This application incorporates by reference in their entirety the attached tables and sequence listing. Tables 1-6 and 8-10 are appended after the claims and are incorporated by reference. The content of the accompanying sequence listing is incorporated herein by reference in its entirety.
The present invention relates to markers and chromosomal amplification correlated to disease, particularly malignant disease such as breast cancer. More specifically, the present invention relates to using cancer markers and chromosomal region analyses for the prediction of patient outcome in breast cancer patients. The present invention also relates to markers and therapeutics targeting in vivo drug resistance. More specifically, the present invention relates to the diagnosis and treatment using cancer markers and therapeutics which target drug resistance in breast cancer patients with low survival rates.
Breast cancer is one of the most common malignancies among women and shares, together with lung carcinoma, the highest fatality rate of all cancers affecting females. The current treatment of the breast cancer is limited to a very invasive, total or partial mastectomy, radiation therapy, or chemotherapy, later two resulting in serious undesirable side effects.
It is now well established that breast cancers progress through accumulation of genomic (Albertson et al., 2003; Knuutila et al., 2000) and epigenomic (Baylin and Herman, 2000; Jones, 2005) aberrations that enable development of aspects of cancer pathophysiology such as reduced apoptosis, unchecked proliferation, increased motility, and increased angiogenesis (Hanahan and Weinberg, 2000). Discovery of the genes that contribute to these pathophysiologies when deregulated by recurrent aberrations is important to understanding mechanisms of cancer formation and progression and to guide improvements in cancer diagnosis and treatment.
Analyses of expression profiles have been particularly powerful in identifying distinctive breast cancer subsets that differ in biological characteristics and clinical outcome (Perou et al., 1999; Perou et al., 2000; Sorlie et al., 2001; Sorlie et al., 2003). For example, unsupervised hierarchical clustering of microarray derived expression data have identified intrinsically variable gene sets that distinguish five breast cancer subtypes—basal-like, luminal A, luminal B, ERBB2 and normal breast-like. The basal-like and ERBB2 subtypes have been associated with strongly reduced survival durations in patients treated with surgery plus radiation (Perou et al., 2000; Sorlie et al., 2001) and some studies have suggested that reduced survival duration in poorly performing subtypes is caused by an inherently high propensity to metastasize (Ramaswamy et al., 2003). These analyses already have led to the development of multi-gene assays that stratify patients into groups that can be offered treatment strategies based on risk of progression (Esteva et al., 2005; Gianni et al., 2005; van 't Veer et al., 2002). However, the predictive power of these assays is still not as high as desired and the assays have not been fully tested in patient populations treated with aggressive adjuvant chemotherapies.
Analyses of breast tumors using fluorescence in situ hybridization (Al-Kuraya et al., 2004; Kallioniemi et al., 1992; Press et al., 2005; Tanner et al., 1994) and comparative genomic hybridization (Kallioniemi et al., 1994; Loo et al., 2004; Naylor et al., 2005; Pollack et al., 1999) show that breast tumors also display a number of recurrent genome copy number aberrations including regions of high level amplification that have been associated with adverse outcome (Al-Kuraya et al., 2004; Cheng et al., 2004; Isola et al., 1995; Jain et al., 2001; Press et al., 2005). This raises the possibility of improved patient stratification through combined analysis of gene expression and genome copy number (Barlund et al., 2000; Pollack et al., 2002; Ray et al., 2004; Yi et al., 2005). In addition, several studies of specific chromosomal regions of recurrent abnormality at 17q12 (Kauraniemi et al., 2001; Kauraniemi et al., 2003) and 8p11 (Gelsi-Boyer et al., 2005; Ray et al., 2004) show the value of combined analysis of genome copy number and gene expression for identification of genes that contribute to breast cancer pathophysiology by deregulating gene expression.
Nevertheless, there is a continued need for further understanding of the genes, and of the chromosomal aberration(s) that occur in cancer, for example breast cancer.
Disclosed herein are roles of genome copy number abnormalities (CNAs) in breast cancer pathophysiology by identifying associations between recurrent CNAs, gene expression and clinical outcome in a set of aggressively treated early stage breast tumors. It shows that the recurrent CNAs differ between tumor subtypes defined by expression pattern and that stratification of patients according to outcome can be improved by measuring both expression and copy number; especially high level amplification. Sixty-six genes (set forth in Table 3) deregulated by the high level amplifications are therapeutic targets; nine of these genes (FGFR1, IKBKB, ERBB2, PROSC, ADAM9, FNTA, ACACA, PNMT, and NR1D1) are “druggable.” Low level CNAs appear to contribute to cancer progression by altering RNA and cellular metabolism.
As used herein gene amplification is used in a broad sense. It comprises an increase of gene copy number; it can also comprise assessment amplification of the gene product. Thus levels of gene expression, as well as corresponding protein expression can be evaluated. In the embodiments that follow, it is understood that assessment of gene expression can be used to assess level of gene product such as RNA or protein.
Thus, embodiments of the invention include: A method for prognosing the outcome of a patient with breast cancer, said method comprising: providing breast cancer tissue from the patient; determining from the provided tissue, the level of gene amplification or gene expression for at least one gene set forth in Table 3; identifying that the at least one gene or gene product is amplified; whereby, when the at least one gene or gene product is amplified, this is an indication that the patient has the predicted disease free survival or probability for distant recurrence set forth in Table3. This method can comprise that the gene or gene product is ACACA (SEQ ID NOs: 1, 2), ADAM9 (SEQ ID NOs: 3-8), ERBB2 (SEQ ID NOs: 9-14), FGFR1 (SEQ ID NOs: 15, 16), FNTA (SEQ ID NOs: 17, 18), IKBKB (SEQ ID NOs: 19, 20), NR1D1 (SEQ ID NOs: 21, 22), PNMT (SEQ ID NOs: 23, 24), or PROSC (SEQ ID NOs: 25, 26); in particular PROSC (SEQ ID NOs: 25, 26), ADAM9 (SEQ ID NOs: 3-8), FNTA (SEQ ID NOs: 17, 18), ACACA (SEQ ID NOs: 1, 2), PNMT (SEQ ID NOs: 23, 24), or NR1D1 (SEQ ID NOs: 21, 22). In one preferred embodiment, the gene, ADAM9 (SEQ ID NOs: 3, 5 and 7) is a therapeutic target. In certain embodiments, there is a proviso that the gene or gene product is not ERBB2 (SEQ ID NOs: 9-14), FGFR1 (SEQ ID NOs: 15, 16), or IKBKB (SEQ ID NOs: 19, 20). The detecting step can comprise use a of methodology selected from the group consisting of quantitative PCR, FISH, array CGH, quantitative PCR, in situ hybridization for RNA , immunohistochemistry and reverse phase protein lysate arrays for protein. In some embodiments, the gene or gene product is FGF3 (SEQ ID NOs: 65,66), PPFIA1 (SEQ ID NOs: 69, 70), NEU3 (SEQ ID NOs: 79, 80), CSTF1 (SEQ ID NOs: 117, 118), PCK1 (SEQ ID NOs: 123, 124), VAPB (SEQ ID NOs: 129, 130), GNAS (SEQ ID NOs: 135, 136), BCAS1 (SEQ ID NOs: 115, 116), TMEPA1 (SEQ ID NOs: 125, 126), or STX16 (SEQ ID NOs: 131, 132). In certain embodiments, the gene or gene product is FGF3 (SEQ ID NOs: 65,66), PPFIA1 (SEQ ID NOs: 69, 70), NEU3 (SEQ ID NOs: 79, 80), GNAS (SEQ ID NOs: 135, 136), TMEPA1 (SEQ ID NOs: 125, 126), STX16 (SEQ ID NOs: 131, 132), or VAPB (SEQ ID NOs: 129, 130). In some embodiments, the breast cancer is a luminal A breast cancer and the gene or gene product is a gene or encoded by a gene at 11q13-14 and/or 20q13, e.g., FGF3 (SEQ ID NOs: 65,66), PPFIA1 (SEQ ID NOs: 69, 70), NEU3 (SEQ ID NOs: 79, 80), GNAS (SEQ ID NOs: 135, 136), TMEPA1 (SEQ ID NOs: 125, 126), STX16 (SEQ ID NOs: 131, 132), or VAPB (SEQ ID NOs: 129, 130).
An embodiment in accordance with the invention comprises: A method for selecting a patient for treatment with a drug that modulates the expression of a gene set forth in Table 3, said method comprising: providing tissue biopsy from the patient; determining from the provided tissue, the level of gene amplification or gene product expression for a gene set forth in Table 3; identifying that one or more of the genes or gene products is amplified; whereby, when the one or more genes or gene products are amplified, this gene and/or gene product is a candidate for treatment with a drug that modulates the expression of the one or more gene of Table 3 or a drug that affects a protein of Table 3. In certain embodiments, the gene or product is ACACA (SEQ ID NOs: 1, 2), ADAM9 (SEQ ID NOs: 3-8), ERBB2 (SEQ ID NOs: 9-14), FGFR1 (SEQ ID NOs: 15, 16), FNTA (SEQ ID NOs: 17, 18), IKBKB (SEQ ID NOs: 19, 20), NR1D1 (SEQ ID NOs: 21, 22), PNMT (SEQ ID NOs: 23, 24), or PROSC (SEQ ID NOs: 25, 26); in particular PROSC (SEQ ID NOs: 25, 26), ADAM9 (SEQ ID NOs: 3-8), FNTA (SEQ ID NOs: 17, 18), ACACA (SEQ ID NOs: 1, 2), PNMT (SEQ ID NOs: 23, 24), or NR1D1 (SEQ ID NOs: 21, 22); and in one embodiment, particularly, ADAM9. In certain embodiments, there is a proviso that the gene or gene product is not ERBB2 (SEQ ID NOs: 9-14), FGFR1 (SEQ ID NOs: 15, 16), or IKBKB (SEQ ID NOs: 19, 20). In some embodiments, the gene or gene product is FGF3 (SEQ ID NOs: 65,66), PPFIA1 (SEQ ID NOs: 69, 70), NEU3 (SEQ ID NOs: 79, 80), CSTF1 (SEQ ID NOs: 117, 118), PCK1 (SEQ ID NOs: 123, 124), VAPB (SEQ ID NOs: 129, 130), GNAS (SEQ ID NOs: 135, 136), BCAS1 (SEQ ID NOs: 115, 116), TMEPA1 (SEQ ID NOs: 125, 126), or STX16 (SEQ ID NOs: 131, 132). In certain embodiments, the gene or gene product is FGF3 (SEQ ID NOs: 65,66), PPFIA1 (SEQ ID NOs: 69, 70), NEU3 (SEQ ID NOs: 79, 80), GNAS (SEQ ID NOs: 135, 136), TMEPA1 (SEQ ID NOs: 125, 126), STX16 (SEQ ID NOs: 131, 132), or VAPB (SEQ ID NOs: 129, 130). In some embodiments, the breast cancer is a luminal A breast cancer and the gene or gene product is a gene or encoded by a gene at 11q13-14 and/or 20q13, e.g., FGF3 (SEQ ID NOs: 65,66), PPFIA1 (SEQ ID NOs: 69, 70), NEU3 (SEQ ID NOs: 79, 80), GNAS (SEQ ID NOs: 135, 136), TMEPA1 (SEQ ID NOs: 125, 126), STX16 (SEQ ID NOs: 131, 132), or VAPB (SEQ ID NOs: 129, 130). The determining step can comprise use a of methodology selected from the group consisting of quantitative PCR, FISH, array CGH, quantitative PCR, in situ hybridization for RNA , immunohistochemistry and reverse phase protein lysate arrays for protein.
An embodiment of the invention comprises: A method for treatment of a patient with breast cancer, said method comprising: providing tissue biopsy from the patient; determining from the provided tissue, the level of gene amplification or level of gene product for a gene set forth in Table 3; identifying that one or more of the genes or gene products is amplified; whereby, when the one or more genes or gene products are amplified, this patent is treated with a drug that modulates the expression of the one or more gene or a drug that affects the gene product. In certain embodiments, the gene or gene product is ACACA (SEQ ID NOs: 1, 2), ADAM9 (SEQ ID NOs: 3-8), ERBB2 (SEQ ID NOs: 9-14), FGFR1 (SEQ ID NOs: 15, 16), FNTA (SEQ ID NOs: 17, 18), IKBKB (SEQ ID NOs: 19, 20), NR1D1 (SEQ ID NOs: 21, 22), PNMT (SEQ ID NOs: 23, 24), or PROSC (SEQ ID NOs: 25, 26); in particular PROSC (SEQ ID NOs: 25, 26), ADAM9 (SEQ ID NOs: 3-8), FNTA (SEQ ID NOs: 17, 18), ACACA (SEQ ID NOs: 1, 2), PNMT (SEQ ID NOs: 23, 24), or NR1D1 (SEQ ID NOs: 21, 22).; or more particularly in one embodiment, ADAM9. In certain embodiments there is a proviso that the gene or gene product is not ERBB2(SEQ ID NOs: 9-14), FGFR1 (SEQ ID NOs: 15, 16), or IKBKB (SEQ ID NOs: 19, 20). In some embodiments, the gene or gene product is FGF3 (SEQ ID NOs: 65,66), PPFIA1 (SEQ ID NOs: 69, 70), NEU3 (SEQ ID NOs: 79, 80), CSTF1 (SEQ ID NOs: 117, 118), PCK1 (SEQ ID NOs: 123, 124), VAPB (SEQ ID NOs: 129, 130), GNAS (SEQ ID NOs: 135, 136), BCAS1 (SEQ ID NOs: 115, 116), TMEPA1 (SEQ ID NOs: 125, 126), or STX16 (SEQ ID NOs: 131, 132). In certain embodiments, the gene or gene product is FGF3 (SEQ ID NOs: 65,66), PPFIA1 (SEQ ID NOs: 69, 70), NEU3 (SEQ ID NOs: 79, 80), GNAS (SEQ ID NOs: 135, 136), TMEPA1 (SEQ ID NOs: 125, 126), STX16 (SEQ ID NOs: 131, 132), or VAPB (SEQ ID NOs: 129, 130). In some embodiments, the breast cancer is a luminal A breast cancer and the gene or gene product is a gene or encoded by a gene at 11q13-14 and/or 20q13, e.g., FGF3 (SEQ ID NOs: 65,66), PPFIA1 (SEQ ID NOs: 69, 70), NEU3 (SEQ ID NOs: 79, 80), GNAS (SEQ ID NOs: 135, 136), TMEPA1 (SEQ ID NOs: 125, 126), STX16 (SEQ ID NOs: 131, 132), or VAPB (SEQ ID NOs: 129, 130). In one embodiment the drug is an antisense sequence for a gene of Table 3, and the particular antisense sequence corresponds to the one or more amplified genes identified in the identifying step. The determining step can comprise use a of methodology selected from the group consisting of quantitative PCR, FISH, array CGH, quantitative PCR, in situ hybridization for RNA , immunohistochemistry and reverse phase protein lysate arrays for protein.
Another embodiment of the invention comprises: A method for identifying a moiety that modulates a protein, said method comprising: providing a protein selected from the group consisting of PROSC (SEQ ID NO: 26), ADAM9 (SEQ ID NOs: 4, 6, or 8), FNTA (SEQ ID NO: 18), ACACA (SEQ ID NO: 2), PNMT (SEQ ID NO: 24), or NR1D1 (SEQ ID NO: 22); screening the provided protein with a candidate moiety; determining whether the candidate moiety modules (e.g., alters function or expression) of the protein; and, selecting a moiety that modules the protein. A further embodiment comprises: A method for modulating a PROSC (SEQ ID NO: 26), ADAM9 (SEQ ID NOs: 4, 6 or 8), FNTA (SEQ ID NO: 18), ACACA (SEQ ID NO: 2), PNMT (SEQ ID NO: 24), or NR1D1 (SEQ ID NO: 22) protein in a living cell, said method comprising: providing a moiety that modulates the protein; administering the moiety to a living cell that expresses PROSC, ADAM9, FNTA, ACACA, PNMT, or NR1D1 protein corresponding to the moiety; whereby, PROSC, ADAM9, FNTA, ACACA, PNMT, or NR1D1 protein in the cell is modulated.
Another embodiment of the invention comprises a method for prognosing the outcome of a patient with breast cancer, said method comprising: providing breast cancer tissue from the patient; determining from the provided tissue, the level of gene deletion for at least one gene from amplicon 8p11-12; identifying that the at least one gene is deleted; whereby, when the at least one gene is deleted, this is an indication that the patient has the predicted disease free survival or probability for distant recurrence set forth in Table3. In certain embodiments, the at least one gene from amplicon 8p11-12 is selected from the chromosome 8 genes set forth in Table 3. The determining step can comprise use a of methodology selected from the group consisting of quantitative PCR, FISH, array CGH, quantitative PCR, in situ hybridization for RNA, immunohistochemistry and reverse phase protein lysate arrays for protein.
a). Frequencies of genome copy number gain and loss plotted as a function of genome location with chromosomes 1pter to the left and chromosomes 22qter and X to the right. Vertical lines indicate chromosome boundaries and vertical dashed lines indicate centromere locations. Positive and negative values indicate frequencies of tumors showing copy number increases and decrease respectively with gain and loss as described in the methods.
b). Frequencies of tumors showing high level amplification. Data are displayed as described in
c-j). Frequencies of tumors showing significant copy number gains and losses as defined in
a). Disease specific survival in 130 breast cancer patients whose tumors were defined using expression profiling to be basal-like (third curve down), luminal A (top curve), luminal B (second curve from top) and ERBB2 class (bottom curve).
b). Disease-specific survival of patients with tumors classified by genome copy number aberration analysis as 1q/16q (top curve), Complex (red) and Amplifying (blue).
c). Survival of patients with (bottom curve) and without (top curve) amplification at any region of recurrent amplification.
d). Survival of patients whose tumors were defined using expression profiling to be luminal A tumors with (bottom curve) and without (top curve) amplification at 8p11-12, 11q13, and/or 20q.
e). Survival of patients whose tumors that were not amplified at 8p11-12 and that had normal (top curve) or reduced (bottom curve) genome copy number at 8p11-12.
f). Survival of patients whose tumors had normal (top curve) or abnormal (bottom curve) genome copy number at 8p11-12.
a). Frequencies of genome copy number gain and loss plotted as described in
b). Array CGH analyses of genome copy number for human mammary epithelial cells at passages 16 and 21 before transition through telomere crisis (upper two traces) and at passages 28 and 44 after immortalization (lower two traces) (Chin et al., 2004).
Table 1. Univariate and multivariate associations for individual amplicons and/or disease specific survival and distant recurrence. Also shown are the chromosomal positions of the beginning and ends of the amplicons and the flanking clones. Associations are shown for the entire sample set and for luminal A tumors (univariate associations only).
Table 2. Associations of genomic variables with clinical features.
Table 3. Functional characteristics of 66 genes; these genes are in recurrent amplicons associated with reduced survival duration in breast cancer. Functional annotation was based on the Human Protein Reference Database (http://hprd.org). Genes highlighted in dark gray are associated with reduced survival duration or distant recurrence when over expressed in non-Amplifying tumors. Genes highlighted in light gray are significantly associated with reduced survival duration or distant recurrence (p<0.05) when down regulated in non-Amplifying tumors. Distances to sites of recurrent viral integration were determined from published information (Akagi et al., 2004). The last column identifies genes having predicted protein folding characteristics indicating that they are druggable (see, e.g., Russ and Lampel, 2005).
Table 4. Univariate p-values with the corresponding 95% confidence intervals for associations with disease-specific survival and distant recurrence endpoints and the corresponding multivariate results for those found to be significant in univariate analyses (p<0.05) for at least one of the clinical end points. Only variables individually significant at p<0.05 for at least one of the two end points are included in the multivariate regression. Stage and SBR Grade are treated as continuous variables rather than factors. In each column pair, the left subcolumn lists results for disease-specific survival and the right subcolumn lists results for time to distant recurrence.
Table 5. Comparison of the association between expression subtypes and survival duration in 3 datasets. Log-likelihood ratio test p-value is shown for each model. Basal is the reference in all models. Multivariate models include size and nodal status. In multivariate analyses, the first value shown in each cell is the p-value and the second is the ratio of the medians in the compared groups.
Table 6. Identities of 1432 gene transcripts showing significant associations between genome copy numbers measured using array CGH and transcript levels measured using Affymetrix U133A expression arrays in 101 primary breast tumors. Data will be available through CaBIG and a public web site.
Table 7. The set of genes in Table 3, shown with the corresponding GenBank Accession numbers and the SEQ ID NOs assigned for the gene and gene products.
Table 8. Sequences of siRNAs targeting various human genes encoded by amplicon 11q13.
Table 9. Sequences of shRNAs targeting human NEU3, FGF3 and PPFIA1 genes.
Table 10. Sequences of siRNAs targeting various human genes encoded by amplicon 20q13.
In order to further the understanding of the genes, and of the chromosomal aberration(s) that occur in cancer, for example breast cancer we performed combined analyses of genome copy number and gene expression to identify genes that contribute to breast cancer pathophysiology with emphasis on those that are associated with poor response to current therapies.
By associating clinical endpoints with genome copy number and gene expression, we showed strong associations between expression subtype and genome aberration composition and we identified four human chromosomal regions (8p11-12, 11q13-14, 17q12 and/or 20q13) of recurrent amplification associated with poor outcome in treated patients. Gene expression profiling revealed 66 genes (see, e.g., Table 3 and Table 7) in these regions of amplification whose expression levels were deregulated by the high-level amplifications. We also found a surprising association between low level CNAs (genome copy number abnormality CNA) and up-regulation of genes associated with RNA and protein metabolism that may suggest a new mechanism by which these aberrations contribute to cancer progression.
We disclose a comprehensive analysis of gene expression and genome copy number in aggressively treated primary human breast cancers performed in order to identify (a) genomic events that are assayed to better stratify patients according to clinical behavior, (b) identify how molecular aberrations contribute to breast cancer pathogenesis and (c) discover genes that are therapeutic targets in patients that do not respond well to current therapies.
Molecular Markers that Predict Outcome
We focused in this study on combined analyses of genome copy number and gene expression in tumors from patients who had aggressive treatment with surgery, radiation of the surgical margins, and hormonal therapy for ER positive disease and aggressive adjuvant chemotherapy as indicated (typically adriamycin and cytoxan but not including Trastuzumab). Analyses of markers in the context of this treatment regimen allowed us to identify those that predicted outcome in patients whose tumors were treated more aggressively than in previously published studies (Esteva et al., 2005; Gianni et al., 2005; van 't Veer et al., 2002). Our analyses of this aggressively treated patient cohort revealed two important associations.
First, we found that the survival of patients with tumors classified as basal-like according to expression pattern did not have significantly worse outcome than patients with luminal or normal-like tumors in this tumor set, unlike previous reports (van 't Veer et al., 2002; van de Vijver et al., 2002) (see
Secondly, we found that aggressively treated patients with high level amplification had worse outcome than did patients without amplification (see
Our combined analyses of genome copy number and gene expression showed substantial differences in recurrent genome abnormality composition between tumors classified according to expression pattern and revealed that over 10% of the genes interrogated in this study had expression levels that were highly significantly associated with genome copy number changes. Most of the gene expression changes were associated with low level changes in genome copy number, but 66 were deregulated by the high level amplifications associated with poor outcome (see Table 3), as defined as having a multiple testing corrected p-value of less than 0.05. These analyses provide evidence of: the etiology of breast cancer subtypes, mechanisms by which the low-level copy number changes contribute to cancer pathogenesis and identify a suite of genes that contribute to cancer pathophysiology when over expressed as a result of high level amplification.
Breast cancer subtypes.
The differences in recurrent aberration composition between expression subtypes is consistent with a model of cancer progression in which the expression subtype and genotype are determined by the cell type and stage of differentiation that survives telomere crisis and acquires sufficient proliferative advantage to achieve clonal dominance in the tumor (Chin et al., 2004). This model indicates that the genome CNA spectrum is selected to be most advantageous to the progression of the specific cell type that achieves immortality and clonal dominance. In this model, the recurrent genome CNA composition can be considered an independent subtype descriptor—much as genome CNA composition can be considered to be a cancer type descriptor (Knuutila et al., 2000). The independence of the genome CNA composition and basal and luminal expression subtypes is clear from
Low level abnormalities. The most frequent low-level copy number changes were not associated with reduced survival duration although some were associated with other markers usually associated with survival such as tumor size, nodal status, and grade (see Table 2). This raises the question of why the recurrent low-level CNAs are selected. To understand this, we applied the statistical tool GOstat to determine the ontology of the genes deregulated by these abnormalities. This analysis showed that numerous genes involved in RNA and cellular metabolism were significantly up-regulated by these events. Interestingly, we also observed that many of the recurrent low-level aberrations matched the low-level copy number changes in the ZNF217-transfected human mammary epithelial cells that emerged after passage through telomere crisis having achieved clonal dominance in the culture (see FIG. 5)—presumably because the aberrations they carried conferred a proliferative advantage(Chin et al., 2004). This indicates that the low-level CNAs contribute to early cancer formation by increasing basal metabolism thereby providing a net survival/proliferative advantage to the cells that carry them. This idea is supported by a report that some of these same classes of genes were associated with proliferative fitness yeast (Deutschbauer et al., 2005). That study described analyses of proliferative fitness in the complete set of Saccharomyces cerevisiae heterozygous deletion strains and reported reduced growth rates for strains carrying deletions in genes involved in RNA metabolism and ribosome biogenesis and assembly.
High level amplification. We found that high level amplifications of 8p11-12, 11g13-14, 17q12 and/or 20q13 were associated with reduced survival duration and/or distant recurrence overall, and within the luminal A expression subgroup. We identified 66 genes (see, e.g., Table 3) in these regions whose expression levels were correlated with copy number. These 66 genes are shown in Table 7 below along with the GenBank Accession numbers for each of the genes and gene products (proteins), the records of which are hereby incorporated by reference for all purposes. Also shown are the corresponding SEQ ID NOs as assigned here and shown in the sequence listing attached herein in computer readable form.
GO analyses of those genes showed that they are involved in aspects of nucleic acid metabolism, protein modification, signaling and the cell cycle and/or protein transport and evidence is mounting that many if not most of these genes are functionally important in the cancers in which they are amplified and over expressed (see Table 3). Indeed, published functional studies in model systems already have implicated fourteen genes in diverse aspects of cancer pathophysiology (Table 3, column 8).
Six of these are encoded in the region of amplification at 8p11. These are the RNA binding protein, LSM1 (GenBank Accession No. NM—014462; SEQ ID NO:35; Fraser et al., 2005), the receptor tyrosine kinase, FGFR1 (GenBank Accession No. AY585209; SEQ ID NO: 15; Braun and Shannon, 2004), the cell cycle regulatory protein, TACC1 (GenBank Accession No. NM—206862; SEQ ID NO: 43; Still et al., 1999), the metalloproteinase, ADAM9 (GenBank Accession Nos. AF495383, NM—003816, NM—001005845; SEQ ID NOs: 3, 5, and 7; Mazzocca et al., 2005), the serine/threonine kinase, IKBKB (GenBank Accession Nos. AY663108, NM—001556, XM—032491; SEQ ID NO: 19; Greten and Karin, 2004; Lam et al., 2005) and the DNA polymerase, POLB (GenBank Accession No. NM—002690, SEQ ID NO: 53; Clairmont et al., 1999).
Functionally validated genes in the region of amplification at 11q13 include the cell cycle regulatory protein, CCND1 (GenBank Accession Nos. NM—053056, NM—001758; SEQ ID NO: 63; Hinds et al., 1994), and the growth factor, FGF3 (GenBank Accession Nos. NM—005247, SEQ ID NO: 65; Okunieff et al., 2003).
Functionally important genes in the region of amplification at 17q include the transcription regulation protein, PPARBP (GenBank Accession No. NM—004774.2; SEQ ID NO: 97; Zhu et al., 2000), the receptor tyrosine kinase ERBB2 (GenBank Accession No. AY208911, NM—004448, NM—001005862; SEQ ID NOs: 9, 11, 13; Slamon et al., 1989) and the adapter protein, GRB7 (GenBank Accession No. NM—001030002, SEQ ID NO: 105; Tanaka et al., 2000).
The AKT pathway-associated-transcription factor, ZNF217 (GenBank Accession No. NM—006526; SEQ ID NO: 113; Huang et al., 2005; Nonet et al., 2001) and the RNA binding protein, RAE1 (GenBank Accession No. NM—003610; SEQ ID NO: 119; Babu et al., 2003) are functionally validated genes encoded in the region of amplification at 20q13.
As set forth in Table 3, column 9, further support for the functional importance of 21 of these genes (TACC1, ADAM9, IKBKB, POLB, CCND1, PCGF2, PSMB3, PIP5K2B, F1120291, STARD3, TCAP, PNMT, PERLD1, GRB7, GSDML, PSMD3, NR1D1, ZNF217, BCAS1, TH1L, and C20orf45) in oncogenesis comes from the observation that they are within 100 Kbp of sites of recurrent tumorigenic viral integration in the mouse (Akagi et al., 2004); in particular, three (IKBKB, CCND1, GRB7) are within 10 Kbp of such a site. Taking proximity to a site of recurrent tumorigenic viral integration as evidence for a role in cancer genesis, an additional 13 genes or transcripts are implicated (see Table 3); these are the genes that are near viral insertion sites but are: (1) not associated with outcome [highlighted gray] and (2) not previously associated to cancer [column 8].
The biological roles of the genes deregulated by recurrent high level amplification are diverse and vary between regions of amplification. For example, genes deregulated by amplification at 11q13 and 17q11-12 predominantly involved signaling and cell cycle regulation while genes deregulated by amplification at 8p11-12 and 20q13 were of mixed function but were associated most frequently with aspects of nucleic acid metabolism. The predominance of genes involved in nucleic acid metabolism in the region of amplification at 8p11-12 was especially strong.
Gene Deletion. Interestingly, the region of recurrent amplification at 8p11-12 described above was reduced in copy number in some tumors and this event also was associated with poor outcome. Thus, this is evidence that the poor clinical outcome in tumors with 8p11-12 abnormalities is due to increased genome instability/mutagenesis resulting from either up- or down-regulation of genes encoded in this region. This is supported by studies in yeast showing that up- or down-regulation of genes involved in chromosome integrity and segregation can produce similar instability phenotypes (Ouspenski et al., 1999).
Thus, the 66 genes we set forth in Table 3 were found to be deregulated by the high level amplifications and were associated with poor outcome; these genes and their gene products serve as therapeutic targets for cancer treatment, in particular those patients that are refractory to current therapies. Small molecule or antibody based inhibitors have already been developed for FGFR1 (PD173074, (Ray et al., 2004)), IKBKB (PS-1145; (Lam et al., 2005)) and ERBB2 (Trastuzumab, (Vogel et al., 2002)).
Six genes set forth in Table 3 (PROSC, ADAM9, FNTA, ACACA, PNMT, and NR1D1) are considered as druggable based on the presence of predicted protein folds that favor interactions with drug-like compounds (Russ and Lampel, 2005).
Taking ERBB2 as the paradigm (recurrently amplified, over expressed, associated with outcome and with demonstrated functional importance in cancer), indicates that FGFR1, TACC1, ADAM9, IKBKB, PNMT, and GRB7 are high priority therapeutic targets in these regions of amplification. Thus, it is expected that the studies and effects of inhibition on ADAM9, as described in Example 10, may be carried out and observed for any of these genes as well. Furthermore, it is contemplated that antagonists of these genes can be made by one having skill in the art, including but not limited to, inhibitory oligonucleotides and peptides, aptamers, small molecules, drugs and antibodies, thereby producing an effect on the gene or gene product as a treatment for breast cancer.
We assessed genome copy number using BAC array CGH (Hodgson et al., 2001; Pinkel et al., 1998; Snijders et al., 2001; Solinas-Toldo et al., 1997) and gene expression profiles using Affymetrix U133A arrays (Ramaswamy et al., 2003; Reyal et al., 2005) in breast tumors from a cohort of patients treated according to the standard of care between 1989 and 1997 (surgery, radiation, hormonal therapy and treatment with high dose adriamycin and cytoxan as indicated). We measured genome copy number profiles for 145 primary breast tumors and gene expression profiles for 130 primary tumors, of which 101 were in common. We analyzed these data to identify recurrent genomic and transcriptional abnormalities and we assessed associations with clinical endpoints to identify genomic events that might contribute to cancer pathophysiology.
Genome copy number and gene expression features. We found that the recurrent genome copy number and gene expression characteristics measured for the patient cohort in this study were similar to those reported in earlier studies. We summarize these briefly.
a) and
Associations between CNAs and expression. Combined analyses of genome copy number and expression showed that the recurrent genome CNAs differed between expression subtypes and identified genes whose expression levels were significantly deregulated by the CNAs.
c) shows that the basal-like tumors were relatively enriched for low-level copy number gains involving 3q , 8q, and 10p and losses involving 3p, 4p, 4q, 5q, 12q, 13q, 14q and 15q while
In order to understand how the genome aberrations were influencing cancer pathophysiologies, we identified genes that were deregulated by recurrent genome CNAs. We took these genes to be those whose expression levels were significantly associated with copy number (Holm-adjusted p-value<0.05). These genes, which represent about 10% of the genome interrogated by the Affymetrix HGU133A arrays used in this study, and their copy number-expression level correlation coefficients are listed in Table 4 This extent of genome-aberration-driven deregulation of gene expression is similar to that reported in earlier studies (Hyman et al., 2002; Pollack et al., 1999).
We tested associations between copy number and expression level for 186 genes in regions of amplification at 8p11-12, 11q13-q14, 17q11-12 and 20q13 (see Table 5) and we identified 66 genes in these regions whose expression levels were correlated with copy number (FDR<0.01, wilcoxon rank sum test; Table 3). These genes define the transcriptionally important extents of the regions of recurrent amplification. Twenty-three were from a 5.5 Mbp region at 8p11-12 flanked by SPFH2 and LOC441347, ten were from a 6.6 Mbp region at 11g13-14 flanked by CCND1 and PRKRIR, nineteen were from a 3.1 Mbp region at 17q12 flanked by LHX1 and NR1D1 and fourteen were from a 5.4 Mbp region at 20q13 flanked by ZNF217 and C20orf45.
Since the recurrent genome aberrations differed between expression subtypes, we explored the extent to which the expression subtypes were determined by genome copy number. Specifically, we applied unsupervised hierarchical clustering to intrinsically variable genes after removing genes whose expression levels were correlated with copy number.
Associations with Clinical Variables.
Associations with histopathology.
These analyses showed that ER/PR negative tumors were predominantly found in the basal-like and “complex” expression and genome aberration subtypes, respectively. Node-positive tumors had significantly more amplified arms and recurrent amplicons than node-negative samples but showed a much more moderate difference in terms of low-level copy number transitions. Stage 1 tumors had moderately fewer low- and high-level changes than higher stage tumors. The number of low and high level abnormalities increased with SBR grade. Interestingly, the “complex” tumors showing many low-level abnormalities were more strongly associated with aberrant p53 expression than “amplifying” tumors. “Simple” tumors tended to have Ki67 proliferation indices <10% while “complex” and “amplifying” tumors typically had Ki67 indices >10%. The number of amplifications increased significantly with tumor size but the number of low level changes did not. We observed no association of genomic changes with the age at diagnosis.
Associations with outcome.
The tumor subtypes based on patterns of gene expression or genome aberration content showed moderate associations with outcome endpoints. For example,
We found that high level amplification was most strongly associated with poor outcome in this aggressively treated patient population. Amplification at any of the 9 recurrent amplicons was an independent risk factor for reduced survival duration (p<0.04) and distant recurrence (p<0.01) in a multivariate Cox-proportional model that included tumor size and nodal status.
Importantly, we found that stratification according to amplification status allowed identification of patients with poor outcome even within an expression subtype.
Considering the strong association between amplification and outcome, we explored the possibility that some of these genes were over expressed in tumors in which they were not amplified and that over expression was associated with reduced survival duration in those tumors. Increased expression levels of 7 genes are labeled in Table 3 in dark gray (CTTN, KRTAP5-9, LHX1, PPARBP, PNMT, GRB7, TMEPAI). These genes were associated with reduced survival or distant recurrence at the p<0.1 level but only two, the growth factor receptor binding protein, GRB7 (17q) and the keratin associated protein, KTRAP5-9 (11q), at the p<0.05 level.
Interestingly, this expression analysis also revealed an unexpected association between reduced expression levels of genes from regions of amplification and poor outcome (either disease free survival or distant recurrence) in tumors without relevant amplifications (p<0.05). This was especially prominent for genes from the region of amplification at 8p11-12 (14 of 23 genes in this region showed this association) while only two genes from regions of adverse-outcome-associated amplifications on chromosomes 17q and 20q showed this association.
Following this lead, we tested associations between outcome and reduced copy number at 8p11-12 in patients in tumors in which 8p11-12 was not amplified.
We also tested for associations of low level genome copy number changes with the outcome endpoints. The most frequent low-level copy number changes (e.g. increased copy number at 1q, 8q and 20q or decreased copy number at 16q) were not significantly associated with outcome endpoints. However, we did find a significant association of the loss of a small region on 9q22 with adverse outcome, both disease-specific survival and distal recurrence, which persisted even after correction for multiple testing (p<0.05, multivariate Cox regression). This region is defined by BACs, CTB-172A10 and RP11-80F13. We also found a marginally significant association between fraction of the genome lost and disease-specific survival in luminal A tumors (p<0.02 and <0.06 for univariate and multivariate regression, respectively, Wilcoxon rank-sum test).
The lack of association of the most frequent low level CNAs with outcome raised the issue of selection pressure during tumor evolution. To understand this, we used the program GoStat (Beissbarth and Speed, 2004) to identify the Gene Ontology (GO) classes of 1444 unique genes (1734 probe sets) whose expression levels were preferentially modulated by low-level CNAs compared to 3026 probe sets whose expression levels did not show associations with copy number. The GO categories most significantly overrepresented in the set of genes with a dosage effect compared to genes with no or minimal dosage effect involved RNA processing (Holm adjusted p-value<0.001), RNA metabolism (p<0.01) and cellular metabolism (p<0.02).
Tumor characteristics. Frozen tissue from UC San Francisco and the California Pacific Medical Center collected between 1989 and 1997 was used for this study. Tissues were collected under IRB approved protocols with patient consent. Tissues were collected, frozen over dry ice within 20 minutes of resection, and stored at −80 C. An H&E section of each tumor sample was reviewed, and the frozen block was manually trimmed to remove normal and necrotic tissue from the periphery. Clinical follow-up was available with a median time of 6.6 years overall and 8 years for censored patients. Tumors were predominantly early stage (83% stage I & II) with an average diameter of 2.6 cm. About half of the tumors were node positive, 67% were estrogen receptor positive, 60% received tamoxifen and half received adjuvant chemotherapy (typically adriamycin and cytoxan). Clinical characteristics of the individual tumors are provided together with expression and array CGH profiles in the CaBIG repository and at http://graylabdata.lbl.gov.
Array CGH. Each sample, such as from Example 1, was analyzed using Scanning and OncoBAC arrays. Scanning arrays were comprised of 2464 BACs selected at approximately megabase intervals along the genome as described previously (Hodgson et al., 2001; Snijders et al., 2001). OncoBAC arrays were comprised of 1860 P1, PAC, or BAC clones. About three-quarters of the clones on the OncoBAC arrays contained genes and STSs implicated in cancer development or progression. All clones were printed in quadruplicate. DNA samples for array CGH were labeled generally as described previously (Hackett et al., 2003; Hodgson et al., 2001; Snijders et al., 2001). Briefly, 500 ng each of cancer and normal female genomic DNA sample was labeled by random priming with CY3- and CY5-dUTP, respectively; denatured; and hybridized with unlabeled Cot-1 DNA to CGH arrays. After hybridization, the slides were washed and imaged using a 16-bit CCD camera through CY3, CY5, and DAPI filters (Pinkel et al., 1998).
Statistical considerations. Data processing. Array CGH data image analyses were performed as described previously (Jain et al., 2002). In this process, an array probe was assigned a missing value for an array if there were fewer than 2 valid replicates or the standard deviation of the replicates exceeded 0.2. Array probes missing in more than 50% of samples in OncoBAC or Scanning array datasets were excluded in the corresponding set. Array probes representing the same DNA sequence were averaged within each dataset and then between the two datasets. Finally, the two datasets were combined and the array probes missing in more than 25% of the samples, unmapped array probes and probes mapped to chromosome Y were eliminated. The final dataset contained 2149 unique probes.
Expression profiling using the Affymetrix High Throughput Analysis (HTA) system. Expression array analysis using the GeneChip® assay is implemented on the Affymetrix HTA system in four automated procedures; target preparation, hybridization, washing/staining and scanning
Target preparation. For each sample, the RNA target is prepared by putting 2.5 μg of total RNA in 5 μl water and 5 μl of 10 μM T7(dt)24 primer into a MJ Research 96-well reaction plate. The total RNA undergoes an annealing step at 70° C. for 10 minutes followed by a 4° C. cooling step for 5 minutes. The plate is transferred back to the deck position and undergoes first strand cDNA synthesis. 10 μl of First Strand Cdna Synthesis cocktail (4 μl of Affymetrix 5× 1st strand buffer (250 mM Tris-HCl, pH 8.3 at room temperature; 375 mM KCl; 15 mM MgCl2), is mixed with 2 μl 0.1M DTT, 1 μl 10 mM dNTP mix, 1 μl Superscript II (200 U/ul), and 2 μl nuclease free water per reaction) is added, and the plate is then transferred to the thermal cycler and incubated at 42° C. for 60 minutes and 4° C. for 5 min. 91 μl of nuclease free water and 39 μl of the Second Strand cDNA Synthesis cocktail (30 μl of Affymetrix 5× 2nd strand buffer, 100 mM Tris-HCl (pH 6.9), 23 mM MgCl2, 450 mM KCl, 0.75 mM B-NAD, 50 mM (NH4)2SO4); 3 μl 10 mM dNTP; 1 μl 10 unit/μl DNA Ligase; 4 μl 10 unit/μl DNA Polymerase and 1 μl 2 units/μl RNase H) is added. The plate is incubated at 16° C. for 120 minutes and 4° C. for 5 minutes. 4 μl of T4 Polymerase cocktail comprised of 2 μl T4 DNA Polymerase plus 2 μl 1× T4 DNA Polymerase Buffer (165 mM Tris-acetate (pH 7.9), 330 mM Sodium-acetate, 50 mM Magnesium-acetate, 5 mM DTT) is added and the plate is taken back to the thermal cycler where it is cycled at 16° C. for 10 minutes, 72° C. for 10 minutes, and cooled to 4° C. for 5 minutes.
The plate is transferred back to the deck and Agencourt Magnetic Beads are used for the cDNA clean-up. 162 μl of magnetic beads are mixed with 90 μl of in the cDNA Clean-Up Plate and incubated for 5 minutes. Post incubation, the cDNA bound to the beads in the cDNA Clean-Up Plate is moved to the Agencourt magnetic plate. Another 115 μl of magnetic beads is mixed with 64 μl cDNA incubated for 5 minutes, and then moved to the Agencourt magnetic plate. Post incubation, the supernatant is removed and two washes with 75% EtOH are performed using 200 μl solution. The EtOH is then removed and the beads sit for 5 minutes. 40 μl of nuclease free water is added to the beads and mixed well. The solution is then incubated for 1 minute, and then it is taken back to the magnetic plate where it is incubated for 5 minutes to capture the beads on the magnet. 22 μl of eluted cDNA is then transferred to the Purified cDNA Plate (22 μl total volume). 38 μl of IVT cocktail (6 μl 10× IVT Buffer, 18 μl HTA RLR Reagent (labeling NTP), 6 μl HTA Enzyme Mix, 1 μl T7 RNA Polymerase, and 7 μl RNase free water per reaction is added to the purified cDNA) is added to the 22 μl of purified cDNA (60 μl total volume). The plate is then transferred to the thermal cycler where incubation of 8 hours at 37° C. occurs.
Upon completion, the plate is transferred back to the deck where 120 μL Agencourt Magnetic Beads are used to clean up the cRNA product. The A260 of the purified cRNA is measured in a plate spectrophotometer, then the concentration in each well of a 96 well plate is adjusted to a calculated value of 0.625 μg/μl. A second reading is taken to verify the normalization process. 30 μl of cRNA was transferred from the cRNA Normalization Plate and dispensed in the Fragmented cRNA Plate. 7.5 μl of 5× fragmentation buffer per sample is added. The plate is then transferred to the thermal cycler where it is held at 94° C. for 35 minutes followed by a cooling step at 20° C. for 5 minutes. The sample is then mixed with 90 μl of hybridization cocktail (3 μl of 20× bioB, bioC, bioD, and creX hybridization controls mixed with 1.6 μl 3 nM oligo-B2, 1 μl 10 mg/ml Herring sperm DNA, 1 μl 50 mg/ml acetylated BSA, and 83.4 μl 1.2× Hybridization Buffer).
Hybridization. The sample is then ready to be hybridized. The peg array plate is incubated in 60 μl pre-hybridization cocktail (1 μl 10 mg/ml Herring sperm DNA, 1 μl 50 mg/ml Acetylated BSA, 84 μl Hybridization buffer, 15 μl nuclease free H20 per reaction). The hybridization-ready sample is taken to the thermal cycler and denatured for 95° C. for 5 minutes. Upon completion of this step, the plate is returned to the deck where 70 μl of sample is transferred to a hybridization tray. The peg plate is then lifted off of the pre-hybridization tray and taken to the hybridization plate where it is placed. This “hybridization sandwich” is then manually transferred to a hybridization oven where it incubates at 48° C. for 16-18 hours.
Washing/Staining. The robot lifts the peg plate off of the hybridization tray and transfers it to the first low stringency wash (LSW) (6×SSPE, 0.01% Tween-20) where it is dipwashed 36 times. The plate is then transferred to the other three low stringency wash positions where the dipping is repeated. The peg plate is then moved to the high stringency wash (HSW) (100 mM MES, 0.1M NaCl, 0.01% Tween-20) where it is incubated at 41° C. for 25 minutes. After the incubation, the peg plate is transferred to a fifth LSW tray where the HSW removed by rinsing. The plate is transferred to the first stain (31.5 μl nuclease free H20, 35 μl 2× MES stain buffer, 2.8 μl 50 mg/ml Acetylated BSA, 0.7 μl R-Phycoerythrin Streptavidin), where it incubates at room temperature for 10 minutes. At the end of the 10 minute incubation, the peg plate undergoes another 4 cycles of dip washing method. The peg tray is then transferred to stain 2 (2.8 μl 50 mg/ml Acetylated BSA, 0.7 μl reagent grade goat IgG, 0.4 μl biotinylated goat Anti-streptavidin antibody per reaction). The above method is repeated for stain 3 (31.5 μl nuclease free H20, 35μl 2× MES stain buffer, 2.8 μl 50 mg/ml Acetylated BSA, 0.7 μl R-Phycoerythrin Streptavidin). At the end of the incubation of the third stain, the peg plate is washed 36 times in LSW. The robot then transfers 70 μl of MES holding buffer, 68 mM MES, 1.0 M NaCl, 0.01% Tween-20, into a sterile scan tray. The peg tray is then placed into the scan tray for scanning
Scanning. The 96 well peg plate is scanned by the Affymetrix High Throughput (HT) scanner, a fully automated epi-fluorescence imaging system with an excitation wavelength range of 340 nm to 675 nm and a cooled 1280×1024 CCD camera with 12 bit readout. Scanning resolution is 1.0 μm/pixel with a 10× objective. Images are captured at two different exposure times. Each well will have 49 sub-images/exposure times. The software program then converts these .dat files into mini .cel files and then into composite cel files where the information is analyzed in the Affymetrix GCOS 1.2 software.
Statistical considerations. Data processing. For Affymetrix data, multi-chip robust normalization was performed using RMA software (Irizarry et al., 2003). Transcripts assessed on the arrays were classified into two groups using Gaussian model-based clustering by considering the joint distribution of the median and standard deviation of each probe set across samples. During this process, computational demands were reduced by randomly sampling and clustering 2000 probe intensities using mclust (Yeung et al., 2001; Yeung et al., 2004) with two clusters and unequal variance. Next, the remaining probe intensities were classified into the newly created clusters using linear discriminant analysis. The cluster containing probe intensities with smaller mean and variance was defined as “not expressed” and the second cluster was “expressed”.
Characterizing copy number changes. The array CGH data were analyzed using circular binary segmentation (CBS) (Olshen et al., 2004) to translate intensity measurements into regions of equal copy number as implemented in the DNA copy R/Bioconductor package. Missing values for probes mapping within segmented regions of equal copy number were imputed by using the value of the corresponding segment. A few probes with missing values (<0.3%) were located between segmented regions and their values were imputed using the maximum value of the two flanking segments. Thus, each probe was assigned a segment value referred to as its “smoothed” value. The scaled median absolute deviation (MAD) of the difference between the observed and smoothed values was used to estimate the tumor-specific experimental variation. All tumors had noise standard deviation of less than 0.2. The gain and loss status for each probe was assigned using the merge Level procedure as described (Willenbrock and Fridlyand, 2005). In this process, segmental values across the genome were merged to create a common set of copy number levels for each individual tumor. The probes corresponding to the copy number level with the smallest absolute median value were declared unchanged whereas all the other probes were either gained or lost depending on the sign of the segment mean. Additionally, to account for high level focal aberrations being single outliers and thus assigned the status of the surrounding segments, the probe was assigned gain status when amplified as described below.
The frequency of alterations at each probe locus was computed as the proportion of samples showing an aberration at that locus. The genome distance assigned to each probe was computed by assigning a genomic distance equal to half the distance to the neighboring probes or to the end of a chromosome for the probes with only one neighbor. The number of copy number transitions was computed based on the initial DNA copy segmentation by counting the number of copy number transitions in the genome (Snijders et al., 2003). Single outliers such as high level amplifications were identified by assigning the original observed log2ratio to the probes for which the observed values were more than 4 tumor-specific MAD away from the smoothed values. The amplification status for a probe was then determined by considering the width of the segment to which that probe belonged (0 if an outlier) and a minimum difference between the smoothed value of the probe (observed value if an outlier) and the segment means of the neighboring segments. The clone was declared amplified if it belonged to the segment spanning less than 20 Mb and the minimum difference was greater than exp(−x3) where x is the final smoothed value for the clone. Note that this allowed clones with small log 2ratio to be declared amplified if they were high relative to the surrounding clones with the required difference becoming larger as value of the clone gets smaller (e.g. a difference of 1 was required when clone value was 0 and 0.36 when the clone value was 1; Albertson, Fridlyand, private communication).
Clustering of genome copy number profiles. Genome copy number profiles were clustered using smoothed imputed data with outliers present. Agglomerative hierarchical clustering with Pearson correlation as a similarity measure and the Ward method to minimize sum of variances were used to produce compact spherical clusters (Hartigan, 1975). The number of groups was assessed qualitatively by considering the shape of the clustering dendogram.
Expression subtype assignment. Tumors were classified according to expression phenotype (basal, ERBB2, luminal A, luminal B and normal-like) by assigning each tumor to the subtype of the cluster defined by hierarchical clustering of expression profiles for 122 samples published by Sorlie et al (Sorlie et al., 2003) to which it had the highest Pearson correlation. The correlation was computed using the subset of Stanford intrinsically variable genes common to both datasets. Unigene IDs were used to match the probes and genes with non-unique Unigene IDs. These data were averaged and the genes were median-centered for both datasets. For robustness, only 79 of the most tightly clustered Stanford samples were used to define Stanford cluster centroids. Unigene IDs for Affymetrix data were obtained from the TIGR Resourcer website, http://pga.tigr.org/tigr-scripts/magic/rl.pl. The Stanford intrinsic genes list was downloaded from http://genome-www.stanford.edu/breast_cancer/robustness/data.shtml. The same procedure was used to assign expression subtypes to the 295 breast tumors dataset published by van de Vijver et al., (van de Vijver et al., 2002) downloaded from http://www.rii.com/publications/2002/default.html except that matching was done directly using gene names.
Association of copy number with survival. Stage 4 samples were excluded from all the outcome-related analyses; and disease-specific survival and time to distant recurrence were used as the two endpoints. We identified clinical variables independently associated with outcome endpoints by first using univariate Cox-proportional hazards model to identify clinical variables individually associated with the outcomes and then identifying the subset of variables significant in the additive multivariate model which included all significant variables from univariate analyses. Significance was declared at the 0.05 level. As demonstrated in (Willenbrock and Fridlyand, 2005), analyzing segmented data greatly increases power to detect true significant associations without increasing the false positive rate. Therefore, we used smoothed imputed data with outliers as described above to identify significant associations of low-level copy number changes with outcome endpoints. P-values were adjusted using False Discovery Rate (FDR) and a genome association was considered significant if its FDR was less than 0.05. A Cox proportional model also was used to associate the total number of copy number transitions and amount of genome gained and lost with survival; overall and within expression subtypes. P-values were not adjusted for FDR for these two analyses due to their targeted nature and significance was declared at the 0.05 level.
Regions of high level amplification were declared recurrent when present in at least 5 samples. The BAC array probes were further manually grouped to form groups of contiguous regions thereby referred to as amplicons, and singletons were excluded. Each sample was further classified as amplified for a given amplicon if it contained at least one amplified probe in the amplicon region. We tested all amplicons for association with the outcome variables by fitting univariate and multivariate Cox-proportional models with and without clinical variables and assessing significance of the standardized Cox-proportional coefficient. Significance was declared at unadjusted p-value<0.05.
Association of copy number with expression. The presence of an overall dosage effect was assessed by subdividing each chromosomal arm into non-overlapping 20 Mb bins and computing the average of cross-Pearson-correlations for all gene transcript-BAC probe pairs that mapped to that bin. We also calculated Pearson correlations and corresponding p-values between expression level and copy number for each gene transcript. Each transcript was assigned an observed copy number of the nearest mapped BAC array probe. 80% of gene transcripts had a nearest clone within 1 Mbp and 50% had a clone within 400 kbp. Correlation between expression and copy number was only computed for the gene transcripts whose absolute assigned copy number exceeded 0.2 in at least 5 samples. This was done to avoid spurious correlations in the absence of real copy number changes. We used conservative Holm p-value adjustment to correct for multiple testing. Gene transcripts with an adjusted p-value<0.05 were considered to have expression levels that were highly significantly affected by gene dosage. This corresponded to a minimum Pearson correlation of 0.44.
Associations of transcription and CNA in regions of amplification with outcome in tumors without particular amplicons. We assessed the associations of levels of transcripts in regions of amplifications with survival or distant recurrence in tumors without amplifications in order to find genes that might contribute to progression when deregulated by mechanisms other than amplification (e.g. we assessed associations between expression levels of the genes mapping to the 8 p11-12 amplicon and survival in samples without 8p11-12 amplification. We performed separate cox-proportional regressions for disease-specific survival and distant recurrence. Stage 4 samples were excluded from all analyses.
Testing for functional enrichment. We used the gene ontology statistics tool, GoStat (Beissbarth and Speed, 2004) to test whether gene transcripts with strongest dosage effects were enriched for particular functional groups. The p-values were adjusted using False Discovery Rate. The categories were considered significantly overrepresented if the FDR-adjusted p-value was less than 0.001. Since expressed genes were significantly more likely to show dosage effects than non expressed genes (p-value <2.2e-16, Wilcoxon rank sum test), GoStat comparisons were performed only for expressed genes. Specifically, GO categories for 1734 expressed probes with significant dosage effect (Holm p-value<0.05) were compared with those for 3026 expressed probes with no dosage effect (Pearson correlation<0.1).
Probe Preparation. Methods of preparing probes are well known to those of skill in the art (see, e.g. Sambrook et al., Molecular Cloning: A Laboratory Manual (2nd ed.), Vols. 1-3, Cold Spring Harbor Laboratory, (1989) or Current Protocols in Molecular Biology, F. Ausubel et al., ed. Greene Publishing and Wiley-Interscience, New York (1987)), which are hereby incorporated by reference.
Prior to use, constructs are fragmented to provide smaller nucleic acid fragments that easily penetrate the cell and hybridize to the target nucleic acid. Fragmentation can be by any of a number of methods well known to hose of skill in the art. Preferred methods include treatment with a restriction enzyme to selectively cleave the molecules, or alternatively to briefly heat the nucleic acids in the presence of Mg2+. Probes are preferably fragmented to an average fragment length ranging from about 50 by to about 2000 bp, more preferably from about 100 by to about 1000 by and most preferably from about 150 by to about 500 bp.
Methods of labeling nucleic acids are well known to those of skill in the art. Preferred labels are those that are suitable for use with in situ hybridization. The nucleic acid probes may be detectably labeled prior to the hybridization reaction. Alternatively, a detectable label which binds to the hybridization product may be used. Such detectable labels include any material having a detectable physical or chemical property and have been well-developed in the field of immunoassays.
As used herein, a “label” is any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, or chemical means. Useful labels in the present invention include radioactive labels (e.g., 32P, 125I, 14C, 3H, and 35S), fluorescent dyes (e.g. fluorescein, rhodamine, Texas Red, etc.), electron-dense reagents (e.g. gold), enzymes (as commonly used in an ELISA), colorimetric labels (e.g. colloidal gold), magnetic labels (e.g. DYNABEADS™), and the like. Examples of labels which are not directly detected but are detected through the use of directly detectable label include biotin and dioxigenin as well as haptens and proteins for which labeled antisera or monoclonal antibodies are available.
The particular label used is not critical to the present invention, so long as it does not interfere with the in situ hybridization of the stain. However, stains directly labeled with fluorescent labels (e.g. fluorescein-12-dUTP, Texas Red-5-dUTP, etc.) are preferred for chromosome hybridization.
A direct labeled probe, as used herein, is a probe to which a detectable label is attached. Because the direct label is already attached to the probe, no subsequent steps are required to associate the probe with the detectable label. In contrast, an indirect labeled probe is one which bears a moiety to which a detectable label is subsequently bound, typically after the probe is hybridized with the target nucleic acid.
In addition the label must be detectable in as low copy number as possible thereby maximizing the sensitivity of the assay and yet be detectible above any background signal. Finally, a label must be chosen that provides a highly localized signal thereby providing a high degree of spatial resolution when physically mapping the stain against the chromosome. Particularly preferred fluorescent labels include fluorescein-12-dUTP and Texas Red-5-dUTP.
The labels may be coupled to the probes in a variety of means known to those of skill in the art. In a preferred embodiment the nucleic acid probes will be labeled using nick translation or random primer extension (Rigby, et al. J. Mol. Biol., 113: 237 (1977) or Sambrook, et al., Molecular Cloning—A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1985)).
One of skill in the art will appreciate that the probes of this invention need not be absolutely specific for the targeted 8p11-12, 11q13-14, 17q11-12 or 20q13 regions of the genome. Rather, the probes are intended to produce “staining contrast”. “Contrast” is quantified by the ratio of the probe intensity of the target region of the genome to that of the other portions of the genome. For example, a DNA library produced by cloning a particular chromosome (e.g. chromosome 7) can be used as a stain capable of staining the entire chromosome. The library contains both sequences found only on that chromosome, and sequences shared with other chromosomes. Roughly half the chromosomal DNA falls into each class. If hybridization of the whole library were capable of saturating all of the binding sites on the target chromosome, the target chromosome would be twice as bright (contrast ratio of 2) as the other chromosomes since it would contain signal from the both the specific and the shared sequences in the stain, whereas the other chromosomes would only be stained by the shared sequences. Thus, only a modest decrease in hybridization of the shared sequences in the stain would substantially enhance the contrast. Thus, contaminating sequences which only hybridize to non-targeted sequences, for example, impurities in a library can be tolerated in the stain to the extent that the sequences do not reduce the staining contrast below useful levels.
In situ Hybridization. Generally, in situ hybridization comprises the following major steps: (1) fixation of tissue or biological structure to analyzed; (2) prehybridization treatment of the biological structure to increase accessibility of target DNA, and to reduce nonspecific binding; (3) hybridization of the mixture of nucleic acids to the nucleic acid in the biological structure or tissue; (4) posthybridization washes to remove nucleic acid fragments not bound in the hybridization and (5) detection of the hybridized nucleic acid fragments. The reagents used in each of these steps and their conditions for use vary depending on the particular application.
In some applications it is necessary to block the hybridization capacity of repetitive sequences. In this case, human genomic DNA is used as an agent to block such hybridization. The preferred size range is from about 200 by to about 1000 bases, more preferably between about 400 to about 800 by for double stranded, nick translated nucleic acids.
Hybridization protocols for the particular applications disclosed here are described in Pinkel et al. Proc. Natl. Acad. Sci. USA, 85: 9138-9142 (1988) and in EPO Pub. No. 430,402. Suitable hybridization protocols can also be found in Methods in Molecular Biology Vol. 33, In Situ Hybridization Protocols, K. H. A. Choo, ed., Humana Press, Totowa, N.J., (1994). +In a particularly preferred embodiment, the hybridization protocol of Kallioniemi et al., ERBB2 amplification in breast cancer analyzed by fluorescence in situ hybridization. Proc Natl Acad Sci USA, 89: 5321-5325 (1992) is used.
Typically, it is desirable to use dual color FISH, in which two probes are utilized, each labeled by a different fluorescent dye. A test probe that hybridizes to the region of interest is labeled with one dye, and a control probe that hybridizes to a different region is labeled with a second dye. A nucleic acid that hybridizes to a stable portion of the chromosome of interest, such as the centromere region, is often most useful as the control probe. In this way, differences between efficiency of hybridization from sample to sample can be accounted for.
The FISH methods for detecting chromosomal abnormalities can be performed on nanogram quantities of the subject nucleic acids. Paraffin embedded tumor sections can be used, as can fresh or frozen material. Because FISH can be applied to the limited material, touch preparations prepared from uncultured primary tumors can also be used (see, e.g., Kallioniemi, A. et al., Cytogenet. Cell Genet. 60: 190-193 (1992)). For instance, small biopsy tissue samples from tumors can be used for touch preparations (see, e.g., Kallioniemi, A. et al., Cytogenet. Cell Genet. 60: 190-193 (1992)). Small numbers of cells obtained from aspiration biopsy or cells in bodily fluids (e.g., blood, urine, sputum and the like) can also be analyzed. For prenatal diagnosis, appropriate samples will include amniotic fluid and the like.
Quantitative PCR. Elevated gene expression is detected using quantitative PCR. Primers can be created to detect sequence amplification by signal amplification in gel electrophoresis. As is known in the art, primers or oligonucleotides are generally 15-40 by in length, and usually flank unique sequence that can be amplified by methods such as polymerase chain reaction (PCR) or reverse transcriptase PCR (RT-PCR, also known as real-time PCR). Methods for RT-PCR and its optimization are known in the art. An example is the PROMEGA PCR Protocols and Guides, found at URL:<http://www.promega.com/guides/per guide/default.htm>, and hereby incorporated by reference. Currently at least four different chemistries, TaqMan® (Applied Biosystems, Foster City, Calif., USA), Molecular Beacons, Scorpions® and SYBR® Green (Molecular Probes), are available for real-time PCR. All of these chemistries allow detection of PCR products via the generation of a fluorescent signal. TaqMan probes, Molecular Beacons and Scorpions depend on Förster Resonance Energy Transfer (FRET) to generate the fluorescence signal via the coupling of a fluorogenic dye molecule and a quencher moiety to the same or different oligonucleotide substrates. SYBR Green is a fluorogenic dye that exhibits little fluorescence when in solution, but emits a strong fluorescent signal upon binding to double-stranded DNA.
Two strategies are commonly employed to quantify the results obtained by real-time RT-PCR; the standard curve method and the comparative threshold method. In this method, a standard curve is first constructed from an RNA of known concentration. This curve is then used as a reference standard for extrapolating quantitative information for mRNA targets of unknown concentrations. Another quantitation approach is termed the comparative Ct method. This involves comparing the Ct values of the samples of interest with a control or calibrator such as a non-treated sample or RNA from normal tissue. The Ct values of both the calibrator and the samples of interest are normalized to an appropriate endogenous housekeeping gene.
High Throughput Screening. High throughput screening (HTS) methods are used to identify compounds that inhibit candidate genes which are related to drug resistance and reduced survival rate. HTS methods involve providing a combinatorial chemical or peptide library containing a large number of potential therapeutic compounds. Such “libraries” are then screened in one or more assays, as described herein, to identify those library members (particular peptides, chemical species or subclasses) that display the desired characteristic activity. The compounds thus identified can serve as conventional “lead compounds” or can themselves be used as potential or actual therapeutics.
A combinatorial chemical library is a collection of diverse chemical compounds generated by either chemical synthesis or biological synthesis, by combining a number of chemical “building blocks” such as reagents. For example, a linear combinatorial chemical library such as a polypeptide library is formed by combining a set of chemical building blocks (amino acids) in every possible way for a given compound length (i.e., the number of amino acids in a polypeptide compound). Millions of chemical compounds can be synthesized through such combinatorial mixing of chemical building blocks.
Preparation and screening of combinatorial chemical libraries is well known to those of skill in the art. Such combinatorial chemical libraries include, but are not limited to, peptide libraries (see, e.g., U.S. Pat. No. 5,010,175, Furka, Int. J. Pept. Prot. Res. 37:487-493 (1991) and Houghton et al., Nature 354:84-88 (1991)). Other chemistries for generating chemical diversity libraries can also be used. Such chemistries include, but are not limited to: peptoids (e.g., PCT Publication No. WO 91/19735), encoded peptides (e.g., PCT Publication WO 93/20242), random bio-oligomers (e.g., PCT Publication No. WO 92/00091), benzodiazepines (e.g., U.S. Pat. No. 5,288,514), diversomers such as hydantoins, benzodiazepines and dipeptides (Hobbs et al., Proc. Nat. Acad. Sci. USA 90:6909-6913 (1993)), vinylogous polypeptides (Hagihara et al., J. Amer. Chem. Soc. 114:6568 (1992)), nonpeptidal peptidomimetics with glucose scaffolding (Hirschmann et al., J. Amer. Chem. Soc. 114:9217-9218 (1992)), analogous organic syntheses of small compound libraries (Chen et al., J. Amer. Chem. Soc. 116:2661 (1994)), oligocarbamates (Cho et al., Science 261:1303 (1993)), and/or peptidyl phosphonates (Campbell et al., J. Org. Chem. 59:658 (1994)), nucleic acid libraries (see Ausubel, Berger and Sambrook, all supra), peptide nucleic acid libraries (see, e.g., U.S. Patent 5,539,083), antibody libraries (see, e.g., Vaughn et al., Nature Biotechnology, 14(3):309-314 (1996) and PCT/US96/10287), carbohydrate libraries (see, e.g., Liang et al., Science, 274:1520-1522 (1996) and U.S. Patent 5,593,853), small organic molecule libraries (see, e.g., benzodiazepines, Baum C&EN, January 18, page 33 (1993); isoprenoids, U.S. Pat. No. 5,569,588; thiazolidinones and metathiazanones, U.S. Patent 5,549,974; pyrrolidines, U.S. Pat. Nos. 5,525,735 and 5,519,134; morpholino compounds, U.S. Pat. No. 5,506,337; benzodiazepines, U.S. Pat. No. 5,288,514, and the like).
Devices for the preparation of combinatorial libraries are commercially available (see, e.g., ECIS™, Applied BioPhysics Inc., Troy, N.Y., MPS, 390 MPS, Advanced Chem Tech, Louisville Ky., Symphony, Rainin, Woburn, Mass., 433A Applied Biosystems, Foster City, Calif., 9050 Plus, Millipore, Bedford, Mass.). In addition, numerous combinatorial libraries are themselves commercially available (see, e.g., ComGenex, Princeton, N.J., Tripos, Inc., St. Louis, Mo., 3D Pharmaceuticals, Exton, Pa., Martek Biosciences, Columbia, Md., etc.).
Inhibitor Oligonucleotide and RNA interference (RNAi) Sequence Design. Known methods are used to identify sequences that inhibit candidate genes which are related to drug resistance and reduced survival rate. Such inhibitors may include but are not limited to, siRNA oligonucleotides, antisense oligonucleotides, peptide inhibitors and aptamer sequences that bind and act to inhibit PVT1 expression and/or function.
RNA interference is used to generate small double-stranded RNA (small interference RNA or siRNA) inhibitors to affect the expression of a candidate gene generally through cleaving and destroying its cognate RNA. Small interference RNA (siRNA) is typically 19-22 nt double-stranded RNA. siRNA can be obtained by chemical synthesis or by DNA-vector based RNAi technology. Using DNA vector based siRNA technology, a small DNA insert (about 70 bp) encoding a short hairpin RNA targeting the gene of interest is cloned into a commercially available vector. The insert-containing vector can be transfected into the cell, and expressing the short hairpin RNA. The hairpin RNA is rapidly processed by the cellular machinery into 19-22 nt double stranded RNA (siRNA). In a preferred embodiment, the siRNA is inserted into a suitable RNAi vector because siRNA made synthetically tends to be less stable and not as effective in transfection.
siRNA can be made using methods and algorithms such as those described by Wang L, Mu F Y. (2004) A Web-based Design Center for Vector-based siRNA and siRNA cassette. Bioinformatics. (In press); Khvorova A, Reynolds A, Jayasena S D. (2003) Functional siRNAs and miRNAs exhibit strand bias. Cell. 115(2):209-16; Harborth J, Elbashir S M, Vandenburgh K, Manninga H, Scaringe S A, Weber K, Tuschl T. (2003) Sequence, chemical, and structural variation of small interfering RNAs and short hairpin RNAs and the effect on mammalian gene silencing. Antisense Nucleic Acid Drug Dev. 13(2):83-105; Reynolds A, Leake D, Boese Q, Scaringe S, Marshall W S, Khvorova A. (2004) Rational siRNA design for RNA interference. Nat Biotechnol. 22(3):326-30 and Ui-Tei K, Naito Y, Takahashi F, Haraguchi T, Ohki-Hamazaki H, Juni A, Ueda R, Saigo K. (2004) Guidelines for the selection of highly effective siRNA sequences for mammalian and chick RNA interference. Nucleic Acids Res. 32(3):936-48, which are hereby incorporated by reference.
Other tools for constructing siRNA sequences are web tools such as the siRNA Target Finder and Construct Builder available from GenScript (http://www.genscript.com), Oligo Design and Analysis Tools from Integrated DNA Technologies (URL:<http://www.idtdna.com/SciTools/SciTools.aspx>), or siDESIGN™ Center from Dharmacon, Inc. (URL:<http://design.dharmacon.com/default.aspx?source=0>). siRNA are suggested to built using the ORF (open reading frame) as the target selecting region, preferably 50-100 nt downstream of the start codon. Because siRNAs function at the mRNA level, not at the protein level, to design an siRNA, the precise target mRNA nucleotide sequence may be required. Due to the degenerate nature of the genetic code and codon bias, it is difficult to accurately predict the correct nucleotide sequence from the peptide sequence. Additionally, since the function of siRNAs is to cleave mRNA sequences, it is important to use the mRNA nucleotide sequence and not the genomic sequence for siRNA design, although as noted in the Examples, the genomic sequence can be successfully used for siRNA design. However, designs using genomic information might inadvertently target introns and as a result the siRNA would not be functional for silencing the corresponding mRNA.
Rational siRNA design should also minimize off-target effects which often arise from partial complementarity of the sense or antisense strands to an unintended target. These effects are known to have a concentration dependence and one way to minimize off-target effects is often by reducing siRNA concentrations. Another way to minimize such off-target effects is to screen the siRNA for target specificity.
The siRNA can be modified on the 5’-end of the sense strand to present compounds such as fluorescent dyes, chemical groups, or polar groups. Modification at the 5′-end of the antisense strand has been shown to interfere with siRNA silencing activity and therefore this position is not recommended for modification. Modifications at the other three termini have been shown to have minimal to no effect on silencing activity.
It is recommended that primers be designed to bracket one of the siRNA cleavage sites as this will help eliminate possible bias in the data (i.e., one of the primers should be upstream of the cleavage site, the other should be downstream of the cleavage site). Bias may be introduced into the experiment if the PCR amplifies either 5′ or 3′ of a cleavage site, in part because it is difficult to anticipate how long the cleaved mRNA product may persist prior to being degraded. If the amplified region contains the cleavage site, then no amplification can occur if the siRNA has performed its function.
Antisense oligonucleotides (“oligos”) can be designed to inhibit candidate gene function. Antisense oligonucleotides are short single-stranded nucleic acids, which function by selectively hybridizing to their target mRNA, thereby blocking translation. Translation is inhibited by either RNase H nuclease activity at the DNA:RNA duplex, or by inhibiting ribosome progression, thereby inhibiting protein synthesis. This results in discontinued synthesis and subsequent loss of function of the protein for which the target mRNA encodes.
In a preferred embodiment, antisense oligos are phosphorothioated upon synthesis and purification, and are usually 18-22 bases in length. It is contemplated that the candidate gene antisense oligos may have other modifications such as 2′-O-Methyl RNA, methylphosphonates, chimeric oligos, modified bases and many others modifications, including fluorescent oligos.
In a preferred embodiment, active antisense oligos should be compared against control oligos that have the same general chemistry, base composition, and length as the antisense oligo. These can include inverse sequences, scrambled sequences, and sense sequences. The inverse and scrambled are recommended because they have the same base composition, thus same molecular weight and Tm as the active antisense oligonucleotides. Rational antisense oligo design should consider, for example, that the antisense oligos do not anneal to an unintended mRNA or do not contain motifs known to invoke immunostimulatory responses such as four contiguous G residues, palindromes of 6 or more bases and CG motifs.
Antisense oligonucleotides can be used in vitro in most cell types with good results. However, some cell types require the use of transfection reagents to effect efficient transport into cellular interiors. It is recommended that optimization experiments be performed by using differing final oligonucleotide concentrations in the 1-5 μm range with in most cases the addition of transfection reagents. The window of opportunity, i.e., that concentration where you will obtain a reproducible antisense effect, may be quite narrow, where above that range you may experience confusing non-specific, non-antisense effects, and below that range you may not see any results at all. In a preferred embodiment, down regulation of the targeted mRNA will be demonstrated by use of techniques such as northern blot, real-time PCR, cDNA/oligo array or western blot. The same endpoints can be made for in vivo experiments, while also assessing behavioral endpoints.
For cell culture, antisense oligonucleotides should be re-suspended in sterile nuclease-free water (the use of DEPC-treated water is not recommended). Antisense oligonucleotides can be purified, lyophilized, and ready for use upon re-suspension. Upon suspension, antisense oligonucleotide stock solutions may be frozen at −20° C. and stable for several weeks.
Aptamer sequences which bind to specific RNA or DNA sequences can be made. Aptamer sequences can be isolated through methods such as those disclosed in co-pending U.S. patent application Ser. No. 10/934,856, entitled, “Aptamers and Methods for their Invitro Selection and Uses Thereof,” which is hereby incorporated by reference.
It is contemplated that the sequences described herein may be varied to result in substantially homologous sequences which retain the same function as the original. As used herein, a polynucleotide or fragment thereof is “substantially homologous” (or “substantially similar”) to another if, when optimally aligned (with appropriate nucleotide insertions or deletions) with the other polynucleotide (or its complementary strand), using an alignment program such as BLASTN (Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. (1990) “Basic local alignment search tool.” J. Mol. Biol. 215:403-410), and there is nucleotide sequence identity in at least about 80%, preferably at least about 90%, and more preferably at least about 95-98% of the nucleotide bases.
Inhibition of ADAM9 induces cell apoptosis. It was found that silencing of ADAM9 inhibits breast cancer cell growth and cell proliferation and inhibition of ADAM9 expression in breast cancer cells induces cell apoptosis. Thus, ADAM9 is implicated in proliferative aspects of breast cancer pathophysiology and serves as a possible therapeutic target in breast cancer.
A comprehensive study of gene expression and copy number in primary breast cancers and breast cancer cell lines was carried out, whereby we identified a region of high level amplification on chromosome 8p11 that is associated with reduced survival duration. The metalloproteinase-like, disintegrin-link and cysteine-rich protein, ADAM9, identified herein, maps to the region of amplification at 8p11. siRNA knockdown was applied to explore how amplification and over-expression of this particular gene play a role in breast cancer pathophysiology and to determine if this gene may be a valuable therapeutic target.
We transiently transfected 83 nM of siRNA for ADAM9 into T47D, BT549, SUM52PE, 600MPE and MCF10A breast cancer cell lines. Non-specific siRNA served as a negative control. Cell viability/proliferation was evaluated by CellTiter-Glo® luminescent cell viability assay (CTG, Promega), cell apoptosis was assayed using YoPro-1 and Hoechst staining and cell cycle inhibition was assessed by measuring BrdU incorporation. All cellular measurements were made in adhered cells using the Cellomics high content scanning instrument. All assays were run at 3, 4, 5 and 6 days post transfection.
Briefly, the siRNA transfection protocol was as follows. Cells are plated and grown to 50-70% confluency and transfected using DharmaFECT1. In tubes, mix: Tube A: total volume 10 ul 9.5 uL SFM media+0.5siRNA(varied according to the experiment design); Tube B: total volume 10 ul 9.8 uL SFM media+0.2 DharmaFECT1. Incubate tubes for 5 min. During this incubation, remove media from target cells and replace with SFM in each well. Add contents of Tube B to Tube A and mix gently. Incubate for 20 min at room temperature. Add 20 uL mixture solution dropwise to each well (final volume=100 uL). Leave for 4 h, aspirate off media and replace with full growth media and allow cells to grow for several days.
Cell growth analysis was carried out using the CellTiter-Glo® Luminescent Cell Viability Assay (Promega Cat#G7571/2/3). The luminescence signal of viable cells as measured the amount of ATP detected in the plates were read using a custom plate reader and program.
BrdU Staining and Fixation for Cellomics were used to measure cell proliferation and cell cycle analysis. To incorporate BrdU and fix the cells 10 uM final concentration of BrdU (Sigma #B5002) was added directly to cell media and pulsed for 30 minutes in tissue culture incubator. The media was removed and the cells washed 2× with 1× PBS and then 70% EtOH added to cover cells and fix for overnight at 4° C. Next day the 70% EtOH was removed and cells allowed to dry. Then 2N HCl was added and cells incubated at room temperature for 5-10 minutes, then removed and 1× PBS added to neutralize. Diluted anti-BrdU antibody (Mouse anti-BrdU Clone 3D4 (BD Pharmingen #555627)) 1:100 in 1× PBS/0.5% Tween-20. Anti-BrdU was added to cells (50ul—96 well plate; 200ul—24 well plate) and incubated for 45-60 minutes at room temperature on a rocker. Antibody was aspirated and cells washed 2× with 1× PBS/0.5% Tween-20. Rabbit Anti-mouse Alexa Fluor 488 (Invitrogen #A-11059) was diluted 1:250 in 1× PBS/0.5% Tween-20. Secondary antibody was added to cells and incubated 30-60 minutes at room temperature on a rocker then washed 3× with 1× PBS/0.5% Tween-20. After the last wash was removed and cells were incubated with 1 ug/ml Hoechst 33342 (Sigma #B2261) diluted in 1× PBS for 45 minutes at room temperature on a rocker. Cells were washed and covered with 1× PBS. Plates were scanned or stored at 4° C. for later scanning on Cellomics.
YoPro-1 Staining for Cellomics was used for cell apoptosis analysis. Add YoPro-1 (Final use at 1 ug/ml) and Hoechst (Final use at 10 ug/ml) directly to cell media. Place in 37° C. incubator for 30 min. Then read directly on Cellomics
Significant knockdown of ADAM9 was achieved in BT549 and T47D cells transfected with siRNA-ADAM9 for 48 hr, 72 hr and 96 hr. Silencing of ADAM9 significantly reduced the proliferation of breast cancer cells and inhibited the BrdU incorporation after treatment with siRNA compared to controls. Knockdown of ADAM9 in breast cancer cells also induced significant levels of apoptosis. Furthermore, we found that cells had very good response when the concentration of siRNA-ADAM9 were higher than 30 nM. The current results suggested that silencing expression of ADAM9 is a novel approach for inhibition of breast cancer cell growth. ADAM9 may serve as a new candidate therapeutic target for treatment of breast cancer with poor outcome.
As described above, the 11q13 amplicon encodes ten genes or non-coding RNA transcripts that appear likely to contribute to the pathophysiology of breast cancer and that are potential therapeutic targets. None of these genes are considered druggable based on predicted protein folding characteristics. However, all are candidates for siRNA therapeutic attack. We applied an efficient siRNA transfection strategy as explained in Example 9 to assess the therapeutic potential of siRNAs against genes encoded in the region of recurrent amplification at 11q13.
We transiently transfect 50 nM of siRNAs targeting these genes (4 individual siRNAs per gene, Table 8) in cell lines amplified at 11q13 (HCC1954, ZR75B, MDAMB415 and CAMA1) and not amplified (BT474, HS578T and MCF10A). Non-specific siRNA served as a negative control Viable cell number and apoptosis index were measured for each siRNA. These analyses showed that silencing of CCND1, FGF3, PPFIA1, FOLR3, and NEU3 reduced the cell growth of 11q13-amplified breast cancer cells compared to unamplified controls (
To further validated the therapeutic potential of targeting FGF3, PPFIA1 and NEU3, we packaged shRNA lenti-virus (5 shRNAs for each gene, Open Biosystems Inc. Table 9 using the third generation lenti-virus packaging system and infected breast cells in which amplified/overexpressed FGF3, PPFIA1 and NEU3 are overexpressed with these lentiviral shRNAs. Knockdown efficiency was then measured by western blot. We identified successful clones marked with arrows (at least one clone for each gene) that can knock down more than 80% protein of the target genes (
Knockdown of FGF3, PPFIA1, and NEU3 also induced cell apoptosis and inhibited cell growth in 3D culture. We measured cell apoptosis by caspase3 activity and/or YoPRO plus Hoechst staining after cells infected with shRNAs using methodology described in Example 9. We found that knockdown of FGF3, PPFIA1 and NEU3 by shRNA significantly increased cell apoptosis in breast cancer cells (
Combinational Knockdown of Genes at 1413 Amplicon has the Synergistic Effect in Breast Cancer Cells.
To evaluate the synergistic effect on knockdown of candidate therapeutic targets FGF3, PPFIA1, NEU3 and CCND1, we infected breast cancer cells with shRNAs lentivirus individually and/or combinationally. Our data showed that combinational knockdown of NEU3 and PPFIA1 significantly inhibited cell growth (
Inhibition of genes encoded by the 20q13 amplicon As described above, the 20q13 amplicon encodes fourteen genes or non-coding RNA transcripts that appear likely to contribute to the pathophysiology of breast cancer and that are potential therapeutic targets. None of these genes are considered druggable based on predicted protein folding characteristics. However, all are candidates for siRNA therapeutic attack. We applied an efficient siRNA transfection strategy as explained in Example 9 to assess the therapeutic potential of siRNAs against genes encoded in the region of recurrent amplification at 20q13. We transiently transfected 50 nM of siRNAs (Table 10) targeting these genes (4 individual siRNAs per gene) in cell lines amplified at 20q13 (BT474, MCF7, MDAMB 157 and SUM52PE) and not amplified (MCF10A and ZR75B). Non-specific siRNA served as a negative control. Viable cell number, proliferation and apoptosis index were measured for each siRNA using the assays described in Example 9. These analyses showed that silencing of CSTF1, PCK1, RAB22A, VAPB, GNAS, C20orf45, BCAS1, TMEPAI and STX16 reduced the cell growth of 20g13-amplified breast cancer cells compared to unamplified controls (
Akagi, K., Suzuki, T., Stephens, R. M., Jenkins, N. A., and Copeland, N. G. (2004). RTCGD: retroviral tagged cancer gene database. Nucleic Acids Res 32, D523-527.
Al-Kuraya, K., Schraml, P., Torhorst, J., Tapia, C., Zaharieva, B., Novotny, H., Spichtin, H., Maurer, R., Mirlacher, M., Kochli, O., et al. (2004). Prognostic relevance of gene amplifications and coamplifications in breast cancer. Cancer Res 64, 8534-8540.
Albertson, D. G., Collins, C., McCormick, F., and Gray, J. W. (2003). Chromosome aberrations in solid tumors. Nat Genet 34, 369-376.
Babu, J. R., Jeganathan, K. B., Baker, D. J., Wu, X., Kang-Decker, N., and van Deursen, J. M. (2003). Rael is an essential mitotic checkpoint regulator that cooperates with Bub3 to prevent chromosome missegregation. J Cell Biol 160, 341-353.
Barlund, M., Monni, O, Kononen, J., Cornelison, R., Torhorst, J., Sauter, G., Kallioniemi, O.-P., and Kallioniemi, A. (2000). Multiple genes at 17q23 undergo amplification and overexpression in breast cancer. Cancer Res 60, 5340-5344.
Barlund, M., Tirkkonen, M., Forozan, F., Tanner, M. M., Kallioniemi, O., and Kallioniemi, A. (1997). Increased copy number at 17q22-q24 by CGH in breast cancer is due to high-level amplification of two separate regions. Genes Chromosomes Cancer 20, 372-376.
Baylin, S. B., and Herman, J. G. (2000). DNA hypermethylation in tumorigenesis: epigenetics joins genetics. Trends Genet 16, 168-174.
Beissbarth, T., and Speed, T. P. (2004). GOstat: find statistically overrepresented Gene Ontologies within a group of genes. Bioinformatics 20, 1464-5 (2004). Bioinformatics 20, 1464-1465.
Blegen, H., Will, J. S., Ghadimi, B. M., Nash, H. P., Zetterberg, A., Auer, G., and Ried, T. (2003). DNA amplifications and aneuploidy, high proliferative activity and impaired cell cycle control characterize breast carcinomas with poor prognosis. Anal Cell Pathol 25, 103-114.
Braun, B. S., and Shannon, K. (2004). The sum is greater than the FGFR1 partner. Cancer Cell 5, 203-204.
Callagy, G., Pharoah, P., Chin, S. F., Sangan, T., Daigo, Y., Jackson, L., and Caldas, C. (2005). Identification and validation of prognostic markers in breast cancer with the complementary use of array-CGH and tissue microarrays. J Pathol 205, 388-396.
Cheng, K. W., Lahad, J. P., Kuo, W. L., Lapuk, A., Yamada, K., Auersperg, N., Liu, J., Smith-McCune, K., Lu, K. H., Fishman, D., et al. (2004). The RAB25 small GTPase determines aggressiveness of ovarian and breast cancers. Nat Med 10, 1251-1256.
Chin, K., de Solorzano, C. O., Knowles, D., Jones, A., Chou, W., Rodriguez, E. G., Kuo, W. L., Ljung, B. M., Chew, K., Myambo, K., et al. (2004). In situ analyses of genome instability in breast cancer. Nat Genet 36, 984-988.
Clairmont, C. A., Narayanan, L., Sun, K. W., Glazer, P. M., and Sweasy, J. B. (1999). The Tyr-265-to-Cys mutator mutant of DNA polymerase beta induces a mutator phenotype in mouse LN12 cells. Proc Natl Acad Sci USA 96, 9580-9585.
Deutschbauer, A. M., Jaramillo, D. F., Proctor, M., Kumm, J., Hillenmeyer, M. E., Davis, R. W., Nislow, C., and Giaever, G. (2005). Mechanisms of haploinsufficiency revealed by genome-wide profiling in yeast. Genetics 169, 1915-1925.
Esteva, F. J., Sahin, A. A., Cristofanilli, M., Coombes, K., Lee, S. J., Baker, J., Cronin, M., Walker, M., Watson, D., Shak, S., and Hortobagyi, G. N. (2005). Prognostic role of a multigene reverse transcriptase-PCR assay in patients with node-negative breast cancer not receiving adjuvant systemic therapy. Clin Cancer Res 11, 3315-3319.
Fraser, M. M., Watson, P. M., Fraig, M. M., Kelley, J. R., Nelson, P. S., Boylan, A. M., Cole, D. J., and Watson, D. K. (2005). CaSm-mediated cellular transformation is associated with altered gene expression and messenger RNA stability. Cancer Res 65, 6228-6236.
Fridlyand, J., Snijders, A. M., Ylstra, B., Li, H., Olshen, A., Segraves, R., Dairkee, S., Tokuyasu, T., Ljung, B. M., Jain, A. N., et al. (2006). Breast tumor copy number aberration phenotypes and genomic instability. BMC Cancer 6, 96.
Gelsi-Boyer, V., Orsetti, B., Cervera, N., Finetti, P., Sircoulomb, F., Rouge, C., Lasorsa, L., Letessier, A., Ginestier, C., Monville, F., et al. (2005). Comprehensive profiling of 8p11-12 amplification in breast cancer. Mol Cancer Res 3, 655-667.
Gianni, L., Zambetti, M., Clark, K., Baker, J., Cronin, M., Wu, J., Mariani, G., Rodriguez, J., Carcangiu, M., Watson, D., et al. (2005). Gene Expression Profiles in Paraffin-Embedded Core Biopsy Tissue Predict Response to Chemotherapy in Women With Locally Advanced Breast Cancer. J Clin Oncol.
Greten, F. R., and Karin, M. (2004). The IKK/NF-kappaB activation pathway-a target for prevention and treatment of cancer. Cancer Lett 206, 193-199.
Hackett, C. S., Hodgson, J. G., Law, M. E., Fridlyand, J., Osoegawa, K., de Jong, P. J., Nowak, N. J., Pinkel, D., Albertson, D. G., Jain, A., et al. (2003). Genome-wide array CGH analysis of murine neuroblastoma reveals distinct genomic aberrations which parallel those in human tumors. Cancer Res 63, 5266-5273.
Hanahan, D., and Weinberg, R. A. (2000). The hallmarks of cancer. Cell 100, 57-70.
Hartigan, J. A. (1975). Clustering Algorithms (New York: Wiley).
Hinds, P. W., Dowdy, S. F., Eaton, E. N., Arnold, A., and Weinberg, R. A. (1994). Function of a human cyclin gene as an oncogene. Proc Natl Acad Sci USA 91, 709-713.
Hodgson, G., Hager, J. H., Vole, S., Hariono, S., Wernick, M., Moore, D., Nowak, N., Albertson, D. G., Pinkel, D., Collins, C., et al. (2001). Genome scanning with array CGH delineates regional alterations in mouse islet carcinomas. Nat Genet 29, 459-464.
Huang, G., Krig, S., Kowbel, D., Xu, H., Hyun, B., Volik, S., Feuerstein, B., Mills, G. B., Stokoe, D., Yaswen, P., and Collins, C. (2005). ZNF217 suppresses cell death associated with chemotherapy and telomere dysfunction. Hum Mol Genet 14, 3219-3225.
Hyman, E., Kauraniemi, P., Hautaniemi, S., Wolf, M., Mousses, S., Rozenblum, E., Ringner, M., Sauter, G., Monni, O., Elkahloun, A., et al. (2002). Impact of DNA amplification on gene expression patterns in breast cancer. Cancer Res 62, 6240-6245.
Irizarry, R., Bolstad, B., Collin, F., Cope, L., Hobbs, B., and Speed, T. (2003). Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Research 31, e15.
Isola, J., Chu, L., DeVries, S., Matsumura, K., Chew, K., Ljung, B. M., and Waldman, F. M. (1999). Genetic alterations in ERBB2-amplified breast carcinomas. Clin Cancer Res 5, 4140-4145.
Isola, J. J., Kallioniemi, O. P., Chu, L. W., Fuqua, S. A., Hilsenbeck, S. G., Osborne, C. K., and Waldman, F. M. (1995). Genetic aberrations detected by comparative genomic hybridization predict outcome in node-negative breast cancer. Am J Pathol 147, 905-911.
Jain, A. N., Chin, K., Borresen-Dale, A. L., Erikstein, B. K., Eynstein Lonning, P., Kaaresen, R., and Gray, J. W. (2001). Quantitative analysis of chromosomal CGH in human breast tumors associates copy number abnormalities with p53 status and patient survival. Proc Natl Acad Sci USA 98, 7952-7957.
Jain, A. N., Tokuyasu, T. A., Snijders, A. M., Segraves, R., Albertson, D. G., and Pinkel, D. (2002). Fully automatic quantification of microarray image data. Genome Res 12, 325-332.
Jones, P. A. (2005). Overview of cancer epigenetics. Semin Hematol 42, S3-8.
Kallioniemi, A., Kallioniemi, O. P., Piper, J., Tanner, M., Stokke, T., Chen, L., Smith, H. S., Pinkel, D., Gray, J. W., and Waldman, F. M. (1994). Detection and mapping of amplified DNA sequences in breast cancer by comparative genomic hybridization. Proc Natl Acad Sci USA 91, 2156-2160.
Kallioniemi, O. P., Kallioniemi, A., Kurisu, W., Thor, A., Chen, L. C., Smith, H. S., Waldman, F. M., Pinkel, D., and Gray, J. W. (1992). ERBB2 amplification in breast cancer analyzed by fluorescence in situ hybridization. Proc Natl Acad Sci USA 89, 5321-5325.
Kauraniemi, P., Barlund, M., Monni, O., and Kallioniemi, A. (2001). New amplified and highly expressed genes discovered in the ERBB2 amplicon in breast cancer by cDNA microarrays. Cancer Res 61, 8235-8240.
Kauraniemi, P., Kuukasjarvi, T., Sauter, G., and Kallioniemi, A. (2003). Amplification of a 280-kilobase core region at the ERBB2 locus leads to activation of two hypothetical proteins in breast cancer. Am J Pathol 163, 1979-1984.
Knuutila, S., Autio, K., and Aalto, Y. (2000). Online access to CGH data of DNA sequence copy number changes. Am J Pathol 157, 689.
Lam, L. T., Davis, R. E., Pierce, J., Hepperle, M., Xu, Y., Hottelet, M., Nong, Y., Wen, D., Adams, J., Dang, L., and Staudt, L. M. (2005). Small molecule inhibitors of IkappaB kinase are selectively toxic for subgroups of diffuse large B-cell lymphoma defined by gene expression profiling. Clin Cancer Res 11, 28-40.
Loo, L. W., Grove, D. I., Williams, E. M., Neal, C. L., Cousens, L. A., Schubert, E. L., Holcomb, I. N., Massa, H. F., Glogovac, J., Li, C. I., et al. (2004). Array comparative genomic hybridization analysis of genomic alterations in breast cancer subtypes. Cancer Res 64, 8541-8549.
Mazzocca, A., Coppari, R., De Franco, R., Cho, J. Y., Libermann, T. A., Pinzani, M., and Toker, A. (2005). A secreted form of ADAM9 promotes carcinoma invasion through tumor-stromal interactions. Cancer Res 65, 4728-4738.
Naylor, T. L., Greshock, J., Wang, Y., Colligon, T., Yu, Q. C., Clemmer, V., Zaks, T. Z., and Weber, B. L. (2005). High resolution genomic analysis of sporadic breast cancer using array-based comparative genomic hybridization. Breast Cancer Res 7, R1186-1198.
Nonet, G., Stampfer, M., Chin, K., Gray, J. W., Collins, C., and Yaswen, P. (2001). The ZNF217 gene amplified in breast cancers promotes immortalization of human mammary epithelial cells. Cancer Research 61, 1250-1254.
Okunieff, P., Fenton, B. M., Zhang, L., Kern, F. G., Wu, T., Greg, J. R., and Ding, I. (2003). Fibroblast growth factors (FGFS) increase breast tumor growth rate, metastases, blood flow, and oxygenation without significant change in vascular density. Adv Exp Med Biol 530, 593-601.
Olshen, A. B., Venkatraman, E. S., Lucito, R., and Wigler, M. (2004). Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5, 557-572.
Ouspenski, II, Elledge, S. J., and Brinkley, B. R. (1999). New yeast genes important for chromosome integrity and segregation identified by dosage effects on genome stability. Nucleic Acids Res 27, 3001-3008.
Perou, C. M., Jeffrey, S. S., van de Rijn, M., Rees, C. A., Eisen, M. B., Ross, D. T., Pergamenschikov, A., Williams, C. F., Zhu, S. X., Lee, J. C., et al. (1999). Distinctive gene expression patterns in human mammary epithelial cells and breast cancers. Proc Natl Acad Sci USA 96, 9212-9217.
Perou, C. M., Sorlie, T., Eisen, M. B., van de Rijn, M., Jeffrey, S. S., Rees, C. A., Pollack, J. R., Ross, D. T., Johnsen, H., Akslen, L. A., et al. (2000). Molecular portraits of human breast tumours. Nature 406, 747-752.
Pinkel, D., Segraves, R., Sudar, D., Clark, S., Poole, I., Kowbel, D., Collins, C., Kuo, W. L., Chen, C., Zhai, Y., et al. (1998). High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays. Nat Genet 20, 207-211.
Pollack, J. R., Perou, C. M., Alizadeh, A. A., Eisen, M. B., Pergamenschikov, A., Williams, C. F., Jeffrey, S. S., Botstein, D., and Brown, P. O. (1999). Genome-wide analysis of DNA copy-number changes using cDNA microarrays. Nat Genet 23, 41-46.
Pollack, J. R., Sorlie, T., Perou, C. M., Rees, C. A., Jeffrey, S. S., Lonning, P. E., Tibshirani, R., Botstein, D., Borresen-Dale, A. L., and Brown, P. O. (2002). Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors. Proc Natl Acad Sci USA 99, 12963-12968.
Press, M. F., Sauter, G., Bernstein, L., Villalobos, I. E., Mirlacher, M., Zhou, J. Y., Wardeh, R., Li, Y. T., Guzman, R., Ma, Y., et al. (2005). Diagnostic evaluation of HER-2 as a molecular target: an assessment of accuracy and reproducibility of laboratory testing in large, prospective, randomized clinical trials. Clin Cancer Res 11, 6598-6607.
Ramaswamy, S., Ross, K. N., Lander, E. S., and Golub, T. R. (2003). A molecular signature of metastasis in primary solid tumors. Nat Genet 33, 49-54.
Ray, M. E., Yang, Z. Q., Albertson, D., Kleer, C. G., Washburn, J. G., Macoska, J. A., and Ethier, S. P. (2004). Genomic and expression analysis of the 8p11-12 amplicon in human breast cancer cell lines. Cancer Res 64, 40-47.
Reyal, F., Stransky, N., Bernard-Pierrot, I., Vincent-Salomon, A., de Rycke, Y., Elvin, P., Cassidy, A., Graham, A., Spraggon, C., Desille, Y., et al. (2005). Visualizing chromosomes as transcriptome correlation maps: evidence of chromosomal domains containing co-expressed genes—a study of 130 invasive ductal breast carcinomas. Cancer Res 65, 1376-1383.
Russ, A. P., and Lampel, S. (2005). The druggable genome: an update. Drug Discov Today 10, 1607-1610.
Slamon, D. J., Godolphin, W., Jones, L. A., Holt, J. A., Wong, S. G., Keith, D. E., Levin, W. J., Stuart, S. G., Udove, J., Ullrich, A., and et al. (1989). Studies of the HER-2/neu proto-oncogene in human breast and ovarian cancer. Science 244, 707-712.
Snijders, A. M., Fridlyand, J., Mans, D. A., Segraves, R., Jain, A. N., Pinkel, D., and Albertson, D. G. (2003). Shaping of tumor and drug-resistant genomes by instability and selection. Oncogene 22, 4370-4379.
Snijders, A. M., Nowak, N., Segraves, R., Blackwood, S., Brown, N., Conroy, J., Hamilton, G., Hindle, A. K., Huey, B., Kimura, K., et al. (2001). Assembly of microarrays for genome-wide measurement of DNA copy number. Nat Genet 29, 263-264.
Solinas-Toldo, S., Lampel, S., Stilgenbauer, S., Nickolenko, J., Benner, A., Dohner, H., Cremer, T., and Lichter, P. (1997). Matrix-based comparative genomic hybridization: biochips to screen for genomic imbalances. Genes Chromosomes Cancer 20, 399-407.
Sorlie, T., Perou, C. M., Tibshirani, R., Aas, T., Geisler, S., Johnsen, H., Hastie, T., Eisen, M. B., van de Rijn, M., Jeffrey, S. S., et al. (2001). Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci USA 98, 10869-10874.
Sorlie, T., Tibshirani, R., Parker, J., Hastie, T., Marron, J. S., Nobel, A., Deng, S., Johnsen, H., Pesich, R., Geisler, S., et al. (2003). Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci USA 100, 8418-8423.
Still, I. H., Hamilton, M., Vince, P., Wolfman, A., and Cowell, J. K. (1999). Cloning of TACC1, an embryonically expressed, potentially transforming coiled coil containing gene, from the 8p11 breast cancer amplicon. Oncogene 18, 4032-4038.
Tanaka, S., Sugimachi, K., Kawaguchi, H., Saeki, H., Ohno, S., and Wands, J. R. (2000). Grb7 signal transduction protein mediates metastatic progression of esophageal carcinoma. J Cell Physiol 183, 411-415.
Tanner, M. M., Tirkkonen, M., Kallioniemi, A., Collins, C., Stokke, T., Karhu, R., Kowbel, D., Shadravan, F., Hintz, M., Kuo, W. L., and et al. (1994). Increased copy number at 20q13 in breast cancer: defining the critical region and exclusion of candidate genes. Cancer Res 54, 4257-4260.
van 't Veer, L. J., Dai, H., van de Vijver, M. J., He, Y. D., Hart, A. A., Mao, M., Peterse, H. L., van der Kooy, K., Marton, M. J., Witteveen, A. T., et al. (2002). Gene expression profiling predicts clinical outcome of breast cancer. Nature 415, 530-536.
van de Vijver, M. J., He, Y. D., van't Veer, L. J., Dai, H., Hart, A. A., Voskuil, D. W., Schreiber, G. J., Peterse, J. L., Roberts, C., Marton, M. J., et al. (2002). A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med 347, 1999-2009.
Vogel, C. L., Cobleigh, M. A., Tripathy, D., Gutheil, J. C., Harris, L. N., Fehrenbacher, L., Slamon, D. J., Murphy, M., Novotny, W. F., Burchmore, M., et al. (2002). Efficacy and safety of trastuzumab as a single agent in first-line treatment of HER2-overexpressing metastatic breast cancer. J Clin Oncol 20, 719-726.
Weber-Mangal, S., Sinn, H. P., Popp, S., Klaes, R., Emig, R., Bentz, M., Mansmann, U., Bastert, G., Bartram, C. R., and Jauch, A. (2003). Breast cancer in young women (<or =35 years): Genomic aberrations detected by comparative genomic hybridization. Int J Cancer 107, 583-592.
Willenbrock, H., and Fridlyand, J. (2005). A comparison study: applying segmentation to array CGH data for downstream analyses. Bioinformatics.
Yeung, K. Y., Fraley, C., Murua, A., Raftery, A. E., and Ruzzo, W. L. (2001). Model-based clustering and data transformations for gene expression data. Bioinformatics 17, 977-987.
Yeung, K. Y., Medvedovic, M., and Bumgarner, R. E. (2004). From co-expression to co-regulation: how many microarray experiments do we need? Genome Biol 5, R48.
Yi, Y., Mirosevich, J., Shyr, Y., Matusik, R., and George, A. L., Jr. (2005). Coupled analysis of gene expression and chromosomal location. Genomics 85, 401-412.
Zhu, Y., Kan, L., Qi, C., Kanwar, Y. S., Yeldandi, A. V., Rao, M. S., and Reddy, J. K. (2000). Isolation and characterization of peroxisome proliferator-activated receptor (PPAR) interacting protein (PRIP) as a coactivator for PPAR. J Biol Chem 275, 13510-13516.
While the present sequences, compositions and processes have been described with reference to specific details of certain exemplary embodiments thereof, it is not intended that such details be regarded as limitations upon the scope of the invention. The present examples, methods, procedures, specific compounds and molecules are meant to exemplify and illustrate the invention and should in no way be seen as limiting the scope of the invention. Any patents, publications, publicly available sequences mentioned in this specification and listed above are indicative of levels of those skilled in the art to which the invention pertains and are hereby incorporated by reference to the same extent as if each were specifically and individually incorporated by reference.
1,2 ,3,4Kruskal-wallis test (1-7, 11, 12), significance of robust linear regression standardized coefficient (8-10
5Fisher exact test (1-7, 11, 12), significance of robust linear regression standardized coefficient (8-10)
1van de Vijver MJ, He YD, van't Veer LJ, Dai H, Hart AA, et al. 2002) A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med 347: 1999-2009.
2Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, et al. 2001) Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci USA 98: 10869-10874.
This application is a divisional application of U.S. application Ser. No. 12/330,386, filed Dec. 8, 2008, which is a continuation-in-part of PCT application no. PCT/US2007/070908, filed Jun. 11, 2007, which claims priority to U.S. provisional patent application No. 60/812,704, filed on Jun. 9, 2006, each of which applications is hereby incorporated by reference in its entirety.
This invention was made during work supported by the National Cancer Institute, through Grants CA 58207 and CA 112970, and during work supported by the U.S. Department of Energy under Contract No. DE-ACO3-765F00098, now DE-ACO2-05CH11231. The government has certain rights in this invention.
Number | Date | Country | |
---|---|---|---|
60812704 | Jun 2006 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 12330386 | Dec 2008 | US |
Child | 13243712 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/US2007/070908 | Jun 2007 | US |
Child | 12330386 | US |