This disclosure relates to polynucleotide analysis and, in particular, to polynucleotide expression profiling of breast tumors and cancers using libraries or arrays of polynucleotides.
The ERBB2 oncogene, also called HER2 or NEU, is located in band q12 of chromosome 17. It codes for a 185-kDa transmembrane tyrosine kinase related to members of the ERBB family, which also includes epidermal growth factor receptor. ERBB2 is amplified and over-expressed in 15-30% of breast cancers (1). Although its exact role in mammary oncogenesis remains unclear (2, 3, for reviews), the receptor is a clinically relevant target for the treatment of breast cancer for two reasons. First, ERBB2 gene amplification and over-expression of ERRB2 gene products have been associated in many studies with prognosis or response to anticancer therapies (4, 5, for reviews). Second, therapy based on a humanized monoclonal antibody (trastuzumab/Herceptin™) aimed at reducing the aberrant expression of the receptor has shown benefits in metastatic breast cancer patients (6-8, for reviews). However, modifications of chemotherapy and hormonal therapy strategies based on ERBB2 status remain controversial. Furthermore, the clinical efficacy of trastuzumab is unexpectedly variable, implying that additional and/or alternate methods to accurately identify appropriate patients for treatment with ERBB2 antagonists may be warranted.
Currently, ERBB2 status is primarily determined by two different methods: fluorescence in situ hybridization (FISH), which reveals gene amplification, and immunohistochemistry (IHC), which detects the over-expressed ERBB2 protein (9-12, for recent reviews). FISH is a good method for ERBB2 testing, but is technically more difficult to implement than IHC. IHC is easier to perform, but is difficult to standardize (13). IHC is currently the only FDA-approved test for selection of patients for treatment with trastuzumab. The American Society for Clinical Oncology and National Comprehensive Cancer Network guidelines recommend the use of either FISH (PathVysion™) or the HercepTest™, which is a specific IHC test made by the Dako Corporation.
This Herpceptin™ method includes a calibrated internal control to semi-quantitatively assess positive staining on a scale ranging from 0 (absence of ERBB2 protein over-expression) to 3+ (maximum of ERBB2 over-expression). Results are scored by a pathologist; interpretation is relatively straightforward in ERBB2-negative individuals (0-1+) and in patients who strongly over-express the protein (3+). Accurate scoring is however problematic for the intermediate level 2+. For cases scoring 2+(10-15% of all breast cancers), the concordance with FISH is, at best, 25%. Importantly, a proportion of 2+ cases are bona fide ERBB2-over-expressing tumors to which Herceptin treatment should be applied.
Thus, universal, accurate, and standardized determination of ERBB2 status has not yet been achieved. The reliability of this determination will greatly influence the selection of the relevant cases and thus the clinical efficacy of Herceptin treatment. Moreover, the establishment of specific methods for patient selection for ERBB2 antagonists may serve as a paradigm for guiding clinical use of the new targeted approaches expected in the near future. It is thus important to further document the methods and parameters useful to assess ERBB2 status.
Moreover, preliminary reports suggest that clinical outcome may vary between patients with the same ERBB2 status and treatment, implying that other factors, in addition to ERBB2, may play a role in determining the level of sensitivity to trastuzumab. Additionally, it may be necessary to associate other targeted therapies to anti-ERBB2 treatment, and identification of complementary or secondary targets may thus prove useful to guide selection of appropriate combination therapy. These secondary targets may contribute to activation of pathways associated with response to ERBB2 hyperactivity. Although the common pathways such as the RAS/MAPK pathway and other induced genes have been reported (14), ERBB2-associated signaling cascades have yet to be elucidated. Thus, accurate measurement of ERBB2 status as well as identification of associated molecular alterations are now intensively required.
The effect of surgery on proliferation of breast carcinomas, in particular those over-expressing HER2 oncoprotein, has been recently assessed (67). It has been found that residual breast carcinomas that had been surgically removed within 48 days after first surgery showed a significant increase in proliferation if they were ERBB2-positive. Treatment of ERBB2-positive tumour cells with trastuzumab before adding a growth stimulus abolished drainage-fluid-induced proliferation. This suggests that ERBB2 over-expression by breast carcinoma cells has a role in post-surgical stimulation of proliferation of breast carcinoma cells.
Emerging technologies may facilitate progress on both ERBB2 typing and target discovery. Among these, DNA microarrays are currently prominent; they provide massive parallel quantification of mRNA expression levels for thousands of genes in a sample (15, 16, for recent reviews). Several reports have shown that this technology can be used to improve the prognostic classification of breast cancers (17-24). 217 breast carcinomas have been analyzed using DNA microarrays containing ˜9,000 spotted cDNA clones. Our aim was to identify differences in gene expression patterns between ERBB2-negative and ERBB2-positive breast tumors. We have identified a series of 37 discriminator genes/mRNA/ESTs called “ERBB2 gene expression signature,” the expression of which was able to distinguish ERBB2-negative and positive samples. This signature was independently validated by correlative IHC and FISH analyses. Among the genes included in the signature were potential additional targets, such as GATA4.
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
a-2b represent the validation of the ERBB2 gene expression signature by supervised classification of thirty-seven genes/ESTs from an independent series of breast cancer samples.
a contains photomicrographs of tissue microarray sections, showing protein expression by hematoxylin and eosin staining (top) or immuno-histochemical staining (bottom).
a represents an unsupervised classification of 159 breast tumors using hierarchical clustering of 159 breast tumors and 37 clones from the ERBB2 gene expression signature. Each row represents a clone and each column represents a sample. Expression level of each gene in a single sample is relative to its median abundance across all samples and is depicted according to a color scale shown at the bottom. Red and green indicate expression levels respectively above and below the median. The magnitude of deviation from the median is represented by the color saturation. Grey indicates missing data.
a and 7b represents an unsupervised hierarchical classification of 159 breast tumors defining an ERBB2 gene expression signature performed as in
We provide a “gene expression signature” (also referred to as “GES”) that can identify ERBB2 alteration in breast tumors, as well as enhance current understanding of the role of ERBB2 in mammary oncogenesis. The gene expression signature contains genes that are neighbors of ERBB2 on 17q12, and includes potential regulators and/or downstream effectors of ERBB2 (e.g., GATA4) and eventual targets (e.g., cadherin, integrins). The gene expression signature can be used both for breast tumor management in clinical settings and as a research tool, in academic laboratories.
We thus provides a method for analyzing differential gene expression associated with breast tumor, based on the analysis of the over- or under-expression of polynucleotide sequences in a sample or cell line. The analysis comprises the detection of the over-expression of at, least one, preferably at least two, more preferably three or all, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, selected from at least each of predefined polynucleotide sequences sets consisting of:
Set 1: SEQ ID NOS. 73, 74, 75, 76, 77 (ERBB2);
Set 4: SEQ ID NOS. 78, 79, 80 (GATA4); and
Set 5: SEQ ID NOS. 41, 42, 43 (CDH15).
We also provide a method for analyzing differential gene expression associated with breast tumor, based on the analysis of the over- or under-expression of polynucleotide sequences in a sample or cell line. This analysis includes the detection of the over-expression or under-expression of at least one, preferably at least two, more preferably three or all, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, selected from each of predefined polynucleotide sequences sets consisting of Set 1: SEQ ID NO. 73, 74, 75, 76, 77 (ERBB2), Set 2: SEQ ID NO. 28, 29, 30 (GRB7), Set 3: SEQ ID NO. 83, 84, 85 (NR1D1), Set 4: SEQ ID NO. 78, 79, 80 (GATA4), Set 5: SEQ ID NO. 41, 42, 43 (CDH15), Set 6: SEQ ID NO. 16, 17 (LTA), Set 7: SEQ ID NO. 86, 87, 116 (MAP2K6), Set 8: SEQ ID NO. 54, 55, 113 (PECAM1), Set 9: SEQ ID NO. 44, 45 (PPARBP), Set 13: SEQ ID NO. 10 (LOC148696), Set 18: SEQ ID NO. 24, 25 (STAT3), Set 20: SEQ ID NO. 36, 37, 38 (CDKL5), Set 21: SEQ ID NO. 46, 47, 48 (CSTA), Set 22: SEQ ID NO. 52, 53, 115 (ITGB3), Set 23: SEQ ID NO. 56, 57, 58 (MKI67), Set 24: SEQ ID NO. 59, 60, 61 (PBEF), Set 27: SEQ ID NO. 88, 89, 90 (ITGA2), Set 28: SEQ ID NO. 11 (ESTAA878915), SET 29: SEQ ID NO. 1, 2, 3 (JDP1), SET 35: SEQ ID NO. 67, 68, 69 (FLJ10.193), SET 36: SEQ ID NO. 70, 71, 72 (ESR1), SET 43: SEQ ID NO. 104, 105, 106 (DAXX), SET 47: SEQ ID NO. 114, and SET 48: SEQ ID NO. 117, 118 (C17ORF37).
We further provide a polynucleotide library useful for the molecular characterization of a breast cancer, comprising or corresponding to a pool of polynucleotide sequences which are over- or under-expressed in breast tissue.
We still further provide a method for analyzing differential gene expression associated with breast tumor, including a) obtaining nucleic acids from a breast tissue sample from a patient, b) reacting the nucleic acids sample obtained in step (a) with a polynucleotide library or array, and c) detecting the reaction product of step (b).
We yet further provide to a method for analyzing differential gene expression associated with breast tumor, including a) obtaining proteins from a breast tissue sample from a patient, and b) measuring in the sample the level of proteins corresponding to proteins coded by a polynucleotide library or array.
We also further provide a method for treating a patient with a breast cancer, including (i) the implementation of a method for analyzing differential gene expression associated with breast tumor on a sample from the patient, and (ii) determining a treatment for this patient based on the analysis of differential gene expression profile.
As used herein, a disease, disorder, e.g., tumor or condition “associated with” an aberrant expression of a nucleic acid refers to a disease, disorder, e.g., tumor or condition in a subject which is caused by, contributed to by, or causative of an aberrant level of expression of a nucleic acid.
As used herein, the term “subsequence” refers to any part of said polynucleotide sequence that is less than the entire polynucleotide sequence, and which would be also suitable to perform the method of analysis. A person skilled in the art can choose the position and length of a subsequence by applying routine experiments. For example, a subsequence of a polynucleotide can be any contiguous sequence of at least about 10, about 25, about 50, about 100, about 200, about 300, about 400, about 800, or about 1,000 nucleotides. Examples of such subsequences are given in Table 1 below, under the heading “Seq3′” or “Seq5′”.
The over- or under-expression of a given polynucleotide sequence, subsequence or complement thereof can be determined by any known method, such as disclosed in PCT patent application WO 02103320, the entire disclosure of which is herein incorporated by reference. Suitable methods can comprise the detection of difference in the expression of the polynucleotide sequences in relation to at least one control. Said control can comprise, for example, polynucleotide sequence(s) from sample of the same patient or from a pool of ERBB2+ or ERBB2− patients, or polynucleotide sequences selected from among reference sequence(s) which may already be known to be over- or under-expressed. The expression level of said control polynucleotide sequences can be an average or an absolute value of the expression of reference polynucleotide sequences. The values for control polynucleotide expression can be processed in order to accentuate the difference relative to the expression of the polynucleotide sequences.
The analysis of the over- or under-expression of polynucleotide sequences can be carried out on sample such as biological material derived from any mammalian cells, including cell lines, xenografts, and human tissues (preferably breast tissue), etc. The method can be performed on any sample from a patient or an animal (for example for veterinary applications or preclinical trials).
More particularly, we provide a method for analyzing differential gene expression associated with breast tumors, based on the analysis of the over- or under-expression of polynucleotide sequences on a sample or cell line. The analysis comprises the detection of the over-expression of at least one, preferably at least two, more preferably three or all, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, selected from each of at least the predefined polynucleotide sequences sets consisting of:
Set 1: SEQ ID NO. 73, 74, 75, 76, 77 (ERBB2);
Set 2: SEQ ID NO. 28, 29, 30 (GRB7);
Set 3: SEQ ID NO. 83, 84, 85 (NR1D1);
Set 4: SEQ ID NO. 78, 79, 80 (GATA4); and
Set 5: SEQ ID NO. 41, 42, 43 (CDH15).
The method can further comprise at least one of the following embodiments:
The detection of the over-expression of at least one, preferably at least two, more preferably three or all, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, selected from each one of predefined polynucleotide sequences sets consisting of:
Set 6: SEQ ID NO. 16, 17 (LTA);
Set 7: SEQ ID NO. 86, 87, 116 (MAP2K6); and
Set 8: SEQ ID NO. 54, 55, 113 (PECAM1).
The detection of the over-expression of at least one, preferably at least two, more preferably three or all, polynucleotide sequence(s), subsequence(s) or complement(s) thereof from each one of predefined polynucleotide sequences sets consisting of:
Set 9: SEQ ID NO. 44, 45 (PPARBP);
Set 10: SEQ ID NO. 33, 34, 35 (PPP1R1B); and
Set 11: SEQ ID NO. 39, 40 (RPL19).
The detection of the over-expression of at least one, preferably at least two, more preferably three or all, polynucleotide sequence(s); subsequence(s) or complement(s) thereof, from each of predefined polynucleotide sequences sets consisting of:
Set 1: SEQ ID NO. 73, 74, 75, 76, 77 (ERBB2);
Set 2: SEQ ID NO. 28, 29, 30 (GRB7);
Set 3: SEQ ID NO. 83, 84, 85 (NR1D1);
Set 4: SEQ ID NO. 78, 79, 80 (GATA4);
Set 5: SEQ ID NO. 41, 42, 43 (CDH15);
Set 6: SEQ ID NO. 16, 17 (LTA);
Set 7: SEQ ID NO. 86, 87, 116 (MAP2K6);
Set 8: SEQ ID NO. 54, 55, 113 (PECAM1);
Set 9: SEQ ID NO. 44, 45 (PPARBP);
Set 10: SEQ ID NO. 33, 34, 35 (PPP1R1B);
Set 11: SEQ ID NO. 39, 40 (RPL19);
Set 12: SEQ ID NO. 4, 5, 6 (PSMB3);
Set 13: SEQ ID NO. 10 (LOC148696);
Set 14: SEQ ID NO. 12, 13 (NOL3/loc283849);
Set 15: SEQ ID NO. 14, 15 (ITGA2B);
Set 16: SEQ ID NO. 18, 19 (NFKBIE);
Set 17: SEQ ID NO. 22, 23 (PADI2);
Set 18: SEQ ID NO. 24, 25 (STAT3);
Set 19: SEQ ID NO 26, 27 (OAS2);
Set 20: SEQ ID NO. 36, 37, 38 (CDKL5);
Set 21: SEQ ID NO. 46, 47, 48 (CSTA);
Set 22: SEQ ID NO. 52, 53, 115 (ITGB3);
Set 23: SEQ ID NO. 56, 57, 58 (MKI67);
Set 24: SEQ ID NO. 59, 60, 61 (PBEF);
Set 25: SEQ ID NO. 62, 63, 64 (FADS2);
Set 26: SEQ ID NO. 81, 82 (LOX);
Set 27: SEQ ID NO. 88, 89, 90 (ITGA2); and
Set 28: SEQ ID NO. 11 (ESTAA878915).
The under-expression of at least one, preferably at least two, more preferably three or all, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, from each one of predefined polynucleotide sequences sets consisting of:
SET 29: SEQ ID NO. 1, 2, 3 (JDP1);
SET 30: SEQ ID NO. 7, 8, 9 (NAT1);
SET 31: SEQ ID NO. 20, 21 (CELSR2);
SET 32: SEQ ID NO. 31, 32 (ESTN33243);
SET 33: SEQ ID NO. 49, 50, 51 (SCUBE2);
SET 34: SEQ ID NO. 65, 66 (ESTH29301);
SET 35: SEQ ID NO. 67, 68, 69 (FLJ10193); and
SET 36: SEQ ID NO. 70, 71, 72 (ESR1).
According to another embodiment, the method comprises the detection of the over- or under-expression of at least one, preferably at least two, more preferably three or all, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, selected from each of predefined polynucleotide sequences sets consisting of:
Set 1: SEQ ID NO. 73, 74, 75, 76, 77 (ERBB2);
Set 2: SEQ ID NO. 28, 29, 30 (GRB7);
Set 3: SEQ ID NO. 83, 84, 85 (NR1D1);
Set 4: SEQ ID NO. 78, 79, 80 (GATA4);
Set 5: SEQ ID NO. 41, 42, 43 (CDH15);
Set 6: SEQ ID NO. 16, 17 (LTA);
Set 7: SEQ ID NO. 86, 87, 116 (MAP2K6);
Set 8: SEQ ID NO. 54, 55, 113 (PECAM1);
Set 9: SEQ ID NO. 44, 45 (PPARBP);
Set 10: SEQ ID NO. 33, 34, 35 (PPP1R1B);
Set 11: SEQ ID NO. 39, 40 (RPL19);
Set 13: SEQ ID NO. 10 (LOC148696);
Set 14: SEQ ID NO. 12, 13 (NOL3/loc283849);
Set 15: SEQ ID NO. 14, 15 (ITGA2B);
Set 16: SEQ ID NO. 18, 19 (NFKBIE);
Set 18: SEQ ID NO. 24, 25 (STAT3);
Set 19: SEQ ID NO. 26, 27 (OAS2);
Set 20: SEQ ID NO. 36, 37, 38 (CDKL5);
Set 21: SEQ ID NO. 46, 47, 48 (CSTA);
Set 22: SEQ ID NO. 52, 53, 115 (ITGB3);
Set 23: SEQ ID NO. 56, 57, 58 (MKI67);
Set 24: SEQ ID NO. 59, 60, 61 (PBEF);
Set 26: SEQ ID NO. 81, 82 (LOX);
Set 27: SEQ ID NO. 88, 89, 90 (ITGA2);
SET 29: SEQ ID NO. 1, 2, 3 (JDP1);
SET 33: SEQ ID NO. 49, 50, 51 (SCUBE2);
SET 34: SEQ ID NO. 65, 66 (ESTH29301);
SET 35: SEQ ID NO. 67, 68, 69 (FLJ10193); and
SET 36: SEQ ID NO. 70, 71, 72 (ESR1).
By “over- or under-expression” of a polynucleotide sequence, it is meant that over-expression of certain sequences are detected simultaneously to the under-expression of others sequences. “Simultaneously” means concurrent with or within a biologically or functionally relevant period of time during which the over-expression of a sequence may be followed by the under-expression of another sequence; or conversely, e.g., because expression of both polynucleotide sequences are directly or indirectly correlated.
In a further embodiment, we provide a method for analyzing differential gene expression associated with breast tumors, based on the analysis of the over- or under-expression of polynucleotide sequences in a sample or cell line, said analysis comprising:
the detection of the over-expression of at least one, preferably at least two, more preferably three or all, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, selected from each of predefined polynucleotide sequences sets consisting of:
Set 1: SEQ ID NO. 73, 74, 75, 76, 77 (ERBB2);
Set 2: SEQ ID NO. 28, 29, 30 (GRB7);
Set 6: SEQ ID NO. 16, 17 (LTA);
Set 23: SEQ ID NO. 56, 57, 58 (MKI67); and
the detection of the under-expression of at least one, preferably at least two or three, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, selected from SET 36: SEQ ID NO. 70, 71, 72 (ESR1).
In a further embodiment, we provide a method for analyzing differential gene expression associated with breast tumors based on the analysis of the over- or under-expression of polynucleotide sequences on a sample or cell line, said analysis comprising the detection of the over-expression or under-expression of at least one, preferably at least two, three or all, polynucleotide(s), subsequence(s) or complement(s) thereof, selected from each of predefined polynucleotide sequences sets consisting of:
Set 1: SEQ ID NO. 75, 76, 77 (ERBB2);
Set 2: SEQ ID NO. 28, 29, 30 (GRB7);
Set 4: SEQ ID NO. 78, 79, 80 (GATA4);
Set 5: SEQ ID NO. 41, 42, 43 (CDH15);
SET 31: SEQ ID NO. 20, 21 (CELSR2);
SET 36: SEQ ID NO. 70, 71, 72 (ESR1); and
SET 48: SEQ ID NO. 117, 118 (C17ORF37).
In a particular embodiment this method comprises:
the detection of the over-expression of at least one preferably at least two, more preferably three or all, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, selected from each of predefined polynucleotide sequences sets consisting of:
Set 1: SEQ ID NO. 75, 76, 77 (ERBB2);
Set 2: SEQ ID NO. 28, 29, 30 (GRB7);
Set 4: SEQ ID NO. 78, 79, 80 (GATA4);
Set 5: SEQ ID NO. 41, 42, 43 (CDH15); and
the detection of the under-expression of at least one, preferably at least two, more preferably three or all, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, selected from each of predefined polynucleotide sequences sets consisting of:
SET 31: SEQ ID NO. 20, 21 (CELSR2);
SET 36: SEQ ID NO. 70, 71, 72 (ESR1); and
SET 48: SEQ ID NO. 117, 118 (C17ORF37).
In a further embodiment, we provide a method for analyzing differential gene expression associated with breast tumors based on the analysis of the over or under expression of polynucleotide sequences in a sample or cell line, said analysis comprising the detection of the over-expression of under-expression of at least one, preferably at least two, more preferably three or all, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, selected from each of predefined polynucleotide sequences sets consisting of:
Set 1: SEQ ID NO. 73, 74, 75, 76, 77 (ERBB2);
Set 2: SEQ ID NO. 28, 29, 30 (GRB7);
Set 3: SEQ ID NO. 83, 84, 85 (NR1D1);
Set 4: SEQ ID NO. 78, 79, 80 (GATA4);
Set 5: SEQ ID NO. 41, 42, 43 (CDH15);
Set 6: SEQ ID NO. 16, 17 (LTA);
Set 7: SEQ ID NO. 86, 87, 116 (MAP2K6);
Set 8: SEQ ID NO. 54, 55, 113 (PECAM1);
Set 9: SEQ ID NO. 44, 45 (PPARBP);
Set 13: SEQ ID NO. 10 (LOC148696);
Set 18: SEQ ID NO. 24, 25 (STAT3);
Set 20: SEQ ID NO. 36, 37, 38 (CDKL5);
Set 21: SEQ ID NO. 46, 47, 48 (CSTA);
Set 22: SEQ ID NO. 52, 53, 115 (ITGB3);
Set 23: SEQ ID NO. 56, 57, 58 (MKI67);
Set 24: SEQ ID NO. 59, 60, 61 (PBEF);
Set 27: SEQ ID NO. 88, 89, 90 (ITGA2);
Set 28: SEQ ID NO. 11 (ESTAA878915);
SET 29: SEQ ID NO. 1, 2, 3 (JDP1);
SET 35: SEQ ID NO. 67, 68, 69 (FLJ10193);
SET 36: SEQ ID NO. 70, 71, 72 (ESR1);
SET 43: SEQ ID NO. 104, 105, 106 (DAXX);
SET 47: SEQ ID NO. 114; and
SET 48: SEQ ID NO. 117, 118 (C17ORF37).
In another embodiment this method comprises:
the detection of the over-expression of at least one, preferably at least two, more preferably three or all, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, selected from each of predefined polynucleotide sequences sets consisting of:
Set 1: SEQ ID NO. 73, 74, 75, 76, 77 (ERBB2);
Set 2: SEQ ID NO. 28, 29, 30 (GRB7);
Set 3: SEQ ID NO. 83, 84, 85 (NR1D1);
Set 4: SEQ ID NO. 78, 79, 80 (GATA4);
Set 5: SEQ ID NO. 41, 42, 43 (CDH15);
Set 6: SEQ ID NO. 16, 17 (LTA);
Set 7: SEQ ID NO. 86, 87, 116 (MAP2K6);
Set 8: SEQ ID NO. 54, 55, 113 (PECAM1);
Set 9: SEQ ID NO. 44, 45 (PPARBP);
Set 13: SEQ ID NO. 10 (LOC148696);
Set 18: SEQ ID NO. 24, 25 (STAT3);
Set 20: SEQ ID NO. 36, 37, 38 (CDKL5);
Set 21: SEQ ID NO. 46, 47, 48 (CSTA);
Set 22: SEQ ID NO. 52, 53, 115 (ITGB3);
Set 23: SEQ ID NO. 56, 57, 58 (MKI67);
Set 24: SEQ ID NO. 59, 60, 61 (PBEF);
Set 27: SEQ ID NO. 88, 89, 90 (ITGA2);
Set 28: SEQ ID NO. 11 (ESTAA878915);
SET 47: SEQ ID NO. 114;
SET 48: SEQ ID NO. 117, 118 (C17ORF37); and
the detection of the under-expression of at least one, preferably at least two, more preferably three or all, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, selected from each of predefined polynucleotide sequences sets consisting of:
SET 29: SEQ ID NO. 1, 2, 3 (JDP1);
SET 35: SEQ ID NO. 67, 68, 69 (FLJ10193);
SET 36: SEQ ID NO. 70, 71, 72 (ESR1); and
SET 43: SEQ ID NO. 104, 105, 106 (DAXX).
In another embodiment, this method further comprises:
the detection of the over-expression of at least one, preferably at least two, more preferably three or all, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, selected from each of predefined polynucleotide sequences sets consisting of:
SET 38: SEQ ID NO. 94, 95 (B3GNT3);
SET 40: SEQ ID NO. 99; and
SET 44: SEQ ID NO. 107, 108 (ACTR1A); and
the detection of the under-expression of at least one, preferably at least two, more preferably three or all, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, selected from each of predefined polynucleotide sequences sets consisting of:
SET 31: SEQ ID NO. 20, 21 (CELSR2);
SET 33: SEQ ID NO. 49, 50, 51 (SCUBE2);
SET 37: SEQ ID NO. 91, 92, 93 (RHOBTB3);
SET 39: SEQ ID NO. 96, 97, 98 (NUDT14);
SET 41: SEQ ID NO. 100, 101 (CASKIN1);
SET 42: SEQ ID NO. 102, 103 (KIF5C);
SET 45: SEQ ID NO. 109, 110, 111 (MAPT); and
SET 46: SEQ ID NO. 112.
The number of sequences according to the various embodiments can vary in the range of from 1 to the total number of sequences described therein; e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115 or 120 sequences.
The number of sets according to the various embodiments can vary in the range of from 1 to the total number of sets described therein; e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44 or 45 sets.
Table 1 hereafter displays a library of polynucleotide sequences of SEQ ID NO. 1 to SEQ ID NO. 118 above. Table 1 indicates the name of the gene with its gene symbol, its clone reference (Image, or Ipsogen in italics) and for each gene the relevant sequence(s) defining the set (identification numbers: SEQ ID NO.). We conveniently define the nucleotide sequences by reference to different sets, but can also define the polynucleotide sequences by the name of the gene or subsequences thereof.
drosophila)
0000143
0000133
homo sapiens transcribed sequence
sapiens]
0000170
0000135
0000367
We provide a method in which the differential gene expression corresponds to an alteration of ERBB2 gene expression of some or all of the polynucleotide sequences from Table 1, or subsequences or complements thereof, in breast tumor and/or an alteration of an ER gene expression in breast tumor.
The detection of over- or under-expression of polynucleotide sequences according to the method can be carried out by any suitable technique, for example by FISH or IHC. It can be performed, for example, on nucleic acids obtained from a breast tissue sample or from a tumor cell line.
In one embodiment, the polynucleotides, or subsequences or complements thereof, are immobilized on DNA microarrays.
The detection of over- or under-expression of polynucleotide sequences according to the method can also be carried out at the protein level, for example, by detecting proteins expressed from nucleic acid in a breast tissue sample.
We provide particularly a method for monitoring the treatment of a patient with a breast cancer comprising the implementation of the above methods on nucleic acids or protein in a breast tissue sample from said patient.
Advantageously, the method is performed on patient scoring +2 with the HercepTest™ (see
Also advantageously, the method is performed on patients to determine their need to be pre-treated with ERBB2 antagonist, e.g., Herceptin™ (trastuzumab), before surgical removal of ERBB2 positive primary breast tumors. Treatment with ERBB2 inhibitor such as Herceptin™ before ablation could reduce tumor proliferation and metastatic risk stimulated by surgical resection.
We further provide a polynucleotide library useful for the molecular characterization of a breast cancer, comprising or corresponding to a pool of polynucleotide sequences over- or under-expressed in breast tissue. In one embodiment, the pool comprises or corresponds to at least one, preferably at least two, more preferably three or all, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, selected from each of predefined polynucleotide sequences sets consisting of:
The pool can also comprise at least one, preferably at least two, more preferably three or all, polynucleotide sequence, subsequence or complement thereof, selected in each of predefined polynucleotide sequences sets of at least one of the following groups:
Set 12: SEQ ID NO. 4, 5, 6 (PSMB3); Set 13: SEQ ID NO. 10 (LOC148696); Set 14: SEQ ID NO. 12, 13 (NOL3/loc283849); Set 15: SEQ ID NO. 14, 15 (ITGA2B); Set 16: SEQ ID NO. 18, 19 (NFKBIE); Set 17: SEQ ID NO. 22, 23 (PADI2); Set 18: SEQ ID NO. 24, 25 (STAT3); Set 19: SEQ ID NO. 26, 27 (OAS2); Set 20: SEQ ID NO. 36, 37, 38 (CDKL5); Set 21: SEQ ID NO. 46, 47, 48 (CSTA); Set 22: SEQ ID NO. 52, 53, 115 (ITGB3); Set 23: SEQ ID NO. 56, 57, 58 (MKI67); Set 24: SEQ ID NO. 59, 60, 61 (PBEF); Set 25: SEQ ID NO. 62, 63, 64 (FADS2); Set 26: SEQ ID NO. 81, 82 (LOX); Set 27: SEQ ID NO. 88, 89, 90 (ITGA2); SET 28: SEQ ID NO. 11 (ESTAA878915); and
SET 29: SEQ ID NO. 1, 2, 3 (JDP1); SET 30: SEQ ID NO. 7, 8, 9 (NAT1); SET 31: SEQ ID NO. 20, 21 (CELSR2); SET 32: SEQ ID NO. 31, 32 (ESTN33243); SET 33: SEQ ID NO. 49, 50, 51 (SCUBE2); SET 34: SEQ ID NO. 65, 66 (ESTH29301); SET 35: SEQ ID NO. 67, 68, 69 (FLJ10193); SET: SEQ ID NO. 70, 71, 72 (ESR1).
A specific polynucleotide library useful for the molecular characterization of a breast cancer comprises or corresponds to a pool of polynucleotide sequences over- or under-expressed in breast tissue, said pool comprising or corresponding to at least one, preferably at least two, more preferably three or all, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, selected from each of predefined polynucleotide sequences sets consisting of:
Set 1: SEQ ID NO. 73, 74, 75, 76, 77 (ERBB2); Set 2: SEQ ID NO. 28, 29, 30 (GRB7); Set 3: SEQ ID NO. 83, 84, 85 (NR1D1); Set 4: SEQ ID NO. 78, 79, 80 (GATA4); Set 5: SEQ ID NO. 41, 42, 43 (CDH15); Set 6: SEQ ID NO. 16, 17 (LTA); Set 7: SEQ ID NO. 86, 87, 116 (MAP2K6); Set 8: SEQ ID NO. 54, 55, 113 (PECAM1); Set 9: SEQ ID NO. 44, 45 (PPARBP); Set 10: SEQ ID NO. 33, 34, 35 (PPP1R1B); Set 11: SEQ ID NO. 39, 40 (RPL19); Set 13: SEQ ID NO. 10 (LOC148696); Set 14: SEQ ID NO. 12, 13 (NOL3/loc283849); Set 15: SEQ ID NO. 14, 15 (ITGA2B); Set 16: SEQ ID NO. 18, 19 (NFKBIE); Set 18: SEQ ID NO. 24, 25 (STAT3); Set 19: SEQ ID NO. 26, 27 (OAS2); Set 20: SEQ ID NO. 36, 37, 38 (CDKL5); Set 21: SEQ ID NO. 46, 47, 48 (CSTA); Set 22: SEQ ID NO. 52, 53, 115 (ITGB3); Set 23: SEQ ID NO. 56, 57, 58 (MKI67); Set 24: SEQ ID NO. 59, 60, 61 (PBEF); Set 26: SEQ ID NO. 81, 82 (LOX); Set 27: SEQ ID NO. 88, 89, 90 (ITGA2); SET 29: SEQ ID NO. 1, 2, 3 (DPI); SET 33: SEQ ID NO. 49, 50, 51 (SCUBE2); SET 34: SEQ ID NO. 65, 66 (ESTH29301/NA); SET 35: SEQ ID NO. 67, 68, 69 (FLJ10193); and SET 36: SEQ ID NO. 70, 71, 72 (ESR1).
A further specific polynucleotide library useful for the molecular characterization of a breast cancer comprises or corresponds to a pool of polynucleotide sequences over or under expressed in breast tissue, said pool comprising or corresponding to at least one, preferably at least two, more preferably three or all, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, selected from each of predefined polynucleotide sequences sets consisting of:
Set 1: SEQ ID NO. 73, 74, 75, 76, 77 (ERBB2); Set 2: SEQ ID NO. 28, 29, 30 (GRB7); Set 6: SEQ ID NO. 16, 17 (LTA); Set 23: SEQ ID NO. 56, 57, 58 (MKI67); and SET 36: SEQ ID NO. 70, 71, 72 (ESR1).
A further specific polynucleotide library useful for the molecular characterization of a breast cancer-comprises or corresponds to a pool of polynucleotide sequences over- or under-expressed in breast tissue, said pool comprising or corresponding to at least one, preferably at least two, more preferably three or all, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, selected from each of predefined polynucleotide sequences sets consisting of:
Set 1: SEQ ID NO. 75, 76, 77 (ERBB2); Set: SEQ ID NO. 28, 29, 30 (GRB7); Set 4: SEQ ID NO. 78, 79, 80 (GATA4); Set 5: SEQ ID NO. 41, 42, 43 (CDH15); SET 31: SEQ ID NO. 20, 21 (CELSR2); SET 3: SEQ ID NO. 70, 71, 72 (ESR1); SET 48: SEQ ID NO. 117, 118 (C17ORF37.)
A further specific polynucleotide library useful for the molecular characterization of a breast cancer comprises or corresponds to a pool of polynucleotide sequences over- or under-expressed in breast tissue, said pool comprising or corresponding to at least one, preferably at least two, more preferably three or all, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, selected from each of predefined polynucleotide sequences sets consisting of:
Set 1: SEQ ID NO. 73, 74, 75, 76, 77 (ERBB2); Set 2: SEQ ID NO. 28, 29, 30 (GRB7); Set 3: SEQ ID NO. 83, 84, 85 (NR1D1); Set 4: SEQ ID NO. 78, 79, 80 (GATA4); Set 5: SEQ ID NO. 41, 42, 43 (CDH15); Set 6: SEQ ID NO. 16, 17 (LTA); Set 7: SEQ ID NO. 86, 87, 116 (MAP2K6); Set 8: SEQ ID NO. 54, 55, 113 (PECAM1); Set 9: SEQ ID NO. 44, 45 (PPARBP); Set 13: SEQ ID NO. 10 (LOC148696); Set 18: SEQ ID NO. 24, 25 (STAT3); Set 20: SEQ ID NO. 36, 37, 38 (CDKL5); Set 21: SEQ ID NO. 46, 47, 48 (CSTA); Set 22: SEQ ID NO. 52, 53, 115 (ITGB3); Set 23: SEQ ID NO. 56, 57, 58 (MKI67); Set 24: SEQ ID NO. 59, 60, 61 (PBEF); Set 27: SEQ ID NO. 88, 89, 90 (ITGA2); Set 28: SEQ ID NO. 11 (ESTAA878915); SET 29: SEQ ID NO. 1, 2, 3 (JDP1); SET 35: SEQ ID NO. 67, 68, 69 (FLJ10193); SET 36: SEQ ID NO. 70, 71, 72 (ESR1); SET 43: SEQ ID NO. 104, 105, 106 (DAXX); SET 47: SEQ ID NO. 114; and SET 48: SEQ ID NO. 117, 118 (C17ORF37).
This pool may further comprise at least one, preferably at least two, more preferably three or all, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, selected from each of predefined polynucleotide sequences sets consisting of: SET 31: SEQ ID NO. 20, 21 (CELSR2); SET 33: SEQ ID NO. 49, 50, 51 (SCUBE2); SET 37: SEQ ID NO. 91, 92, 93 (RHOBTB3); SET 38: SEQ ID NO. 94, 95 (B3GNT3); SET 39: SEQ ID NO. 96, 97, 98 (NUDT14); SET 40: SEQ ID NO. 99; SET 41: SEQ ID NO. 100, 101 (CASKIN1); SET 42: SEQ ID NO. 102, 103 (KIF5C); SET 44: SEQ ID NO. 107, 108 (ACTR1A); SET 45: SEQ ID NO. 109, 110, 111 (MAPT); and SET 46: SEQ ID NO. 112.
The term “pool”, as used herein, refers to a number of sequences that may vary in a range of from 1 to the total number of polynucleotide sequences, e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115 or 120 sequences.
The polynucleotide libraries can be immobilized on a solid support to form an array. The solid support can, for example, be selected from the group consisting of nylon membrane, nitrocellulose membrane, glass slide, glass beads, membranes on glass support or a silicon chip.
Thus, a method comprises:
obtaining nucleic acids from a breast tissue sample from a patient; and
reacting said nucleic acids obtained in step (a) with a polynucleotide library; and
detecting the reaction product of step (b).
The polynucleotide sample can be labeled, e.g., before reaction step (b), and the label of the polynucleotide sample can be selected from the group consisting of radioactive, colorimetric, enzymatic, molecular amplification, bioluminescent or fluorescent labels. For example, a preferred label can be selected from the group consisting of biotin and digoxygenin.
The method can further comprise obtaining a control sample comprising polynucleotides, reacting said control sample with a polynucleotide library, detecting a control sample reaction product and comparing the amount of said polynucleotide sample reaction product to the amount of said control sample reaction product.
By “nucleic acids” is meant polynucleotides; e.g., isolated polynucleotides, such as deoxyribonucleic acid (DNA), and, where appropriate, ribonucleic acid (RNA). “Nucleic acids” should also be understood to include, as equivalents, analogs of RNA or DNA made from nucleotide analogs, and, as applicable to the embodiment being described, single (sense or antisense) and double-stranded polynucleotides: ESTs, chromosomes, cDNAs, mRNAs, and rRNAs are representative examples of molecules that may be referred to as nucleic acids. DNA can be obtained, for example, from said nucleic acids sample and RNA can be obtained, for example, by transcription of said DNA. In addition, mRNA can be isolated from said nucleic acids sample and cDNA can be obtained by reverse transcription of said mRNA.
In a further embodiment, a method can be performed at the protein level. Such a method can comprise:
obtaining proteins from a breast tissue sample from a patient; and
measuring proteins in the sample obtained in step (a), in which the level of proteins in the sample corresponds to proteins coded by a polynucleotide library. It is understood that the proteins can be obtained directly from the sample; e.g., by standard extraction or isolation techniques or can be obtained by translation of mRNA obtained from the samples.
Our methods are useful for detecting, diagnosing, staging, monitoring, predicting, or preventing conditions associated with breast cancer. It is particularly useful for predicting clinical outcome of breast cancer and/or predicting occurrence of metastatic relapse and/or determining the stage or aggressiveness of a breast disease in at least about 50%, e.g., at least about 55%, e.g., at least about 60%, e.g., at least about 65%, e.g., at least about 70%, e.g., at least about 75%, e.g., at least about 80%, e.g., at least about 85%, e.g., at least about 90%, e.g., at least about 95%, e.g., about 100% of the patients. The methods are also useful for selecting more appropriate doses and/or schedule for administering chemotherapeutics and/or biopharmaceuticals and/or radiation therapy to circumvent toxicities in a patient.
By “aggressiveness of a breast disease” is meant, e.g., cancer growth rate or potential to metastasize; a so-called “aggressive cancer” will grow or metastasize more rapidly than a non-aggressive cancer, or significantly affect overall health status and quality of life.
By “predicting clinical outcome” is meant, e.g., the ability for a skilled artisan to classify patients into at least two prognostic classes (good vs. poor) showing significantly different long-term Metastasis Free Survival (MFS).
We also provide a method for treating a patient with a breast cancer, comprising i) implementing a method of analyzing differential gene expression profile on a sample from said patient, and ii) determining a treatment for this patient based on the analysis of differential gene expression profile obtained with said method. “Treating” encompasses palliative care as well as ameliorating at least one symptom of the condition or disease.
The methods can achieve high specificity and sensitivity level of at least about 80%, e.g., about 85%, e.g., about 90%, e.g., about 93%, e.g., about 95% e.g., about 97%, e.g., about 99% in predicting the clinical outcome, in predicting occurrence of metastatic relapse, or determining the stage or aggressiveness of breast cancer.
a represents the analysis of protein expression using immunohistochemistry on tissue microarray sections. “TMA1” indicates a hematoxylin-eosin staining (H & E) of paraffin block section (25×30 mm) from TMA1 containing 552 tumors and control samples. Examples of IHC staining are indicated by the numbers 1-4. Section 1 shows a sample with ERBB2 expression equal to 3+ and section 2 shows a sample with no detected ERBB2 expression. Section 3 shows a sample with GATA4 expression equal to Q=300, and section 4 shows a sample with no GATA4 expression.
b represents the analysis of ERBB2 gene copy number in breast tumors using fluorescence in situ hybridization (FISN) on tissue microarray sections. “TMA2” indicates H & E staining of paraffin block section (25×30 mm) from TMA2-containing 94 tumors. Below the TMA2 section, two sections of invasive breast carcinomas are shown, the first with ERBB2 amplification and the second with normal gene copy number. Red dots (arrows) represent ERBB2 copies and green dots represent centromere 17, on interphase, chromosomes.
Herceptest™ is the first co-approval of molecular diagnostic and therapeutic agent consisting of: stringent standardization of HER-2/neu antisera and IHC protocols; increased awareness for scrupulous quality control; standardized, universal controls, and system for pathological scoring; results interpreted by pathologists specifically trained to consistently score Her-2 immunostaining (ie. use of reference laboratories).
As shown in
A weak positive result on the Herceptest™ would depict weak to moderate complete membrane staining in more than 10 percent of the tumor cells.
A strong positive on the Herceptest™ result would depict a strong complete membrane staining in more than 10 percent of the tumor cells.
The row/column representation principle in
We thus provide a set of genes, the analysis of which produces a gene expression profile that can discriminate between ERBB2+ and ERBB2− breast tumors. Content of the signature
The identity of the discriminator genes gives insight into the underlying biological mechanisms associated with ERBB2 status and with the aggressive phenotype of ERBB2+ breast cancers. They also provide new diagnostic, prognostic and predictive factors, as well as new therapeutic targets.
Twenty-nine genes/ESTs were significantly over-expressed in ERBB2+ tumors. Without wishing to be bound by any theory, their co-expression may indicate co-amplification (same chromosomal location), regulation by ERBB2, coregulation by common factors or association with unknown phenotypic feature of disease. In addition to ERBB2 itself, there were 6 genes from region q12 of chromosome 17 in the signature (See
Collectively, these data indicate that neoangiogenesis and/or changes in blood vessel organization may play an important role in the pathogenesis of these tumors, and confirm that Herceptin and anti-cancer agents have an additive and/or synergistic activity. Other genes in the near vicinity of ERBB2 locus may be co-amplified with ERBB2 gene but may not be expressed due to the absence of an appropriate promoter or to repression. It is known that only a small proportion of genes from a given amplicon are over-expressed (37).
Other over-expressed genes were not located on chromosome arm 17q. CDH15, also called M-Cadherin or myotubule cadherin, is expressed in myoepithelial cells and may play a role in the muscle-like differentiation of these cells. Again, without wishing to be bound by any theory, this might suggest that ERBB2+ tumors have a certain degree of myoepithelial differentiation; alternatively they may be characterized by a high degree of dedifferentiation with appearance of new markers (this may also be true for other RNAs such as PECAM1).
An interesting finding was GATA4, whose co-expression with ERBB2 was validated at the protein level. This gene codes for a transcription factor of the GATA family (38). It is expressed in adult vertebrate heart, gut epithelium, and gonads. GATA4 is essential for cardiovascular development (39, 40), and regulates genes critical for myocardial differentiation and function. Likewise, ERBB2 is essential for heart development (41; reviewed in 42). Therefore, without wishing to be bound by any theory, ERBB2 may exert some of its downstream effects through GATA4 or, alternatively, GATA4 may stimulate ERBB2 gene transcription by positive feedback regulation.
MAP2K6 is also strongly expressed in cardiac muscle (43). The major adverse effect of Herceptin is cardiotoxicity (44). Investigation of the functional relationship between ERBB2, GATA4 and MAP2K6 may enhance current understanding of cardiotoxicities associated with ERBB2 antagonists, and contribute to design ways to circumvent this side-effect. Activation of GATA4 is thought to occur through RHO GTPases (45, 46), which are also central to the physiologic and pathophysiologic functions of integrins and cadherins (47, for review).
The data disclosed herein also shows variability in ERBB2 and/or GATA4 gene expression, and ERBB2 and GATA4 co-variability may potentially serve as an indicator of patient risk for cardiotoxicity by Herceptin treatment. Therefore, we also provide to a method for determining the risk of averse cardiovascular secondary events for patients treated with Herceptin, comprising the analysis of the differential expression GATA4 gene from a sample or cell line of said patient.
As discussed above, we provide a method comprising the detection of the over- or under-expression of at least one, preferably at least two or more preferably three, polynucleotide sequence(s), subsequence(s) or complement(s) thereof, selected from each of at least one predefined polynucleotide sequence sets consisting of:
Set 1: SEQ ID NO. 73, 74, 75, 76, 77 (ERBB2); and
Set 4: SEQ ID NO. 78, 79, 80 (GATA4).
The MKI67 gene encodes the proliferation marker Ki67/MIB1. This marker was upregulated in ERBB2+ samples, suggesting that ERBB2+ tumors are proliferative tumors. Immunohistochemical results on ˜250 TMA1 tumors for ERBB2 and Ki67 stainings showed that expression of both proteins were correlated, confirming gene clustering at the protein level, in agreement with recent reports (48, 49). The over-expression of the CSTA gene, which encodes cystatin A, a cysteine protease inhibitor of the stefin family that acts as endogenous inhibitor of cathepsins, can be put in perspective with the finding of Oh et al. (14) on the downregulation of cathepsin D in ERBB2-transfected MCF-7 cells. Finally, the presence of genes encoding two structurally-related factors, lymphotoxin A (LTA) and preB-cell colony-enhancing factor (PBEF), and NFKBIE imply that specific immune and inflammatory mechanisms may be associated with ERBB2+ tumors.
Five genes with known function were downregulates in ERBB2-positive tumors. Interestingly, one of these was ESR1, which encodes estrogen receptor α, an important modulator of hormone dependent mammary oncogenesis. It is recognized that most ERBB2-amplified tumors are ER-negative and are resistant to hormone therapy (50-53). Moreover, an interplay between ERBB2 and ER pathways has been demonstrated (54). SCUBE2, a gene encoding a secreted protein with an EGF-like domain (55), and CELSR2, which encodes a non-classical cadherin, might have antagonistic regulatory roles of ERBB2 activities at the cell membrane. SCUBE2 and NAT1 were associated to ESR1 in a gene expression signature associated with ER positivity (24).
Several recent gene expression studies have addressed the issue of ERBB2 status and function in breast cancer. Most of them used cancer cell lines, and others included tissue samples.
An early large-scale study of the ERBB2 amplicon was done on 7 breast tumor cell lines by Kauraniemi et al. (30) using a custom-made cDNA microarray that included 217 clones from chromosome region 17q12. ERBB2, GRB7, PPP1R1B were consistently over-expressed when amplified, in conjunction with other genes that were not on microarray constructed from libraries. Willis et al. (56) used a commercially available oligonucleotide chip (Affymetrix GeneChip Hu35K) to study mRNA from 12 breast tumors and from two cell lines also typed using comparative genomic hybridization. A total of 20 known genes showed significant over-expression in tumors with gains of region 17q12-23. These included ERBB2, GRB7, PPARBP, but also MLLT6. KRT10 and TUBG1 that were not identified in the gene signature.
Wilson et al. (31) used a commercially available “breast specific” nylon microarray with ˜5,000 cDNAs to study cell lines and two sets of 5 ERBB2-positive and negative pooled breast tumors. Only few genes from 17q were among the upregulated genes; these included RPL19 and LASP1. Dressman et al. (57) studied 34 tumors and established a gene expression signature specific of ERBB2+ samples that contained several 17q genes including GRB7, NR1D1, PSMB3, and RPL19. Sorlie et al. (24) have also defined ERBB2+ signature with five genes from 17q12, including ERBB2 and GRB7.
Genes located in the vicinity of ERBB2 are frequently co-upregulated following DNA amplification. This phenomenon is less marked for genes located further apart from ERBB2, which may be included only when the amplification affects a large segment from the region. Some of the genes close to ERBB2 did not appear in the present signature, whereas they were upregulated in other studies (i.e. LASP1, MLLT6). This may be due to a different proportion of tumors with variably-sized amplicons in the analyzed panels.
While amplification of region 17q12-21 can affect ERBB2 chromosomal neighbors, ERBB2 protein over-expression can affect downstream targets and possibly also upstream regulators via positive feedback regulatory mechanisms. Balance in cadherins and integrins and functional processes associated with cell-matrix adhesive systems seem particularly affected in ERBB2-positive tumors (31). This suggests that ERBB2 oncogenic activity may be associated with cell motility, as has been proposed previously (58, 59).
A recent study, using DNA microarrays from the Sanger center containing ˜6,000 unique genes/ESTs, has described the transcriptional changes associated with a series of 61 genes following over-expression of a transfected ERBB2 gene in an immortalized HB4a human mammary luminal epithelial cell line (60). Previously, several studies had identified genes whose transcription is affected by ERBB2 over-expression or amplification using differential screening (14, 61). Some of these genes are located near the ERBB2 locus. The present gene expression signature GES shares no common gene with the list of Kumar-Sinha et al. (62) established in comparing cell lines including ERBB2-transfected cell line; however, a gene related to fatty acid biology, FADS2, is part of the present gene expression signature.
Tiwari et al. (63) reported a relationship between ERBB2, fatty acids and 2′,5′ oligoadenylate synthetases (OAS2), which is included in the present “ERBB2 cluster” (See the figures). Peroxisome proliferator-activated receptors (PPARs) are known regulators of lipid metabolism; their trans-activating capacity depends on the recruitment of auxiliary proteins (64, for review. Modifications of fatty acid metabolism in ERBB2+ tumors may thus be associated with over-expression of PPARBP.
Alteration of ERBB2 expression is associated with poor prognosis (unfavorable clinical outcome with metastasis and death) and can be countered by a targeted therapy based on a humanized antibody, trastuzumab (Herceptin™). Therefore, the determination of ERBB2 status is important in breast cancer management. Accurate quantitation of ERBB2 expression, however, has proved to be difficult since both IHC and FISH have limitations and can be influenced by many variables (9-13). As a consequence, there is still no consensus on the best method for assessing ERBB2 status. In routine practice, IHC, which more than FISH detects the actual target of Herceptin™, is faster and more economic but highly dependent on fixative conditions, staining procedures, scoring system, quality controls and interlaboratory standardization. In addition, results are often difficult to interpret since a number of cases show only moderate over-expression of the protein and discrepancies in the results are subject to interobserver variability. FISH methods are quantitative and sensitive (65), but are also expensive, time-consuming and require specialized expertise and equipment. Indeed, variable concordance between IHC and FISH have led to the current practice of testing +2 HercepTest patients by both IHC and FISH to making a clinical decisions on whether to recommend treatment with anti-ERBB2 antagonists.
The work carried out shows the potential of DNA microarray-based gene expression profiling to establish ERBB2 status, and to identify among ERBB2 2+ cases those with gene amplification and those without.
Our methods will now be illustrated by the following non-limiting examples.
Using DNA microarrays, 217 breast cancer samples obtained from 210 women treated at the Institute Paoli-Calmettes between 1988 and 2001 were studied. Inclusion criteria of samples were: i)—sporadic primary localized breast cancer treated with surgery followed by adjuvant anthracyclin-based chemotherapy, ii)—tumor material quickly dissected and frozen in liquid nitrogen and stored at −160° C. Exclusion criteria included locally advanced or inflammatory or metastatic forms. The main characteristics of patients and tumors are listed in Table 2 below.
Immunohistochemical parameters collected included estrogen receptor (ER), progesterone receptor (PR) and P53 status (positivity cut-off values of 1%), and ERBB2 status (0-3+ score as illustrated by the HercepTest kit scoring guidelines). All tumor sections were reviewed de novo by two pathologists prior to analysis, and all samples contained more than 50% tumor cells. The series of 217 samples was divided in two sets: a first set of 163 samples, from which was derived, before supervised analysis, a “learning” set of 145 samples, and a second set of 54 samples designated the “validation” set.
A consecutive series of 552 women with unilateral localized invasive breast carcinomas treated at the Institut Paoli-Calmettes between June 1981 and December 1999 was studied using a first TMA designated TMA1. Of the 552 cases studied, 257 were available for ERBB2, GATA4, ER and Ki67 staining. According to the WHO classification, there were 194 ductal, 26 lobular, 10 tubular, 3 medullary carcinomas and 24 other histological types. The average age at diagnosis was 59 years, median age 60, with a range of 25 to 91 years. A total of 135 tumors were associated with lymph node invasion, and 199 were positive for ER. A set of 94 tumors (chosen within tumors analyzed by DNA microarrays) was included in a second TMA designated TMA2.
Except for SUM-52, SUM-102, and SUM-149 (a gift of S. P. Ethier, Ann Arbor, Mich.) the breast cancer cell lines (BT474, HCC38, HCC1395, HCC1569, HCC1937, MDA-MB-157, MDA-MB-231, MDA-MB453, SK-BR-3, SK-BR-7, T47D, UACC-812, and ZR-75-1) were obtained from the American Type. Culture Collection (ATCC; Rockville, Md.). All cell lines were grown according to the recommendations of the supplier.
Total RNA was extracted from frozen tumor samples and cell lines by standard methods using guanidinium isothiocyanate solution and centrifugation on cesium chloride cushion, as previously described in (25), the entire disclosure of which is herein incorporated by reference. RNA integrity was controlled by electrophoresis on agarose gels and by Agilent analysis (Bioanalyzer, Palo Alto, Calif.) before labeling.
PCR products from a total of 9038 Image clones, including 3910 expressed sequenced tags (EST) and 5125 known genes, were spotted on 12×8.5 cm2 nylon filters with a Microgrid II robot (Biorobotics Apogent Discoveries). Several controls were included in the microarrays, such as poly(A)+ stretches, plant cDNAs, and PCR controls. Microarray spotting and hybridization processes were done as previously described in (19), the entire disclosure of which is herein incorporated by reference.
Hybridizations of microarray membranes were done with radioactive [alpha-33P]-dCTP-labeled probes made from 5 μg of total RNA from each sample according to described protocols. Membranes were then washed, exposed to phosphor-imaging plates and scanned with a FUJI BAS 1500 machine. Signal intensities were quantified with ArrayGauge software (Fuji, Dusseldorf, Germany), normalized for amount of spotted DNA as described in (21) the entire disclosure of which is herein incorporated by references and the variability of experimental conditions using non-linear rank-based methods as described in (26), the entire disclosure of which is herein incorporated by references then log-transformed. We first applied supervised analysis to identify the optimal set of genes which best discriminated between ERBB2-negative and positive breast cancer samples. The positivity cut-off of ERBB2 status was defined by protein expression using IHC (HercepTest™ kit): positive status was defined as 3+ and negative status as 0 or 1+ (See
ProfileSoftware™ Corporate (Ipsogen, Marseille) was utilized for all analyses. This program uses a discriminating score (DS) (17) combined with iterative random permutation tests. The DS′ was calculated for each gene as DS=(M1−M2)/(S1+S2) where M1 and S1 respectively represent mean and standard deviation of expression levels of the gene in subgroup 1 (ERBB2-positive), and M2 and S2 in subgroup 2 (ERBB2-negative). Statistical confidence levels were estimated by bootstrap resampling as previously described in (27) the entire disclosure of which is herein incorporated by references with a false positive rate of 2/10000.
Briefly, approximately two-thirds (n=106) of the samples from the learning set (n=145) were randomly selected to include at least 20 ERBB2-positive cases. They were then submitted to supervised analysis described above. The process was repeated 30 times (30 randomly defined subgroups of 106 samples), thus generating 30 lists of genes. These lists were then compared and a gene was considered as a discriminator if present in at least 25 gene-lists out of 30; allowing the identification of the most relevant genes, independent of the sample set used.
Unsupervised hierarchical clustering was applied to investigate relationships between samples and relationships between genes identified by supervised analysis. The hierarchical clustering was applied to data log-transformed and median-centred on genes using the ProfileSoftware™ Corporate program (Ipsogen, Marseille) (average linkage clustering using uncentered Pearson correlation as similarity metric) and results were displayed with the same program.
Two TMA, TMA1 (552 samples) and TMA2 (94 samples), were prepared as described in (28) with slight modifications (29) the entire disclosure of which are herein incorporated by reference. For each tumor, a representative tumor area was carefully selected by histopathological analysis of a hematoxylin-eosin stained section of a donor block. Core cylinders (one for each tumor for TMA2 and three for each tumor for TMA1) with a diameter of 0.6 mm for TMA 1 and 2 mm for TMA2, were punched from this area and deposited into a recipient paraffin block using a specific arraying device (Beecher Instruments, Silver Spring, Md.). In addition to tumor tissues, the recipient block also included normal breast and established breast tumor cell lines to serve as internal controls: BT-474 known to have four to eight-fold amplification of the ERBB2 gene, and MCF-7, whose chromosomes 17 each have one copy of the ERBB2 gene (30). Five-μm sections of the resulting array block were mounted onto glass slides and used for IHC (TMA1) and FISH (TMA2) analyses. The reliability of the method was assessed by comparison with conventional sections for the usual prognostic parameters (including estrogen receptor and ERBB2); the value of the kappa test was 0.95 (29).
The following antibodies were used for IHC: polyclonal antibody anti-ERBB2 (Dako-HercepTest™, Copenhagen, Denmark), used strictly following the guidelines described by the manufacturer; goat polyclonal antibody anti-GATA4 (sc-1237, 1:50 dilution; Santa Cruz Biotechnology, Inc., Santa Cruz, Calif.), anti-MIB1/Ki67 (1:100 dilution, Dako), anti-ER-(clone 6F11, 1:60 dilution, Novocastra Laboratories)
IHC was done on five-μm sections of TMA1. Briefly, tissues were deparaffinized in Histolemon (Carlo Erba Reagenti, Rodano, Italy) and rehydrated in graded alcohol. Antigen retrieval was done by incubation at 98° C. in citrate buffer. Slides were transferred to a Dako autostainer, except for Dako-HercepTest™ where guidelines are imposed by the manufacturer. Staining was done at room temperature as follows: after washes in phosphate buffer, endogenous peroxidase activity was quenched by treatment with 0.1% H2O2, slides were pre-incubated with blocking serum (Dako Corporation) for 10 min, then incubated with the affinity-purified antibody for one hour. After washes, slides were incubated with biotinylated antibody against rabbit IgG for 20 min followed by streptadivin-conjugated peroxidase (Dako LSAB®2 kit). Immunoreactive complexes were visualized with the peroxidase substrate, diaminobenzidine, counter-stained with hematoxylin, and coverslipped using Aquatex (Merck, Darmstadt, Germany) mounting solution. Slides were evaluated under a light microscope by three pathologists.
Immunoreactivities for GATA4 and ER were classified by estimating the percentage (P) of tumor cells showing characteristic staining (from undetectable level or 0%, to homogenous staining or 100%) and by estimating the intensity (I) of staining (weak staining or 1, moderate staining or 2, strong staining or 3). Results were scored by multiplying the percentage of positive cells by the intensity, i.e. by the so-called quick score (Q) (Q=P×I; maximum=300). For Ki67, only the percentage (P) of tumor cells was estimated, since intensity does not vary and for ERBB2, the status was defined using the Dako scale. Expression levels allowed the tumors to be grouped in two categories: no expression (Q=0 for GATA4 and ER, P<20 for Ki67, and 0/+ for ERBB2), and expression (Q>0 for GATA4 and ER, P>20 for Ki67, and 2+/3+ for ERBB2). The average of the score of a minimum of two core biopsies was calculated for each case of TMA 1.
FISH for ERBB2 gene amplification was done on TMA2 using the Dako ERBB2 FISH PharmDX™ Kit according to the manufactuter's instructions. In brief, TMA2 sections were baked overnight at 55° C., deparaffinized in Histolemon (Carlo Erba Reagenti, Rodano, Italy), rehydrated in graded alcohol and washed in Dako wash buffet. Slides were pretreated by immersion in Dako pretreatment solution at 97° C. for 10 min and cooled to room temperature. Slides were then washed in Dako wash buffer and immersed in Dako pepsin at room temperature for 10 min. Pepsin was removed with two changes of wash buffer. Slides were dehydrated in graded alcohol. Ten μl of HER2/CEN17 (centromere 17) Probe Mix (Dako) was added to the sample area of each section. Sections were coverslipped and the edges were sealed with rubber cement. Slides were placed on a flat metal surface and heated at 82° C. for 5 min to codenature the probe and target DNA, and transferred to a preheated humidified hybridization chamber to hybridize the probe and DNA for 18 h at 45° C. After hybridization, the rubber cement and the coverslips were removed from the slides. Sections were washed in wash buffet at 65° C. then at room temperature. Slides were dehydrated in graded alcohol and air-dried in the dark. Nuclei were counterstained with 15 μl of DAP1/antifade and coverslipped. Slides were stored at −4° C. in the dark for up to 7 days prior to analysis.
Sections were examined with a fluorescent microscope (Zeiss-Axiophot) using the filter recommended by Dako. The invasive lesion selected for the TMA2 was easily localized under the microscope. Approximately forty malignant, non overlapping cell nuclei were scored for each case, and included and scored only if HER2 and CEN17 signals were clearly detected. A ratio of HER2/CEN17 was calculated for each specimen that met this inclusion criteria. ERBB2 was considered as amplified when the FISH ratio HER2/CEN17 was >=2.0. Each assay was read twice by two observers. Specimens were considered negative when less than 10% of tumor cells showed amplification of ERBB2.
Correlations between hierarchical clustering-based tumor groups and molecular and histoclinical parameters were investigated by using the Chi2 test. All p-values were two-sided at the 5% level of significance. Distributions of molecular markers analyzed by TMA1 were compared using Fisher exact test.
The mRNA expression profiles from 217 different human breast cancer samples and 16 breast cancer cell lines were determined with cDNA microarrays containing ˜9,000 spotted PCR products from known genes and ESTs. Analysis, both supervised and unsupervised, identified an ERBB2-specific gene expression signature (GES). To further validate this signature, studies were completed by FISH and IHC analyses on breast cancer tissue microarrays.
1) Identification and Validation of an ERBB2 Gene Expression Signature from Tumor Profiling
Supervised analysis was utilized to identify a gene expression signature correlated with ERBB2 status. It was applied to the mRNA expression profiles from 145 randomly chosen breast cancer samples (learning set) by comparing two subgroups defined by their ERBB2 status as determined by standard IHC: samples scoring 0 and 1+ (hereafter designated ERBB2−, 116 samples) were compared to samples scoring 3+ (ERBB2+, 29 samples). Cases with equivocal 2+(n=10) or unavailable (n=8) staining were excluded from analysis. To identify a molecular signature independent from the predefined subgroups of tumors identified by IHC, several different subsets of samples were iteratively defined and supervised analysis was performed on each of these subsets independently. Thirty such iterations were done. The lists of genes identified as significant discriminators (these lists ranged from 80 to 274 clones) were then compared, revealing 37 clones present in at least 25 lists: these clones defined an ERBB2-specific gene expression signature (GES). All of the genes identified in this signature were tag-resequenced to confirm their identity.
Once identified on this set of 145 samples, we validated our ERBB2 GES in an independent set of 54 breast cancer samples (validation set). As shown in
2) Comparative Analysis of ERBB2 Gene Expression Signature of Human Breast Tissues to Breast Cancer Cell Lines
On the Ipsogen DiscoveryChip, a series of 16 breast cancer cell lines were profiled. The cell lines included 5 cell lines (BT474, HCC1569, MDA-MB-453, SK-BR-3 and UACC-812) known to have amplification and/or high mRNA expression of the ERBB2 gene (30, 31). ERBB2 GES successfully separated ERBB2+ and ERBB2-cell lines (
Collectively, these analyses demonstrated that the ERBB2 gene expression signature correctly classified breast tumors and cell lines consistent with ERBB2 status evaluated with standard procedure (Herceptest™, Dako Corporation).
3) Analysis of Breast Tumor Samples Using Tissue Microarrays
Significant discriminator genes were further validated by immunohistochemical analysis of their corresponding proteins (
We found 40% of ERBB2-positive tumors in ER-negative tumors but only 10% in ER-positive tumors.
A total of 68 (72%) of the 94 samples included in TMA2 were available for FISH analysis of ERBB2 locus. Examples of results are shown in
Previous supervised analyses did not include the breast cancer samples scored 2+ for ERBB2 IHC. We reclassified these cases with all 145 samples previously analyzed—which included the 68 cases with available FISH ERBB2 data—by using hierarchical clustering program based on ERBB2 GES. Results are displayed in
Despite significant transcriptional heterogeneity between tumors for these genes, the combined expression patterns defined at least three clusters of tumors, designated A, B and C. Group A (73 cases, in green) displayed an over-expression of the “ER cluster” and an under-expression of the “ERBB2 cluster” overall compared to groups B and C. Conversely, the “ERBB2 cluster” and the “ER cluster” were upregulated and downregulated in group C samples (36 cases, in red) overall, as compared to other groups. Finally, group B′ (50 cases, in black) displayed an intermediate profile with heterogenous expression of the “ERBB2 cluster” and under-expression of the “ER cluster”.
Correlations of tumor groups as defined by hierarchical clustering with ERBB2 status were analyzed. As expected, group C strongly differed from the other groups with respect to ERBB2 protein expression since 93% of all ERBB2 3+ samples were located in this group. In group C 77% of samples scored 3+, 9% 2+ and 14% 0-1+; in contrast, in groups A and B, these rates were 0% and 5% (3+), 3% and 10% (2+), and 97% and 85% (0-1+) (p<0.0001, Chi2 test, A vs B vs C groups), respectively. As expected, there was also a strong correlation between tumor groups and FISH status with most of the FISH positive cases clustered in group C (p<0.0001, Chi2 test, A vs B vs C groups). ERBB2 FISH information and IHC status were both available in 64 cases out of 159. Interestingly, the three 2+ tumors located in group C displayed ERBB2 amplification (FISH positive), while the seven 2+ tumors included in group A (2 cases) and group B (5 cases) had no amplification (FISH negative). These results shows that our ERBB2 GES could separate FISH-positive and FISH-negative ERBB2 2+ tumors, providing more specific information than FISH with respect to ERBB2 IHC status (HercepTest™). Indeed, the correlation between GES groups (C samples vs A+B samples) and FISH result (negative vs positive) provided a sensitivity of 90% and a specificity of 88% (concordance in 89% of cases). In comparison, the correlation between IHC-based grouping (0-1+ vs 2-3+) and FISH status showed an equal sensitivity of 90% but a weaker specificity of 76% (concordance in 82% of cases) (Table 4 hereunder).
Sensitivity was better for the two-comparisons; as shown in
5) Correlation with Histoclinical Parameters
We searched for correlations between tumor groups and relevant molecular and histoclinical parameters of samples. Our GES-based grouping correlated with SBR grade and hormone receptor status, further, albeit indirectly, validating our classification. Group C did not contain grade 1 samples; 44% of samples were grade 2 and 56% were grade 3. In groups A+B, 15% of samples were grade 1, 48% were grade 2 and 37% were grade 3 (p=0.02, Chi-2 test). In group C, samples were likely to be ER-negative (59%), compared with 27% in groups A+B (p=0.001, Chi-2 test). Similarly, although not significant, correlation was found for PR status (p=0.07, Chi2 test). No correlation was found with pathological size of tumors, axillary lymph node status and P53 IHC status.
All documents referred to above are herein incorporated by reference in their entirety. A variety of modifications to the embodiments described will be apparent to those skilled in the art from the disclosure provided herein. Thus, the disclosure may be embodied in other specific forms without departing from the spirit or essential attributes thereof and, accordingly, reference should be made to the appended claims, rather than to the foregoing specification, as indicating the scope of the disclosure.
This is a divisional of U.S. Ser. No. 10/928,465, filed Sep. 27, 2004, which claims the benefit of U.S. Ser. No. 60/498,497, filed on Aug. 28, 2003, the entire disclosure of which is herein incorporated by reference.
Number | Date | Country | |
---|---|---|---|
60498497 | Aug 2003 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 10928465 | Aug 2004 | US |
Child | 12212282 | US |