The present invention relates to methods for identifying ERBB2 alteration in tumors, in particular cancer, based on the analysis of the over or under expression of polynucleotide sequences in a tissue sample.
The amplification of the ERBB2-region of chromosome 17 results in the constitutive overexpression of the ERBB2 (also named <<HER2>>) oncogene protein and fuels uncontroled tumor growth in approximately 15 to 30% of breast tumors. ERBB2 is considered today as a predictive marker for clinical benefit from trastuzumab, or Herceptin®, a monoclonal antibody directed against the ERBB2 protein, in both primary and metastatic tumors. However current testing methods are inaccurate for as much as 20% of cases and this may lead to missing the benefit of Herceptin® therapy for some patients or, on the contrary, to prescribing unnecessary therapy for others.
Currently, tumors are tested for ERBB2 with 2 main complementary technologies: immunohistochemistry (IHC) which identifies ERBB2 protein expressed in the tumor cells and in situ hybridization (ISH), which quantifies ERBB2 DNA copy number in the cell chromosomes. Some RT-PCR assays, that quantify the amount of ERBB2 mRNA, have also been developed more recently.
There is need of cancer signature showing higher performance, in terms of robustness, specificity and sensibility, for identifying ERBB2 alteration in tumors, in particular cancer.
The Applicant has now defined a new signature predicting ERBB2 status.
The authors of the present invention have now discovered, entirely unexpectedly, a signature predicting ERBB2 status, which correlates with the expression of the HER2 protein at cell membrane level. The test, developed on a set of 152 tumors, was validated in 3 independent datasets totaling 152 tumors. The test correlates with the IHC method in 96% of the cases and it resolves 95% of equivocal IHC cases.
Surprisingly, the Inventors found some genes, strongly correlated with ERBB2 IHC.
These genes allow obtaining a signature predicting ERBB2 status in one step with a global performance (sensitivity, specificity, robustness, etc. . . . ) improved compared to the prior 2-steps methods such as those requiring performing the FISH score after performing IHC method.
Furthermore, these genes are independent with the oestrogen receptor (ER) status of the patient. So, there is no need to perform the ER test before performing the test with the genes of the invention.
Finally, the Inventors found the these genes are located in the ERBB2 amplicon, and capture information about DNA amplification.
The method of the invention also reconciles information at the protein, RNA and DNA level. In other words, the information obtained by using the method of the invention reflects the situation at the genomic, transcriptomic, as well as proteomic level.
So, the invention relates to a method for identifying ERBB2 alteration in tumors, in particular cancer, based on the analysis of the over or under expression of genes in a tissue sample, said analysis comprising:
In a particular aspect of the invention, the method of detection of the expression of the group of genes may comprise, or may consist of at least three, or at least four, or at least five, or at least six, or at least seven, or of eight genes selected among the following genes: ERBB2, C17orf37, GRB7, PERLD1, STARD3, CRKRS, FGFR2, ZRANB1.
In another particular aspect of the invention, the method of detection of the expression of the group of genes may comprise, or may consist of, at least three, or at least four, or at least five, or at least six, or at least seven, or of eight genes selected among the following genes: ERBB2, C17orf37, GRB7, PERLD1, STARD3, CRKRS, FGFR2, ZRANB1, and of the gene corresponding to SEQ ID NO. 31.
In a particular embodiment of the invention, the group of genes may comprise, or may consist of: ERBB2, C17orf37 and GRB7.
In another particular embodiment of the invention, the group of genes may comprise, or may consist of: ERBB2, C17orf37, GRB7, and the gene corresponding to SEQ ID NO. 31.
In another particular aspect of the invention, the group of genes may comprise, or may consist of: ERBB2, C17orf37, GRB7 and PERLD1.
In another particular aspect of the invention, the group of genes may comprise, or may consist of: ERBB2, C17orf37, GRB7 and PERLD1, and the gene corresponding to SEQ ID NO. 31.
In another particular aspect of the invention, the group of genes may comprise, or may consist of: ERBB2, C17orf37, GRB7, PERLD1 and STARD3.
In another particular aspect of the invention, the group of genes may comprise, or may consist of: ERBB2, C17orf37, GRB7, PERLD1 and STARD3 and of the gene corresponding to SEQ ID NO. 31.
In another aspect of the invention, the group of genes may comprise, or may consist of: ERBB2, C17orf37, GRB7, PERLD1, STARD3 and CRKRS.
In another aspect of the invention, the group of genes may comprise, or may consist of: ERBB2, C17orf37, GRB7, PERLD1, STARD3 and CRKRS and of the gene corresponding to SEQ ID NO. 31.
The sequences allowing to detect the genes above mentioned may be of any kind of nucleic acid, as the man skilled in the art surely knows how to detect a gene among other in a tissue sample.
In a particular embodiment of the invention, this detection may be realized by hybridization of polynucleotide sequences from a tissue sample with cDNA total sequence or with cDNA subsequences of said genes, or with primers, or with the following polynucleotide sequences: SEQ ID NO. 17, SEQ ID NO. 18, SEQ ID NO. 19, SEQ ID NO. 20, SEQ ID NO. 21, SEQ ID NO. 22, SEQ ID NO. 23, SEQ ID NO. 24, SEQ ID NO. 25, SEQ ID NO. 26, SEQ ID NO. 27, SEQ ID NO. 28, SEQ ID NO.29, SEQ ID NO. 30, SEQ ID NO. 31, SEQ ID NO. 32.
In another particular embodiment of the invention, this detection may be realized by hybridization of polynucleotide sequences from a tissue sample with a group of polynucleotide sequences comprising, of consisting of, at least one, or at least two, or at least three, or at least four, or at least five, or at least six, or at least seven, of the following sequences: SEQ ID NO. 17, SEQ ID NO. 18, SEQ ID NO. 19, SEQ ID NO. 20, SEQ ID NO. 21, SEQ ID NO. 22, SEQ ID NO. 23, SEQ ID NO. 24, SEQ ID NO. 25, SEQ ID NO. 26, SEQ ID NO. 27, SEQ ID NO. 30, SEQ ID NO. 31, SEQ ID NO. 32.
The polynucleotide sequences SEQ ID NO. 17 to SEQ ID NO. 32 are polynucleotide sequences (also called “probesets”) capable to react with nucleic acid samples of the genes showed in table 1:
The sequences mentioned above are the following ones:
These probesets are AFFYMETRIX (HG-U133_PLUS—2) probes (http://www.affymetrix.com/products_services/arrays/specific/hgu133plus.affx).
SEQ ID NO. 1 and 2 represents 2 isoformes of the ERBB2 genes. These 2 isoformes are matched by the probeset SEQ ID NO. 17.
SEQ ID NO. 5 and 6 represents 2 isoformes of the GRB7 gene. These 2 isoformes are matched by the probeset SEQ ID NO. 20.
SEQ ID NO. 8 and 9 represents 2 isoformes of the CRKRS gene. These 2 isoformes are matched by the probeset SEQ ID NO. 25.
SEQ ID NO. 10 and 11 represents 2 isoformes of the FGFR2 gene. These 2 isoformes are matched by the probeset SEQ ID NO. 27.
According to a particular embodiment of the invention, the method of the invention may be realized by hybridization of the polynucleotide sequences group comprising, or consisting of: SEQ ID NO. 17, SEQ ID NO. 18, SEQ ID NO. 19 and SEQ ID NO. 20.
According to another particular embodiment of the invention, the method of the invention may be realized by hybridization of the polynucleotide sequences group comprising, or consisting of: SEQ ID NO. 17, SEQ ID NO. 18, SEQ ID NO. 19 and SEQ ID NO. 20, and of SEQ ID NO. 31.
According to another particular embodiment of the invention, the method of the invention may be realized by hybridization of the polynucleotide sequences group comprising, or consisting of: SEQ ID NO. 17, SEQ ID NO. 18, SEQ ID NO. 19, SEQ ID NO. 20, SEQ ID NO. 21, and SEQ ID NO. 22.
According to another particular embodiment of the invention, the method of the invention may realized by hybridization of the polynucleotide sequences group comprising, or consisting of: SEQ ID NO. 17, SEQ ID NO. 18, SEQ ID NO. 19, SEQ ID NO. 20, SEQ ID NO. 21, SEQ ID NO. 22, and SEQ ID NO. 31.
According to another particular embodiment of the invention, the method of the invention may be realized by hybridization of the polynucleotide sequences group comprising, or consisting of: SEQ ID NO. 17, SEQ ID NO. 18, SEQ ID NO. 19, SEQ ID NO. 20, SEQ ID NO. 21, SEQ ID NO. 22, SEQ ID NO. 23 and SEQ ID NO. 24.
According to another particular embodiment of the invention, the method of the invention may be realized by hybridization of the polynucleotide sequences group comprising, or consisting of: SEQ ID NO. 17, SEQ ID NO. 18, SEQ ID NO. 19, SEQ ID NO. 20, SEQ ID NO. 21, SEQ ID NO. 22, SEQ ID NO. 23, SEQ ID NO. 24 and SEQ ID NO. 31.
According to another particular embodiment of the invention, the method of the invention may be realized by hybridization of the polynucleotide sequences group comprising, or consisting of: SEQ ID NO. 17, SEQ ID NO. 18, SEQ ID NO. 19, SEQ ID NO. 20, SEQ ID NO. 21, SEQ ID NO. 22, SEQ ID NO. 23, SEQ ID NO. 24, SEQ ID NO. 25, SEQ ID NO. 26, SEQ ID NO. 28, SEQ ID NO. 30, SEQ ID NO. 31, SEQ ID NO. 32.
According to another particular embodiment of the invention, the method of the invention may be realized by hybridization of the polynucleotide sequences group comprising, or consisting of: SEQ ID NO. 17, SEQ ID NO. 18, SEQ ID NO. 19, SEQ ID NO. 20, SEQ ID NO. 21, SEQ ID NO. 22, SEQ ID NO. 23, SEQ ID NO. 24, SEQ ID NO. 25, SEQ ID NO. 26, SEQ ID NO. 27, SEQ ID NO. 28, SEQ ID NO. 29, SEQ ID NO. 30, SEQ ID NO. 31, SEQ ID NO. 32.
Advantageously, the method of the invention comprises the following steps:
Advantageously, the nucleic acids sample may be labelled before reaction step (a).
Advantageously, the label of the polynucleotide sample may be selected from the group consisting of radioactive, colorimetric, enzymatic, e.g. biotinilated label, molecular amplification, bioluminescent or fluorescent labels.
Advantageously, the tissue may be fixed, paraffin-embedded, or fresh, or frozen.
For all the particular aspects of the invention, the expression of polynucleotide sequences in a tissue sample may by determined by measuring the expression level of RNA transcript(s) by real-time polymerase chain reaction (RT-PCR).
For all the particular aspects of the invention, the method may further comprise obtaining a control polynucleotide sample, reacting said control sample with said polynucleotide sequences, detecting a control sample reaction product and comparing the amount of said polynucleotide sample reaction product to the amount of said control sample reaction product.
Advantageously, the method the tissue sample may be a human sample.
Advantageously, the method of the invention allows to detect cancers selected from the group consisting of breast cancer, lung cancer, colorectal cancer, pancreatic cancer, prostate cancer, ovarian cancer, head and neck cancer, esophageal cancer, glioblastoma multiforme, hepatocellular cancer, gastric cancer, cervical cancer, liver cancer, bladder cancer, cancer of the urinary tract, thyroid cancer, renal cancer, carcinoma, melanoma, and brain cancer.
Advantageously, the tissue sample may be breast cancer sample.
Advantageously, the method of the invention allows the determination of the expression of the ERBB2 protein at cell membrane level.
Advantageously, the method of the invention allows to determine the ERBB2 immunohistochemical (IHC) status of a cancer patient, e.g., a breast cancer patient.
Another object of the invention is the use of the method of the invention for detecting, diagnosing, staging, monitoring cancer or following up the stage or aggressiveness of a cancer.
Any of the polynucleotide sequences groups as mentioned above may be used for the use according to the invention.
Advantageously, this use allows the monitoring of the treatment of a patient with a cancer selected from the group consisting of breast cancer, lung cancer, colorectal cancer, pancreatic cancer, prostate cancer, ovarian cancer, head and neck cancer, esophageal cancer, glioblastoma multiforme, hepatocellular cancer, gastric cancer, cervical cancer, liver cancer, bladder cancer, cancer of the urinary tract, thyroid cancer, renal cancer, carcinoma, melanoma, and brain cancer, e.g., breast cancer, and comprises the implementation of the method in any of its aspects on nucleic acids from a cancer tissue, e.g. breast cancer tissue sample of a patient.
Advantageously, the use of the method of the invention allows the assessment of the ERBB2 gene expression status of a patient for whose status could not has be previously clearly assessed with a immunohistochemical (IHC) assay for determination of ERBB2 overexpression in breast cancer, .e.g. of patients scoring 2+ with the HercepTest™ (Dako, Denmark, AS).
In other words, the use of this method allows the assessment of the ERBB2 gene expression status of a patient presenting equivocal results with IHC assay.
Indeed, a 2+ score obtained with the Herceptest™ does not allow to determine the ERBB2 status.
Advantageously, the monitoring relates to the clinical efficacy of an anti-ERBB2 treatment, e.g. by Herceptin™ (trastuzumab) treatment.
Advantageously, the use of the method allows the determination of a treatment for the patient or animal with a cancer according, e.g., breast cancer based on the analysis of differential gene expression profile obtained with said method.
Another object of the invention is a polynucleotide library useful for the molecular characterization of a cancer, e.g. breast cancer, that may comprise or may consist of polynucleotide sequences for detecting the genes as defined above.
Advantageously, the polynucleotide library may comprise, or may consist of cDNA total sequence or of cDNA subsequences of said genes.
Advantageously, the polynucleotide library may comprise, or may consist of primers allowing the detection of the genes mentioned above.
Advantageously, the polynucleotide library may comprise, or may consist of any of the groups of probesets as described above.
Advantageously, the polynucleotide library may comprise, or may consist, of: SEQ ID NO. 17, SEQ ID NO. 18, SEQ ID NO. 19, SEQ ID NO. 20, SEQ ID NO. 21, SEQ ID NO. 22, SEQ ID NO. 23, SEQ ID NO. 24, SEQ ID NO. 25, SEQ ID NO. 26, SEQ ID NO. 28, SEQ ID NO. 30, SEQ ID NO. 31, SEQ ID NO. 32.
In any of these mode of realization, the polynucleotide library may be immobilized on a solid support.
In this case, the support may be selected from the group comprising nylon membrane, nitrocellulose membrane, glass slide, glass beads, membranes on glass support or silicon chip.
Another object of the invention is a kit comprising polynucleotide sequences, e.g., primers and probes, allowing the detection of the expression of the gene(s) and/or sequence(s) of the invention as defined above.
In a particular embodiment of the invention, the kit comprises a polynucleotide library as described above.
Any of the polynucleotide sequences groups as mentioned above may be used in the kit according to the invention.
The kit may comprise one or more of (1) nucleic acid extraction buffer/reagents and protocol; (2) reverse transcription buffer/reagents and protocol; and (3) qPCR buffer/reagents and protocol suitable for performing the method of the invention.
The kit may also comprise 1) data retrieval and/or analysis software.
The kit may be used by a laboratory or physician and be sent to a laboratory for sample testing, e.g., ISO-17025 MapQuant DX™ Lab Services at DNAVision SA (Gosselies, Belgium) on Affymetrix GeneChip® Systems 3000Dx2 (GCS3000Dx2), ensuring highly reproducible sample processing.
Another aspect of the invention relates to a report comprising a summary of the normalized expression levels of an RNA transcript or its expression products in a cancer cell obtained from a subject, wherein said RNA transcript is the RNA of a gene set select from one of the groups described above.
Another aspect of the invention relates to a report comprising a prediction of the response of a subject to treatment with an anti ERBB2 treatment, e.g. an ERBB2 antibody, based on the determination of the normalized expression levels of an RNA transcript or its expression products in a cancer cell obtained from the subject, wherein said RNA transcript is the RNA transcript of a gene group as described above.
Another object of the invention is a method for determining amplification of ERBB2 gene locus on chromosome 17q12-17q21.1 comprising determining the expression level of one or more RNA transcripts or their expression products in a biological sample containing cancer cells obtained from said subject, wherein the RNA transcript is of at least one, at two, at least three, or at least four, or at least five, or at least six, or at least seven, or of eight or larger group of genes selected from the group of genes located within less than one megabase on either side of ERBB2 gene on chromosome 17q12-17q21.1.
In said method, the gene(s) is (are) selected from ERBB2, C17orf37, GRB7, PERLD1, STARD3 and CRKRS. Advantageously, the method further include the hybridization of the tissue sample with the polynucleotide sequence SEQ ID NO. 31.
Another object of the invention is a method for predicting the response of a subject diagnosed with ERBB2 positive cancer to treatment with an ERBB2 inhibitor, comprising determining the expression level of one or more RNA transcripts or their expression products in a biological sample containing cancer cells obtained from said subject, wherein the RNA transcript is of one or more genes selected from the group consisting of ERBB2 and genes located near ERBB2 on chromosome 17g12-17q21.1, particularly the groups of genes as described above, notably the genes of table 1.
This method may further comprise the detection of the expression of SEQ ID NO. 31.
Unless otherwise noted, technical terms are used according to conventional usage.
In order to facilitate review of the various embodiment of the invention, the following explanation of specific terms is provided:
“Overexpression of polynucleotide sequences” means that the expression level of certain polynucleotide sequences is higher than the expression level of a control polynucleotide sequence.
“Underexpression of polynucleotide sequences” means that the expression level of certain polynucleotide sequences is lesser than the expression level of a control polynucleotide sequence.
There are many ways to collect quantitative or relative data on nucleic acids sequences, and the analytical methodology does not affect the utility of nucleic acids sequences expression in assessing the clinical outcome of a female mammal suffering from breast cancer. Methods for determining quantities of nucleic acids expression in a biological sample are well known from one of skill in the art. As an example of such methods, one can cite northern blot, cDNA array, oligo arrays, quantitative Reverse Transcription-PCR, e.g. real-time Real Time polymerase chain reaction (RT-PCR).
In the present invention, the term “polynucleotide” refers to a polymer of RNA or DNA that is single- or double-stranded, optionally containing synthetic, non-natural or altered nucleotide bases. A polynucleotide in the form of a polymer of DNA may be comprised of one or more segments of cDNA, genomic DNA or synthetic DNA.
Detection preferably involves calculating/quantifying a relative expression (transcription) level for each nucleic acids sequence.
By “ERBB2 amplicon”, in the sense of the present invention, is meant a wide region of amplification on chromosome 17q12-17q21.1, which contains many genes frequently amplified in breast tumours. This amplicon contains especially the ERBB2 gene.
By “genes”, in the sense of the present invention, is meant a polynucleotide sequence, e.g., isolated, such as deoxyribonucleic acid (DNA), and, where appropriate, ribonucleic acid (RNA). This sequence may be the complete sequence of the gene, or a subsequence of the gene that may be at least 90%, at least 95% identical to the complete gene sequence, which would be also suitable to perform the method of the analysis according to the invention. A person skilled in the art may choose the position and length of the gene by applying routine experiments. The term should also be understood to include, as equivalents, analogs of RNA or DNA made from nucleotide analogs, and, as applicable to the embodiment being described, single (sense or antisense) and double-stranded polynucleotides. ESTs, chromosomes, cDNAs, mRNAs, and rRNAs are representative examples of molecules that may be referred to as nucleic acids. DNA may be obtained from said nucleic acids sample and RNA may be obtained by transcription of said DNA. In addition, mRNA may be isolated from said nucleic acids sample and cDNA may be obtained by reverse transcription of said mRNA.
By “polynucleotide sequences group consisting of”, in the sense of the present invention, is meant a group of polynucleotide sequences comprising exactly the polynucleotide sequences mentioned, and no polynucleotide sequence in addition nor in less than the polynucleotide sequences of the group.
By “cDNA total sequence of the gene”, in the sense of the invention, is meant the cDNA sequence resulting of the transcription of the DNA sequence coding for the gene.
By “cDNA subsequences of the gene”, in the sense of the invention, is meant a sequence of nucleic acids of cDNA total sequence of the gene that allows a specific hybridization under stringent conditions, as an example more than 10 nucleotides, preferably more than 15 nucleotides, and most preferably more than 25 nucleotides, as an example more than 50 nucleotides or more than 100 nucleotides.
The polynucleotide sample isolated from the subject and obtained at step (a) may be RNA, preferably mRNA. Said polynucleotide sample isolated from the patient can also correspond to cDNA obtained by reverse transcription of the mRNA, or a product of ligation after specific hybridization of specific probes to mRNA or cDNA.
The sequences SEQ ID No. 17 to SEQ ID NO. 32 are Affymetrix sequences (also refered hereafter as “probeset sequences”).
By “reacting nucleic acids sample with polynucleotide sequences”, in the sense of the invention, is meant contacting the nucleic acids sample with polynucleotide sequences in conditions allowing the hybridization of cDNA total sequence of the gene or of cDNA subsequences or of primers of the gene or of probeset sequences with polynucleotide sequences of the corresponding gene.
Animals corresponds to animals such as humans, mice, rats, guinea pigs, monkeys, cats, dogs, pigs, horses, or cows, preferably to humans, and most preferably to women.
Biological sample means any biological material, such as a cell, a tissue sample, or a biopsy from breast cancer.
A “Control” as used herein corresponds to one or more biological samples from a cell, a tissue sample or a biopsy from breast. Said control may be obtained from the same female mammal than the one to be tested or from another female mammal, preferably from the same specie, or from a population of females mammal, preferably from the same specie, that may be the same or different from the test female mammal or subject. Said control may correspond to a biological sample from a cell, a cell line, a tissue sample or a biopsy from breast.
DNA or RNA arrays consist of large numbers of respectively DNA or RNA molecules spotted in a systematic order on a solid support or substrate such as a nylon membrane, glass slide, glass beads or a silicon chip. Depending on the size of each DNA or RNA spot on the array, DNA or RNA arrays can be categorized as microarrays (each DNA or RNA spot has a diameter less than 250 microns) and macroarrays (spot diameter is grater than 300 microns). When the solid substrate used is small in size, arrays are also referred to as DNA or RNA chips. Depending on the spotting technique used, the number of spots on a glass microarray can range from hundreds to thousands.
Typically, a method of monitoring gene expression by DNA or RNA array involves the following steps:
In the present invention, the term “immobilized on a support” means bound directly or indirectly thereto including attachment by covalent binding, hydrogen bonding, ionic interaction, hydrophobic interaction or otherwise.
Preferably, the polynucleotide sample obtained at step (a) is labeled before its reaction at step (b) with the probe immobilized on a solid support. Such labeling is well known from one of skill in the art and includes, but is not limited to, radioactive, colorimetric, enzymatic, e.g. biotinylation, molecular amplification, bioluminescent, electrochemical or fluorescent labeling.
Advantageously, the reaction product of step (c) is quantified by further comparison of said reaction product to a control sample.
Detection preferably involves calculating/quantifying a relative expression (transcription) level for each nucleic acids sequence.
Then, the determination of the relative expression level for each nucleic acid sequences previously described enables to assess the clinical outcome of the subject—i.e. female mammal—suffering from a cancer, e.g. a breast cancer, by the method of the invention.
The method of assessing the clinical outcome of a patient suffering from a cancer may further involve a step of taking a biological sample, preferably breast cancer tissue or cells from a patient. Such methods of sampling are well known of one of skill in the art, and as an example, one can cite surgery.
The provided method may also correspond to an in vitro method, which does not include such a step of sampling.
By “differential expression profile”, in the sense of the invention, is meant the difference between the level of expression of a gene in a control tissue, i.e. a breast tissue free of cancer, and the level of expression of the same gene in the sample analysed.
By “aggressiveness of a cancer”, in the sense of the invention, is meant, e.g., cancer growth rate or potential to metastasise. A so-called “aggressive cancer” will grow or metastasise rapidly or significantly affect overall health status and quality of life.
By “specificity”, in the sense of the invention, is meant the capacity, for a method, especially a diagnostic method, to exclude a disease (or a health problem), when it is really absent. The specificity is the proportion of healthy persons whose the result of the method or test is negative, calculated as follows: true negatives/(true negatives+false positives).
By “sensibility”, in the sense of the invention, is meant the capacity, for a method, especially a diagostic method, to detect a disease (or a health problem), when it really exists. The sensibility is the proportion of all the sick persons whose result to the method is positive, calculated as follows: true positives/(true positives+false negatives).
By “robustness”, in the sense of the invention, is meant the quality of being able to withstand changes in procedure or circumstances. It designs a method, or a group of genes, capable of coping well with variations (sometimes unpredictable variations) in its operating environment.
The method, and particularly the polynucleotide sequences groups of the invention, are “robust”, as it has been constructed by cross validations. It is furthermore independent of the subjective interpretation of a anatomo-pathologist.
For the classification of the patient in view of the ERBB2+ or ERBB2−, the man skilled in the art can use any method allowing the measurement of the expression of the genes of the invention. For example, the man skilled in the art can use the SVM method described in Vaknik et at. (Vapnik, 1998, Statistical Learning Theory. V. N. Vapnik. Wiley Interscience. The content of this document is hereby incorporated by reference.
The present invention will be understood more clearly on reading the description of the experimental studies performed in the context of the research carried out by the applicant, which should not be interpreted as being limiting in nature.
The test has been developed on 152 tumor samples from Institut Paoli Calmettes (IPC) cancer Center: 126 IHC 0, 26 IHC 3+. These tumors have been profiled on an Affymetrix platform, HG-U133 plus 2.0 GeneChip®.
The HER2 signature has been obtained by the RFE-SVM (Recursive Feature Elimination-Support Vector Machine) classification method (Guyon et al. 2002; Machine Learning, 46, 389-422) by using the predefined set as the learning set.
We have used R Magpie implementation package (Ambroise, McLachlan). In order to guarantee robustness of our selection, we have used a cross validation protocol. We had first filtered absent probesets (expression level lower than 5.5 on the whole tumor set) and invariants (standard deviation lower than 0.5): those 2 probeset categories indeed tend to bring noise to classification.
The RFE-SVM algorithm provides an optimal signature with the 16 probesets: SEQ ID NO. 17, SEQ ID NO. 18, SEQ ID NO. 19 and SEQ ID NO. 20, SEQ ID NO. 21, SEQ ID NO. 22, SEQ ID NO. 23, SEQ ID NO. 24, SEQ ID NO. 25, SEQ ID NO. 26, SEQ ID NO. 27, SEQ ID NO. 28, SEQ ID NO. 29, SEQ ID NO. 30, SEQ ID NO. 31, SEQ ID NO. 32, of table 1.
The 16 probesets are located on the 17q12-17q21.1 locus except ZRANB1 and FGFR2 that both are on locus 10q26.
We have chosen the following 14 probesets among the 16 probesets: SEQ ID NO. 17, SEQ ID NO. 18, SEQ ID NO. 19 and SEQ ID NO. 20, SEQ ID NO. 21, SEQ ID NO. 22, SEQ ID NO. 23, SEQ ID NO. 24, SEQ ID NO. 25, SEQ ID NO. 26, SEQ ID NO. 28, SEQ ID NO. 30, SEQ ID NO. 31, SEQ ID NO. 32, of table 1.
Performances have first been evaluated on 3 independent sets of tumors according to the following clinical criteria:
Performances were already very satisfactory using our 16 probesets (Table 2):
We have chosen to test the 14 probesets of the amplicon in order to understand the role of ZRANB1 and FGFR2. When doing that, we have globally improved the performance and validated the signature of the group of 14 probesets.
This gene collection is particularly relevant since it covers ERBB2 amplicon from CRKRS to GRB7.
When comparing our 14 probesets signature to prior art signature or, to only one ARNm, we have noticed that we have improved it in terms of sensitivity, specificity and robustness.
The method of the invention is an SVM model based on the expression of 14 probe sets corresponding to 6 genes of the 17q12 locus and one unknown sequence of the sequence of the 17q locus.
The test has been developed on 152 tumors and validated on 3 independent sets of 152 tumors. The test correlates with IHC method in 96% of cases and resolves equivocal cases (IHC 2+) in 95% of cases. We have also observed a concordance with FISH in more that 91% of cases but on a limited number of tumors (n=11).
We have validated our 14 probesets signature on 5 independent sets of tumors according to the following clinical criteria:
From these 5 independent sets, 282 tumors have been selected based on their high-quality genomic profile, according to the criteria (average background, average noise, scale factor, percentage of present, gapdh, beta-actin and degradation slope of RNA) defined by Affymetrix (<<GeneChip® Expression Analysis Technical Manual>>, 2004) and which are generally applied in the art. As threshold we have chosen two standard deviation (which results in an alpha of 5% if the distribution is normal.) for each criterion.
For all these tumors, we have the detailed information IHC: 189 IHC 0, 22 IHC 1+, 20 IHC 2+, 51 IHC 3+.
Furthermore for IHC 2+, we have the FISH score expressed as positive or negative.
When comparing our 14 probesets signature to prior art signature or, to only one ARNm, regarding the 5 independent sets representing the 282 selected tumors, we have noticed that we had a good overall correlation but also in terms of sensitivity and specificity
The test previously developed on 152 tumors, has been validated on 5 independent sets representing the 282 selected tumors. The test correlates with IHC method in 94% of cases with a global sensitivity and specificity of 78% and 98%, respectively. The test helps classify 271 tumors on 282 (96%). The test also helps resolve equivocal cases (IHC 2+) in 95% of cases (19/20). We also observe a concordance with FISH in 95% of cases (n=19).
Thus we have succeeded in 1-step test using our 14 probesets signature to globally improve the performance (sensitivity, specificity), compared to prior 2-steps tests such as those requiring performing the FISH score after performing IHC method.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB09/55625 | 12/9/2009 | WO | 00 | 6/10/2011 |
Number | Date | Country | |
---|---|---|---|
61121218 | Dec 2008 | US | |
61140110 | Dec 2008 | US |