The invention generally relates to a molecular classification of disease and particularly to methods and compositions for determining BRCA deficiency.
The instant application was filed with one (1) table (Table 1) under 37 C.F.R. §§1.52(e)(1)(iii) & 1.58(b), submitted electronically as the following text file: “3317-01-1P-2010-10-01-TABLE1-BGJ.txt”; creation date: Oct. 1, 2010; Size: 86,503 bytes. This file and all its contents are incorporated by reference herein in their entirety.
The breast and ovarian cancer susceptibility genes, BRCA1 and BRCA2, were discovered in patients having a family history of breast or ovarian cancer. Miki et al., S
It has been discovered that measuring expression of the BRCA1 and/or BRCA2 (referred to collectively as “BRCA”) genes together with cell-cycle progression (“CCP”) gene expression can effectively identifies tumors with BRCA deficiency. Specifically, we determined that tumors in which BRCA and CCP expression are anti-correlated represent a subgroup of BRCA deficient tumors. This subgroup is generally characterized by BRCA hypermethylation. Thus the invention generally provides compositions and methods for determining BRCA status.
In one aspect the invention provides a method for determining gene expression comprising measuring the expression of BRCA1 and/or BRCA2 (BRCA expression) in a sample and measuring the expression of a panel of CCP genes in the sample. Some embodiments further comprise determining whether BRCA expression is correlated to CCP expression. Some embodiments further comprise analyzing methylation in BRCA1 and/or BRCA2 in the sample.
As mentioned above, anti-correlation between BRCA and CCP expression is correlated with BRCA deficiency. Thus another aspect of the invention provides a method for determining whether a sample is BRCA deficient comprising measuring the expression of BRCA1 and/or BRCA2 (BRCA expression) in said sample and measuring the expression of a panel of CCP genes in the sample. Some embodiments further comprise determining whether BRCA expression is correlated to CCP expression. In some embodiments, anti-correlation between BRCA and CCP expression indicates the sample is BRCA deficient. In some embodiments anti-correlation between BRCA and CCP expression indicates the sample has BRCA hypermethylation. Some embodiments further comprise analyzing methylation in BRCA1 and/or BRCA2 in the sample.
In some embodiments the panel of CCP genes comprises at least two (or five, or six, or ten, or 15) CCP genes from any of Tables 1 to 5 or Panels A to G. In some embodiments the panel of CCP genes comprises the genes in any of Tables 1 to 5 or Panels A to G.
In some embodiments, determining the expression of a panel of genes comprising CCP genes involves determining the expression of a plurality of test genes comprising at least 4, 6, 8, 10, 15 or more CCP genes and deriving a test value from the determined expression, wherein the CCP genes are weighted to contribute at least 50%, at least 75% or at least 85% of the test value. Thus, in some embodiments, the invention provides a method for determining whether a sample is BRCA deficient comprising (1) determining in a sample from a patient (a) the expression of BRCA1 and/or BRCA2, and (b) the expression of a panel of genes including at least 4 or at least 8 cell-cycle genes; (2) providing a test value by (a) weighting the determined expression of each of a plurality of test genes selected from the panel of genes with a predefined coefficient, and (b) combining the weighted expression to provide the test value, wherein the cell-cycle genes are weighted to contribute at least 50%, at least 75% or at least 85% of the test value; and (3) comparing the test value to the expression of BRCA 1 and/or BRCA2 to determine whether these are correlated or anti-correlated. In some embodiments the method further comprises (4) correlating an anti-correlation between the test value and BRCA1 and/or BRCA2 expression to BRCA deficiency.
BRCA deficiency is associated with various characteristics in tumors. Thus in one aspect the invention provides a method of classifying a cancer comprising measuring the expression of BRCA1 and/or BRCA2 (BRCA expression) in said sample and measuring the expression of two or more CCP genes in the sample. Some embodiments further comprise determining whether BRCA expression is correlated to CCP expression. In some embodiments, anti-correlation between BRCA and CCP expression indicates any one of the following: greater likelihood of survival (e.g., progression-free survival, overall survival, etc.), greater likelihood of response to DNA damaging agents (e.g., platinum chemotherapy drugs, etc.), greater likelihood of response to drugs targeting the poly (ADP-ribose) polymerase (PARP) pathway, etc. Some embodiments further comprise determining whether BRCA1 and/or BRCA2 is hypermethylated.
In some embodiments gene expression is determined using any of the following techniques: quantitative PCR™ (e.g., TaqMan™), microarray hybridization analysis, quantitative sequencing, etc. In some embodiments methylation is analyzed using any of the following techniques: Southern blotting, single nucleotide primer extension, methylation-specific polymerase chain reaction (MSPCR), restriction landmark genomic scanning for methylation (RLGS-M) and CpG island microarray, single nucleotide primer extension (SNuPE), combined bisulfite restriction analysis (COBRA), etc.
In another aspect the invention provides systems related to the above methods of the invention. In one embodiment the invention provides a system for determining gene expression in a tumor sample, comprising: (1) a sample analyzer for determining the expression levels of BRCA1 and/or BRCA2 and a panel of genes comprising at least two CCP genes in a sample, wherein the sample analyzer contains the sample, mRNA from the sample and expressed from the panel of genes, or cDNA synthesized from said mRNA; (2) a first computer program for (a) receiving gene expression data on BRCA1 and/or BRCA2, (b) receiving gene expression data on at least two test genes selected from the panel of genes, (c) weighting the determined expression of each of the test genes with a predefined coefficient, and (d) combining the weighted expression to provide a CCP test value representing the expression level of the panel of genes.
In some embodiments the above system further comprises a computer program for comparing the expression of BRCA1 and/or BRCA2 to the CCP test value, wherein high expression of BRCA1 and/or BRCA2 coupled with a high CCP test value indicates BRCA and CCP expression are correlated, wherein low expression of BRCA1 and/or BRCA2 coupled with a low CCP test value indicates BRCA and CCP expression are correlated, wherein high expression of BRCA1 and/or BRCA2 coupled with a low CCP test value indicates BRCA and CCP expression are anti-correlated, and wherein low expression of BRCA1 and/or BRCA2 coupled with a high CCP test value indicates BRCA and CCP expression are anti-correlated.
In some embodiments the above system further comprises a computer program for receiving data on the correlation between BRCA expression and CCP expression in a patient sample and concluding that the sample is BRCA deficient if BRCA expression and CCP expression are anti-correlated in the sample. In some embodiments the system comprises a sample analyzer for determining the methylation status of BRCA1 and/or BRCA2.
In yet another aspect the invention provides a kit for practicing the methods and for use in the systems of the present invention. The kit may include a carrier for the various components of the kit. The carrier can be a container or support, in the form of, e.g., bag, box, tube, rack, and is optionally compartmentalized. The carrier may define an enclosed confinement for safety purposes during shipment and storage.
The kit includes various components useful in determining the expression of BRCA1 and/or BRCA2, the expression of at least two CCP genes, and optionally the expression of one or more housekeeping gene markers and/or the methylation status of BRCA1 and/or BRCA2. For example, the kit many include oligonucleotides specifically hybridizing under high stringency to mRNA or cDNA of BRCA1, BRCA2, or the genes in Tables 1 to 5 or Panels A to F. Such oligonucleotides can be used as PCR primers in RT-PCR reactions, or hybridization probes.
Various techniques for determining BRCA status are known to those skilled in the art. In some embodiments the whole genome of one or more cells is determined and the sequence of a BRCA gene found within that genome is analyzed for mutations. In some embodiments a BRCA gene is specifically sequenced, which may include exon sequencing, sequencing of exons along with at least some amount of flanking intronic sequence, or sequencing of the entire genomic region containing the BRCA gene of interest. Copy number analysis may also be used. In some embodiments large rearrangement analysis is used to determine whether large portions of the BRCA gene (or even the entire gene) have been deleted or duplicated. In some embodiments methylation analysis is used to determine BRCA status.
The foregoing and other advantages and features of the invention, and the manner in which the same are accomplished, will become more readily apparent upon consideration of the following detailed description of the invention taken in conjunction with the accompanying examples and drawings, which illustrate preferred and exemplary embodiments.
It has been discovered that measuring BRCA expression together with cell-cycle progression (“CCP”) gene expression can effectively identify tumors with BRCA deficiency (Example 2). Specifically, we determined that tumors in which BRCA and CCP expression are anti-correlated represent a subgroup of BRCA deficient tumors (id.). This subgroup is generally characterized by BRCA hypermethylation (id.). Thus determining BRCA and CCP expression levels can effectively identify BRCA deficient tumors better than BRCA expression alone. Accordingly the invention generally provides compositions and methods for determining BRCA status.
In one aspect the invention provides a method for determining gene expression comprising measuring the expression of BRCA1 and/or BRCA2 (BRCA expression) in a sample and measuring the expression of a panel of CCP genes in the sample. Some embodiments further comprise determining whether BRCA expression is correlated to CCP expression. Some embodiments further comprise analyzing methylation in BRCA1 and/or BRCA2 in the sample.
As mentioned above, anti-correlation between BRCA and CCP expression is correlated with BRCA deficiency. Thus another aspect of the invention provides a method for determining whether a sample is BRCA deficient comprising measuring the expression of BRCA1 and/or BRCA2 (BRCA expression) in said sample and measuring the expression of a panel of CCP genes in the sample. “BRCA deficient” and “BRCA deficiency” mean attenuated cellular activity of BRCA1 and/or BRCA2 protein. This can include deletion of part or all of the BRCA1 and/or BRCA2 gene, lowered transcription and/or stability of BRCA1 and/or BRCA2 mRNA (e.g., as caused by hypermethylation), lowered translation of BRCA1 and/or BRCA2 protein, or mutation(s) in the BRCA1 and/or BRCA2 gene or transcripts leading to a protein with lowered biochemical activity.
“Cell-cycle progression gene” and “CCP gene” herein refer to a gene whose expression level closely tracks the progression of the cell through the cell-cycle. See, e.g., Whitfield et al., M
Whether a particular gene is a CCP gene may be determined by any technique known in the art, including that taught in Whitfield et al., M
Additional CCP gene panels useful in the invention are as follows:
Various embodiments of the invention involve determining the expression of genes (e.g., BRCA1, BRCA2, CCP genes, etc.) in a sample. In the context of an individual test gene, “expression level” means the amount (normalized or absolute) of an analyte associated with that gene in a sample. For example, the level of BRCA1 expression can be the amount of BRCA1 transcript (or cDNA reverse transcribed from such transcript) or protein in a sample.
Those skilled in the art are familiar with various techniques for determining the expression level of a gene or protein in a tissue or cell sample. Gene expression can be determined either at the RNA level (i.e., noncoding RNA (ncRNA), mRNA, miRNA, tRNA, rRNA, snoRNA, siRNA and piRNA) or at the protein level. Expression analysis at the RNA level can be done using, e.g., microarray analysis (e.g., for assaying mRNA or microRNA expression, copy number, etc.), quantitative real-time PCR™ (“qRT-PCR™”, e.g., TaqMan™), etc. Levels of proteins in a tumor sample can be determined by any known techniques in the art, e.g., HPLC, mass spectrometry, or using antibodies specific to selected proteins (e.g., IHC, ELISA, etc.). The activity level of a polypeptide encoded by a gene may be used in much the same way as the expression level of the gene or polypeptide. Often higher activity levels indicate higher expression levels while lower activity levels indicate lower expression levels. Thus, in some embodiments, the activity level of a polypeptide encoded by a gene is determined rather than or in addition to the expression level of the gene. Those skilled in the art are familiar with techniques for measuring the activity of various such proteins, including BRCA1, BRCA2, and those encoded by the genes listed in Tables 1 to 5. The methods of the invention may be practiced independent of the particular technique used.
In some embodiments, the expression of one or more normalizing genes is also obtained for use in normalizing the expression of test genes. As used herein, “normalizing genes” referred to the genes whose expression is used to calibrate or normalize the measured expression of the gene of interest (e.g., test genes). Importantly, the expression of normalizing genes should be independent of cancer outcome/prognosis, and the expression of the normalizing genes is very similar among all the tumor samples. Normalization ensures accurate comparison of expression of a test gene between different samples. For this purpose, housekeeping genes known in the art can be used. Housekeeping genes are well known in the art, with examples including, but are not limited to, GUSB (glucuronidase, beta), HMBS (hydroxymethylbilane synthase), SDHA (succinate dehydrogenase complex, subunit A, flavoprotein), UBC (ubiquitin C) and YWHAZ (tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein, zeta polypeptide). One or more housekeeping genes can be used. Preferably, at least 2, 5, 10 or 15 housekeeping genes are used to provide a combined normalizing gene set. The amount of gene expression of such normalizing genes can be averaged, combined together by straight additions or by a defined algorithm. Some examples of particularly useful housekeeper genes for use in the methods and compositions of the invention include those listed in Table A below.
In the case of measuring RNA levels for the genes, one convenient and sensitive approach is the real-time quantitative PCR™ (gPCR™) assay, following a reverse transcription reaction. Typically, a cycle threshold (Ct) is determined for each test gene and each normalizing gene, i.e., the number of cycles at which the fluoescence from a qPCR reaction above background is detectable.
The overall expression of the one or more normalizing genes can be represented by a “normalizing value” which can be generated by combining the expression of all normalizing genes, either weighted equally (straight addition or averaging) or by different predefined coefficients. In one simple example, the normalizing value CtH can be the cycle threshold (Ct) of one single normalizing gene, or an average of the Ct values of 2 or more, preferably 10 or more, or 15 or more normalizing genes, in which case, the predefined coefficient is 1/N, where N is the total number of normalizing genes used. Thus, CtH=(CtH1+CtH2+ . . . CtHn)/N. As will be apparent to skilled artisans, depending on the normalizing genes used, and the weight desired to be given to each normalizing gene, any coefficients (from 0/N to N/N) can be given to the normalizing genes in weighting the expression of such normalizing genes. That is, CtH=xCtH1+yCtH2+ . . . zCtHn, wherein x+y+ . . . +z=1.
As discussed above, the methods of the invention generally involve determining the level of expression of a panel of CCP genes. With modern high-throughput techniques, it is often possible to determine the expression level of tens, hundreds or thousands of genes. Indeed, it is possible to determine the level of expression of the entire transcriptome (i.e., each transcribed gene in the genome). Once such a global assay has been performed, one may then informatically analyze one or more subsets (i.e., panels) of genes. For example, one may analyze the expression of a panel comprising primarily CCP genes according to the present invention by combining the expression level values of the individual test genes to obtain a test value.
As will be apparent to a skilled artisan, such a test value represents the overall expression level of the panel of test genes (e.g., a panel composed of substantially CCP genes). In one embodiment, to provide a test value in the methods of the invention, the normalized expression for a test gene can be obtained by normalizing the measured Ct for the test gene against the CtH, i.e., ΔCt1=(Ct1−CtH). Thus, the test value representing the overall expression of the plurality of test genes can be provided by combining the normalized expression of all test genes, either by straight addition or averaging (i.e., weighted equally) or by a different predefined coefficient. For example, the simplest approach is averaging the normalized expression of all test genes: test value=(ΔCt1+ΔCt2+ . . . +ΔCtn)/n. As will be apparent to skilled artisans, depending on the test genes used, different weight can also be given to different test genes in the present invention.
Thus in methods of the invention described herein comprising determining the expression of a panel of CCP genes, such determining step may comprise: (1) determining the expression of a panel of genes in the sample comprising at least two CCP genes; and (2) providing a test value by (a) weighting the determined expression of each of a plurality of test genes selected from said panel of genes with a predefined coefficient, and (b) combining the weighted expression to provide said test value. This test value represents the level of expression of the panel of genes in the sample. In embodiments involving comparison or analysis of CCP expression, the test value will often be compared to BRCA expression in order to determine whether the two are correlated or anti-correlated. In some embodiments, anti-correlation indicates BRCA deficiency.
In some embodiments the methods of the invention comprise determining the status of a panel (i.e., a plurality) of test genes comprising a plurality of CCP genes (e.g., to provide a test value representing the average expression of the test genes). For example, increased expression in a panel of test genes may refer to the average expression level of all panel genes in a particular patient being higher than the average expression level of these genes in normal patients (or higher than some index value that has been determined to represent the normal average expression level). Alternatively, increased expression in a panel of test genes may refer to increased expression in at least a certain number (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30 or more) or at least a certain proportion (e.g., 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, 100%) of the genes in the panel as compared to the average normal expression level.
In some embodiments the plurality of test genes (which may itself be a sub-panel analyzed informatically) comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 70, 80, 90, 100, 200, or more CCP genes. In some embodiments the plurality of test genes comprises at least 10, 15, 20, or more CCP genes. In some embodiments the plurality of test genes comprises between 5 and 100 CCP genes, between 7 and 40 CCP genes, between 5 and 25 CCP genes, between 10 and 20 CCP genes, or between 10 and 15 CCP genes. In some embodiments CCP genes comprise at least a certain proportion of the plurality of test genes used to provide a test value. Thus in some embodiments the plurality of test genes comprises at least 25%, 30%, 40%, 50%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% CCP genes. In some preferred embodiments the plurality of test genes comprises at least 10, 15, 20, 25, 30, 35, 40, 45, 50, 70, 80, 90, 100, 200, or more CCP genes, and such CCP genes constitute at least 50%, 60%, 70%, preferably at least 75%, 80%, 85%, more preferably at least 90%, 95%, 96%, 97%, 98%, or 99% or more of the total number of genes in the plurality of test genes.
In some embodiments the CCP genes are the genes in any one of Table 1 and Panels A through G. In some embodiments the test panel comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, or more of the genes in any of Tables 1 to 5 and Panels A to F. In some embodiments the invention provides methods comprising determining (e.g., in a sample) the expression of the genes in any one of Tables 1 to 5 and Panels A to F.
It has been determined that, once the CCP phenomenon reported herein is appreciated, the choice of individual CCGs for a test panel can, in some embodiments, be somewhat arbitrary. In other words, many CCGs have been found to be very good surrogates for each other. Thus any CCG (or panel of CCGs) can be used in the various embodiments of the invention. In other embodiments of the invention, optimized CCGs are used. One way of assessing whether particular CCGs will serve well in the methods and compositions of the invention is by assessing their correlation with the mean expression of CCGs (e.g., all known CCGs, a specific set of CCGs, etc.). Those CCGs that correlate particularly well with the mean are expected to perform well in assays of the invention, e.g., because these will reduce noise in the assay.
126 CCGs and 47 housekeeping genes had their expression compared to the CCG and housekeeping mean in order to determine preferred genes for use in some embodiments of the invention. Rankings of select CCGs according to their correlation with the mean CCG expression as well as their ranking according to predictive value are given in Tables 2, 3, 5, 6, & 7.
Assays of 126 CCGs and 47 HK (housekeeping) genes were run against 96 commercially obtained, anonymous prostate tumor FFPE samples without outcome or other clinical data. The working hypothesis was that the assays would measure with varying degrees of accuracy the same underlying phenomenon (cell cycle proliferation within the tumor for the CCGs, and sample concentration for the HK genes). Assays were ranked by the Pearson's correlation coefficient between the individual gene and the mean of all the candidate genes, that being the best available estimate of biological activity. Rankings for these 126 CCGs according to their correlation to the overall CCG mean are reported in Table 6.
After excluding CCGs with low average expression, assays that produced sample failures, CCGs with correlations less than 0.58, and HK genes with correlations less than 0.95, a subset of 56 CCGs (Panel H) and 36 HK candidate genes were left. Correlation coefficients were recalculated on these subsets, with the rankings shown in Tables 7 and 8, respectively.
The CCGs in Panel F were likewise ranked according to correlation to the CCG mean as shown in Table 9 below.
When choosing specific CCGs for inclusion in any embodiment of the invention, the individual predictive power of each gene may be used to rank them in importance. The inventors have determined that the CCGs in Panel C can be ranked as shown in Table 10 below according to the predictive power of each individual gene. The CCGs in Panel F can be similarly ranked as shown in Table 11 below.
Thus, in some embodiments of each of the various aspects of the invention the plurality of test genes comprises the top 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40 or more genes listed in Table 6, 7, 9, 10, or 11. In some embodiments the plurality of test genes comprises at least some number of CCGs (e.g., at least 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50 or more CCGs) and this plurality of CCGs comprises at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, or of the following genes: ASPM, BIRC5, BUB1B, CCNB2, CDC2, CDC20, CDCA8, CDKN3, CENPF, DLGAP5, FOXM1, KIAA0101, KIF11, KIF2C, KIF4A, MCM10, NUSAP1, PRC1, RACGAP1, and TPX2. In some embodiments the plurality of test genes comprises at least some number of CCGs (e.g., at least 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50 or more CCGs) and this plurality of CCGs comprises at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, or 20 of the following genes: TPX2, CCNB2, KIF4A, KIF2C, BIRC5, RACGAP1, CDC2, PRC1, DLGAP5/DLG7, CEP55, CCNB1, TOP2A, CDC20, KIF20A, BUB1B, CDKN3, NUSAP1, CCNA2, KIF11, and CDCA8. In some embodiments the plurality of test genes comprises at least some number of CCGs (e.g., at least 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50 or more CCGs) and this plurality of CCGs comprises any one, two, three, four, five, six, seven, eight, nine, or ten or all of gene numbers 1 & 2, 1 to 3, 1 to 4, 1 to 5, 1 to 6, 1 to 7, 1 to 8, 1 to 9, or 1 to 10 of any of Table 6, 7, 9, 10, or 11. In some embodiments the plurality of test genes comprises at least some number of CCGs (e.g., at least 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50 or more CCGs) and this plurality of CCGs comprises any one, two, three, four, five, six, seven, eight, or nine or all of gene numbers 2 & 3, 2 to 4, 2 to 5, 2 to 6, 2 to 7, 2 to 8, 2 to 9, or 2 to 10 of any of Table 6, 7, 9, 10, or 11. In some embodiments the plurality of test genes comprises at least some number of CCGs (e.g., at least 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50 or more CCGs) and this plurality of CCGs comprises any one, two, three, four, five, six, seven, or eight or all of gene numbers 3 & 4, 3 to 5, 3 to 6, 3 to 7, 3 to 8, 3 to 9, or 3 to 10 of any of Table 6, 7, 9, 10, or 11. In some embodiments the plurality of test genes comprises at least some number of CCGs (e.g., at least 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50 or more CCGs) and this plurality of CCGs comprises any one, two, three, four, five, six, or seven or all of gene numbers 4 & 5, 4 to 6, 4 to 7, 4 to 8, 4 to 9, or 4 to 10 of any of Table 6, 7, 9, 10, or 11. In some embodiments the plurality of test genes comprises at least some number of CCGs (e.g., at least 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50 or more CCGs) and this plurality of CCGs comprises any one, two, three, four, five, six, seven, eight, nine, 10, 11, 12, 13, 14, or 15 or all of gene numbers 1 & 2, 1 to 3, 1 to 4, 1 to 5, 1 to 6, 1 to 7, 1 to 8, 1 to 9, 1 to 10, 1 to 11, 1 to 12, 1 to 13, 1 to 14, or 1 to 15 of any of Table 6, 7, 9, 10, or 11.
In CCP signatures the particular CCP genes analyzed is often not as important as the total number of CCP genes. The number of CCP genes analyzed can vary depending on many factors, e.g., technical constraints, cost considerations, the classification being made, the cancer being tested, the desired level of predictive power, etc. Increasing the number of CCP genes analyzed in a panel according to the invention is, as a general matter, advantageous because, e.g., a larger pool of genes to be analyzed means less “noise” caused by outliers and less chance of an error in measurement or analysis throwing off the overall predictive power of the test. However, cost and other considerations will sometimes limit this number and finding the optimal number of CCP genes for a signature is desirable.
It has been discovered that the predictive power of a CCP signature often ceases to increase significantly beyond a certain number of CCP genes (see
(Pn+1−Pn)<CO,
wherein P is the predictive power (i.e., Pn is the predictive power of a signature with n genes and Pn+1 is the predictive power of a signature with n genes plus one) and CO is some optimization constant. Predictive power can be defined in many ways known to those skilled in the art including, but not limited to, the signature's p-value. CO can be chosen by the artisan based on his or her specific constraints. For example, if cost is not a critical factor and extremely high levels of sensitivity and specificity are desired, CO can be set very low such that only trivial increases in predictive power are disregarded. On the other hand, if cost is decisive and moderate levels of sensitivity and specificity are acceptable, CO can be set higher such that only significant increases in predictive power warrant increasing the number of genes in the signature.
Alternatively, a graph of predictive power as a function of gene number may be plotted (as in
Example 1 and
Determining expression levels can be, to varying degrees, quantitative, qualitative, or both. For example, when determining the BRCA1 mRNA transcript levels in a sample, the absolute number of transcripts can be determined. Alternatively, the absolute number of transcripts may be normalized against some standard as discussed above to yield a relative rather than absolute expression level. When determining protein expression levels, more qualitative analysis is common. For example, tissue samples may be stained with an antibody against BRCA1 protein and the level of staining in tumor cells can be assigned certain semi-quantitative numbers (e.g., −1, 0, +1). Assigning particular expression levels in this way will often be based on an internal control (e.g., surrounding non-tumor cells) or an external control (e.g., unrelated BRCA-intact cells).
Those skilled in the art are familiar with various ways of determining the expression of a panel (plurality) of genes (e.g., CCP genes). One may determine the expression of a panel of genes by determining the average (e.g., mean, median, weighted average, etc.) expression level, normalized or absolute, of panel genes in a sample obtained from a particular patient (either throughout the sample or in a subset of cells from the sample or in a single cell). Increased expression in this context will mean the average expression is higher than the average expression level of these genes in normal patients (or higher than some index value, e.g., a value that has been determined to represent the average expression level in a reference population (e.g., patients with cancer or patients with the same cancer)). Alternatively, one may determine the expression of a panel of genes by determining the average expression level (normalized or absolute) of at least a certain number (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30 or more) or at least a certain proportion (e.g., 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, 100%) of the genes in the panel. Alternatively, one may determine the expression of a panel of genes by determining the absolute copy number of the mRNA (or protein) of all the genes in the panel and either total or average these across the genes.
In preferred embodiments, the test value representing the expression level of a test gene (e.g., BRCA1) or a plurality of test genes (e.g., a panel of CCP genes) is compared to one or more reference values (or index values) to determine if expression of the test gene(s) is high, low, average, etc. Once BRCA and CCP expression have thus been determined as high, low, etc., one can, according to the methods of the present invention, determine whether BRCA and CCP expression are correlated or anti-correlated.
Those skilled in the art are familiar with various ways of deriving and using index values. For example, the index value may represent the gene expression levels found in a normal sample obtained from the patient of interest, in which case an expression level (e.g., test value) in the test sample significantly above this index value would indicate high expression in the sample.
Alternatively, the index value may represent the average expression level for a set of individuals from a diverse population or a subset of the population. For example, one may determine the average expression level of a gene or gene panel in a random sampling of patients. This average expression level may be termed the “threshold index value.” In some embodiments of the invention the methods comprise determining whether the expression of one or more test genes is “increased” or “high.” In the context of the invention, “increased” or “high” expression of a test gene means the patient's expression level is either elevated over a normal index value or a threshold index (e.g., by at least some threshold amount (e.g., a standard deviation)) or within the range of expression that has been determined in patients to be high (e.g., top quartile of reference patients).
Alternative index values may be derived by dividing patients into groups based on expression level. For example, one may determine the level of expression of the test gene(s) for a set of patients and group the patients into terciles, quartiles, quintiles, etc. A threshold may be set at the boundary of each group, with test patients being placed into a group (e.g., quartile) depending on which threshold(s) their determined expression exceeds.
Alternatively index values may be determined thusly: In order to assign patients to risk groups (e.g., high likelihood of having cancer, high likelihood of recurrence/progression), a threshold value will be set for the cell cycle mean. The optimal threshold value is selected based on the receiver operating characteristic (ROC) curve, which plots sensitivity vs (1−specificity). For each increment of the cell cycle mean, the sensitivity and specificity of the test is calculated using that value as a threshold. The actual threshold will be the value that optimizes these metrics according to the artisan's requirements (e.g., what degree of sensitivity or specificity is desired, etc.).
As mentioned above, anti-correlation between BRCA and CCP expression indicates BRCA deficiency. Thus in one aspect the invention provides a method for determining whether a sample is BRCA deficient comprising measuring the expression of BRCA1 and/or BRCA2 (BRCA expression) in said sample, measuring the expression of a panel of CCP genes in the sample, and determining whether BRCA expression is correlated to CCP expression. In this context, BRCA and CCP expression are “correlated” in a sample if BRCA and CCP expression are both high, low, or intermediate in the sample. Conversely, BRCA and CCP expression are “anti-correlated” in a sample if one is low while the other is high or if one is either high or low and the other is intermediate in the sample. In a preferred embodiment BRCA and CCP expression are anti-correlated if BRCA (especially BRCA1) expression is low and CCP expression (especially expression of one of the panels in Tables 1 to 5 (e.g., Panels A to F)) is high.
In some embodiments the sample is from a patient having (or suspected of having) ovarian cancer, breast cancer, lung cancer, colon cancer, or prostate cancer, or any combination of these. In some embodiments, the sample is a tumor tissue sample, a blood or blood derivative (e.g., serum, plasma) sample, a urine sample, or any other sample derived from the body of a patient. In some embodiments the sample used to determine expression levels is some derivative of these bodily samples (e.g., an isolate of the RNA, DNA, protein, etc. from a bodily sample).
In some embodiments, the invention provides a method for determining whether a sample is BRCA deficient comprising measuring the expression of BRCA1 and/or BRCA2 (BRCA expression) in said sample, measuring the expression of a panel of CCP genes in the sample, and determining whether BRCA expression is correlated to CCP expression, wherein anti-correlation between BRCA and CCP expression indicates the sample is BRCA deficient.
In some embodiments anti-correlation between BRCA and CCP expression indicates the sample has BRCA hypermethylation. Some embodiments further comprise determining the methylation status and level of a gene or panel of genes (preferably the BRCA1 and/or BRCA2 gene) in the sample. As used herein, “methylation status” is used to indicate the presence or absence or the level or extent of methyl group modification in the polynucleotide of at least one gene. As used herein, “methylation level” is used to indicate the quantitative measurement of methylated DNA for a given gene, defined as the percentage of total DNA copies of that gene that are determined to be methylated, based on quantitative methylation-specific PCR.
Any assay that can be employed to determine the methylation status of the gene or gene panel should suffice for the purposes of the present invention. In general, assays are designed to assess the methylation status of individual genes, or portions thereof. Examples of types of assays used to assess the methylation pattern include, but are not limited to, Southern blotting, single nucleotide primer extension, methylation-specific polymerase chain reaction (MSPCR), restriction landmark genomic scanning for methylation (RLGS-M) and CpG island microarray, single nucleotide primer extension (SNuPE), and combined bisulfite restriction analysis (COBRA). The COBRA technique is disclosed in Xiong & Laird, N
In some embodiments the panel of CCP genes comprises at least two (or five, or six, or ten, or 15, or more) CCP genes from any of Tables 1 to 5. In some embodiments the panel of CCP genes comprises at least two (or five, or six, or ten, or 15, or more) CCP genes from any of Tables 1 to 5. In some embodiments the panel of CCP genes comprises the genes listed in Table 4. In some embodiments the panel of CCP genes comprises the genes in Panel F. In some embodiments the panel of CCP genes comprises the genes listed in Table 5.
BRCA deficiency has been found to be correlated with, inter alia, progression-free survival (Example 2). Specifically, BRCA deficient patients show a significantly longer progression-free survival than non-BRCA-deficient patients. Thus in one aspect the invention provides a method of classifying a cancer comprising measuring the expression of BRCA1 and/or BRCA2 (BRCA expression) in said sample and measuring the expression of two or more CCP genes in the sample. Some embodiments further comprise determining whether BRCA expression is correlated to CCP expression. In some embodiments, anti-correlation between BRCA and CCP expression indicates any one of the following: greater likelihood of survival (e.g., progression-free survival, overall survival, etc.), greater likelihood of response to DNA damaging agents (e.g., platinum chemotherapy drugs, etc.), greater likelihood of response to drugs targeting the poly (ADP-ribose) polymerase (PARP) pathway, etc.
As used herein, a patient has an “increased likelihood” of some clinical feature or outcome (e.g., recurrence, progression, response to a particular therapeutic regimen, etc.) if the probability of the patient having the feature or outcome exceeds some reference probability or value. The reference probability may be the probability of the feature or outcome across the general relevant patient population. For example, if the probability of recurrence in the general breast cancer population is X % and a particular patient has been determined by the methods of the present invention to have a probability of recurrence of Y %, and if Y>X, then the patient has an “increased likelihood” of recurrence. Alternatively, as discussed above, a threshold or reference value may be determined and a particular patient's probability of recurrence may be compared to that threshold or reference.
Those skilled in the art are familiar with various techniques for determining gene expression and any technique that determines gene expression can be used in the methods of the invention. In some embodiments gene expression is determined using any of the following techniques: quantitative PCR™ (e.g., TaqMan™), microarray hybridization analysis, quantitative sequencing, etc.
The results of any analyses according to the invention will often be communicated to physicians, genetic counselors and/or patients (or other interested parties such as researchers) in a transmittable form that can be communicated or transmitted to any of the above parties. Such a form can vary and can be tangible or intangible. The results can be embodied in descriptive statements, diagrams, photographs, charts, images or any other visual forms. For example, graphs showing expression or activity level or sequence variation information for various genes can be used in explaining the results. Diagrams showing such information for additional target gene(s) are also useful in indicating some testing results. The statements and visual forms can be recorded on a tangible medium such as papers, computer readable media such as floppy disks, compact disks, etc., or on an intangible medium, e.g., an electronic medium in the form of email or website on internet or intranet. In addition, results can also be recorded in a sound form and transmitted through any suitable medium, e.g., analog or digital cable lines, fiber optic cables, etc., via telephone, facsimile, wireless mobile phone, internet phone and the like.
Thus, the information and data on a test result can be produced anywhere in the world and transmitted to a different location. As an illustrative example, when an expression level, activity level, or sequencing (or genotyping) assay is conducted outside the United States, the information and data on a test result may be generated, cast in a transmittable form as described above, and then imported into the United States. Accordingly, the present invention also encompasses a method for producing a transmittable form of information on at least one of (a) expression level or (b) activity level for at least one patient sample. The method comprises the steps of (1) determining at least one of (a) or (b) above according to methods of the present invention; and (2) embodying the result of the determining step in a transmittable form. The transmittable form is the product of such a method.
Techniques for analyzing such expression, activity, and/or sequence data (indeed any data obtained according to the invention) will often be implemented using hardware, software or a combination thereof in one or more computer systems or other processing systems capable of effectuating such analysis.
Thus one aspect of the present invention provides systems related to the above methods of the invention. In one embodiment the invention provides a system for determining gene expression in a tumor sample, comprising: (1) a sample analyzer for determining the expression levels of BRCA1 and/or BRCA2 and a panel of genes comprising at least two CCP genes in a sample, wherein the sample analyzer contains the sample, mRNA from the sample and expressed from the panel of genes, or cDNA synthesized from said mRNA; (2) a first computer program means for (a) receiving gene expression data on BRCA1 and/or BRCA2, (b) receiving gene expression data on at least two test genes selected from the panel of genes, (b) weighting the determined expression of each of the test genes with a predefined coefficient, and (c) combining the weighted expression to provide a CCP test value representing the expression level of the panel of genes.
As with the methods of the invention, the systems of the invention may be used to determine whether BRCA and/or CCP expression in a sample are high, low, etc. Thus in some embodiments the above system further comprises a computer program means of comparing the expression of BRCA1 and/or BRCA2 to a reference value, wherein expression of BRCA1 and/or BRCA2 above this reference value indicates said BRCA1 and/or BRCA2 expression is high. In some embodiments the above system further comprises a computer program means of comparing the CCP test value to a reference value, wherein a CCP test value above this reference value indicates CCP expression is high.
As with the methods of the invention, the systems of the invention may be used to determine whether BRCA and CCP expression are correlated in a sample. Thus in some embodiments the above system further comprises a computer program means of comparing the expression of BRCA1 and/or BRCA2 to the CCP test value, wherein high expression of BRCA1 and/or BRCA2 coupled with a high CCP test value indicates BRCA and CCP expression are correlated, wherein low expression of BRCA1 and/or BRCA2 coupled with a low CCP test value indicates BRCA and CCP expression are correlated, wherein high expression of BRCA1 and/or BRCA2 coupled with a low CCP test value indicates BRCA and CCP expression are anti-correlated, and wherein low expression of BRCA1 and/or BRCA2 coupled with a high CCP test value indicates BRCA and CCP expression are anti-correlated.
As with the methods of the invention, the systems of the invention may be used to determine whether the sample is BRCA deficient. Thus in some embodiments the above system further comprises a computer program means of receiving data on the correlation between BRCA expression and CCP expression in a patient sample and concluding that the sample is BRCA deficient if BRCA expression and CCP expression are anti-correlated in the sample.
In some embodiments the system comprises a sample analyzer for determining the methylation status of BRCA1 and/or BRCA2. In some embodiments this sample analyzer is the same as the sample analyzer for determining gene expression.
In the systems of the invention, as with the methods of the invention described above, the test genes may comprise at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 70, 80, 90, 100, 200, or more CCP genes. In some embodiments the test genes comprise at least 10, 15, 20, or more CCP genes. In some embodiments the test gene comprises between 5 and 100 CCP genes, between 7 and 40 CCP genes, between 5 and 25 CCP genes, between 10 and 20 CCP genes, or between 10 and 15 CCP genes. In some embodiments CCP genes comprise at least a certain proportion of the test genes used to provide a test value. Thus in some embodiments the test genes comprise at least 25%, 30%, 40%, 50%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% CCP genes. In some preferred embodiments the test genes comprise at least 10, 15, 20, 25, 30, 35, 40, 45, 50, 70, 80, 90, 100, 200, or more CCP genes, and such CCP genes constitute at least 50%, 60%, 70%, preferably at least 75%, 80%, 85%, more preferably at least 90%, 95%, 96%, 97%, 98%, or 99% or more of the total number of test genes.
In some embodiments, the system further comprises a display module displaying the comparison between the test value and the one or more reference values, or displaying a result of the comparing step.
In a preferred embodiment, the amount of RNA transcribed from the panel of genes including test genes is measured in the sample. In addition, the amount of RNA of one or more housekeeping genes in the sample is also measured, and used to normalize or calibrate the expression of the test genes, as described above.
The sample analyzer can be any instrument useful in determining gene expression, including, e.g., a sequencing machine, a real-time PCR machine, a microarray instrument, etc. In embodiments comprising a sample analyzer for determining methylation status, such a sample analyzer can be any instrument useful in determining methylation status.
The computer-based analysis function can be implemented in any suitable language and/or browsers. For example, it may be implemented with C language and preferably using object-oriented high-level programming languages such as Visual Basic, SmallTalk, C++, and the like. The application can be written to suit environments such as the Microsoft Windows™ environment including Windows™ 98, Windows™ 2000, Windows™ NT, and the like. In addition, the application can also be written for the MacIntosh™, SUN™, UNIX or LINUX environment. In addition, the functional steps can also be implemented using a universal or platform-independent programming language. Examples of such multi-platform programming languages include, but are not limited to, hypertext markup language (HTML), JAVA™, JavaScript™, Flash programming language, common gateway interface/structured query language (CGI/SQL), practical extraction report language (PERL), AppleScript™ and other system script languages, programming language/structured query language (PL/SQL), and the like. Java™- or JavaScript™-enabled browsers such as HotJava™, Microsoft™ Explorer™, or Netscape™ can be used. When active content web pages are used, they may include Java™ applets or ActiveX™ controls or other active content technologies.
The analysis function can also be embodied in computer program products and used in the systems described above or other computer- or internet-based systems. Accordingly, another aspect of the present invention relates to a computer program product comprising a computer-usable medium having computer-readable program codes or instructions embodied thereon for enabling a processor to carry out gene expression analysis. These computer program instructions may be loaded onto a computer or other programmable apparatus to produce a machine, such that the instructions which execute on the computer or other programmable apparatus create means for implementing the functions or steps described above. These computer program instructions may also be stored in a computer-readable memory or medium that can direct a computer or other programmable apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory or medium produce an article of manufacture including instruction means which implement the analysis. The computer program instructions may also be loaded onto a computer or other programmable apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions or steps described above.
Some embodiments of the present invention provide a system for determining whether a patient sample is BRCA deficient. Generally speaking, the system comprises (1) computer program means for receiving, storing, and/or retrieving data on the correlation between BRCA and CCP expression in a patient sample; (2) computer program means for querying this patient data; (3) computer program means for concluding whether there is or is not a correlation; and optionally (4) computer program means for outputting/displaying this conclusion. In some embodiments this means for outputting the conclusion may comprise a computer program means for informing a health care professional of the conclusion. In some embodiments the system further comprises a computer program means for receiving, storing, and/or retrieving data on BRCA and CCP expression in a patient sample and a computer program means for determining if BRCA and CCP expression are correlated in such sample.
One example of such a computer system is the computer system [300] illustrated in
The at least one memory module [306] may include, e.g., a removable storage drive [308], which can be in various forms, including but not limited to, a magnetic tape drive, a floppy disk drive, a VCD drive, a DVD drive, an optical disk drive, etc. The removable storage drive [308] may be compatible with a removable storage unit [310] such that it can read from and/or write to the removable storage unit [310]. Removable storage unit [310] may include a computer usable storage medium having stored therein computer-readable program codes or instructions and/or computer readable data. For example, removable storage unit [310] may store patient data. Example of removable storage unit [310] are well known in the art, including, but not limited to, floppy disks, magnetic tapes, optical disks, and the like. The at least one memory module [306] may also include a hard disk drive [312], which can be used to store computer readable program codes or instructions, and/or computer readable data.
In addition, as shown in
Computer system [300] may include at least one processor module [302]. It should be understood that the at least one processor module [302] may consist of any number of devices. The at least one processor module [302] may include a data processing device, such as a microprocessor or microcontroller or a central processing unit. The at least one processor module [302] may include another logic device such as a DMA (Direct Memory Access) processor, an integrated communication processor device, a custom VLSI (Very Large Scale Integration) device or an ASIC (Application Specific Integrated Circuit) device. In addition, the at least one processor module [302] may include any other type of analog or digital circuitry that is designed to perform the processing functions described herein.
As shown in
The at least one input module [330] may include, for example, a keyboard, mouse, touch screen, scanner, and other input devices known in the art. The at least one output module [324] may include, for example, a display screen, such as a computer monitor, TV monitor, or the touch screen of the at least one input module [330]; a printer; and audio speakers. Computer system [300] may also include, modems, communication ports, network cards such as Ethernet cards, and newly developed devices for accessing intranets or the internet.
The at least one memory module [306] may be configured for storing patient data entered via the at least one input module [330] and processed via the at least one processor module [302]. Patient data relevant to the present invention may include expression level, activity level, copy number and/or sequence information for a CCP and optionally PTEN. Patient data relevant to the present invention may also include clinical parameters relevant to the patient's disease. Any other patient data a physician might find useful in making treatment decisions/recommendations may also be entered into the system, including but not limited to age, gender, and race/ethnicity and lifestyle data such as diet information. Other possible types of patient data include symptoms currently or previously experienced, patient's history of illnesses, medications, and medical procedures.
The at least one memory module [306] may include a computer-implemented method stored therein. The at least one processor module [302] may be used to execute software or computer-readable instruction codes of the computer-implemented method. The computer-implemented method may be configured to, based upon the patient data, indicate whether the patient has an increased likelihood of recurrence, progression or response to any particular treatment, generate a list of possible treatments, etc.
In certain embodiments, the computer-implemented method may be configured to identify a patient as having or not having cancer or as having or not having an increased likelihood of recurrence or progression. For example, the computer-implemented method may be configured to inform a physician that a particular patient has cancer, has a quantified probability of having cancer, has an increased likelihood of recurrence, etc. Alternatively or additionally, the computer-implemented method may be configured to actually suggest a particular course of treatment based on the answers to/results for various queries.
In some embodiments, the computer-implemented method of the invention [400] is open-ended. In other words, the apparent first step [410] in
Regarding the above computer-implemented method [400], the answers to queries may be determined by the method instituting a search of patient data for the answer. For example, to answer the query [410], patient data may be searched for BRCA and CCP expression data. If such a comparison has not already been performed, the method may compare these data to some reference in order to determine if the respective expressions are high, low, average, etc. The method may also compare the respective expressions to determine if BRCA and CCP expression are correlated. Additionally or alternatively, the method may present one or more of the queries (e.g., [410]) to a user (e.g., a physician) of the computer system [300]. For example, the query [410] may be presented via an output module [324]. The user may then answer “Yes” or “No” via an input module [330]. The method may then proceed based upon the answer received. Likewise, the conclusions [430, 431, 440, 441] may be presented to a user of the computer-implemented method via an output module [324].
As used herein in the context of computer-implemented embodiments of the invention, “displaying” means communicating any information by any sensory means. Examples include, but are not limited to, visual displays, e.g., on a computer screen or on a sheet of paper printed at the command of the computer, and auditory displays, e.g., computer generated or recorded auditory expression of a patient sample's BRCA status.
The practice of the present invention may also employ conventional biology methods, software and systems. Computer software products of the invention typically include computer readable media having computer-executable instructions for performing the logic steps of the method of the invention. Suitable computer readable medium include floppy disk, CD-ROM/DVD/DVD-ROM, hard-disk drive, flash memory, ROM/RAM, magnetic tapes and etc. Basic computational biology methods are described in, for example, Setubal et al., I
The present invention may also make use of various computer program products and software for a variety of purposes, such as probe design, management of data, analysis, and instrument operation. See U.S. Pat. Nos. 5,593,839; 5,795,716; 5,733,729; 5,974,164; 6,066,454; 6,090,555; 6,185,561; 6,188,783; 6,223,127; 6,229,911 and 6,308,170. Additionally, the present invention may have embodiments that include methods for providing genetic information over networks such as the Internet as shown in U.S. Ser. Nos. 10/197,621 (U.S. Pub. No. 20030097222); 10/063,559 (U.S. Pub. No. 20020183936), 10/065,856 (U.S. Pub. No. 20030100995); 10/065,868 (U.S. Pub. No. 20030120432); 10/423,403 (U.S. Pub. No. 20040049354).
In one aspect, the present invention provides methods of treating a cancer patient comprising determining whether BRCA and CCP expression are correlated in a sample from the patient and (1) recommending, prescribing, or administering a particular treatment regimen if BRCA and CCP expression are anti-correlated in the sample or (2) recommending, prescribing, or administering a particular treatment regimen if BRCA and CCP expression are correlated in the sample. In some embodiments, the particular treatment regimen comprises a DNA-damaging agent (e.g., platinum) chemotherapy if BRCA and CCP expression are anti-correlated in the sample. In some embodiments, the particular treatment regimen comprises PARP-inhibitor drugs if BRCA and CCP expression are anti-correlated in the sample. In some embodiments, if BRCA and CCP expression are correlated in the sample the particular treatment regimen comprises a regimen chosen from the group consisting of AC, FEC, FAC, FEC-T, Epirubicin-CMF, TAC, AC-Paclitaxel, AT, TC, T-Carboplatin, Lapatinib, Trastuzumab, Bevacizumab, Sunitinib, Docetaxel, Paclitaxel, Nano Paclitaxel, Docetaxel/capecitabine, Paclitaxel/gemcitabine, Docetaxel/gemcitabine, Gemcitabine, Trastuzumab/Docetaxel, Trastuzumab/Paclitaxel, Capecitabine, Lapatinib/Capecitabine, Ixabepilone, and Toco-P.
The methods of the invention are useful, inter alia, in identifying individuals who may benefit from germline BRCA testing but who may not meet the commonly applied criteria for identifying such individuals. For instance, commonly used criteria include personal history of cancer and significant family history of cancer. As used herein, “personal history of cancer” has its conventional meaning in the art (e.g., a previous cancer in the individual in question). As used herein, “significant family history of cancer” also has its conventional meaning in the art. Various guidelines have been devised and are used by healthcare professionals to determine whether an individual has a “significant family history of cancer.” These include guidelines of American Gastroenterological Association; American Society of Breast Surgeons; American Society of Clinical Oncology; American Society of Colon & Rectal Surgeons; Oncology Nursing Society; Society of Gynecologic Oncologists (e.g., women with breast cancer at ≦40 years, women with bilateral breast cancer (particularly if the first cancer was at ≦50 years); women with breast cancer at ≦50 years and a close relative† with breast cancer at ≦50 years; women of Ashkenazi Jewish ancestry with breast cancer at ≦50 years; women with breast or ovarian cancer at any age and two or more close relatives with breast cancer at any age (particularly if at least one breast cancer was at ≦50 years); unaffected women with a first or second degree relative that meets one of the above criteria), etc. Other widely accepted criteria include individuals with a personal or family history of breast cancer before age 50 or ovarian cancer at any age; individuals with two or more primary diagnoses of breast and/or ovarian cancer; individuals of Ashkenazi Jewish descent with a personal or family history of breast cancer before age 50 or ovarian cancer at any age; male breast cancer patients. A patient lacks a “significant family history of cancer” when one or more of these criteria are not met (usually all). Thus in some embodiments the patient to be assessed by the methods of the invention has a significant family history of cancer. In some embodiments the patient has a personal history of cancer.
In another aspect of the present invention, a kit is provided for practicing the methods and for use in the systems of the present invention. The kit may include a carrier for the various components of the kit. The carrier can be a container or support, in the form of, e.g., bag, box, tube, rack, and is optionally compartmentalized. The carrier may define an enclosed confinement for safety purposes during shipment and storage.
The kit includes various components useful in determining the expression of BRCA1 and/or BRCA2, the expression of at least two CCP genes, and optionally the expression of one or more housekeeping gene markers and/or the methylation status of BRCA1 and/or BRCA2. For example, the kit many include oligonucleotides specifically hybridizing under high stringency to mRNA or cDNA of BRCA1, BRCA2, or the genes in Tables 1 to 5 or Panels A to F. Such oligonucleotides can be used as PCR primers in RT-PCR reactions, or hybridization probes. In some embodiments the kit comprises reagents (e.g., probes, primers, and or antibodies) for determining the expression level of a panel of genes, where said panel comprises at least 25%, 30%, 40%, 50%, 60%, 75%, 80%, 90%, 95%, 99%, or 100% CCP genes (e.g., CCP genes in Tables 1 to 5 or Panels A to F). In some embodiments the kit consists of reagents (e.g., probes, primers, and or antibodies) for determining the expression level of no more than 2500 genes, wherein at least 5, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, 120, 150, 200, 250, or more of these genes are CCP genes (e.g., Tables 1 to 5 or Panels A to F).
The oligonucleotides in the detection kit can be labeled with any suitable detection marker including but not limited to, radioactive isotopes, fluorephores, biotin, enzymes (e.g., alkaline phosphatase), enzyme substrates, ligands and antibodies, etc. See Jablonski et al., Nucleic Acids Res., 14:6115-6128 (1986); Nguyen et al., Biotechniques, 13:116-123 (1992); Rigby et al., J. Mol. Biol., 113:237-251 (1977). Alternatively, the oligonucleotides included in the kit are not labeled, and instead, one or more markers are provided in the kit so that users may label the oligonucleotides at the time of use.
Various other components useful in the detection techniques may also be included in the detection kit of this invention. Examples of such components include, but are not limited to, Taq polymerase, deoxyribonucleotides, dideoxyribonucleotides, other primers suitable for the amplification of a target DNA sequence, RNase A, and the like. In addition, the detection kit preferably includes instructions on using the kit for practice the prognosis method of the present invention using human samples.
The following example illustrates the validation of a CCP gene panel in predicting predicting time to chemical recurrence after radical prostatectomy in prostate cancer patients. The following CCP gene panel was tested:
Mean mRNA expression for the above 31 CCP genes was tested on 440 prostate tumor FFPE samples using a Cox Proportional Hazard model in Splus 7.1 (Insightful, Inc., Seattle Wash.). The p-value for the likelihood ratio test was 3.98×10−5. The mean of CCP expression is robust to measurement error and individual variation between genes.
The study further aimed at determining the optimal number of CCP genes to include in a CCP panel. As mentioned above, CCP expression levels are correlated to each other so it was possible that measuring a small number of genes would be sufficient, e.g., to predict prostate cancer outcome. In order to determine the optimal number of CCP genes for the signature, the predictive power of the mean was tested for randomly selected sets of from 1 to 30 of the CCP genes listed above. To evaluate how smaller subsets of the larger CCP set (i.e., smaller CCP panels) performed, the study also compared how well the signature predicted outcome as a function of the number of CCP genes included in the signature (
This simulation showed that there is a threshold range of CCP genes in a panel that provides significantly improved predictive power (
Unselected human ovarian cancer tissues (235) were obtained under Institutional Review Board (IRB)-approved protocols. Table 9 shows the patient/cancer characteristics.
RNA/DNA Extraction from Frozen Cancers
10 μm thick sections from frozen cancer blocks in Tissue-Tek OCT (Qiagen, Valencia, Calif.) were homogenized using a TissueRuptor (Qiagen) after adding QIAzol lysis reagent, followed by RNA isolation using a QIAgen miRNAeasy Mini Kit per manufacturers protocol. A QIAamp DNA Mini Kit (QIAgen) was used to isolate DNA per the manufacturer's protocol with overnight incubation at 56° C. and RNaseA treatment.
Reverse transcription was performed using a High-Capacity cDNA Reverse Transcription Kit (Applied Biosystems, Inc.) per manufacturer instructions. For pre-amplification, a 0.2× probe mix was made by combining 1 μL of 91 20× gene expression assays from Applied Biosystems Inc. and 9 μL of low-EDTA TE. Pre-amplification was performed using 2.54, of 2× TaqMan° PreAmp Master Mix (Applied Biosystems, Inc), 1.25 μL of 0.2× probe mix, and 1.25 μL cDNA. Applied Biosystems TaqMan assays (BRCA 1: Hs00173233_ml/Hs00173237_ml/Hs01556190_ml/Hs01556191_ml; BRCA2: Hs00609060_ml; housekeepers: Hs99999908_ml (GUSB)/Hs00188166_ml (SDHA)/Hs00237047_ml (YWHAZ)/Hs00824723_ml (UBC)/Hs00609297_ml (HMBS)) were used for pre-amplification and qPCR on a Fluidigm (South San Francisco, Calif.) BioMark instrument. Cycle conditions were 95° C. for 10 minutes, 17 cycles of 95° C. for 15 seconds and 60° C. for 4 minutes. The PCR products were diluted 1:5 with low-EDTA TE. Samples were assessed on gene expression M48 dynamic arrays (Fluidigm) per manufacturer's protocol.
500 ng-1 μg of RNA was treated with Amplification Grade Deoxyribonuclease I (Sigma-Aldrich Inc.) in a 10 μL reaction at room temperature for 30 minutes. 1 μL of Stop Solution is then added and heated to 70° C. for 10 minutes. 14 μLs of RNase-free water is added to make 1 ug of RNA in 25 μLs to be used in a 50 μL reverse transcription reaction using High-Capacity cDNA Reverse Transcription Kit (Applied Biosystems, Inc.)
Pre-Amplification was done using a 0.2× probe mix made combining 1 μL of the 48 individual 20× gene expression assays from Applied Biosystems, Inc. and 52 μLs of low-EDTA TE. Pre-amplification was performed using 2.5 μLs of TaqMan® PreAmp Master Mix (2×) (Applied Biosystems, Inc.), 1.25 μLs of the 0.2× probe mix, and 1.25 μL cDNA.
The range of expression of the genes involved in the calculation of CCP score was too large to allow accurate quantification under uniform conditions. Two pre-amplifications were run independently at each of the two cycle conditions, 8 and 18 cycles. Cycle conditions were 95° C. for 10 minutes and 8/18 cycles of 95° C.×15 seconds and 60° C.×4 minutes. The products were then diluted 1:5 using low-EDTA TE. Samples were run versus the 48 assays (Table 10) on the Fluidigm Gene Expression 48.48 Dynamic Arrays per manufacturers' protocol.
qPCR Analysis
The comparative CT method was used to calculate relative gene expression using the CT for the BRCA2 assay, the average CTs from the BRCA 1 assays, and the average CTs from housekeeper genes. qPCR was performed in 220 cancers where high quality RNA was obtained.
MeAH-011E Methyl-ProfilerTM DNA Methylation PCR Assay Human Breast Cancer, Signature Panel (24-Genes, 385-Well Plates) was used per manufacturers' protocol for the 4-sample format. 125 ng RNase treated genomic DNA was used per restriction enzyme digestion, for a total of 500 ng. Incubation of digestion reactions was performed at 37° C. for 6 hours.
CCP scores were calculated for each sample in the following manner. CT values less than 8 were considered to be above the limit of detection and were removed from the analysis. Data from the two pre-amplification cycling conditions were normalized by subtracting off the average of the CT values of the genes that were not missing any values and whose CT were between 8 and 23 under both conditions. These centered CT values were averaged for each gene with at least two CT values whose standard deviation was less than or equal to 3. ΔCT was calculated as the difference in centered CT values between the gene of interest and the average of the housekeeper genes. ΔCT was then centered for each gene by the average ΔCT on all the samples that were not missing ΔCT for any gene. The negative of the average of the centered ΔCT across the cell-cycle genes is the CCP score.
A patient sample was considered BRCA deficient (79 out of 242 tested) if it had a mutation in BRCA1/2 (41 out of 227 tested), abnormal expression of BRCA1 (47/239), or more than 10% methylation of BRCA1 (9 out of 53 tested).
The association between progression free survival (PFS) and BRCA deficiency was tested using the partial likelihood ratio test from a Cox's proportional hazards model with PFS as the response and BRCA deficiency as the only predictor. The hazard ratio (HR) for deficient patients versus non-deficient patients was 0.66 (p-value=0.014, n=193, 16% censoring), indicating decreased risk of disease progression in deficient patients.
The samples in this study consisted of 216 fresh frozen breast tumors from 4 commercial sources. All but one had ER, PR, and HER2 status. Unless stated otherwise, all assay and statistical details for this study were as described in Example 2 above.
Three ER-patients were PR+. As such, each sample was assigned one of three subtypes based on ER status first and then on HER2 status in the ER-tumors: 113 ER+, 64 triple negative, and 38 ER−/HER2+. One ER− patient was missing HER2 status. As a result her tumor subtype could not be assigned.
BRCA1 expression was measured and calculated for 215 patients' tumors. Three qPCR assays for BRCA1 (Hs00173233_ml (BRCA1), Hs00173237_ml (BRCA1(2)), and Hs01556190_ml (BRCA1(3))) and three housekeeper genes (MMADHC, RPS23, and SDHA) were used to measure BRCA1 expression on these samples. Each sample was preamplified with all the assays 4 times: twice for 12 cycles and twice for 18 cycles. CT was determined for each assay-sample-preamp. For each sample, the genes with CT between 8 and 23 on all preamps were identified as centering genes. They were averaged for each preamp. This quantity was subtracted from the CT of each measurement to put the CT from different numbers of cycles of preamp on the same scale. All replicates with CT greater than 8 were averaged for each assay. ΔCT was calculated for each BRCA1 assay by subtracting the average of the three housekeeper genes. The pairwise relationships between the normalized expression for the BRCA1 assays are shown in
As the correlation of the three BRCA1 assays was high, BRCA1 expression was calculated as the average −ΔCT of the three assays.
Cell-cycle gene expression was measured and calculated for 215 patients' samples in the same manner as BRCA1 expression, with a few exceptions. First, the ProAssay04 set of assays, which consists of 31 cell-cycle genes and 15 housekeepers (Table 15 above), was used instead of 3 housekeepers and 3 assays for the gene of interest. Second, 8 and 18 cycles of preamp were used instead of 12 and 18. Lastly, before averaging all the genes, each gene was centered by the average expression of that gene in the samples where all the cell-cycle genes performed well.
The correlation between each of the cell-cycle genes and the CCP score is shown in
Methylation of the BRCA1 promoter region was measured in 199 tumors.
It is specifically contemplated that any embodiment of any method or composition of the invention may be used with respect to any other method or composition of the invention.
In the context of genes and gene products, the name of the gene is generally italicized herein following convention. In such cases, the italicized gene name is generally to be understood to refer to the gene (i.e., genomic), its mRNA (or cDNA) product, and/or its protein product. Generally, though not always, a non-italicized gene name refers to the gene's protein product.
The use of the term “or” in the claims is used to mean “and/or” unless explicitly indicated to refer to alternatives only or the alternative are mutually exclusive, although the disclosure supports a definition that refers to only alternatives and “and/or.”
Throughout this application, the term “about” is used to indicate that a value includes the standard deviation of error for the device or method being employed to determine the value.
Following long-standing patent law, the words “a” and “an,” when used in conjunction with the word “comprising” in the claims or specification, denotes one or more, unless specifically noted.
Other objects, features and advantages of the present invention will become apparent from the following detailed description. It should be understood, however, that the detailed description and the specific examples, while indicating specific embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.
All of the compositions and methods disclosed and claimed herein can be made and executed without undue experimentation in light of the present disclosure. While the compositions and methods of this invention have been described in terms of preferred embodiments, it will be apparent to those of skill in the art that variations may be applied to the compositions and methods and in the steps or in the sequence of steps of the method described herein without departing from the concept, spirit and scope of the invention. More specifically, it will be apparent that certain agents that are both chemically and physiologically related may be substituted for the agents described herein while the same or similar results would be achieved. All such similar substitutes and modifications apparent to those skilled in the art are deemed to be within the spirit, scope and concept of the invention as defined by the appended claims.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention pertains. In case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.
Other features and advantages of the invention will be apparent from the preceding detailed description and from the following claims
This application is a continuation of International Application No. PCT/US11/054,369, filed Sep. 30, 2011, which claims priority benefit of U.S. Provisional Application No. 61/388,692, filed Oct. 1, 2010. The contents of each of these prior applications are hereby incorporated by reference in their entirety.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/US11/54369 | Sep 2011 | US |
Child | 13852129 | US |