1. Field of the Invention
The invention relates to the use of protein biomarkers for the differential diagnosis, determination of prognosis, and monitoring of the progression of treatment of chronic lymphocytic leukemia.
2. Background Information
Cancers are the second most prevalent cause of death in the United States, causing 450,000 deaths per year. One in three Americans will develop cancer, and one in five will die of cancer. While substantial progress has been made in identifying some of the likely environmental and hereditary causes of cancer, there is a need for substantial improvement in the diagnosis and therapy for cancer and related diseases and disorders.
The course of chronic lymphocytic leukemia (CLL) is variable. Some patients have aggressive disease and require therapy within a relatively short time after diagnosis, whereas others have indolent, asymptomatic disease, and need no therapy for many years. CLL treatment depends upon both the stage and symptoms of the individual patient.
A large group of CLL patients have low-grade, or indolent, disease, which does not benefit from treatment. Individuals with CLL-related complications or more advanced disease often benefit from treatment. To this end, several molecules have been investigated in the pursuit of a biomarker to use for diagnosis of CLL. However, as yet, there is no definitive biomarker that can universally detect and distinguish the aggressive and indolent forms of CLL (see, e.g., J. Binet et al, Perspectives on the use of new diagnostic tools in the treatment of chronic lymphocytic leukemia, Blood 2006 107: 859-861; and, L. Rassenti et al, ZAP-70 compared with immunoglobulin heavy-chain gene mutation status as a predictor of disease progression in chronic lymphocytic leukemia, N Engl J Med. 2004 Aug. 26; 351(9):856-7). A need, therefore, exists for a method of using biomarkers for in connection with the differential diagnosis and treatment of CLL.
The present invention is based on the seminal discovery that a panel of biomarkers is differentially expressed in patients with CLL at distinct stages of the disease. More specifically, the invention relates to a molecular classification procedure based on activity levels of modules in protein networks.
Accordingly, the present invention provides a molecular differential diagnosis (prognosis) tool that assigns patients to “aggressive” (high-risk) or “indolent” (low-risk) groups based on their gene expression correlated to the treatment-free survival from the date of sample collection.
Accordingly, the invention provides a method for predicting the prognostic risk posed by chronic lymphocytic leukemia (CLL) to a patient diagnosed with the disease. The method includes obtaining a sample from the patient; and comparing expression of a first plurality of genes from said sample to expression of a second plurality of genes comprising a biomarker subnetwork, wherein the genes of the subnetwork encode proteins known to exhibit protein-protein interactions, and wherein further said proteins are associated with relatively high or low risk for progression of the disease, whereby similarity between expression of said pluralities of genes indicates the relative level of such risk for the patient. In various aspects, the subnetwork of genes encode proteins that comprise one or more protein biomarkers listed in Tables 1 through 4. In related aspects, the subnetwork includes one or more subnetworks listed in Table 5. Additionally, the subnetwork may include one or more of the subnetworks of
In another embodiment, the invention provides a method of diagnosing CLL in a subject. The method includes diagnosing a subject as having or being at risk of having chronic lymphocytic leukemia (CLL), by obtaining a sample from the subject; and comparing the expression or activation of one or more biomarkers listed in Tables 1 through 5 or
In another embodiment, the invention provides a method for distinguish aggressive CLL from indolent CLL in a subject. For example a method is provided for differentially diagnosing aggressive chronic lymphocytic leukemia (CLL) versus indolent CLL in a subject. The method includes obtaining a sample from a subject; and comparing the level of expression of one or more biomarkers listed in Table 1 and/or Table 2 in a sample from the subject suspected of having aggressive CLL with a control indolent CLL sample, wherein greater expression of one or more of said biomarkers listed in Table 1 in the subject sample versus the control indolent CLL sample is diagnostic of aggressive CLL in the subject, or wherein lesser expression of one or more of said biomarkers listed in Table 2 in the subject sample versus the control indolent CLL sample is diagnostic of indolent CLL in the subject.
According to another embodiment, a method is provided for monitoring a therapeutic regime or progression of CLL in a subject. The method includes identifying when a pattern of biomarker expression indicative of CLL in the indolent state changes to a pattern indicative of CLL in the aggressive state. Detection of such a shift can provide a basis upon which to alter therapy at the early stages of aggressive CLL, thereby potentially improving the clinical outcome for the patient. Additionally, a method is provided in entailing monitoring a therapeutic regimen for treating a subject having or at risk of having CLL, including determining a change in activity or expression of one or more biomarkers listed in any of Tables 1 through 4, subnetwork listed in Table 5 or
In yet another aspect, the invention provides a method for diagnosing CLL in a subject, comprising the steps of a) providing a gene expression profile of a sample from the subject suspected of having CLL, wherein the sample simultaneously expresses a plurality of genes at the protein level that are markers (biomarkers) for CLL; and b) comparing the subject's gene expression profile to a reference gene expression profile obtained from a corresponding control sample of B cells, wherein the reference gene expression profile comprises an expression value of one or more target genes for biomarkers indicative of CLL; and/or c) comparing the gene expression profile to a database of CLL protein biomarker subnetworks, to provide a differential diagnosis between indolent and aggressive CLL and/or provide a prognosis for the patient.
In another embodiment, the invention provides a computer microchip programmed with one or more datasets concerning CLL subnetwork markers that can be used in clinical routines for predicting whether a particular patient's expression profile is likely to be in a short need of treatment. A microarray-like approach can be implemented. Accordingly, the invention provides a diagnostic chip comprising nucleotides with at least 80%, 85%, 90%, 95%, or greater percent homology to the sequences of two or more genes listed in Tables 1 through 4, and the subnetworks listed in Table 5 and those of
In another embodiment, the invention provides a software program with algorithms for comparing one or more datasets concerning CLL subnetwork markers that can be used in clinical routines for predicting whether a particular patient's expression profile is likely to be in a short need of treatment. Accordingly, a computer-readable media including algorithms for execution of comparisons included in the methods described herein is provided.
Methods are provided for the diagnosis and monitoring of treatment of CLL based on detection of certain biomarkers in samples from patients who have, or are suspected of having, CLL. Further, expression of one or more such biomarkers can be used to distinguish aggressive CLL from indolent CLL. In addition, certain cellular pathways have been identified as biomarkers of CLL whose activation or inactivation is diagnostic for CLL.
Before the present compositions and methods are described, it is to be understood that this invention is not limited to particular compositions, methods, and experimental conditions described, as such compositions, methods, and conditions may vary. It is also to be understood that the terminology used herein is for purposes of describing particular embodiments only, and is not intended to be limiting, since the scope of the present invention will be limited only in the appended claims.
As used in this specification and the appended claims, the singular forms “a”, “an”, and “the” include plural references unless the context clearly dictates otherwise. Thus, for example, references to “the method” includes one or more methods, and/or steps of the type described herein which will become apparent to those persons skilled in the art upon reading this disclosure and so forth.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the invention, the preferred methods and materials are now described.
In various aspects, the invention provides biomarkers relating to CLL diagnosis, prognosis, treatment, and pathology. A biomarker is a characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacological responses to a therapeutic intervention. Biomarkers vary widely in nature, ease of measurement, and correlation with physiological states of interest. As in the present invention, biomarkers may include up- or down-regulation of gene expression or subnetworks of a plurality of genes. For example, increased or decreased expression of a gene encoding a protein during pathogenesis of a disease, such as CLL, implicates the protein as a protein biomarker. Similarly, increased or decreased expression of a plurality of genes, or subnetwork, during pathogenesis of a disease, such as CLL, implicates the subnetwork as a biomarker subnetwork. For example, the subnetwork, may include a plurality of genes whose expression products (proteins) are known to interact in cellular processes to define biological pathways and processes.
Biomarker subnetworks and/or protein biomarkers whose up-regulation or downregulation is indicative of an increase or decrease in the relative “activeness” of the disease. In particular, marker gene expression define a repertoire of transcriptional activity contributing to or resulting from the dynamic evolution of CLL cells. Network-based gene expression analysis reveals subnetworks of proteins that are coordinately irregulated under the disease progression. With knowledge of such subnetworks, CLL progression can be monitored by analyzing subnetwork activities inferred from gene expression profiles.
As used herein, the terms “patient” or “individual” are used interchangeably herein, and is meant a mammalian subject to be treated, with human patients being preferred. In some cases, the methods of the invention find use in experimental animals, in veterinary application, and in the development of animal models for disease, including, but not limited to, rodents including mice, rats, and hamsters, and primates.
Gene expression profiling has been used to define a repertoire of transcriptional activity contributing to or resulting from the dynamic evolution of CLL cells. To evaluate for this, samples obtained from CLL patients may be profiled for expression, using for example, microarray technology, such as mRNA expression microarrays. However, one of skill in the art would understand that expression profiling may be performed by any method known in the art, including methods, such as serial analysis of gene expression (SAGE) or SuperSAGE technology.
As discussed herein, the methods of the present invention may be used, for example, to evaluate CLL patients and those at risk for CLL. In any of the methods of diagnosis, prognosis, disease progression and therapeutic efficacy described herein, either the presence or the absence of one or more biomarkers of CLL, may be used to generate such clinical measures.
In one embodiment, the invention provides a molecular prognosis tool that assigns patients to “aggressive” (high-risk) or “indolent” (low-risk) groups based on their gene expression correlated to the treatment-free survival from the date of sample collection. To achieve better prediction performance based upon biological-defensive models, the network-based classification scheme developed for predicting metastasis potential of breast cancers (Chuang et al., Mol Syst Biol., 2008) was adapted to CLL analysis. The network-based approach identifies prognostic markers not as individual genes but as subnetworks extracted from molecular interaction databases (
Approximately 30 prognostic subnetworks have been identified which provide new putative cancer markers and an array of “small-scale” models charting the molecular mechanisms correlated with CLL progression, e.g. subnetworks detailing interactions between proteins participating in Wnt signaling, Smoothend signaling, or cell death (
The network-based classification of the invention achieves higher accuracy in predicting duration of treatment-free survival in newly diagnosed patients than commonly-used prognostic factors or conventional gene-expression array analyses (compare
To perform the methods of the invention, the sample of cells examined according to the present method can be obtained from the subject to be treated; e.g., from a blood sample. As used herein, the term “sample” refers to any sample suitable for the methods provided by the present invention. The sample may be any sample that includes cells or cellular components suitable for detection. In one aspect, the sample is a blood sample, including, for example, whole blood or any fraction or component thereof. A blood sample, suitable for use with the present invention may be extracted from any source known that includes blood cells or components thereof, such as veinous, arterial, peripheral, tissue, cord, and the like. For example, a sample may be obtained and processed using well known and routine clinical methods (e.g., procedures for drawing and processing whole blood). In one aspect, an exemplary sample may be peripheral blood drawn from a subject with cancer.
Once disease is established and a treatment protocol is initiated, the methods of the invention may be repeated on a regular basis to monitor the expression level of genes associated with CLL in the subject and correlate them to the subnetwork data. The gene expression level data allows one to distinguish between aggressive and indolent CLL, while comparison of the expression data to subnetwork matrices for CLL aids in classification of each patient into high or low risk categories (
A panel of protein biomarkers that can be used collectively or individually to diagnose CLL are described in Tables 1 through 4 herein below. Analysis of the panel has also revealed that up- or downregulation of certain cellular pathways can be used as indicators of CLL, as described in Table 5 below. Likewise
Most of the individual members of the panel, and the identified cellular pathways, have not previously been recognized as CLL biomarkers or drug targets. The panel of biomarkers can be quantitated using mass spectrometry, antibodies, or other assays well known in the art that identify and measure relative or absolute quantities of gene expression or the resulting proteins, such as expression profiling using microarray or SAGE technology.
Furthermore, the biomarkers can be used to discover drugs by developing assays that report agonist or antagonist interactions of drug candidates with the panel of biomarkers, both individually and collectively, and that report both direct and indirect effects on the biomarkers.
In all of the embodiments of the invention described herein, analysis, such as comparison of the data generated relating to gene expression, is generally performed using software algorithms generally known in the art. Thus generation, comparison and analysis of data is performed on computer-readable media including such software. The results are typically outputted to a user via a visual display.
The invention further provides a microarray device, such as a microarray including nucleotide sequences of the biomarkers disclosed herein. The microarrays are provided for obtaining clinically relevant data regarding CLL diagnosis and the like as described.
Microarrays or arrays of the present invention may include any one, two or three dimensional arrangement of addressable regions bearing a particular chemical moiety or moieties (for example, polynucleotide sequences) associated with that region. Preferably the chemical moieties include oligonucleotides (i.e., probes) An array is addressable in that it has multiple regions of different moieties (i.e., different oligonucleotide sequences) such that a region (i.e., a feature or spot of the array) is at a particular predetermined location (i.e., an address) on the array. An array layout refers to one or more characteristics of the array, such as feature positioning, feature size, and some indication of a moiety at a given location. An array includes a support substrate that may be of any suitable type known in the art, such as glass, to which one or more chemical moiety or moieties are linked or bound using methods well known in the art.
As used herein, the terms “polynucleotide” and “oligonucleotide” refer to nucleic acid molecules. A polynucleotide or oligonucleotide includes single or multiple stranded configurations, where one or more of the strands may or may not be completely aligned with another. The terms “polynucleotide” and “oligonucleotide” are intended to be generic to polydeoxyribonucleotides (containing 2-deoxy-D-ribose), polyribonucleotides (containing D-ribose), or any other type of polynucleotide which is an N-glycoside of a purine or pyrimidine base. Additionally, the terms are intended to include polymers in which the conventional backbone has been replaced with a non-naturally occurring or synthetic backbone or in which one or more of the conventional bases has been replaced with a non-naturally occurring or synthetic base. As such a polynucleotide or oligonucleotide may include naturally occurring nucleotides and phosphodiester bonds that are chemically synthesized.
While the terms “oligonucleotide” and “polynucleotide” are intended to be synonymous, an “oligonucleotide” may generally refer to a nucleotide multimer of about 2 to 100 nucleotides in length, such as a probe, while a “polynucleotide” includes a nucleotide multimer having any number of nucleotides, such as the entire genome of an organism or a portion thereof.
The probes for use with the microarray may be oligodeoxyribonucleotides or oligoribonucleotides, or any modified forms of these polymers that are capable of hybridizing with a target nucleic sequence by complementary base-pairing. Complementary base pairing means sequence-specific base pairing which includes, for example, Watson-Crick base pairing as well as other forms of base pairing such as Hoogsteen base pairing. Modified forms include 2′-O-methyl oligoribonucleotides and so-called PNAs, in which oligodeoxyribonucleotides are linked via peptide bonds rather than phosphodiester bonds. The probes can be attached by any linkage to a support substrate (i.e., 3′, 5′ or via the base).
A variety of methods are well known in the art for manufacturing microarrays, including methods for binding or affixing probes in a variety of configurations to a solid support, such as glass, plastic or silicon wafer. Such methods include fabrication using a variety of technologies, such as printing with fine-pointed pins onto glass slides, photolithography using pre-made masks, photolithography using dynamic micromirror devices, ink-jet printing, or electrochemistry on microelectrode arrays.
In various aspects, the number of probes affixed to the array support can be quite large. For example, the array may include up to about 6 million probes. Further, the probe sequences may be about 80, 85, 90, 95% or more, homologous to one or more nucleotide sequences of the biomarkers identified herein. For example, the probe sequences may be about 95% homologous or greater to the biomarkers indentified in
The following examples are provided to further illustrate the embodiments of the present invention, but are not intended to limit the scope of the invention. While they are typical of those that might be used, other procedures, methodologies, or techniques known to those skilled in the art may alternatively be used.
Purified B cell samples from aggressive and indolent CLL patients were lysed and proteins were digested by trypsin. O16/O18 labeling was used for relative quantitation by spectra count, while iTRAQ labeling was used to obtain more accurate quantitation. An automated 2D nanoflow LC system was coupled to an LTQ mass spectrometer to identify and quantify the peptides. Mass spectra were searched against the IPI™ database using Agilent Spectrum Mill™ software. Search results for individual spectra were automatically validated using filtering criteria from an in-situ False Discovery Rate (FDR) calculation. IDs for identified proteins were converted to gene symbols and Unigene IDs using the IPI gene cross reference table. Protein function analysis was done using the NCI DAVID™ website.
5 pairs of aggressive and indolent CLL samples were analyzed using O16/O18 labeling. A total of 15,442 IPI protein sequences were identified at a protein FDR of 3.5%, which corresponds to 6,348 unigenes. Relative protein spectra count ratio was used for quantitation. Spectra count ratio was digitized into either up (+1), down (−1), or undetected (0) categories. 16 replicate runs were performed to obtain good quantitative statistics. A protein relative abundance heat map was generated using this method (
Proteins that consistently shown up or down trends were selected as potential biomarkers (
Table 1 lists the panel of protein biomarkers that are up-regulated in aggressive CLL compared to indolent CLL B-cells.
Table 2 lists the panel of protein biomarkers that are down-regulated in aggressive CLL compared to indolent CLL B-cells.
musculus MRPS18b mitochondrial ribosomal protein S18b
Table 3 lists the panel of protein biomarkers that are up-regulated in CLL compared to normal B-cells.
Table 4 lists the panel of protein biomarkers that are down-regulated in CLL compared to normal B-cells.
Regulation of known subnetworks of cellular pathways to identify biomarker subnetworks relating to measures of CLL prognosis, diagnosis and/or pathology were also identified by determining up- and down-regulation of genes within the subnetworks. Cellular pathways of molecular interaction and reaction networks are provided in various curated databases, such as Biocarta™ and Kegg™. Table 5 lists known Biocarta™ (biocarta.com/genes/index.asp) and Kegg™ (genome.jp/kegg/pathway.html) subnetworks of cellular pathways that are up- or down-regulated in CLL B cells compared to normal B cells and thus determined to be biomarker subnetworks.
Helicobacter pylori infection
The clinical course of patients with chronic lymphocytic leukemia (CLL) is heterogeneous. For unknown reasons, some patients become fatal within few years while some others may stay symptom free for more than a decade. Several prognostic factors have been identified that can stratify patients into groups that differ in their relative tendency for disease progression and/or survival. Microarray studies have highlighted differences in mRNA levels found between such CLL subgroups.
To evaluate gene expression profiling to define a repertoires of transcriptional activity contributing to or resulting from the dynamic evolution of CLL cells. 131 CLL patients (of >90% CD19+CD5+ peripheral blood mononuclear cells in each sample) were profiled on mRNA expression microarrays using Affymetrix HG-U133 plus 2 GeneChips™ Patterns of gene activity correlated with the time intervals to treatment of CLL patients from the date of sample collection (treatment-free survival). An expression-based prognosis that assigns patients to “aggressive” (high-risk) or “indolent” (low-risk) groups based upon biological-defendable models by incorporating knowledge of molecular pathways was next developed.
To identify progression-associated pathway markers, a network-based marker identification was adopted. The network-based approach identified prognostic markers not as individual genes but as subnetworks extracted from molecular interaction databases (
Specifically, each subnetwork was scored by a vector of activities across all patients, where the activity for a given patient is a function of the expression levels of its member genes. A subnetwork's prognostic power was scored by the Cox metric, which measures the correlation between the activity vector and the patients' treatment-free survival. The resulting 30 prognostic biomarker subnetworks identify new putative cancer markers and provide an array of “small-scale” models charting the molecular mechanisms correlated with CLL disease progression, e.g. subnetworks detailing interactions between proteins participating in cell cycle and death, Myc regulation, proteasome or Wnt signaling (
To examine the utility of the 30 subnetwork markers in CLL risk assessment, a 5-fold cross validation was performed where 80% of the 131 patients were designated as a training set and the rest as a test set. The patients in a training set were first separated into two groups based on the similarity between their activity patterns of the 30 prognostic subnetworks; one set was assigned as “aggressive” and the other as “indolent” according to the median survival time of each patient set. Patients in a test set were segregated into either the “aggressive” or “indolent” group based on their activity similarity to that of the training samples. The survival difference between the two predicted groups of patients in a test set was then used as a metric to evaluate the prognostic power of the given markers.
Combining the results from the 5 test trials, the two risk groups defined by the subnetwork markers displayed significantly different behaviors with respect to treatment-free survival (
Although the invention has been described with reference to the above example, which is incorporated herein by reference, it will be understood that modifications and variations are encompassed within the spirit and scope of the invention. Accordingly, the invention is limited only by the following claims.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US09/53948 | 8/14/2009 | WO | 00 | 3/22/2011 |
Number | Date | Country | |
---|---|---|---|
61089462 | Aug 2008 | US | |
61120347 | Dec 2008 | US |