The present invention relates generally to the detection and/or prognosis and/or diagnosis and/or treatment of sub-types of acute lymphoblastic leukemia and/or chronic myeloid leukemia.
Leukemia's are classified into four multiple groups or types, including: acute myeloid leukemia (AML), acute lymphatic leukemia (ALL), chronic myeloid (CML) and chronic lymphocytic leukemia (CLL). Within these groups, several subcategories can be further identified using a panel of standard diagnostic techniques. These different subcategories of leukemia are associated with varying clinical outcomes and therefore are the basis for different treatment strategies.
The development of new specific drugs and treatment approaches requires the identification of specific subtypes that may benefit from a distinct therapeutic protocol and, thus, can improve outcome of distinct subsets of leukemia. As it is mandatory for the patients suffering from these specific leukemia subtypes to be identified as fast as possible so that the best therapy can be applied, diagnostics today must accomplish sub-classification with maximal precision. Thus, methods and compositions are needed in the art to provide means for additional leukemia diagnostic and prognostic markers.
Compositions and methods for the identification, prognosis, classification, diagnosis and/or treatment of leukemia or a genetic predisposition to leukemia are provided. In one embodiment, the present invention is based on the discovery of multiple genomic abnormalities of the IKZF1 (Ikaros) gene which are shown herein to be associated with acute lymphoblastic leukemia (ALL), more particularly, with BCR-ABL1 positive ALL, and to be associated with chronic myeloid leukemia (CML), more particularly, a subtype of CML termed blast crisis chronic myeloid leukemia (BC-CML). In another embodiment, the present invention demonstrates that the genomic abnormalities of the IKZF1 gene can be used as prognostic markers to identify a subgroup of BCR-ABL1 negative ALL having very poor outcomes. The present invention therefore provides compositions comprising polynucleotides, including both genomic sequences of the various IKZF1 genomic abnormalities disclosed herein and any transcripts encoded thereby. Such polynucleotides comprising the genomic abnormalities of the IKZF1 gene find use, for example, as biomarkers for use in methods for detecting genomic abnormalities which are associated with ALL, more specifically, which are associated with BCR-ABL1 positive ALL, and/or for detecting genomic abnormalities which are associated with CML, more particularly, with BC-CML or the likelihood of progression into blastic transformation of CML. In another embodiment, the biomarkers can be used as a prognostic markers to identify a subgroup of ALL having very poor outcomes. Accordingly, the present invention encompasses methods and compositions useful in the identification and/or the prognosis and/or predisposition and/or treatment of a subject with ALL and/or a subject with CML, more particularly, with BC-CML or the likelihood of progression into blastic transformation of CML.
The compositions of the invention can further be employed in methods for selecting a therapy for a subject affect by leukemia. Including, for example, selecting an appropriate therapy for ALL and/or selecting a therapy for CML, more particularly, a therapy for a patient with BC-CML or for a patient with CML having a likelihood of progression into blastic transformation of CML. Further provided are methods for identifying agents that target a polypeptide expressed from the IKZF1 genomic abnormality. Thus, methods to screen for compounds that can serve as molecular targets for drugs useful in modulating the activity of the polypeptides expressed from the IKZF1 genomic abnormalities are provided. Such compounds can find use in treating ALL and/or treating a subject with CML, more particularly, treating a subject with BC-CML or a patient having CML with the likelihood of progression into blastic transformation of CML. Accordingly, the present invention encompasses methods and compositions useful in the identification and/or the prognosis and/or predisposition and/or treatment of ALL and/or CML, more specifically, BC-CML.
The present inventions now will be described more fully hereinafter with reference to the accompanying drawings, in which some, but not all embodiments of the inventions are shown. Indeed, these inventions may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will satisfy applicable legal requirements. Like numbers refer to like elements throughout.
Many modifications and other embodiments of the inventions set forth herein will come to mind to one skilled in the art to which these inventions pertain having the benefit of the teachings presented in the foregoing descriptions and the associated drawings. Therefore, it is to be understood that the inventions are not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.
In one embodiment, the present invention has identified various genomic abnormalities in the IKZF1 gene that are correlated with ALL, more particularly, with BCR-ABL1 positive ALL, and that are correlated with CML, more particularly, BC-CML or the likelihood of progression into blastic transformation of CML. In addition, the genomic abnormalities in the IKZF1 gene can further be used as prognostic markers of ALL, more particularly, prognostic markers for subtypes of ALL having very poor outcomes, including, the B-progenitor ALL subtypes, including BCR-ABL1(+) and BCR-ABL1(−) subtypes. Various methods and compositions that allow for the direct detection of such genomic abnormalities in IKZF1 are provided. Compositions of the invention include IKZF1 polynucleotides and variants and fragments thereof that can be used to detect the chromosomal abnormalities in the IKZF1 gene that are associated with ALL, more particularly, with BCR-ABL1 positive ALL, and that are associated with CML, more particularly, BC-CML and that are associated with the prognosis of subtype of ALL having very poor outcomes, including, B-progenitor ALL. “Acute lymphoblastic leukemia” or “ALL” comprises a heterogeneous group of leukemic disorders characterized by recurring chromosomal abnormalities including translocations, trisomies and deletions. As used herein “BCR-ABL1” comprises an ALL subtype that is characterized by the presence of the Philadelphia chromosome arising from the t(9; 22)(q34; q11.2) translocation, which encodes the constitutively activated BCR-ABL1 tyrosine kinase. See, for example, Riberio et al. (1987) Blood 70:948 and Gleissner et al. (2002) Blood 99:1536, both of which are herein incorporated by reference. Chronic myeloid leukemia is a myeloproliferative disorder characterized by the presence of the BCR-ABL1 transcript in most cases. CML typically presents as an indolent chronic phase, and subsequently progresses through a more aggressive accelerated phase, eventually terminating in an overt blastic phase (blast crisis), which may be of lymphoid or myeloid lineage.
As used herein, the “IKZF1” gene or the “Ikaros” gene refers to a genomic polynucleotide that encodes an IKZF1 polypeptide, where the encoded polypeptide is a member of a family of zinc finger nuclear proteins that is required for normal lymphoid development. The IKZF1 polypeptide has a central DNA-binding domain consisting of four zinc fingers, and a homo- and heterodimerization domain consisting of the two C-terminal zinc fingers (
The term “gene” refers to a nucleic acid (e.g., DNA) sequence that comprises coding sequences necessary for the production of a polypeptide, precursor, or RNA (e.g., rRNA, tRNA). The term also encompasses the coding region of a structural gene and the sequences located adjacent to the coding region on both the 5′ and 3′ end which allow for the expression of the sequence. Sequences located 5′ of the coding region and present on the mRNA are referred to as 5′ non-translated sequences. Sequences located 3′ or downstream of the coding region and present on the mRNA are referred to as 3′ non-translated sequences. A genomic form or clone of a gene contains the coding region interrupted with non-coding sequences termed “introns” or “intervening regions” or “intervening sequences.” Introns are segments of a gene that are transcribed into nuclear RNA (hnRNA); introns may contain regulatory elements such as enhancers. Introns are removed or “spliced out” from the nuclear or primary transcript; introns therefore are absent in the messenger RNA (mRNA) transcript. The mRNA functions during translation to specify the sequence or order of amino acids in a nascent polypeptide.
As used herein, a “genomic abnormality” refers to any alteration in the genomic sequence. Such rearrangements include a point mutation, a deletion, a substitution, or amplification of the gene, including a complete or partial deletion or amplification of any one or any combination of the promoter, the 5′ regulatory region of the IKZF1 gene, the coding region of the IKZF1 gene, and/or the 3′ regulatory region of the IKZF1 gene. Substitutions and/or deletions and/or additions can range from 1, 2, 3, 5, 10, 30, 60, 100, 200, 300, 400, 500 nucleotides in length or higher. Rearrangements can further include an insertion into the genomic sequence in any one or any combination of the various regions outlined above. In specific embodiments, the genomic abnormality comprises a deletion of the entire IKZF1 gene. In other embodiments, the genomic abnormality comprises an intragenic deletion. In other embodiments, the genomic abnormality comprises sequence mutations (nucleotide substitutions) of the gene.
As used herein, a “genomic abnormality” of IKZF1 is characterized phenotypically by the association of the genomic abnormality with ALL and/or CML, more particularly, with BCR-ABL1 positive ALL and/or with a BC-CML; the likelihood of progression into blastic transformation of CML. In still other embodiments, the genomic abnormality of the IKZF1 gene is characterized phenotypically by the association of the genomic abnormality with a subgroup of ALL having very poor outcomes, including, BCR-ABL1 positive and BCR-ABL1 negative B-progenitor ALL subtypes.
The term “intragenic deletion” refers to any internal deletion in the genomic DNA of a gene. Thus, the term “intragenic deletion of IKZF1” refers to any internal deletion in the genomic DNA comprising the IKZF1 gene. As used herein, an intragenic deletion of an IKZF1 allele is characterized phenotypically by the association of the intragenic deletion with ALL and/or CML, more particularly, with BCR-ABL1 positive ALL and/or BC-CML or the likelihood of progression into BC-CML. At the genetic level, the intragenic deletion is part of the genetic make-up of the cell (contained within the genomic DNA). In specific embodiments, the intragenic deletion of IKZF1 comprises an internal deletion of various exons including, for example, a deletion of at least one of exon 0, exon 1, exon 2, exon 3, exon 4, exon 5, exon 6, and/or exon 7 of the IKZF1 gene or any combination thereof. It is recognized that as used herein, a deletion of an exon or intron can encompass both the complete absence of the recited exon or intron sequence, or the absence of at least a fragment of the full exon or full intron. In other words, the chromosomal break can occur anywhere within the recited exon or in the flanking intron. The exons of the human IKZF1 gene are designated in the genomic sequence of the human IKZF1 gene in SEQ ID NO: 1.
In specific embodiments, the genomic abnormality of the IKZF1 gene comprises a deletion of exon 3 through exon 6. In further embodiments, the genomic abnormality resulting in the deletion of exon 3 through exon 6 results from a proximal chromosomal break point occurring within intron 2 and a distal chromosomal break point occurring within intron 6. See, for example, Table 9. The specific genomic abnormality depicted in Table 9 is referred to herein as IKZF1Δexon3-6 or IK6.
Additional, non-limiting examples of genomic abnormalities of the IKZF1 gene are shown throughout the experimental section. For instance, the genomic abnormality of the IKZF1 gene can comprise a deletion of exon 2 through exon 6 (referred to here in as IKZF1Δexon2-6 or Ik9). In such rearrangements, the genomic abnormality could result from a proximal chromosomal break point occurring in intron 1 or in exon 2 and a distal chromosomal break point occurring in intron 6 or exon 6. In still other examples, the genomic abnormality of the IKZF1 gene can comprise a deletion of exon 1 through exon 6 (referred to herein as IKZF1Δexon1-6 or Ik6). In such rearrangements, the genomic abnormality could result from a proximal chromosomal break point occurring upstream of exon 1 or in exon 1 and a distal chromosomal break point occurring in intron 6 or exon 6.
The term “intragenic substitution” refers to any internal substitution in the genomic DNA of a gene. Thus, the term “intragenic substitution of IKZF1” refers to any internal substitution or point mutations in the genomic DNA comprising the IKZF1 gene. As used herein, an intragenic substitution of an IKZF1 allele is characterized phenotypically by the association of the intragenic deletion with ALL and/or CML, more particularly, with BCR-ABL1 positive ALL; and/or with BC-CML or progression into blastic transformation CML; and/or with a subgroup of ALL with very poor outcomes.
The term “intragenic addition” refers to any internal addition in the genomic DNA of a gene. Thus, the term “intragenic addition of IKZF1” refers to any internal addition in the genomic DNA comprising the IKZF1 gene. As used herein, an intragenic addition of an IKZF1 allele is characterized phenotypically by the association of the intragenic addition with ALL and/or CML, more particularly, with BCR-ABL1 positive ALL; and/or with BC-CML or progression into blastic transformation CML; and/or with a subgroup of ALL with very poor outcomes.
Further provided are a series of genetic abnormality which are shown to be associated with CML in blast crisis which, in specific embodiments, comprise point mutations in IKZF1.
In specific embodiments, the genomic abnormality in the IKZF1 gene results in the expression of a dominate negative isoform of the IKZF1 polypeptide. In specific embodiments, the dominant negative isoform of the IKZF1 polypeptide lacks the ability to bind DNA. In other embodiments, the genomic abnormality in the IKZF1 gene results in the complete loss of expression of the IKZF1 polypeptide. In still further embodiments, the genomic abnormality of the IKZF1 gene results from a recombinase activating gene (RAG) mediated recombination event. Representative methods to assay for such activities are disclosed herein in the experimental section.
The term “junction of a genomic abnormality” refers to the region of the polynucleotide which is joined following the occurrence of the genomic abnormality. In view of the characterization of the various chromosomal abnormalities of IKZF1 disclosed herein, novel polynucleotides are provided that comprise the novel polynucleotide junctions of IKZF1 that occur following the various genomic abnormalities.
In specific embodiments, the polynucleotides comprising the IKZF1 genomic abnormalities or active variants and fragments thereof, do not encode an IKZF1 polypeptide, but rather have the ability to specifically detect the IKZF1 genomic abnormality in the genomic DNA of a biological sample, and thereby allow for the identification/classification and/or the prognosis and/or predisposition of the biological sample to ALL, more particularly, BCR-ABL1 positive ALL and/or to CML, more particularly, to BC-CML or the likelihood of progression of blastic transformation of CML. In other embodiments, the polynucleotides comprising IKZF1 genomic abnormalities or active fragments or variants thereof allow for the detection of prognostic markers of a subtype of ALL having very poor outcomes. Various methods and compositions to carry out such methods are disclosed elsewhere herein.
In specific embodiments, detecting the IKZF1 genomic abnormalities find use in selecting a therapy for a subject affect by leukemia. Thus, upon the detection of the IKZF1 genomic abnormality, and in specific embodiments, the identification of the specific IKZF1 genomic abnormality, a therapy may be selected or customized for the subject in view of the IKZF1 genomic abnormalities.
In one embodiment, a method for making a prognosis of an acute lymphoblastic leukemia having a poor outcome in a patient is provided. Thus, the genomic abnormalities of the IKZF1 gene can be used as prognostic markers that allow for the prediction of the probable course and outcome of ALL and/or the likelihood of recovery from the disease. As demonstrated herein, the genomic abnormalities of IKZF1 identify a subgroup of ALL with very poor outcomes. Thus, the identification of genomic abnormalities can be used to improve the ability to accurately stratify patients for appropriate therapy. Such a prognosis can be used to improve outcome prediction, predict risk of relapse, predict risk of treatment failure, and/or design treatment regimes. Such methods comprise assaying the nucleic acid complement of a biological sample for a genomic abnormality in the IKZF1 gene. Such methods comprise detecting the genomic abnormality of the IKZF1 gene in the nucleic acid complement of the biological sample, where the presence of the genomic abnormality of the IKZF1 gene is indicative of a subgroup of ALL with poor outcomes. A prognosis of the patient's ALL based on the genomic abnormalities of IKZF1 gene is then provided.
As used herein, the “nucleic acid complement” of a sample comprises any polynucleotide contained in the sample. The nucleic acid complement that is employed in the methods and compositions of the invention can include all of the polynucleotides contained in the sample or any fraction thereof. For example, the nucleic acid complement could comprise the genomic DNA and/or the mRNA and/or cDNAs of the given biological sample. Thus, the genomic abnormalities in the IKZF1 gene can be detected in the genomic DNA or through the transcribed products thereof.
Methods are further provided that allow for determining the progression of chronic myeloid leukemia in a patient. In one embodiment, a method for classifying a cell sample as BC-CML or having a likelihood of progression into blastic transformation of CML is provided. Such methods can comprise determining if the biological sample comprises a genomic abnormality of the IKZF1 gene. The presence of the genomic abnormality of the IKZF1 gene is indicative of progression into blastic transformation of CML. Thus, the methods and compositions of the invention allow for one to distinguish patients having a likelihood of progression of blastic transformation of CML and/or to determine the general course of treatment for these patients.
Various methods and compositions for identifying a genomic abnormality in the IKZF1 gene are provided. Such methods find use in identifying and/or detecting such rearrangements in any biological material and thus allow for the identification, prognosis, classification, treatment, and/or diagnosis of leukemia or a genetic predisposition to ALL, more particularly, BCR-ABL1 positive ALL and/or to CML, more particularly, with BC-CML or the likelihood of progression into blastic transformation of CML. Such methods further find use to detect a subset of BCR-ABL1 positive and BCR-ABL1 negative B-progenitor ALL subtypes having very poor outcomes.
In one embodiment, a method is provided for assaying a biological sample for a genomic abnormality of the IKZF1 gene. The method comprises (a) providing a biological sample from a subject, wherein the biological sample comprises genomic DNA of the subject and (b) determining if the genomic DNA comprises a genomic abnormality in the IKZF1 gene. In one embodiment, the presence of the genomic abnormality of the IKZF1 gene is indicative of ALL, more particularly, BCR-ABL1 positive ALL. In another embodiment, the presence of the genomic abnormality of the IKZF1 gene is indicative of CML, more particularly, BC-CML or the likelihood of progression into blastic transformation of CML. In still another embodiment, the presence of the genomic abnormality of the IKZF1 gene is used as a prognostic marker to identify a subgroup of ALL with very poor outcomes, including the BCR-ABL1 positive and BCR-ABL1 negative B-progenitor ALL subtypes.
Such methods can be used to identify various IKZF1 genomic abnormalities including for example, a deletion of the entire IKZF1 gene, an intragenic deletion of the IKZF1 gene, or a deletion of at least one exon of the IKZF1 gene. In specific methods, the IKZF1 genomic abnormality that is detected comprises a deletion of exon 3 through exon 6 of the IKZF1 gene; a deletion of exon 2 through exon 6 of the IKZF1 gene; or a deletion of exon 1 through exon 6 of the IKZF1 gene. Alternatively, such methods can be employed to detect any of the additional IKZF1 genomic abnormalities disclosed herein.
It is further recognized that the diagnostic method used to detect the genomic abnormalities may be one which allows for the detection of the rearrangement without discriminating between the various IKZF1 genomic abnormalities disclosed herein.
Alternatively, the method employed may be such as to allow for a specific IKZF1 rearrangement to be distinguished. In other methods, an initial assay may be performed to confirm the presence of an IKZF1 genomic abnormality but not identify the specific genomic abnormality. If desired, a secondary assay can then be performed to determine the identity of the particular IKZF1 genomic abnormality. The second assay may use a different detection technology than the initial assay.
It is further recognized that the IKZF1 genomic abnormalities may be detected along with other markers in a multiplex or panel format. Markers are selected for their predictive value alone or in combination with the IKZF1 genomic abnormalities. Markers for other leukemias, diseases, infections, and metabolic conditions are also contemplated for inclusion in a multiplex of panel format. For example, when detecting IKZF1 genomic abnormalities to identify a subgroup of ALL with very poor outcomes, a test for the BCR-ABL1 translocation can also be performed. Such a test, however, is not required. Ultimately, the information provided by the methods of the present invention will assist a physician in choosing the best course of treatment for a particular patient.
As used herein, a “biological sample” can comprise any sample in which one desires to determine if the nucleic acid complement of the sample contains an IKZF1 genomic abnormality. For example, a biological sample can comprise a sample from any organism, including a mammal, such as a human, a primate, a rodent, a domestic animal (such as a feline or canine) or an agricultural animal (such as a ruminant, horse, swine or sheep). The biological sample can be derived from any cell, tissue or biological fluid from the organism of interest. The sample may comprises any clinically relevant tissue, such as, but not limited to, bone marrow samples, tumor biopsy, fine needle aspirate, or a sample of bodily fluid, such as, blood, plasma, serum, lymph, ascitic fluid, cystic fluid or urine. The sample used in the methods of the invention will vary based on the assay format, nature of the detection method, and the tissues, cells or extracts which are used as the sample. It is recognized that the sample typically requires preliminary processing designed to isolate or enrich the sample for the genomic DNA. A variety of techniques known to those of ordinary skill in the art may be used for this purpose.
As used herein, a “probe” is an isolated polynucleotide to which is attached a conventional detectable label or reporter molecule, e.g., a radioactive isotope, ligand, chemiluminescent agent, enzyme, etc. Such a probe is complementary to a strand of a target polynucleotide, which in specific embodiments of the invention comprise a polynucleotide comprising a junction of the IKZF1 genomic abnormality. Deoxyribonucleic acid probes may include those generated by PCR using IKZF1 specific primers, olignucleotide probes synthesized in vitro, or DNA obtained from bacterial artificial chromosome or cosmid libraries. Probes include not only deoxyribonucleic or ribonucleic acids but also polyamides and other probe materials that can specifically detect the presence of the target DNA sequence. For nucleic acid probes, examples of detection reagents include, but are not limited to radiolabeled probes, enzymatic labeled probes (horse radish peroxidase, alkaline phosphatase), affinity labeled probes (biotin, avidin, or steptavidin), and fluorescent labeled probes (6-FAM, VIC, TAMRA, MGB). One skilled in the art will readily recognize that the nucleic acid probes described in the present invention can readily be incorporated into one of the established kit formats which are well known in the art.
As used herein, “primers” are isolated polynucleotides that are annealed to a complementary target DNA strand by nucleic acid hybridization to form a hybrid between the primer and the target DNA strand., then extended along the target DNA strand by a polymerase, e.g., a DNA polymerase. Primer pairs of the invention refer to their use for amplification of a target polynucleotide, e.g., by the polymerase chain reaction (PCR) or other conventional nucleic-acid amplification methods. “PCR” or “polymerase chain reaction” is a technique used for the amplification of specific DNA segments (see, U.S. Pat. Nos. 4,683,195 and 4,800,159; herein incorporated by reference).
Probes and primers are of sufficient nucleotide length to bind to the target DNA sequence and specifically detect and/or identify a polynucleotide comprising an IKZF1 genomic abnormality or a junction of an IKZF1 genomic abnormality. It is recognized that the hybridization conditions or reaction conditions can be determined by the operator to achieve this result. This length may be of any length that is of sufficient length to be useful in a detection method of choice. Generally, 8, 11, 14, 16, 18, 20, 22, 24, 26, 28, 30, 40, 50, 75, 100, 200, 300, 400, 500, 600, 700 nucleotides or more, or between about 11-20, 20-30, 30-40, 40-50, 50-100, 100-200, 200-300, 300-400, 400-500, 500-600, 600-700, 700-800, or more nucleotides in length are used. Such probes and primers can hybridize specifically to a target sequence under high stringency hybridization conditions. Probes and primers according to embodiments of the present invention may have complete DNA sequence identity of contiguous nucleotides with the target sequence, although probes differing from the target DNA sequence and that retain the ability to specifically detect and/or identify a target DNA sequence may be designed by conventional methods. Accordingly, probes and primers can share about 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or greater sequence identity or complementarity to the target polynucleotide (i.e., SEQ ID NO: 1 or to a fragment thereof). Probes can be used as primers, but are generally designed to bind to the target DNA or RNA and are not used in an amplification process.
Specific primers can be used to amplify the junction of an IKZF1 genomic abnormality to produce an amplicon that can be used as a “specific probe” or can itself be detected for identifying an IKZF1 genomic abnormality in a biological sample. When the probe is hybridized with the polynucleotides of a biological sample under conditions which allow for the binding of the probe to the sample, this binding can be detected and thus allow for an indication of the presence of the IKZF1 genomic abnormality in the biological sample. Such identification of a bound probe has been described in the art. The specific probe may comprise a sequence of at least 80%, between 80 and 85%, between 85 and 90%, between 90 and 95%, and between 95 and 100% identical (or complementary) to a specific region of the IKZF1 gene.
As used herein, “amplified DNA” or “amplicon” refers to the product of polynucleotide amplification of a target polynucleotide that is part of a nucleic acid template. For example, to determine whether the nucleic acid complement of a biological sample comprises an IKZF1 genomic abnormality, the nucleic acid complement of the biological sample may be subjected to a polynucleotide amplification method using a primer pair that includes a first primer derived from the 5′ flanking sequence adjacent to a junction of an IKZF1 genomic abnormality, and a second primer derived from the 3′ flanking sequence adjacent to the junction of the IKZF1 genomic abnormality to produce an amplicon that is diagnostic for the presence of the IKZF1 genomic abnormality. By “diagnostic” for an IKZF1 genomic abnormality is intended the use of any method or assay which discriminates between the present or the absence of an IKZF1 genomic abnormality in a biological sample. The amplicon is of a length and has a sequence that is also diagnostic for the IKZF1 genomic abnormality (i.e., has a junction sequence of the IKZF1 genomic abnormality). The amplicon may range in length from the combined length of the primer pairs plus one nucleotide base pair to any length of amplicon producible by a DNA amplification protocol. A member of a primer pair derived from the flanking sequence may be located a distance from the junction or breakpoint. This distance can range from one nucleotide base pair up to the limits of the amplification reaction, or about twenty thousand nucleotide base pairs. The use of the term “amplicon” specifically excludes primer dimers that may be formed in the DNA thermal amplification reaction.
Methods for preparing and using probes and primers are described, for example, in Molecular Cloning: A Laboratory Manual, 2.sup.nd ed, vol. 1-3, ed. Sambrook et al., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. 1989 (hereinafter, “Sambrook et al., 1989”); Current Protocols in Molecular Biology, ed. Ausubel et al., Greene Publishing and Wiley-Interscience, New York, 1992 (with periodic updates) (hereinafter, “Ausubel et al., 1992”); and Innis et al., PCR Protocols: A Guide to Methods and Applications, Academic Press: San Diego, 1990. PCR primer pairs can be derived from a known sequence, for example, by using computer programs intended for that purpose such as the PCR primer analysis tool in Vector NTI version 10 (Informax Inc., Bethesda Md.); PrimerSelect (DNASTAR Inc., Madison, Wis.); and Primer (Version 0.5.COPYRGT., 1991, Whitehead Institute for Biomedical Research, Cambridge, Mass.). Additionally, the sequence can be visually scanned and primers manually identified using guidelines known to one of skill in the art.
As outline in further detail below, any conventional nucleic acid hybridization or amplification or sequencing method can be used to specifically detect the presence of a polynucleotide arising due to an IKZF1 genomic abnormality. By “specifically detect” is intended that the polynucleotide can be used either as a primer to amplify the junction of an IKZF1 genomic abnormality or the polynucleotide can be used as a probe that hybridizes under stringent conditions to a polynucleotide having an IKZF1 genomic abnormality. The level or degree of hybridization which allows for the specific detection of the IKZF1 genomic abnormality is sufficient to distinguish the polynucleotide with the IKZF1 genomic abnormality from a polynucleotide that does not contain the rearrangement and thereby allow for discriminately identifying an IKZF1 genomic abnormality. By “shares sufficient sequence identity or complentarity to allow for the amplification of an IKZF1 chromosome rearrangement” is intended the sequence shares at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity or complementarity to a fragment or across the full length of the IKZF1 polynucleotide.
The IKZF1 genomic abnormalities may be detected using a variety of nucleic acid techniques known to those of ordinary skill in the art, including but not limited to: nucleic acid sequencing; nucleic acid hybridization; and, nucleic acid amplification. Nucleic acid hybridization includes methods using labeled probes directed against purified DNA, amplified DNA, and fixed leukemic cell preparations (fluorescence in situ hybridization).
Illustrative non-limiting examples of nucleic acid sequencing techniques include, but are not limited to, chain terminator (Sanger) sequencing and dye terminator sequencing. Chain terminator sequencing uses sequence-specific termination of a DNA synthesis reaction using modified nucleotide substrates. Extension is initiated at a specific site on the template DNA by using a short radioactive, or other labeled, oligonucleotide primer complementary to the template at that region. The oligonucleotide primer is extended using a DNA polymerase, standard four deoxynucleotide bases, and a low concentration of one chain terminating nucleotide, most commonly a di-deoxynucleotide. This reaction is repeated in four separate tubes with each of the bases taking turns as the di-deoxynucleotide. Limited incorporation of the chain terminating nucleotide by the DNA polymerase results in a series of related DNA fragments that are terminated only at positions where that particular di-deoxynucleotide is used. For each reaction tube, the fragments are size-separated by electrophoresis in a slab polyacrylamide gel or a capillary tube filled with a viscous polymer. The sequence is determined by reading which lane produces a visualized mark from the labeled primer as you scan from the top of the gel to the bottom. Dye terminator sequencing alternatively labels the terminators. Complete sequencing can be performed in a single reaction by labeling each of the di-deoxynucleotide chain-terminators with a separate fluorescent dye, which fluoresces at a different wavelength.
The present invention further provides methods for identifying nucleic acids containing an IKZF1 genomic abnormality which do not necessarily require sequence amplification and are based on, for example, the known methods of Southern (DNA:DNA) blot hybridizations, in situ hybridization and FISH of chromosomal material, using appropriate probes. Such nucleic acid probes can be used that comprise nucleotide sequences in proximity to the IKZF1 genomic abnormality junction, or breakpoint. By “in proximity to” is intended within about 100 kilobases (kb) of the IKZF1 genomic abnormality junction.
In situ hybridization (ISH) is a type of hybridization that uses a labeled complementary DNA or RNA strand as a probe to localize a specific DNA or RNA sequence in a portion or section of tissue (in situ), or, if the tissue is small enough, the entire tissue (whole mount ISH). DNA ISH can be used to determine the structure of chromosomes. Sample cells and tissues are usually treated to fix the target transcripts in place and to increase access of the probe. The probe hybridizes to the target sequence at elevated temperature, and then the excess probe is washed away. The probe that was labeled with either radio-, fluorescent- or antigen-labeled bases is localized and quantitated in the tissue using either autoradiography, fluorescence microscopy or immunohistochemistry, respectively. ISH can also use two or more probes, labeled with radioactivity or the other non-radioactive labels, to simultaneously detect two or more transcripts. In some embodiments, the IKZF1 genomic abnormalities are detected using fluorescence in situ hybridization (FISH).
In specific embodiments, probes for detecting an IKZF1 genomic abnormality are labeled with appropriate fluorescent or other markers and then used in hybridizations. The Examples section provided herein sets forth various protocol that are effective for detecting the genomic abnormalities, but one of skill in the art will recognize that many variations of these assay can be used equally well. Specific protocols are well known in the art and can be readily adapted for the present invention. Guidance regarding methodology may be obtained from many references including: In situ Hybridization:
Medical Applications (eds. G. R. Coulton and J. de Belleroche), Kluwer Academic Publishers, Boston (1992); In situ Hybridization: hi Neurobiology; Advances in Methodology (eds. J. H. Eberwine, K. L. Valentino, and J. D. Barchas), Oxford University Press Inc., England (1994); In situ Hybridization: A Practical Approach (ed. D. G. Wilkinson), Oxford University Press Inc., England (1992)); Kuo et al. (1991) Am. J. Hum. Genet. 42:112-119; Klinger et al. (1992) Am. J. Hum. Genet. 51:55-65; and Ward et al. (1993) Am. J. Hum. Genet. 52:854-865). There are also kits that are commercially available and that provide protocols for performing FISH assays (available from e.g., Oncor, Inc., Gaithersburg, Md.). Patents providing guidance on methodology include U.S. Pat. Nos. 5,225,326; 5,545,524; 6,121,489 and 6,573,043. All of these references are hereby incorporated by reference in their entirety and may be used along with similar references in the art and with the information provided in the Examples section herein to establish procedural steps convenient for a particular laboratory.
Southern blotting can be used to detect specific DNA sequences. In such methods, DNA that is extracted from a sample is fragmented, electrophoretically separated on a matrix gel, and transferred to a membrane filter. The filter bound DNA is subject to hybridization with a labeled probe complementary to the sequence of interest. Hybridized probe bound to the filter is detected.
In hybridization techniques, all or part of a polynucleotide that selectively hybridizes to a target polynucleotide having an IKZF1 genomic abnormality is employed. By “stringent conditions” or “stringent hybridization conditions” when referring to a polynucleotide probe is intended conditions under which a probe will hybridize to its target sequence to a detectably greater degree than to other sequences (e.g., at least 2-fold over background). Stringent conditions are sequence-dependent and will be different in different circumstances. By controlling the stringency of the hybridization and/or washing conditions, target sequences that are 100% complementary to the probe can be identified (homologous probing). Alternatively, stringency conditions can be adjusted to allow some mismatching in sequences so that lower degrees of identity are detected (heterologous probing). Generally, a probe is less than about 1000 nucleotides in length or less than 500 nucleotides in length.
As used herein, a substantially identical or complementary sequence is a polynucleotide that will specifically hybridize to the complement of the nucleic acid molecule to which it is being compared under high stringency conditions. Appropriate stringency conditions which promote DNA hybridization, for example, 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by a wash of 2×SSC at 50° C., are known to those skilled in the art or can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6. Typically, stringent conditions for hybridization and detection will be those in which the salt concentration is less than about 1.5 M Na ion, typically about 0.01 to 1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30° C. for short probes (e.g., 10 to 50 nucleotides) and at least about 60° C. for long probes (e.g., greater than 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. Exemplary low stringency conditions include hybridization with a buffer solution of 30 to 35% formamide, 1 M NaCl, 1% SDS (sodium dodecyl sulphate) at 37° C., and a wash in 1× to 2×SSC (20×SSC=3.0 M NaCl/0.3 M trisodium citrate) at 50 to 55° C. Exemplary moderate stringency conditions include hybridization in 40 to 45% formamide, 1.0 M NaCl, 1% SDS at 37° C., and a wash in 0.5× to 1×SSC at 55 to 60° C. Exemplary high stringency conditions include hybridization in 50% formamide, 1 M NaCl, 1% SDS at 37° C., and a wash in 0.1×SSC at 60 to 65° C. Optionally, wash buffers may comprise about 0.1% to about 1% SDS. Duration of hybridization is generally less than about 24 hours, usually about 4 to about 12 hours. The duration of the wash time will be at least a length of time sufficient to reach equilibrium.
In hybridization reactions, specificity is typically the function of post-hybridization washes, the critical factors being the ionic strength and temperature of the final wash solution. For DNA-DNA hybrids, the Tm can be approximated from the equation of Meinkoth and Wahl (1984) Anal. Biochem. 138:267-284: Tm=81.5° C.+16.6 (log M)+0.41 (% GC)−0.61 (% form)−500/L; where M is the molarity of monovalent cations, % GC is the percentage of guanosine and cytosine nucleotides in the DNA, % form is the percentage of formamide in the hybridization solution, and L is the length of the hybrid in base pairs. The Tm is the temperature (under defined ionic strength and pH) at which 50% of a complementary target sequence hybridizes to a perfectly matched probe. Tm is reduced by about 1° C. for each 1% of mismatching; thus, Tm, hybridization, and/or wash conditions can be adjusted to hybridize to sequences of the desired identity. For example, if sequences with ≧90% identity are sought, the Tm can be decreased 10° C. Generally, stringent conditions are selected to be about 5° C. lower than the thermal melting point (Tm) for the specific sequence and its complement at a defined ionic strength and pH. However, severely stringent conditions can utilize a hybridization and/or wash at 1, 2, 3, or 4° C. lower than the thermal melting point (Tm); moderately stringent conditions can utilize a hybridization and/or wash at 6, 7, 8, 9, or 10° C. lower than the thermal melting point (Tm); low stringency conditions can utilize a hybridization and/or wash at 11, 12, 13, 14, 15, or 20° C. lower than the thermal melting point (Tm). Using the equation, hybridization and wash compositions, and desired Tm, those of ordinary skill will understand that variations in the stringency of hybridization and/or wash solutions are inherently described. If the desired degree of mismatching results in a Tm, of less than 45° C. (aqueous solution) or 32° C. (formamide solution), it is optimal to increase the SSC concentration so that a higher temperature can be used. An extensive guide to the hybridization of nucleic acids is found in Tijssen (1993) Laboratory Techniques in Biochemistry and Molecular Biology—Hybridization with Nucleic Acid Probes, Part I, Chapter 2 (Elsevier, N.Y.); and Ausubel et al., eds. (1995) Current Protocols in Molecular Biology, Chapter 2 (Greene Publishing and Wiley-Interscience, New York). See Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y.) and Haymes et al. (1985) In: Nucleic Acid Hybridization, a Practical Approach, IRL Press, Washington, D.C. A polynucleotide is said to be the “complement” of another polynucleotide if they exhibit complementarity. As used herein, molecules are said to exhibit “complete complementarity” when every nucleotide of one of the polynucleotide molecules is complementary to a nucleotide of the other. Two molecules are said to be “minimally complementary” if they can hybridize to one another with sufficient stability to permit them to remain annealed to one another under at least conventional “low-stringency” conditions. Similarly, the molecules are said to be “complementary” if they can hybridize to one another with sufficient stability to permit them to remain annealed to one another under conventional “high-stringency” conditions.
Regarding the amplification of a target polynucleotide (e.g., by PCR) using a particular amplification primer pair, “stringent conditions” are conditions that permit the primer pair to hybridize to the target polynucleotide to which a primer having the corresponding sequence (or its complement) would bind and preferably to produce an identifiable amplification product (the amplicon) having a junction of an IKZF1 genomic abnormality in a DNA thermal amplification reaction. In a PCR approach, oligonucleotide primers can be designed for use in PCR reactions to amplify a junction of an IKZF1 genomic abnormality. Methods for designing PCR primers and PCR cloning are generally known in the art and are disclosed in Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y.). See also Innis et al., eds. (1990) PCR Protocols: A Guide to Methods and Applications (Academic Press, New York); Innis and Gelfand, eds. (1995) PCR Strategies (Academic Press, New York); and Innis and Gelfand, eds. (1999) PCR Methods Manual (Academic Press, New York). Methods of amplification are further described in U.S. Pat. Nos. 4,683,195, 4,683,202 and Chen et al. (1994) PNAS 91:5695-5699. These methods as well as other methods known in the art of DNA amplification may be used in the practice of the embodiments of the present invention. It is understood that a number of parameters in a specific PCR protocol may need to be adjusted to specific laboratory conditions and may be slightly modified and yet allow for the collection of similar results. These adjustments will be apparent to a person skilled in the art.
The amplified polynucleotide (amplicon) can be of any length that allows for the detection of the IKZF1 genomic abnormality. For example, the amplicon can be about 10, 50, 100, 200, 300, 500, 700, 100, 2000, 3000, 4000, 5000 nucleotides in length or longer.
Any primer can be employed in the methods of the invention that allows a junction of the IKZF1 genomic abnormality to be amplified and/or detected. For example, in specific embodiments, at least one of the primers employed in the method of detection or amplification comprises the sequence set forth in SEQ ID NO:74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, and/or 104. Methods for designing PCR primers are generally known in the art and are disclosed in Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y.). See also Innis et al., eds. (1990) PCR Protocols: A Guide to Methods and Applications (Academic Press, New York); Innis and Gelfand, eds. (1995) PCR Strategies (Academic Press, New York); and Innis and Gelfand, eds. (1999) PCR Methods Manual (Academic Press, New York). Other known methods of PCR that can be used in the methods of the invention include, but are not limited to, methods using paired primers, nested primers, single specific primers, degenerate primers, gene-specific primers, mixed DNA/RNA primers, vector-specific primers, partially mismatched primers, and the like.
Thus, in specific embodiments, a method of detecting the presence of an IKZF1 genomic abnormality in a biological sample is provided. The method comprises (a) providing a sample comprising the genomic DNA of a subject; (b) providing a pair of DNA primer molecules that can amplify an amplicon having a junction of an IKZF1 genomic abnormality (c) providing DNA amplification reaction conditions; (d) performing the DNA amplification reaction, thereby producing a DNA amplicon molecule; and (e) detecting the DNA amplicon molecule. In order for a nucleic acid molecule to serve as a primer or probe it need only be sufficiently complementary in sequence to be able to form a stable double-stranded structure under the particular solvent and salt concentrations employed.
In still other embodiments, genomic abnormalities of genomic DNA may be amplified prior to or simultaneous with detection. Illustrative non-limiting examples of nucleic acid amplification techniques include, but are not limited to, polymerase chain reaction (PCR), ligase chain reaction (LCR), strand displacement amplification (SDA), and nucleic acid sequence based amplification (NASBA). The polymerase chain reaction (U.S. Pat. Nos. 4,683,195, 4,683,202, 4,800,159 and 4,965,188, each of which is herein incorporated by reference in its entirety), commonly referred to as PCR, uses multiple cycles of denaturation, annealing of primer pairs to opposite strands, and primer extension to exponentially increase copy numbers of a target nucleic acid sequence. For other various permutations of PCR see, e.g., U.S. Pat. Nos. 4,683,195, 4,683,202 and 4,800,159; Mullis et al, (1987) Meth. Enzymol. 155: 335; and, Murakawa et al., (1988) DNA 7: 287, each of which is herein incorporated by reference in its entirety.
The ligase chain reaction (Weiss (1991) Science 254: 1292, herein incorporated by reference in its entirety), commonly referred to as LCR, uses two sets of complementary DNA oligonucleotides that hybridize to adjacent regions of the target nucleic acid. The DNA oligonucleotides are covalently linked by a DNA ligase in repeated cycles of thermal denaturation, hybridization and ligation to produce a detectable double-stranded ligated oligonucleotide product.
Strand displacement amplification (Walker et al. (1992) Proc. Natl. Acad. Sci. USA 89: 392-396; U.S. Pat. Nos. 5,270,184 and 5,455,166, each of which is herein incorporated by reference in its entirety), commonly referred to as SDA, uses cycles of annealing pairs of primer sequences to opposite strands of a target sequence, primer extension in the presence of a dNTP[alpha]S to produce a duplex hemiphosphorothioated primer extension product, endonuclease-mediated nicking of a hemimodified restriction endonuclease recognition site, and polymerase-mediated primer extension from the 3′ end of the nick to displace an existing strand and produce a strand for the next round of primer annealing, nicking and strand displacement, resulting in geometric amplification of product. Thermophilic SDA (tSDA) uses thermophilic endonucleases and polymerases at higher temperatures in essentially the same method (EP Pat. No. 0 684 315).
Non-amplified or amplified IKZF1 genomic abnormalities can be detected by any conventional means. For example, the genomic abnormalities can be detected by hybridization with a detectably labeled probe and measurement of the resulting hybrids. Illustrative non-limiting examples of detection methods are described below.
One illustrative detection method, the Hybridization Protection Assay (HPA) involves hybridizing a chemiluminescent oligonucleotide probe (e.g., an acridinium ester-labeled (AE) probe) to the target sequence, selectively hydrolyzing the chemiluminescent label present on unhybridized probe, and measuring the chemiluminescence produced from the remaining probe in a luminometer. See, e.g., U.S. Pat. No. 5,283,174 and Nelson et al. (1995) Nonisotopic Probing, Blotting, and Sequencing, ch. 17 (Larry J. Kricka ed., 2d ed., each of which is herein incorporated by reference in its entirety).
Another illustrative detection method provides for quantitative evaluation of the amplification process in real-time. Evaluation of an amplification process in “real-time” involves determining the amount of amplicon in the reaction mixture either continuously or periodically during the amplification reaction, and using the determined values to calculate the amount of target sequence initially present in the sample. A variety of methods for determining the amount of initial target sequence present in a sample based on real-time amplification are well known in the art. These include methods disclosed in U.S. Pat. Nos. 6,303,305 and 6,541,205, each of which is herein incorporated by reference in its entirety. Another method for determining the quantity of target sequence initially present in a sample, but which is not based on a real-time amplification, is disclosed in U.S. Pat. No. 5,710,029, herein incorporated by reference in its entirety.
Amplification products may be detected in real-time through the use of various self-hybridizing probes, most of which have a stem-loop structure. Such self-hybridizing probes are labeled so that they emit differently detectable signals, depending on whether the probes are in a self-hybridized state or an altered state through hybridization to a target sequence. By way of non-limiting example, “molecular torches” are a type of self-hybridizing probe that includes distinct regions of self-complementarity (referred to as “the target binding domain” and “the target closing domain”) which are connected by a joining region (e.g., non-nucleotide linker) and which hybridize to each other under predetermined hybridization assay conditions. In a preferred embodiment, molecular torches contain single-stranded base regions in the target binding domain that are from 1 to about 20 bases in length and are accessible for hybridization to a target sequence present in an amplification reaction under strand displacement conditions. Under strand displacement conditions, hybridization of the two complementary regions, which may be fully or partially complementary, of the molecular torch is favored, except in the presence of the target sequence, which will bind to the single-stranded region present in the target binding domain and displace all or a portion of the target closing domain. The target binding domain and the target closing domain of a molecular torch include a detectable label or a pair of interacting labels (e.g., luminescent/quencher) positioned so that a different signal is produced when the molecular torch is self-hybridized than when the molecular torch is hybridized to the target sequence, thereby permitting detection of probe:target duplexes in a test sample in the presence of unhybridized molecular torches. Molecular torches and a variety of types of interacting label pairs are disclosed in U.S. Pat. No. 6,534,274, herein incorporated by reference in its entirety.
Another example of a detection probe having self-complementarity is a “molecular beacon.” Molecular beacons include nucleic acid molecules having a target complementary sequence, an affinity pair (or nucleic acid arms) holding the probe in a closed conformation in the absence of a target sequence present in an amplification reaction, and a label pair that interacts when the probe is in a closed conformation. Hybridization of the target sequence and the target complementary sequence separates the members of the affinity pair, thereby shifting the probe to an open conformation. The shift to the open conformation is detectable due to reduced interaction of the label pair, which may be, for example, a fluorophore and a quencher (e.g., DABCYL and EDANS). Molecular beacons are disclosed in U.S. Pat. Nos. 5,925,517 and 6,150,097, herein incorporated by reference in its entirety.
Other self-hybridizing probes are well known to those of ordinary skill in the art. By way of non-limiting example, probe binding pairs having interacting labels, such as those disclosed in U.S. Pat. No. 5,928,862 (herein incorporated by reference in its entirety) might be adapted for use in the present invention. Probe systems used to detect single nucleotide polymorphisms (SNPs) might also be utilized in the present invention. Additional detection systems include “molecular switches,” as disclosed in U.S. Publ. No. 20050042638, herein incorporated by reference in its entirety. Other probes, such as those comprising intercalating dyes and/or fluorochromes, are also useful for detection of amplification products in the present invention. See, e.g., U.S. Pat. No. 5,814,447 (herein incorporated by reference in its entirety).
Various methods can be used to detect the IKZF1 genomic abnormality or amplicon having a junction of an IKZF1 genomic abnormality, including, but not limited to, Genetic Bit Analysis (Nikiforov et al. (1994) Nucleic Acid Res. 22: 4167-4175) where a DNA oligonucleotide is designed which overlaps both the adjacent flanking DNA sequence and the inserted DNA sequence. The oligonucleotide is immobilized in wells of a microwell plate. Following PCR of the region of interest (using one primer in the inserted sequence and one in the adjacent flanking sequence) a single-stranded PCR product can be hybridized to the immobilized oligonucleotide and serve as a template for a single base extension reaction using a DNA polymerase and labeled ddNTPs specific for the expected next base. Readout may be fluorescent or ELISA-based. A signal indicates presence of the insert/flanking sequence due to successful amplification, hybridization, and single base extension.
Another detection method is the Pyrosequencing technique as described by Winge ((2000) Innov. Pharma. Tech. 00: 18-24). In this method, an oligonucleotide is designed that overlaps the junction. The oligonucleotide is hybridized to a single-stranded PCR product from the region of interest (one primer in the inserted sequence and one in the flanking sequence) and incubated in the presence of a DNA polymerase, ATP, sulfurylase, luciferase, apyrase, adenosine 5′ phosphosulfate and luciferin. dNTPs are added individually and the incorporation results in a light signal which is measured. A light signal indicates the presence of the transgene insert/flanking sequence due to successful amplification, hybridization, and single or multi-base extension.
Fluorescence Polarization as described by Chen et al. ((1999) Genome Res. 9: 492-498, 1999) is also a method that can be used to detect an amplicon of the invention. Using this method, an oligonucleotide is designed which overlaps the inserted DNA junction. The oligonucleotide is hybridized to a single-stranded PCR product from the region of interest (one primer in the inserted DNA and one in the flanking DNA sequence) and incubated in the presence of a DNA polymerase and a fluorescent-labeled ddNTP. Single base extension results in incorporation of the ddNTP. Incorporation can be measured as a change in polarization using a fluorometer. A change in polarization indicates the presence of the genomic abnormality sequence due to successful amplification, hybridization, and single base extension.
Taqman® (PE Applied Biosystems, Foster City, Calif.) is described as a method of detecting and quantifying the presence of a DNA sequence and is fully understood in the instructions provided by the manufacturer. Briefly, a FRET oligonucleotide probe is designed which overlaps the junction. The FRET probe and PCR primers (one primer in the insert DNA sequence and one in the flanking genomic sequence) are cycled in the presence of a thermostable polymerase and dNTPs. Hybridization of the FRET probe results in cleavage and release of the fluorescent moiety away from the quenching moiety on the FRET probe. A fluorescent signal indicates the presence of the flanking/transgene insert sequence due to successful amplification and hybridization.
In one embodiment, the method of detecting a genomic abnormality of IKZF1 comprises (a) contacting the biological sample with a polynucleotide probe that hybridizes under stringent hybridization conditions with a polynucleotide having an IKZF1 genomic abnormality and specifically detects the IKZF1 genomic abnormality; (b) subjecting the sample and probe to stringent hybridization conditions; and (c) detecting hybridization of the probe to the polynucleotide, wherein detection of hybridization indicates the presence of the IKZF1 genomic abnormality.
The materials used in the above assay methods are ideally suited for the preparation of a kit. Various detection reagents can be developed and used to assay the presence of the IKZF1 genomic abnormality. The terms “kits” and “systems,” as used herein in the context of the IKZF1 genomic abnormality detection reagents, are intended to refer to such things as combinations of multiple IKZF1 genomic abnormality detection reagents, or one or more IKZF1 genomic abnormality detection reagents in combination with one or more other types of elements or components (e.g., other types of biochemical reagents, containers, packages, such as packaging intended for commercial sale, substrates to which SNP detection reagents are attached, electronic hardware components, and the like). Accordingly, the present invention further provides IKZF1 genomic abnormality detection kits and systems, including but not limited to, packaged probe and primer sets (e.g., TaqMan probe/primer sets), arrays/microarrays of nucleic acid molecules, and beads that contain one or more probes, primers, or other detection reagents for detecting one or more IKZF1 genomic abnormality. The kits/systems can optionally include various electronic hardware components. For example, arrays (e.g., DNA chips) and microfluidic systems (e.g., lab-on-a-chip systems) provided by various manufacturers typically include hardware components. Other kits/systems (e.g., probe/primer sets) may not include electronic hardware components, but can include, for example, one or more IKZF1 genomic abnormality detection reagents along with other biochemical reagents packaged in one or more containers.
In some embodiments, a IKZF1 genomic abnormality kit typically contains one or more detection reagents and other components (e.g., a buffer, enzymes, such as DNA polymerases or ligases, chain extension nucleotides, such as deoxynucleotide triphosphates, positive control sequences, negative control sequences, and the like) necessary to carry out an assay or reaction, such as amplification and/or detection of a polynucleotide comprising a junction of one of the IKZF1 genomic abnormalities. A kit can further contain means for determining the amount of the target polynucleotide and means for comparing with an appropriate standard, and can include instructions for using the kit to detect the IKZF1 genomic abnormality. In one embodiment, kits are provided which contain the necessary reagents to carry out one or more assays to detect one or more of the IKZF1 genomic abnormality as disclosed herein. The IKZF1 genomic abnormality detection kits/systems may contain, for example, one or more probes, or pairs of probes, that hybridize to a nucleic acid molecule at or near the junction region.
In specific embodiments, a kit for identifying an IKZF1 genomic abnormality in a biological sample is provided. The kit comprises a first and a second primer, wherein the first and second primer amplify a polynucleotide comprising an IKZF1 genomic abnormality junction and thereby detect an IKZF1 genomic abnormality.
Further provided are polynucleotide detection kits comprising at least one polynucleotide that can specifically detect an IKZF1 genomic abnormality. In specific embodiments, the polynucleotide comprises at least one polynucleotide molecule of a sufficient length of contiguous nucleotides homologous or complementary to SEQ ID NO: 1 or a variant thereof to allow for the detection of an IKZF1 genomic abnormality.
Further provided are methods for identifying agents that target a polypeptide expressed from the IKZF1 genomic abnormalities. Thus, methods to screen for compounds that can serve as molecular targets for drugs useful in modulating the activity of the polypeptides expressed from the IKZF1 genomic abnormalities are provided. Such compounds can find use in treating All (i.e., BCR-ABL1 positive ALL, B-progenitor (+) ALL or B-progenitor (−) ALL, and/or in treating CML, more particularly, in treating BC-CML or treating, preventing or delaying progression into BC-CML. The invention provides a method (also referred to herein as a “screening assay”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., peptides, peptidomimetics, small molecules, or other drugs) that modulate (e.g. inhibits) the activity of a polypeptide expressed from the IKZF1 gene having a genomic abnormality.
The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including biological libraries, spatially addressable parallel solid phase or solution phase libraries, synthetic library methods requiring deconvolution, the “one-bead one-compound” library method, and synthetic library methods using affinity chromatography selection. The biological library approach is limited to peptide libraries, while the other four approaches are applicable to peptide, nonpeptide oligomer, or small molecule libraries of compounds (Lam (1997) Anticancer Drug Des. 12:145).
Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. USA 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and Gallop et al. (1994) J. Med. Chem. 37:1233.
Libraries of compounds may be presented in solution (e.g., Houghten (1992) Bio/Techniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (U.S. Pat. No. 5,223,409), spores (U.S. Pat. Nos. 5,571,698; 5,403,484; and 5,223,409), plasmids (Cull et al. (1992) Proc. Natl. Acad. Sci. USA 89:1865-1869), or phage (Scott and Smith (1990) Science 249:386-390; Devlin (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. USA 87:6378-6382; and Felici (1991) J. Mol. Biol. 222:301-310).
The compounds screened in the above assay can be, but are not limited to, small molecules, peptides, carbohydrates, or vitamin derivatives. The agents can be selected and screened at random or rationally selected or designed using protein modeling techniques. For random screening, agents such as peptides or carbohydrates are selected at random and are assayed for their ability to bind to the polypeptide expressed from the IKZF1 gene having the genomic abnormality. Alternatively, agents may be rationally selected or designed. As used herein, an agent is said to be “rationally selected or designed” when the agent is chosen based on the configuration of the polypeptide expressed from the IKZF1 gene having the genomic abnormality. For example, one skilled in the art can readily adapt currently available procedures to generate peptides capable of binding to a specific peptide sequence in order to generate rationally designed antipeptide peptides, see, for example, Hurby et al., “Application of Synthetic Peptides: Antisense Peptides,” in Synthetic Peptides: A User's Guide, W.H. Freeman, New York (1992), pp. 289-307; and Kaspczak et al., Biochemistry 28:9230-2938 (1989).
Determining the ability of the test compound to specifically bind to the polypeptide expressed from the IKZF1 gene having the genomic abnormality can be accomplished, for example, by coupling the test compound with a radioisotope or enzymatic label such that binding of the test compound to the polypeptide expressed from the IKZF1 gene having the genomic abnormality can be determined by detecting the labeled compound in a complex. For example, test compounds can be labeled with 125I, 35S, 14C, or 3H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, test compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.
In another embodiment, an assay of the present invention is a cell-free assay comprising contacting a polypeptide expressed from the IKZF1 gene having the genomic abnormality with a test compound and determining the ability of the test compound to specifically bind to the polypeptide expressed from the IKZF1 gene having the genomic abnormality. Binding of the test compound to the polypeptide expressed from the IKZF1 gene having the genomic abnormality can be determined either directly or indirectly as described above.
In another embodiment, an assay is a cell-free assay comprising contacting the polypeptide expressed from the IKZF1 gene having the genomic abnormality with a test compound and determining the ability of the test compound to specifically modulate (i.e., inhibit or activate) the activity of the polypeptide expressed from the IKZF1 gene having the genomic abnormality. Determining the ability of the test compound to inhibit the activity of a polypeptide expressed from the IKZF1 gene having the genomic abnormality using any method that can assay for IKZF1 activity. In addition, one could assay for the treatment of ALL (i.e., BCR-ABL1 positive ALL, B-progenitor (+) ALL or B-progenitor (−) ALL) and/or in the treatment of CML, more particularly, in the treatment of BC-CML or treating, preventing or delaying progression into BC-CML.
Such desired compounds may be further screened for selectivity by determining whether they suppress or eliminate phenotypic changes or activities associated with expression of the polypeptides expressed from IKZF1 genes having a genomic abnormality in the cells. The agents are screened by administering the agent to the cell or alternatively, the activity of the selective agent can be monitored in an in vitro assay. It is recognized that it is preferable that a range of dosages of a particular agent be administered to the cells to determine if the agent is useful for treating ALL, more particularly, BCR-ABL1 positive ALL and/or in the treatment of CML, more particularly, in the treatment of BC-CML and/or treating, preventing or delaying progression into BC-CML.
There are numerous variations of the above assays which can be used by a skilled artisan in order to isolate agonists. See, for example, Burch, R. M., in Medications Development. Drug Discovery, Databases, and Computer-Aided Drug Design, NIDA Research Monograph 134, NIH Publication No. 93-3638, Rapaka, R. S., and Hawks, R. L., eds., U.S. Dept. of Health and Human Services, Rockville, Md. (1993), pages 37-45.
Using the above procedures, the present invention provides compound capable of binding or modulating the activity of a polypeptide expressed from the IKZF1 gene having the genomic abnormality, produced by a method comprising the steps of (a) contacting said compound with the polypeptide expressed from the IKZF1 gene having the genomic abnormality, and (b) determining whether the agent specifically binds or modulates the activity of the polypeptide expressed from the IKZF1 gene having the genomic abnormality. Additional step(s) to determine whether such binding is selective for the IKZF1 polypeptide expressed from a IKZF1 gene lacking a genomic abnormality may also be employed.
As used herein, “sequence identity” or “identity” in the context of two polynucleotides or polypeptide sequences makes reference to the residues in the two sequences that are the same when aligned for maximum correspondence over a specified comparison window. When percentage of sequence identity is used in reference to proteins it is recognized that residue positions which are not identical often differ by conservative amino acid substitutions, where amino acid residues are substituted for other amino acid residues with similar chemical properties (e.g., charge or hydrophobicity) and therefore do not change the functional properties of the molecule. When sequences differ in conservative substitutions, the percent sequence identity may be adjusted upwards to correct for the conservative nature of the substitution. Sequences that differ by such conservative substitutions are said to have “sequence similarity” or “similarity”. Means for making this adjustment are well known to those of skill in the art. Typically this involves scoring a conservative substitution as a partial rather than a full mismatch, thereby increasing the percentage sequence identity. Thus, for example, where an identical amino acid is given a score of 1 and a non-conservative substitution is given a score of zero, a conservative substitution is given a score between zero and 1. The scoring of conservative substitutions is calculated, e.g., as implemented in the program PC/GENE (Intelligenetics, Mountain View, Calif.).
As used herein, “percentage of sequence identity” means the value determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison, and multiplying the result by 100 to yield the percentage of sequence identity.
Unless otherwise stated, sequence identity/similarity values provided herein refer to the value obtained using GAP Version 10 using the following parameters: % identity and % similarity for a nucleotide sequence using GAP Weight of 50 and Length Weight of 3, and the nwsgapdna.cmp scoring matrix; % identity and % similarity for an amino acid sequence using GAP Weight of 8 and Length Weight of 2, and the BLOSUM62 scoring matrix; or any equivalent program thereof. By “equivalent program” is intended any sequence comparison program that, for any two sequences in question, generates an alignment having identical nucleotide or amino acid residue matches and an identical percent sequence identity when compared to the corresponding alignment generated by GAP Version 10.
By “fragment” is intended a portion of the polynucleotide. Fragments of an IKZF1 polynucleotide or an exon or intron or promoter or 5′/3′ regulatory region thereof or fragments of a polynucleotide comprising an IKZF1 genomic abnormality are useful as, for example, probes and primers and need not encode the IKZF1 polypeptide. Instead, such fragments and variants are able to detect an IKZF1 genomic abnormality that is associated with ALL, more particularly with BCR-ABL1 positive ALL and/or associated with CML, more particularly, BC-CML or the likelihood of progression into blastic transformation of CML. Alternatively, such fragments and variants are able to detect an IKZF1 genomic abnormality that is predictive of a subtype of ALL having a very poor outcome. Thus, fragments of a nucleotide sequence may range from at least about 10, about 15, 20 nucleotides, about 50 nucleotides, about 75 nucleotides, about 100 nucleotides, 200 nucleotides, 300 nucleotides, 400 nucleotides, 500 nucleotides, 600 nucleotides, 700 nucleotides and up to the full-length polynucleotide employed in the invention. Methods to assay for the activity of a desired polynucleotide or polypeptide are described elsewhere herein.
“Variants” is intended to mean substantially similar sequences. For polynucleotides, a variant comprises a deletion and/or addition of one or more nucleotides at one or more internal sites within the native polynucleotide and/or a substitution of one or more nucleotides at one or more sites in the native polynucleotide. Generally, variants of a particular polynucleotide of the invention having the desired activity will have at least about 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to that particular polynucleotide as determined by sequence alignment programs and parameters described elsewhere herein.
An “isolated” or “purified” polynucleotide or polypeptide or biologically active fragment or variant thereof, is substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized. Preferably, an “isolated” nucleic acid is free of sequences (preferably protein encoding sequences) that naturally flank the nucleic acid (i.e., sequences located at the 5′ and 3′ ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For purposes of the invention, “isolated” when used to refer to nucleic acid molecules excludes isolated chromosomes. For example, in various embodiments, the isolated nucleic acid molecules can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb, or 0.1 kb of nucleotide sequences that naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived.
As used herein, the use of the term “polynucleotide” is not intended to limit the present invention to polynucleotides comprising DNA. Those of ordinary skill in the art will recognize that polynucleotides, can comprise ribonucleotides and combinations of ribonucleotides and deoxyribonucleotides. Such deoxyribonucleotides and ribonucleotides include both naturally occurring molecules and synthetic analogues. The polynucleotides of the invention also encompass all forms of sequences including, but not limited to, single-stranded forms, double-stranded forms, hairpins, stem-and-loop structures, and the like.
The following examples are offered by way of illustration and not by way of limitation.
The Philadelphia chromosome, encoding BCR-ABL1, is the defining lesion of chronic myelogenous leukemia (CML) and a subset of acute lymphoblastic leukemia (ALL). To define oncogenic lesions that cooperate with BCR-ABL1 to induce ALL, we performed genome-wide analysis of diagnostic leukemic samples from 304 individuals with ALL, including 43 BCR-ABL1 B-progenitor ALLs, and 34 CML cases. IKZF1 (encoding the transcription factor Ikaros) was deleted in 83.7% of BCR-ABL1 ALL, but not in chromic phase CML. Deletion of IKZF1 was also identified as an acquired lesion in lymphoid blast crisis of CML. The IKZF1 deletions resulted in haploinsufficiency, expression of a dominant negative Ikaros isoform or the complete loss of Ikaros expression. Sequencing of IKZF1 deletion breakpoints suggested that aberrant RAG-mediated recombination is responsible for the deletions. These findings suggest that genetic lesions resulting in the loss of Ikaros function are a key event in the development of BCR-ABL1 ALL.
Patients and samples. Two hundred eighty two patients with acute lymphoblastic leukemia (ALL) treated at St. Jude Children's Research Hospital, 22 adult BCR-ABL1 ALL patients treated at the University of Chicago, and 49 samples obtained from 23 adult patients with chronic myeloid leukemia (CML) treated at the Institute of Medical and Veterinary Science, Adelaide, and 36 AML and ALL cell lines were studied (Tables 1 and 2). The CML cohort included 24 chronic phase, 7 accelerated phase and 15 blast crisis samples, and three samples obtained at complete cytogenetic response. All blast crisis samples were flow sorted to at least 90% blast purity prior to DNA extraction using FACS Vantage SE (with DiVa option) flow cytometers (BD Biosciences, San Jose, Calif.) and fluorescein isothiocyanate labelled CD45, allophycocyanin labelled CD33 and phycoerythrin labelled CD19 and CD13 antibodies (BD Biosciences). Germline tissue was obtained by also sorting the non-blast population in 7 cases. Informed consent for the use of leukemic cells for research was obtained from patients, parents or guardians in accordance with the Declaration of Helsinki, and study approval was obtained from the SJCRH institutional review board.
Single nucleotide polymorphism microarray analysis. Collection and processing of diagnostic and remission bone marrow and peripheral blood samples for Affymetrix single nucleotide polymorphism microarray analysis has been previously reported in detail9. Affymetrix 250K Sty and Nsp arrays were performed on all samples. 50 k Hind 240 and 50 k Xba 240 arrays were performed for 252 ALL samples (Table 1).
Fluorescent in situ hybridization. Fluorescence in situ hybridization for IKZF1 deletion was performed using diagnostic bone marrow or peripheral blood leukemic cells in Carnoy's fixative as previously described9. BAC clones CTD-2382L6 and CTC-79 1O3 (for IKZF1, Open Biosystems, Huntsville, Ala.) were labelled with fluorescein isothiocyanate, and control 7q3 1 probes RP1 1-460K21 (Children's Hospital Oakland Research Institute, Oakland, Calif.) and CTB-133K23 (Open Biosystems), were labelled with rhodamine. At least 100 interphase nuclei were scored per case.
IKZF1 PCR, cloning, quantitative PCR and genomic sequencing. RNA was extracted and reverse transcribed using random hexamer primers and Superscript III (Invitrogen, Carlsbad, Calif.) as previously described9. IKZF1 transcripts were amplified from cDNA using the Advantage 2 PCR enzyme (Clontech, Mountain View) as previously described9 using primers that anneal in exon 0 and 7 of IKZF1. PCR products were purified, and sequenced directly and after cloning into pGEM-T-Easy (Promega, Madison, Wis.). Genomic quantitative PCR for exons 1-7 of IKZF1, and real-time PCR to quantify expression of Ik6 were performed as previously described9. All primers and probes are listed in Table 10. Genomic sequencing of IKZF1 exons 0-7 in all ALL and CML samples was performed as previously described9.
Western blotting. Whole cell lysates of 3-6×106 leukemic cells were prepared and blotted as previously described9 using N- and C-terminus specific rabbit polyclonal Ikaros antibodies (Santa Cruz Biotechnology, Santa Cruz, Calif.).
Methylation analysis. Methylation status of the IKZF1 promoter CpG island (chr7:5012 1508-50121714) was performed using MALDI-TOF mass spectrometry of PCR-amplified, bisulfite modified genomic DNA extracted from leukemic cells as previously described8,29. Statistical analysis. Associations between ALL subtype and IKZF1 deletion frequency were calculated using the exact likelihood ratio test. Differences in Ik6 expression between IKZF1 Δ3-6 and non-Δ3-6 cases was assessed using the exact Wilcoxon-Mann-Whitney test. All P values reported are two-sided. Analyses were performed using StatXact v8.0.0 (Cytel, Cambridge, Mass.).
Cell lines examined by SNP array. Thirty-six acute myeloid and lymphoid leukemia cell lines were genotyped using the Affymetrix Mapping 250 k Sty and Nsp arrays. These were the ALL cell lines 380 (MYC-IGH and BCL2-IGH B-precursor), 697 (TCF3-PBX1), AT 1 (ETV6-R UNX1), BV1 73 (CML in lymphoid blast crisis), CCRF-CEM (TAL-SIL), Jurkat (T-ALL), Kasumi-2 (TCF3-PBX1), MHH-CALL-2 (hyperdiploid B-precursor ALL), MHH-CALL-3 (TCF3-PBX1), MOLT3 (T-ALL), MOLT4 (T-ALL), NALM-6 (B-precursor ALL), OP1 (BCR-ABL1), Reh (ETV6-RUNX1), RS4; 11 (MLL-AF4), SD1 (BCR-ABL1), SUP-B15 (BCR-ABL1), TOM-1 (BCR-ABL1), U-937 (PICALM-AF10), UOCB1 (TCF3-HLF), YT (NK leukemia); and the AML cell lines CMK (FAB M7), HL-60 (FAB M2), K-562 (CML in myeloid blast crisis), Kasumi-1 (RUNX1-RUNX1T1), KG-1 (myelocytic leukemia), ME-1 (CBFB-MYH11), ML-2 (MLL-AF6), M-07e (FAB M7), Mono Mac 6 (MLL-AF9), MV4-1 1 (MLL-AF4), NB4 (PML-RARA), NOMO-1 (MLL-AF9), PL21 (FAB M3), SKNO-1 (RUNX1-RUNX1T1) and THP-1 (FAB M5). Cell lines were obtained from the Deutsche Sammlung von Mikroorganismen and Zellkulturen, Braunschweig, Germany; the American Type Culture Collection, Manassas, Va., from local institutional repositories, or were gifts from Olaf Heidenreich (SKNO-1) and Dario Campana (OP1). Cells were culture in accordance with previously published recommendations30. The paediatric BCR-ABL1 B-precursor ALL cell line OP131 was cultured in RPMI-1640 containing 100 units/ml penicillin, 100 μg/ml streptomycin, 2 mM glutamine and 10% fetal bovine serum. DNA was extracted from 5×106 cells obtained during log phase growth after washing in PBS using the Qlamp DNA blood mini kit (Qiagen, Valencia, Calif.).
Obtaining primary SNP array data. SNP array CEL and SNP call TXT files (generated by Affymetrix GTYPE 4.0 using the DM algorithm) have been deposited in NCBIs Gene Expression Omnibus (GEO, www.ncbi.nlm.nih.gov/geo/) and are accessible through GEO Series accession numbers GSE9109-91 13. These accessions contain the following data: GSE9109: Sty and Nsp files for 304 ALL samples, and Hind and Xba files for 252 of these samples; GSE91 10: Sty and Nsp files for 56 CML samples; GSE9 111: Sty, Nsp, Hind and Xba files for 50 remission acute leukemia samples used as references for copy number analysis; GSE91 12: Sty and Nsp files for 36 acute leukemia cell lines; GSE9 11 3: A superseries containing all of the above data.
Acute lymphoblastic leukemia (ALL) comprises a heterogeneous group of disorders characterized by recurring chromosomal abnormalities including translocations, trisomies and deletions. An ALL subtype with especially poor prognosis is characterized by the presence of the Philadelphia chromosome arising from the t(9; 22)(q34; q1 1.2) translocation, which encodes the constitutively activated BCR-ABL1 tyrosine kinase. BCR-ABL1 positive ALL constitutes 5% of paediatric B-progenitor ALL and approximately 40% of adult ALL1,2. Expression of BCR-ABL1 is also the pathologic lesion underlying chronic myelogenous leukemia (CML)3. Data from murine studies demonstrates that expression of BCR-ABL1 in haematopoietic stem cells can alone induce a CML-like myeloproliferative disease, but cooperating oncogenic lesions are required for the generation of a blastic leukemia4,5. Although the p210 and p190 BCR-ABL1 fusions are most commonly found in CML and paediatric BCR-ABL1 ALL respectively, either fusion may be found in adult BCR-ABL1 ALL6. Importantly, a number of genetic lesions including additional cytogenetic aberrations and mutations in tumor suppressor genes have been described in CML cases progressing to blast crisis7. However, the specific lesions responsible for the generation of BCR-ABL1 acute lymphoid leukemia and blastic transformation of CML remain incompletely understood7. To identify cooperating oncogenic lesions in ALL, we recently performed a genome-wide analysis of paediatric ALL8. This analysis identified an average of 6.8 genomic copy number alterations (CNA) in 9 BCR-ABL1 ALL cases, including deletions in genes that play a regulatory role in normal B cell development.
To extend this analysis and identify lesions that distinguish CML from BCR-ABL1 ALL, we have now examined DNA from leukemic samples from 304 paediatric and adult ALLs (254 B-progenitor, 50 T-lineage), including 21 paediatric and 22 adult BCR-ABL1 ALL, and 23 adult CML samples (Table 1). Samples were analyzed using the 250 k Sty and Nsp Affymetrix SNP arrays (and also the 100K arrays for most cases). This identified a mean of 8.79 somatic CNA per BCR-ABL1 ALL case (range 1-26), with 1.44 gains (range 0-13) and 7.33 losses (range 0-25) (Table 2). No significant differences were noted in the frequency of CNAs between paediatric and adult BCR-ABL1 ALL cases. The most frequent somatic CNA was deletion of IKZF1, which encodes the transcription factor Ikaros (Table 3). IKZF1 was deleted in 36 (83.7%) of 43 BCR-ABL1 ALL cases, including 76.2% of paediatric and 90.9% of adult BCR-ABL1 ALL cases. CDKN2A was deleted in 53.5% of BCR-ABL1 ALL cases, most of which (87.5%) also had deletions of IKZF1 (
Ikaros is a member of a family of zinc finger nuclear proteins that is required for normal lymphoid development9-12. Ikaros has a central DNA-binding domain consisting of four zinc fingers, and a homo- and heterodimerization domain consisting of the two C-terminal zinc fingers13 (
The expression of aberrant, dominant negative Ikaros isoforms in B- and T-lineage ALL has been previously reported by several groups15-22, although alternative splicing has been reported to be the underlying mechanism23. Importantly, the Δ3-6 isoform of Ikaros has been shown to function as a dominant negative inhibitor of the transcriptional activity of Ikaros and related family members13. Moreover, mice heterozygous for a null IKZF1 allele develop clonal T cell expansions24 and mice transgenic for the an IKZF1 A3-6 gene lack T, B, NK and dendritic cells, and develop a T cell lymphoproliferative diseases25,26, demonstrating that alteration in the level of IKZF1 expression is oncogenic.
The high frequency of focal deletions in IKZF1 in BCR-ABL1 ALL suggests that expression of alternative IKZF1 transcripts may be the result of specific genetic lesions, and not alternative splicing of an intact gene. To further explore this possibility, we performed RT-PCR analysis for IKZF1 transcripts in 159 cases (
To identify CNAs in CML, we performed SNP array analysis on 23 CML cases. In addition to chronic phase CML (CP-CML), we also examined matched accelerated phase (APCML, N=7) and blast crisis (BC-CML, N=15, 12 myeloid and 3 lymphoid) samples (Table 6). This identified only 0.47 CNAs per CP-CML case (range 0-8) (Table 7), suggesting that BCR-ABL1 is sufficient to induce CML, but alone does not result in substantial genomic instability. Importantly, no recurrent lesions were identified. In contrast, there was a mean of 7.8 CNAs per BC-CML case (range 0-28) (Table 7), with IKZF1 deletions in four BC samples, including two of the three cases with lymphoid blast crisis (
To explore the mechanism responsible for the identified IKZF1 deletion, we sequenced the IKZF1 Δ3-6 genomic breakpoints (Table 8 and
In summary, we have identified a high frequency of CNAs in BCR-ABL1 ALL and BCCML, but not in CP-CML. The high frequency of recurring CNAs suggests that these lesions directly contribute to the generation of BCR-ABL1 ALL. Among the identified lesions, our analysis revealed a near obligate deletion of IKZF1 in BCR-ABL1 ALL, with 83.7% of paediatric and adult cases containing a deletion that leads to a reduction in dose and/or the expression of an altered Ikaros isoform. By contrast, deletion of IKZF1 was not detected in CP-CML, but was identified as an acquired lesion in 2 of 3 lymphoid BC-CML samples. Furthermore, our data suggest that the IKZF1 deletions result from aberrant RAG-mediated recombination. These data, and the low frequency of IKZF1 deletions in other paediatric B-progenitor ALL cases, suggests that alterations in Ikaros directly contribute to the pathogenesis of BCR-ABL1 ALL. How reduced activity of Ikaros, and possibly that of other family members through the expression of dominant negative Ikaros isoforms, collaborates with BCR-ABL1 to induce lymphoblastic leukemia remains to be determined. Importantly, mice with attenuated Ikaros expression exhibit a partial block of B lymphoid maturation at the pro-B cell stage28, suggesting that Ikaros loss may contribute to the arrested B lymphoid maturation in BCR-ABL1 ALL. However, the high co-occurrence of PAX5 deletions in many cases suggests that IKZF1 deletion contributes to transformation in additional ways. The frequent co-deletion of CDKN2A (encoding INK4A/ARF) with IKZF1 in BCR-ABL1 ALL is a notable finding. This suggests that attenuated Ikaros activity may either collaborate with disruption of INK4A/ARF-mediated tumor suppression, or act through alternative uncharacterized tumor suppressor pathways in ALL. Dissecting the contribution of altered Ikaros activity to BCR-ABL1 leukaemogenesis should not only provide valuable mechanistic insights, but will also help to determine if the presence of this genetic lesion can be used to gain a therapeutic advantage against this aggressive leukemia.
Table 1. The acute lymphoblastic leukemia cases studied by Affymetrix SNP array. * Adult BCR-ABL1 B-ALL cases. †IKZF1 sequencing for these cases was not performed or failed due to limited DNA. ND, not done.
Table 5 shows IKZF1 genomic quantitative PCR and fluorescent in situ hybridization (FISH) results. Genomic qPCR of all 7 coding IKZF1 exons was performed for 8 cases to verify the extent of IKZF1 deletions. In the remaining cases, a subset of exons was tested to confirm the focal IKZF1 deletions. IKZF1/RNAseP qPCR ratios of less than 0.75 indicate deletion, and ratios of less than 0.3 indicate homozygous deletion. e, exon; homo, homozygous (deletion); WT, wild type.
†IKZF1 sequencing for these cases was not performed or failed due to limited DNA.
indicates data missing or illegible when filed
heptamer RSSs located immediately within the deleted segment. Representative BCR-ABL1 cases are shown. The heptamer RSSs are shown underlined and in bold, and nucleotides matching the RSS exactly are shown in red. The additional nucleotides between the consensus genomic sequence suggests the action of TdT. The intron 2 junction sequence for the BCR-ABL-SNP clone #1, 4, 7, 10, 12, 13, 16, 21, 26, 33, 34, 38, 39, and 42 are set forth in SEQ ID NOS: 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, and 20, respectively. The normal sequence of intron 2 is set forth in SEQ ID NO:21. The intron 6 junction sequence for the BCR-ABL-SNP clone #1, 4, 7, 10, 12, 13, 16, 21, 26, 38, 39, 42, 33 and 34 are set forth in SEQ ID NOS:22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, and 35, respectively. The normal sequence of intron 6 is set forth in SEQ ID NO:36.
Table 9 shows the sequencing of IKZF1 Δ3-6 deletions that demonstrates the restricted location of the breakpoints in both introns 2 and 6, and the heptamer RSSs located immediately within the deleted segment. The heptamer RSSs are shown underlined and in bold, and nucleotides matching the RSS exactly are shown in red. The additional nucleotides between the consensus genomic sequence suggests the action of Terminal deoxynucleotidyl transferase (TdT).
Despite best current therapy, up to 20% of pediatric acute lymphoblastic leukemia (ALL) cases relapse. Recent genome-wide analyses have identified a high frequency of recurring DNA copy number abnormalities (CNA) in ALL, but the prognostic impact of these abnormalities has not been defined. We studied a cohort of 221 children with high-risk B-progenitor ALL that excluded known very high risk ALL subtypes (BCR-ABL1, hypodiploid and infant ALL), using single nucleotide polymorphism microarrays, transcriptional profiling and resequencing. A CNA poor outcome predictor was identified and tested in an independent validation cohort of 258 B-progenitor ALL cases.
Over 50 recurring CNA were identified, most commonly targeting genes encoding regulators of B-lymphoid development (66.8% of cases), with PAX5 targeted in 31.7% and IKZF1 in 28.6%. We identified a CNA predictor of very poor outcome in an independent validation cohort (P<0.0001), that was strongly associated with deletion or mutation of IKZF1, a gene that encodes the lymphoid transcription factor IKAROS. The gene expression signature of the poor outcome group was characterized by reduced expression of B-lineage specific genes, and was highly similar to the signature of BCR-ABL1 ALL, another high-risk ALL subtype also characterized by a high frequency of IKZF1 deletion. Genetic alterations of IKZF1 identify a subgroup of ALL with very poor outcome. Incorporation of molecular tests to identify IKZF1 alterations in diagnostic leukemic blasts should improve the ability to accurately stratify patients for appropriate therapy.
Cure rates for children with acute lymphoblastic leukemia (ALL) now exceed 80%', but current therapies result in substantial toxicities, and up to 20% of ALL cases relapse2. In B-progenitor ALL, a number of recurring chromosomal abnormalities are used in risk stratification, including hyperdiploidy, hypodiploidy, translocations t(1 2; 2 1) [ETV6-R UNX1], t(9; 22)[BCR-ABL1], t(1; 19)[TCF3-PBX1] and rearrangement of MLL. Although treatment failure is common in BCRABL1 and MLL-rearranged ALL, relapse occurs in all subtypes, and the biological basis of resistance to therapy is poorly understood.
Recent genome-wide analyses of DNA copy number abnormalities (CNA) have identified numerous recurring genetic alterations in ALL3-6. Genes encoding transcriptional regulators of B lymphoid development, including PAX5, EBF1 and IKZF1 are mutated in over 40% of B-progenitor ALL3. Notably, deletion of IKZF1, encoding the early lymphoid transcription factor IKAROS, is a near obligate event in BCR-ABL1 positive ALL, and at the progression of chronic myeloid leukemia to lymphoid blast crisis. Other CNAs involve tumor suppressors and cell cycle regulators (CDKN2A/B, RB1, PTEN, ETV6), regulators of apoptosis (BTG1), drug receptor genes (NR3C1 and NR3C2), and lymphoid signaling molecules (BTLA, CD200)3.
A systematic analysis of associations between CNA and outcome in ALL has not been performed. Here we report a study examining CNAs in a cohort of 221 children with high-risk ALL. We identified a CNA outcome predictor driven by deletion of IKZF1 that predicts a high risk of relapse. Association of this CNA predictor with poor outcome was validated in an independent cohort of 258 B-progenitor ALL cases. This CNA predictor was associated with gene expression signature characterized by down regulation of B-lymphoid developmental genes and was also highly related to the expression signature of BCR-ABL1 pediatric ALL.
Two patient cohorts were examined, the first comprising 221 non-infant B-progenitor ALL cases treated on the Children's Oncology Group (COG) P9906 study that incorporated an augmented intensive regimen of post-induction chemotherapy (Table 11)7,8. All patients were at high risk of treatment failure based on the presence of central nervous system or testicular disease, MLL gene rearrangement, or age, gender and presentation leukocyte count9. BCR-ABL1 and hypodiploid ALL, and patients with induction failure were excluded. One hundred seventy cases (76.9%) lacked a recurring chromosomal abnormality. The validation cohort comprised 258 children with B-progenitor ALL treated at St Jude Children's Research Hospital3,5, and included both standard and high risk patients, common aneuploidies and recurring trans locations (including 21 BCR-ABL1 positive cases; Table 12). Informed consent and Institutional Review Board approval was obtained for both cohorts. Minimal residual disease (MRD) was measured at days 8 (peripheral blood) and 29 (bone marrow) of initial induction chemotherapy for 197 cases in the P9 906 cohort, and at days 19 and 46 for 160 cases in the St Jude cohort using multiparameter immunophenotyping as previously described8,10,11.
The P9906 cohort comprised 221 B-progenitor ALL cases treated on the Children's Oncology Group P9 906 study with an augmented intensive regimen of post-induction chemotherapy7 (Table 11). All patients were high risk based on the presence of central nervous system or testicular disease, MLL rearrangement, or based on age, gender and presentation leukocyte count28. BCR-ABL1 and hypodiploid ALL, and cases of primary induction failure were excluded. Hyperdiploid (as defined by trisomy of chromosomes 4 and 10 on cytogenetic analysis) and ETV6-RUNX1 cases were excluded unless CNS or testicular involvement was present at diagnosis. Of 276 cases enrolled, 271 were eligible, and 221 had suitable material for genomic analysis. Twenty-five (11.3%) cases were TCF3-PBX1 positive, 19 harbored MLL-rearrangements, four were hyperdiploid, and three were ETV6-RUNX1 positive.
One hundred seventy (7 6.9%) lacked a recurring chromosomal abnormality. The validation cohort comprised 258 B-progenitor ALL cases treated at St Jude Children's Research Hospital3,29, and included 44 high hyperdiploid (greater than 50 chromosomes), 10 hypodiploid, 17 TCF3-PBX1 positive, 50 ETV6-RUNX1 positive, 21 BCR-ABL1 positive and 24 MLL rearranged B-progenitor ALL cases, and 92 cases with low hyperdiploid, pseudodiploid, normal or miscellaneous karyotypes. These cases were treated on St Jude Total XI (N=8), XII (N=13), XIII (N=105), XIV (N=4), XV (N=1 14) and Interfant-99 (infant; N=5) protocols30-34. Nine cases were treated off protocol. The clinical protocol was approved by the National Cancer Institute and by the Institutional Review Board at each of the Children's Research institutions. Patients and/or a parent/guardian provided informed consent to participate in the clinical trial and for future research using clinical specimens.
Leukemic and remission samples from all P9906 cases were genotyped using 250 k Sty and Nsp SNP arrays (Affymetrix, Santa Clara, Calif.). St Jude samples were genotyped with SNP 6.0 arrays (N=36), 250K Sty and Nsp arrays (N=37), and 250 k and 50 k arrays (N=1 85). SNP array analyses, gene expression profiling, and the use of Gene Set Enrichment Analysis36 and Gene Set Analysis'3 to compare gene expression signatures and examine associations between gene sets and outcome are described herein.
All cases in the P9906 cohort were genotyped using 250K Sty and Nsp arrays, which together examine over 500,000 genomic loci. Thirty-six cases from the St Jude cohort were genotyped using SNP 6.0 arrays which examine over 1.87 million loci; 37 with 250K Sty and Nsp arrays, and 185 with both 250K and two 50K arrays that together examine over 615,000 markers were used in the remainder. SNP array data preprocessing and inference of DNA copy number abnormalities (CNAs) and loss-of-heterozygosity (LOH) was performed as previously described3,5. Briefly, SNP calls were generated using the DM or Birdseed algorithms in GTYPE 4.0 or Genotyping Console (Affymetrix) Summarization of probe level data was performed using the PM/MM (50K and 250K arrays) or PM-only (SNP 6.0 arrays) model-based expression algorithms in dChip (www.dchip.org)11. Normalization of array signals was performed using a reference normalization algorithm that utilizes only those SNP probes from diploid regions of each array to guide normalization3. To identify all tumor-acquired regions of CNA for each sample, circular binary segmentation36 (implemented as the DNAcopy package in R) was performed by directly comparing each tumor sample to the corresponding remission sample.
Genomic resequencing of all the coding exons of PAX5, EBF1 and IKZF1 was performed for all P9906 samples. Genomic resequencing of all the coding exons of PAX5, IKZF1 and EBF1 was performed for all P9906 samples by Agencourt Biosciences (Beverley, Mass.). Genomic DNA was amplified in 384 well plates, with each PCR reaction containing 10 ng DNA, 1× HotStar buffer, 0.8 mM dNTPs, 1 mM MgCl2, 0.2 U HotStar enzyme (Qiagen) and 0.2 M forward and reverse primers in 10 l reaction volumes. PCR cycling parameters were: one cycle of 95° C. for 15 min, 35 cycles of 95° C. for 20 s, 60° C. for 30 s and 72° C. for 1 min, followed by one cycle of 72° C. for 3 min. PCR products were purified using proprietary large scale automated template purification systems using solid-phase reversible immobilization, and then sequenced using dye-terminator chemistry and ABI 3700/3730 machines (Applied Biosystems, Foster City, Calif.). Base calls and quality scores were determined using the program PHRED37,38.
Sequence variations including substitutions and insertion/deletions (indel) were analyzed using the SNPdetector39 and the IndelDetector40 software. A useable read was required to have at least one 30-bp window in which 90% of the bases have PHRED quality score of at least 30. Poor quality reads were filtered prior to variation detection. The minimum threshold of secondary to primary peak ratio for substitution and indel detection was set to be 20% and 10%, respectively. All sequence variations were annotated using a previously developed variation annotation pipeline41. Any variation that did not match a known polymorphism (defined as a dbSNP record that does not belong to OMIM SNP nor COSMIC somatic variation database42,43) and resulted in a non-silent amino acid change was considered a putative mutation.
All putative sequence mutations were confirmed by repeat genomic PCR and sequencing of both tumor and remission DNA. Where possible, expression of mutated PAX5 and IKZF1 alleles was confirmed by amplification and direct sequencing of full length PAX5 and IKZF1 cDNA as previously described3,29. Transcripts were then cloned into pGEM-T-Easy (Promega, Madison, Wis.) and multiple colonies sequenced. Confirmation of CNAs involving PAX5 and IKZF1 by genomic quantitative PCR was performed as previously described3,29.
Missense substitutions were generated in the PAX5 (residues 1-1 49)/ETS-1 (residues 331-440)/DNA structure44 and subjected to local refinement using the program O21. Structural representation was performed with the program PyMOL (Delano Scientific)46.
Supervised principal components (SPC) analysis46,47 was used to examine associations between CNAs and outcome of therapy in a genome-wide fashion. This method has previously been used to examine associations between transcriptional profiling data and outcome in cancer47. In this approach, regions of somatic DNA deletion for each sample were transformed into a matrix in which each column represented an individual case, each row represented an individual gene, and each cell represented copy number status for each gene targeted by CNAs in at least one case. Using the P9906 cohort as the training set, a modified univariate Cox score was calculated for the association between copy number status of each gene and event-free survival, and genes whose Cox score exceeded a threshold that best predicted survival were used to carry out supervised principal components analysis. To determine the Cox threshold, the training set was split and principal components were derived from one half of the samples, and then used in a Cox model to predict survival in the other half. By varying the threshold of Cox scores and using twofold cross-validation, this process was repeated ten times, and a threshold of ±1.8 (averaged over ten separate repeats of this procedure) was used to generate the principal components subsequently used to predict outcome.
For each case, we used the first principal component in a regression model to calculate a SPC risk score that represents the sum of the weighted copy number levels for each gene found to be significantly associated with prognosis. To validate the SPC predictor, we computed risk scores for each of the 258 cases in the St Jude validation cohort using the model developed in the P9906 training set, and tested whether these scores were correlated with survival. To illustrate the performance of the SPC risk score in predicting survival, cases in the validation cohort were classified as being high or low risk according to the calculated SPC risk score, and cumulative incidence of hematologic relapse and any relapse in each SPC risk group analyzed using Gray's estimator47. To examine the role of individual genes in determining outcome, we computed importance scores for genes with Cox scores exceeding the threshold defined by cross validation. The importance score is equivalent to the correlation between each gene and the first supervised principal component. Associations between genes with the top importance scores and hematologic and any relapse were then calcula ed using Gray's estimator. Event-free survival (EFS) was defined as the time from diagnosis until the date of failure (relapse, death, or second malignancy) or until the last follow-up date for all event-free survivors. Associations between genetic variables (deletions±sequence mutations of individual genes, presence and number B-cell pathway lesions) and EFS were estimated by the methods of Kaplan and Meier. Standard errors were calculated by the methods of Peto et al48. The Mantel-Haenszel test was used to compare EFS estimates for patients with and without lesion at each locus49. The proportional hazards model of Fine and Gray was used to adjust for age, presentation leukocyte count, cytogenetic subtype and levels of minimal residual disease (MRD)50. Analyses were performed using R (www.r-project.org)51, SAS (SAS v9. 1.2, SAS Institute, Cary, N.C.) and SPLUS (SPLUS 7.0, Insightful Corp., Palo Alto, Calif.)
To evaluate associations between genetic alterations and MRD, MRD data was converted into an ordinal variable (<0.01% 0.01≦MRD<1% and ≧1%) and association analyses performed using the Chi-Square test (FREQ procedure, SAS) with estimation of false discovery rate (MULTTEST, SAS). Significantly associated variables were then adjusted for age, presentation leukocyte count and genetic subtype using logistic regression.
Gene expression profiling was performed using U133 Plus 2 microarrays (Affymetrix) for 198 P9906 samples, and using U133A microarrays (Affymetrix) for 175 St Jude samples. Probe intensities were generated using the MAS 5.0 algorithm, probe sets called absent in all samples in each cohort were excluded, and expression data log-transformed. To define the gene expression signature of poor outcome ALL in each cohort, we used limma (Linear Models for Microarray Analysis)52, the empirical Bayes t-test implemented in Bioconductor53 and the Benjamini-Hochberg method of false discovery rate (FDR) estimation54 to identify probe sets differentially expressed between cases defined as high or low risk according to their SPC risk score. This approach was also used to define the gene expression signature of BCR-ABL1 positive de novo pediatric ALL in the St Jude cohort.
To assess similarity between the high-risk gene expression signatures of the P9906 and St Jude cohorts, and between the high-risk signatures and the signature of BCR-ABL1 positive ALL, gene set enrichment analysis (GSEA)55 and direct comparison of the signatures was performed. Gene sets of the top up- and down-regulated genes in the signatures of high risk P9906 and St Jude ALL, and BCR-ABL1 positive ALL were created and added to the collection of curated gene sets available at the Molecular Signatures Database (www.broad.mit.edu/gsea/msigdb/). GSEA of high risk ALL was then performed for each cohort using this expanded collection of gene sets. In a complementary approach, we determined the fraction of the top 100 differentially expressed probe sets in P9906 high-risk ALL that were also differentially expressed in St Jude BCR-ABL1 positive ALL (at an FDR threshold of 5%). The Gene Set Analysis (GSA) algorithm, a modification of GSEA that allows testing of associations between gene sets and time-dependent variables such as survival time56, was used to examine associations between gene sets and EFS in the P9906 cohort.
P9906 SNP array data are available to academic researchers upon request at caArray at CaBIG (the National Cancer Institute Cancer Biomedical Informatics Grid) (www.array.nci.nih.gov/caarray/project/mulli-001 12), and St Jude SNP array data at the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO)57,58 (www.ncbi.nlm.nih.gov/geo, accession GSE1 1445). Primary gene expression data are available at GEO (P9906 data accession GSE11877, St Jude data accession GSE 12995) and (for P9906 data), caArray. All P9906 SNP array, gene expression, and sequence analysis data are available at http://target.cancer.gov/data/. All sequencing traces and sequencing primer Information have been deposited with NCBIs trace archive.
We identified a mean of 8.36 CNAs per case in the P9906 cohort (Table 13), and over 50 recurring CNA where the minimal common region of change involved one or few genes (Table 14). The most common alterations were deletions of CDKN2A/B (45.7%), the lymphoid transcription factor genes PAX5 (31.7%,
Twenty-two cases harbored 27 PAX5 sequence mutations (Table 17). The most frequent was the previously identified P80R mutation in the paired domain of PAX5 that results in marked attenuation of the DNA-binding and transactivating activity of PAX53 (
Sixty-three (28.6%) cases had deletion of IKZF1 (Tables 14 and 18,
Associations with Outcome
Supervised principal components (SPC) analysis of the P9906 cohort identified associations between copy number status of 23 genes and treatment outcome (Table 21). The resulting SPC risk score was associated with the risk of experiencing any adverse event in the St Jude validation cohort. The 10 year incidence of events in SPC-predicted high risk cases was 59.3% (95% confidence interval (CI) 43.6%-75.1%), compared to 26.7% (CI 19.5%-33.9%) for predicted low risk cases (P<0.0001;
Alterations of IKZF1, BTLA/CD200 and EBF1 were most significantly associated with the P9906 SPC predictor (Table 21). Of these, only IKZF1 was significantly associated with the predictor defined in the St Jude cohort (Table 22). Deletion or mutation of IKZF1 was significantly associated with increased risk of relapse and adverse events in both cohorts (Table 37, FIG. 14A,D, Tables 23-25). IKZF1 deletions were also associated with inferior outcome in St Jude BCR-ABL1 negative ALL (Table 37,
Furthermore, alteration of IKZF1 had independent prognostic significance after adjusting for age, presenting leukocyte count and cytogenetic subtype (Table 25). Deletions of EBF1 and BTLA/CD200 were only associated with inferior outcome in the P9906 cohort. Whilst an increasing number of genetic alterations targeting B cell development was also associated with inferior outcome (Supplementary Tables 23-25), no independent association between PAX5 lesions and outcome were observed in either cohort.
Associations with Minimal Residual Disease During Remission Induction Therapy
Consistent with previous data8,10,11, elevated MRD levels were strongly associated with increased risk of relapse in both cohorts (COG day 8 P<0.0001, day 29 P<0.0001; St Jude day 19 P<0.0001, day 46 P<0.0001). IKZF1 and EBF1 alterations were strongly associated with elevated day 29 MRD levels in the P9906 cohort. Sixteen of 66 (24.2%) IKZF1 deleted/mutated cases had high-level (>1%) MRD at day 29, compared to 6.5% of those without this abnormality (P=0.0002, Table 38 and Table 27). These associations remained significant in multivariable analysis adjusting for age, presentation leukocyte count and genetic subtype (EBF1 odds ratio (OR) 5.5, P=0.001; IKZF1 OR 2.7, P=0.002; Table 28). Importantly, the associations of IKZF1 with relapse and adverse events remained significant after adjusting for age, leukocyte count, subtype and MRD in this cohort (Table 29).
IKZF1 alterations were also associated with outcome in the subgroup of St Jude cases with MRD data (N=160; Tables 30 and 31). Deletion or mutation of IKZF1 was strongly associated with elevated MRD levels in this subset of patients. Thirteen (61.9%) IKZF1 deleted/mutated cases had high (≧1.0%) levels of residual disease at day 19, compared to 9.3% of cases without deletion (P<0.0001, Table 38 and Table 32). This association was also observed for day 46 MRD (33.3% v 0.7%, P<0.0001, Table 38 and Table 33). IKZF1 status was also associated with both day 19 (P=0.0001) and day 46 MRD (P=0.0001) in the BCR-ABL1 negative St Jude cohort (Tables 34 and 35).
The association between IKZF1 alterations and outcome in both cohorts, as well as prior data showing that deletion of IKZF1 is a frequent in BCR-ABL1 positive ALL5, suggest that IKAROS abnormalities are important in the pathogenesis of both poor outcome, BCR-ABL1 negative ALL and BCR-ABL1 positive ALL. To explore this, we used gene set enrichment analysis to compare the gene expression signatures of P9906 and St Jude poor outcome ALL, and BCR-ABL1 positive and P9906 poor outcome (BCR-ABL1 negative) ALL. This analysis revealed significant similarity of signatures of the poor outcome P9906 and St Jude ALL groups (Supplementary
1
16a)
indicates data missing or illegible when filed
1
indicates data missing or illegible when filed
indicates data missing or illegible when filed
Accurate risk stratification is critical to ensure that patients with high-risk ALL receive treatment of appropriate intensity, while low-risk cases are spared unnecessary toxicity. Current risk stratification is primarily based upon clinical variables, immunophenotype, detection of sentinel cytogenetic/molecular lesions data and early response to therapy'. However, a substantial proportion of patients relapse but have no known risk factors at diagnosis. It is thus critical to identify new markers that improve outcome prediction and identify new treatment targets. Here we have used high-resolution genome-wide copy number analysis to identify genetic lesions associated with outcome.
The most striking finding was a strong association between deletions or mutation of IKZF1 (IKAROS) and poor outcome in two independent cohorts notable for markedly different sample composition and treatment schedules. Importantly, the association of IKZF1 status and outcome was independent of age, presenting leukocyte count, cytogenetic subtype and MRD levels, indicating that IZKF1 profiling at diagnosis will be useful in identifying individuals at high risk of treatment failure. Moreover, the gene expression signatures of poor outcome (IKZF1-deleted) P9906 and St Jude ALL were highly similar, and also similar to the signature of BCR-ABL1 positive ALL, where IKZF1 deletion is extremely common As BCR-ABL1 ALL also has a poor prognosis, these findings suggest that IKZF1 mutation may be a key determinant of the poor outcome of both BCR-ABL1 positive and negative disease. The similarity of the gene expression signatures of IKZF1-mutated, BCR-ABL1 negative ALL and BCR-ABL1 positive ALL raises the possibility that the poor outcome, IKZF1-deleted cases may harbor hitherto unidentified activating mutations in tyrosine kinases.
IKAROS is a transcription factor with well-established roles in lymphopoiesis and cancer19. Normal IKAROS contains four N-terminal zinc fingers required for normal DNA binding, and two C-terminal zinc fingers that mediate dimerization. IKAROS is required for the development of all lymphoid lineages19, and mice heterozygous for a dominant negative IKAROS mutation develop aggressive T-lineage hematopoietic disease20. Ikzf1 is also a common target of integration in murine retroviral mutagenesis studies21.
Alternate IKAROS transcripts have been widely described in normal hematopoietic cells and leukemic blasts22. Isoforms lacking most or all of the N-terminal zinc fingers have attenuated DNA binding capacity but retain their ability to homo- and heterodimerize, and thus act as dominant negative inhibitors of IKAROS23. These isoforms have been reported at variable frequency in ALL22. Recently, we reported a near obligate deletion of IKZF1 in BCR-ABL1 positive ALL and lymphoid blast crisis CML, suggesting that perturbation of IKAROS is a key event in the pathogenesis and progression of BCR-ABL1 ALL5. Importantly, there was complete correlation between the extent of genomic deletion and the expression of aberrant IKAROS isoforms5. For example, all cases expressing the dominant negative Ik6 isoform, that lacks exons 3-6 and all N-terminal zinc fingers, had genomic deletions of exons 3-65.
The present study demonstrates that IKZF1 alterations are present in a substantial proportion of BCR-ABL1 negative B-progenitor ALL cases, predominantly in cases that lack other common recurrent cytogenetic abnormalities (3 8.8% of P9906 and 22.8% St Jude cases with normal or miscellaneous cytogenetic abnormalities). As in BCR-ABL1 positive ALL, IKZF1 deletions involved either the entire locus or subsets of exons, and are predicted to result in either haploinsufficiency or the expression of dominant negative IKAROS isoforms. Moreover, we have identified sequence mutations of IKZF1 in ALL that are predicted to result in loss of normal IKAROS function or expression of a novel dominant negative isoform, G158S.
Using GSEA, we found negative enrichment of genes involved in normal B lymphoid signaling and development in poor outcome ALL. This is compatible with the known requirement for IKAROS in lymphoid development19, and previous studies showing that expression of dominant negative IKAROS isoforms impairs B lymphoid differentiation24. Together, these data suggest that attenuation of normal IKAROS activity and the resulting block in lymphoid maturation renders leukemic cells less susceptible to eradication by chemotherapeutic agents. Whether this relates to enrichment for properties that are characteristic of leukemia initiating or stem cells, including their inherent drug resistant mechanisms, remains to be determined25.
Notably, we did not find outcome to be associated with extensively studied loci such as CDKN2A/B26,27, or with PAX5 status, despite PAX5 alterations being the most common B-cell pathway lesions observed in both cohorts. This suggests that PAX5 is important in establishing the leukemic clone, whereas deletions of IKZF1 may also directly contribute to treatment resistance. Experimental studies addressing the relative contribution of these two lesions to leukemogenesis and treatment resistance will provide valuable insights into how these genetic alterations contribute to the molecular pathology of ALL.
In summary, we have identified alterations of IKZF1 as a new prognostic marker in childhood B-progenitor ALL, and integrated genomic analysis suggests that IKZF1 directly contributes to treatment resistance in ALL. These results provide a strong rationale for the integration of IKZF1 status analysis in the diagnostic evaluation of patients with ALL.
Most children with acute lymphoblastic leukemia (ALL) can be cured, but for the subset of patients who undergo relapse prognosis is dismal. To explore the genetic basis of relapse, we performed genome-wide DNA copy number analyses on matched diagnosis and relapse samples from 61 patients with ALL. In the majority of cases, the diagnosis and relapse samples showed different patterns of genomic copy number abnormalities (CNAs), with the abnormalities acquired at relapse preferentially affecting genes involved in cell cycle regulation and B cell development. Although the diagnosis and relapse samples were genetically related, most relapse samples lacked some of the CNAs present at diagnosis, suggesting that the cells responsible for relapse are ancestral to the primary leukemia cells. Backtracking studies demonstrated that cells corresponding to relapse clone were often present as minor sub-populations at diagnosis. These data suggest that genomic abnormalities contributing to ALL relapse are selected for during treatment and that the signaling pathways affected by these acquired alterations may be rational targets for therapeutic intervention.
Despite cure rates for pediatric acute lymphoblastic leukemia (ALL) exceeding 80% (58), treatment failure remains a significant problem. Relapsed ALL ranks as the fourth most common childhood malignancy and has an overall survival rate of only 30% (59, 60). Important biological and clinical differences have been identified between diagnostic and relapsed leukemic cells including the acquisition of new chromosomal abnormalities, gene mutations, and reduced responsiveness to chemotherapeutic agents (61-64). However, many questions remain about the molecular abnormalities responsible for relapse, as well as the relationship between the cells giving rise to the primary and recurrent leukemias in individual patients.
Genome-wide analyses of DNA copy number abnormalities (CNAs) and loss-of-heterozygosity (LOH) using single nucleotide polymorphism (SNP) arrays have provided important insights into the pathogenesis of newly diagnosed ALL. We have previously reported multiple recurring somatic CNAs in genes encoding transcription factors, cell cycle regulators, apoptosis mediators, lymphoid signaling molecules and drug receptors in B-progenitor and T-lineage ALL (65, 66). To gain insights into the molecular lesions responsible for ALL relapse, we have now performed genome-wide CNA and LOH analyses on matched diagnostic and relapse bone marrow samples from 61 pediatric ALL patients (data not shown). These samples included 47 B-progenitor and 14 T-lineage ALL (T-ALL) cases (67). Samples were flow sorted to ensure at least 80% tumor cell purity prior to DNA extraction (data not shown). DNA copy number and LOH data were obtained using Affymetrix SNP 6.0 (47 diagnosis-relapse pairs) or 500K arrays (14 pairs). Remission bone marrow samples were also analyzed for 48 patients (data not shown).
These analyses identified a mean of 10.8 somatic CNAs per B-ALL case at diagnosis, and 7.1 CNAs per T-ALL case (data not shown). 48.9% of B-ALL cases at diagnosis had CNAs in genes known to regulate B-lymphoid development, including PAX5 (N=12), IKZF1 (N=12), EBF1 (N=2), and RAG1/2 (N=2) (tables S5, S6 and S9). Deletion of CDKN2A/B was present in 36.2% of B-ALL and 71.4% T-ALL cases, and deletion of ETV6 in 11 B-ALL cases. We also identified novel CNAs involving ARID2, which encodes a member of a chromatin remodeling complex (68), the cyclic AMP regulated phosphoprotein ARPP-21, the IL3RA and CSF2RA cytokine receptor genes (data not shown), and the Wnt/β-catenin pathway genes CTNNB1, WNT9B and CREBBP (data not shown).
Although evidence for clonal evolution and/or selection at relapse has been previously reported (61, 63, 64, 69-78), we observed a striking degree of change in the number, extent, and nature of CNAs between diagnosis and relapse in paired samples of ALL. A significant increase in the mean number of CNAs per case were observed in relapse B-ALL samples (10.8 at diagnosis versus 14.0 at relapse, P=0.0005) with the majority being additional regions of deletion (6.8 deletions/case at diagnosis versus 9.2/case at relapse, P=0.0006; and 4.0 gains/case at diagnosis versus 4.8 gains/case at relapse, P=0.03; data not shown). By contrast, no significant changes in lesion frequency were observed in T-ALL (data not shown).
The majority (88.5%) of relapse samples harbored at least some of the CNAs present in the matched diagnosis sample, indicating a common clonal origin (data not shown); however, 91.8% exhibited a change in the pattern of CNAs from diagnosis to relapse (data not shown). 34% acquired new CNAs, 12% showed loss of lesions present at diagnosis, and 46% both acquired new lesions and lost lesions present at diagnosis. In 11% of relapsed samples (three B-ALL and four T-ALL cases) all CNAs present at diagnosis were lost at relapse, raising the possibility that the relapse represents the emergence of a second unrelated leukemia. One case (BCR-ABL-SNP-#15) retained the same translocation at relapse, indicating a common clonal origin. In the remaining three cases, lack of similarity of the patterns of deletion at immunoglobulin (Ig) and T-cell antigen receptor (TcR) gene loci suggested that relapse represented emergence of a distinct leukemia (data not shown). For all other relapse cases (86%), analysis of 1 g/TCR deletions demonstrated a clonal relationship between diagnostic and relapse samples (data not shown).
The genes most frequently affected by CNAs acquired at relapse were CDKN2A/B, ETV6, and regulators of B-cell development (Table 39, and data not shown). Sixteen B- and two T-ALL cases acquired new CNAs of CDKN2A/B, 10 of which lacked CDKN2A/B deletions at diagnosis (data not shown). The CDKN2A/B deletions acquired at relapse were bi-allelic in 70% of cases, resulting in a complete loss of expression of all three encoded proteins: INK4A (p16), ARF (p14), and INK4B (p15). Deletion of ETV6, a frequent abnormality at diagnosis in ETV6-RUNX1 B-ALL (65, 76), was also common in relapsed ALL, being identified in 11 cases (10 B-ALLs and one T-ALL), with only one case ETV6-RUNX1 positive (data not shown). Mutations of genes regulating B cell development are common at diagnosis in B-ALL (65), and additional lesions in this pathway were observed at relapse, with a number of cases acquiring multiple hits within the pathway (data not shown). Four cases lacked CNAs in this pathway at diagnosis but acquired deletions in PAX5 (N=1), IKZF1 (N=2), or TCF3 (N=1) at relapse. Eleven cases with CNAs in this pathway at diagnosis acquired additional lesions at relapse, most commonly IKZF1 (5 cases), IKZF2 (two cases) and IKZF3 (one case) (data not shown). New CNAs were also observed in PAX5 (N=3), TCF3 (N=3), RAG1/2 (N=2; data not shown) and EBF1 (N=1, data not shown). CNAs involving genes encoding regulators of lymphoid development were also observed in four T-ALL relapse samples but involved the early lymphoid regulators IKZF1 (N=2), IKZF2 (N=1) and LEF1 (N=2; data not shown), rather than B lineage specific genes such as PAX5 and EBF1.
A number of other less frequent CNAs previously detected in diagnostic ALL samples (65) were also observed as new lesions at relapse, including CNAs of ADDS, ARPP-21, ATM, BTG1, CD200/BTLA, FHIT, KRAS, IL3RA/CSF2RA, NF1, PTCH, TBL1XR1, TOX, WT1, NR3C1 and DMD (data not shown); and progression of intrachromosomal amplification of chromosome 21, a poor prognostic marker in childhood ALL (79) (data not shown). In addition, relapsed T-ALL was remarkable for the loss and acquisition of sentinel lesions in T-ALL, including the loss of NUP214-ABL1 in one case, and the acquisition of NUP214-ABL1, LMO2, and MYB amplification at relapse (65, 80-82) (data not shown).
In addition to defining CNAs, we also performed an analysis of regions of copy-neutral LOH(CN-LOH) that can signify mutated, reduplicated genes. CN-LOH was only identified in 15 B- and 3 T-ALL cases (data not shown). The most common region involved was chromosome 9p (N=8), which in each case contained homozygous CDKN2A/B deletion, consistent with reduplication of a hemizygous CDKN2A/B deletion.
To determine which biologic pathways were most frequently targeted by relapse-acquired CNAs, we categorized each gene contained within altered genomic regions into one or more of 148 biologic pathways. The pathways were then assessed for their frequency of involvement by CNAs across the dataset using Fisher's exact test (66). This analysis identified cell cycle regulation and B-cell development as the most common pathways targeted at relapse (data not shown).
There was a clear clonal relationship between the diagnosis and relapse ALL samples in most cases (93.6% B-ALL and 71.4% T-ALL cases). This suggests that the relapse-associated CNAs were either present at low levels at diagnosis and selected for at relapse, or acquired as new genomic alterations after initial therapy. To explore these possibilities, we mapped the genomic breakpoints of several CNAs acquired at relapse (ADDS, C20orf94, DMD, ETV6, IKZF2, and IKZF3) and developed lesion-specific PCR assays. Evidence of the relapse clone was detected in 7 of 10 diagnostic samples analyzed (
By carefully analyzing the changes in CNAs between matched diagnostic and relapse samples, we were able to map their evolutionary relationship (
These results extend previous studies examining individual genetic loci in relapsed ALL (71, 73, 77, 78, 83-85), and provide important insights into the spectrum of genetic lesions that underlie this process. Although our data are limited to a single class of mutations (CNAs), they demonstrate that no single genetic lesion or alteration of a single pathway is responsible for relapse. Moreover, global genomic instability does not appear to be a prevalent mechanism. Instead, a diversity of mutations appear to contribute to relapse with the most common alterations targeting key regulators of tumor suppression, cell cycle control, and lymphoid/B cell development. Notably, few lesions involved genes with roles in drug import, metabolism, export and/or response, (an exception being the glucocorticoid receptor gene NR3C1) suggesting that the mechanism of relapse is more complex than simple “drug resistance”.
The diversity of genes that are targeted by relapse-associated CNAs coupled with the presence of the relapse clone as a minor sub-population at diagnosis that escapes drug-induced killing represent formidable challenges to the development of effective therapy for relapsed ALL. Nevertheless, our study has identified several common pathways that may contain rational targets against which novel therapeutic agents can be developed.
All publications and patent applications mentioned in the specification are indicative of the level of those skilled in the art to which this invention pertains. All publications and patent applications are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.
Although the foregoing invention has been described in some detail by way of illustration and example for purposes of clarity of understanding, it will be obvious that certain changes and modifications may be practiced within the scope of the appended claims.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US08/82592 | 11/6/2008 | WO | 00 | 4/19/2010 |
Number | Date | Country | |
---|---|---|---|
61002351 | Nov 2007 | US | |
60986530 | Nov 2007 | US | |
61012554 | Dec 2007 | US |