The present invention relates generally to compositions and methods for diagnosing diseases which have an allele-specific therapy and a disease-causing mutation that is sufficiently distant from the molecular site of the therapy to require a diagnostic linking method.
Expansions of CAG trinucleotide repeats (CAG repeats) in coding regions of human genes cause numerous disorders by generating proteins with elongated polyglutamine (polyQ) stretches. This group of disorders includes by way of example spinocerebellar ataxia type 1, spinocerebellar ataxia type 2, spinocerebellar ataxia type 3, spinocerebellar ataxia type 6, spinocerebellar ataxia type 7, spinocerebellar ataxia type 17, spinal and bulbar muscular atrophy, Huntington's disease, and dentatorubral-pallidoluysian atrophy. (Wanker E. E. (2000) Biol. Chem., 381:937-942; Gusella J. F. and MacDonald, M. E. (2000) Nature Rev. Neurosci., 1:109-115; and Usdin K. and Grabczyk, E. (2000) Cell. Mol. Life Sci., 57:914-931).
For purposes of illustration, only Huntington's disease (HD) will be discussed herein. HD is a chronic neurodegenerative disorder which is inherited as an autosomal dominant trait and is characterized by chorea, dementia and personality disorder. Martin, J. B. and Gusella, J. F. (1986) N. Engl. J. Med. 315:1267-1276. The gene responsible for HD contains an expanded and unstable CAG trinucleotide repeat. Huntington's Disease Collaborative Research Group (1993) Cell 72:971-983.
The HD gene (IT15 gene), which encodes huntingtin, a 350 kDa protein of unknown function, is located on the human chromosome 4 and consists of 67 exons. The disease-causing mutation is a CAG repeat expansion located within exon 1 of the HD gene (HD exon1). The CAG repeat is translated into a polyQ stretch. The disease manifests itself when the polyQ stretch exceeds the critical length of 37 glutamines (pathological threshold), whereas 8-35 glutamine residues in huntingtin are tolerated by neuronal cells. Experimental evidence has been presented that huntingtin fragments with polyQ tracts in the pathological range (more than 37 glutamines), but not in the normal range (20-32 glutamines), form high molecular weight protein aggregates with a fibrillar morphology in vitro and in cell culture model systems (Scherzinger et al. (1999) Proc. Natl Acad. Sci. USA, 96:4604-4609; and Waelter et al., (2001) Mol. Biol. Cell, 12:1393-1407). In addition, inclusions with aggregated N-terminally truncated huntingtin protein were detected in HD transgenic mice carrying a CAG repeat expansion of 115-156 units and in HD patient brains (Davies et al., (1997) Cell, 90:537-548; and DiFiglia et al., (1997) Science, 277:1990-1993), suggesting that the process of aggregate formation may be important for the progression of HD. The mechanisms, however, by which the elongated polyQ sequences in huntingtin cause dysfunction and neurodegeneration are not yet understood (Scherzinger et al., (1999); Tobin A. J. and Signer, E. R. (2000) Trends Cell Biol., 10:531-536; and Perutz M. F. (1999) Glutamine repeats and neurodegenerative diseases: molecular aspects. Trends Biochem. Sci., 24:58-63).
Unaffected individuals have repeat numbers of up to 30, while individuals at a high risk of developing HD carry more than 37 CAG repeats. Individuals with 30-37 repeats have a high risk of passing on repeats in the affected size range to their offspring (Andrew et al., (1997) Hum. Mol. Genet., 6:2005-2010; Duyao et al., (1993) Nature Genet., 4:387-392; and Snell et al., (1993) Nature Genet., 4:393-397).
It is known that patients are able to survive and live healthy lives with only one functioning copy of the IT15 gene. Thus, selective inactivation of the allele with a disease-causing mutation should diminish or even eliminate the disease while improving the possibilities of survival in heterozygous patients.
The combination of emotional, cognitive and motor symptoms in HD contributes to an unusually high cost of care. People with Huntington's Disease require care from health professionals of many stripes, including general practitioners, neurologists, social workers, home health aides, psychologists, physical therapists, and speech/language pathologists.
Currently, there are a few diagnostic approaches for nucleic acid sequence identification. U.S. Patent Application Publication No. 20040048301 describes allele-specific primer extension in the presence of labeled nucleotides for sequence identification, but does not include allele-specific primer extension for enrichment of one allele over the other for further analysis of the allele of interest as part of the kit. WO Patent Application No. 2003100101 describes isolation of one sequence in a mixture by hybridization markers and single-strand specific nucleases for use in single-molecule analysis. U.S. Patent Application Publication No. 20030039964 describes a method for isolation of one sequence in a mixture by hybridization to a fixed probe, but does not disclose the use of reverse transcription. U.S. Pat. No. 6,013,431 describes a method for analysis of bases adjacent to a hybridized, immobilized oligo, but does not disclose enrichment of one allele over the other. WO Patent Application No. 9820166 describes a method for specific selection of one allele over the other, followed by mass spectroscopic analysis of the selected molecule, but does not disclose the use of reverse transcription. None of these references disclose methods and diagnostic kits for linking polymorphic sequences to expanded repeat mutations for improved allele-specific diagnosis.
U.S. Patent Publication No. 20040241854 (Davidson) discloses allele-specific inhibition of specific single nucleotide polymorphism variants, and presents data showing that “expanded CAG repeats and adjacent sequences, while accessible to RNAi, may not be preferential targets for silencing” thus describing the problem that our invention addresses (determining what SNP variant at a remote molecular position is linked to the expanded CAG repeat in a particular patient), but does not teach the use of reverse transcription using an allele-specific primer to solve this problem, nor otherwise disclose a method for how to solve this problem. U.S. Patent Publication No. 20060270623 (McSwiggen) discloses multiple siRNA sequences, including those comprising SNP variants, but does not provide any working examples regarding allele-specific RNA interference using these disclosed siRNA sequences, nor disclose how to determine which allele-specific siRNA to administer to a particular Huntington's disease human patient in order to effectively treat that patient's disease by suppression of only the expanded Huntington allele in that patient.
The McSwiggen publication provides a list of SNPs in the IT15 gene and recites siNA molecules complementary to different SNP variants. This publication identifies 68 possible SNP sites but does not teach or suggest which of these sites are particularly useful for treatment of heterozygous patients having Huntington's disease, thus implying that all of these SNPs have equal value in diagnostics and treatment of patients having Huntington's disease.
Accordingly, there is need in the art for novel compounds, methods, and kits for allele-specific diagnostics and therapies, providing optimized diagnostics and/or treatment outcomes.
The instant application addresses these and other needs by disclosing kits targeting specific SNPs in a patient having Huntington's disease or at risk of Huntington's disease for successful diagnosis and/or treatment.
More specifically, the invention provides kits for allele-specific diagnostics and/or allele-specific inhibition of mutated IT15 gene expression in a cell, the kit comprising: a) a first set of allele-specific RT-PCR primers, the first set comprising a first allele-specific RT-PCR primer, capable of selectively hybridizing to a first variant of a first SNP and at least a second allele-specific RT-PCR primer capable of selectively hybridizing to at least a second variant of the first SNP; and b) a second set of allele-specific RT-PCR primers, the second set comprising a first allele-specific RT-PCR primer, capable of selectively hybridizing to a first variant of a second SNP and at least a second allele-specific RT-PCR primer capable of selectively hybridizing to at least a second variant of the second SNP; wherein the first SNP and the second SNP are different, and the first SNP and the second SNP are selected from the group consisting of rs362331, rs362307, rs35892913, rs362267, rs2276881, rs17781557, and rs363125.
In another embodiments, the kits of the instant invention also comprise an allele-specific therapy, most preferably in a form of a short RNA molecule comprising a double-stranded portion, or a nucleic acid construct encoding such short RNA molecule.
The present invention relates to methods and kits for performing allele-specific reverse transcription from an SNP site and analysis of a cDNA at a region of gene mutation. The methods, systems and reagents of the present invention are applicable to any disease which contains an SNP variant of an allele in a heterozygous subject that is on the same mRNA transcript as a disease-causing mutation that is at a remote region of the gene's mRNA.
To aid in the understanding of the invention, the following non-limiting definitions are provided:
As used herein, the term “mutated” refers to an allele having at least 39 CAG repeats.
The term “gene” refers to a DNA sequence that comprises control and coding sequences necessary for the production of a polypeptide or its precursor. The polypeptide can be encoded by a full length coding sequence (either genomic DNA or cDNA) or by any portion of the coding sequence so long as the desired activity is retained. In some aspects, the term “gene” also refers to an mRNA sequence or a portion thereof that directly codes for a polypeptide or its precursor.
The term “transfection” refers to the uptake of foreign DNA by a cell. A cell has been “transfected” when exogenous (i.e., foreign) DNA has been introduced inside the cell membrane. Transfection can be either transient (i.e., the introduced DNA remains extrachromosomal and is diluted out during cell division) or stable (i.e., the introduced DNA integrates into the cell genome or is maintained as a stable episomal element).
“Cotransfection” refers to the simultaneous or sequential transfection of two or more vectors into a given cell.
The term “promoter element” or “promoter” refers to a DNA regulatory region capable of being bound by an RNA polymerase in a cell (e.g., directly or through other promoter-bound proteins or substances) and initiating transcription of a coding sequence. A promoter sequence is, in general, bounded at its 3′ terminus by the transcription initiation site and extends upstream (5′ direction) to include the minimum number of bases or elements necessary to initiate transcription at any level. Within the promoter sequence may be found a transcription initiation site (conveniently defined, for example, by mapping with nuclease S1), as well as protein binding domains (consensus sequences) responsible for the binding of RNA polymerase. The promoter may be operably associated with other expression control sequences, including enhancer and repressor sequences.
The term “in operable combination”, “in operable order” or “operably linked” refers to the linkage of nucleic acid sequences in such a manner that a nucleic acid molecule capable of directing the transcription of a given gene and/or the synthesis of a desired protein molecule is produced. The term also refers to the linkage of amino acid sequences in such a manner so that a functional protein is produced.
The term “vector” refers to a nucleic acid assembly capable of transferring gene sequences to target cells (e.g., viral vectors, non-viral vectors, particulate carriers, and liposomes). The term “expression vector” refers to a nucleic acid assembly containing a promoter which is capable of directing the expression of a sequence or gene of interest in a cell. Vectors typically contain nucleic acid sequences encoding selectable markers for selection of cells that have been transfected by the vector. Generally, “vector construct,” “expression vector,” and “gene transfer vector,” refer to any nucleic acid construct capable of directing the expression of a gene of interest and which can transfer gene sequences to target cells. Thus, the term includes cloning and expression vehicles, as well as viral vectors.
The term “antibody” refers to a whole antibody, both polyclonal and monoclonal, or a fragment thereof, for example a F(ab)2, Fab, FV, VH or VK fragment, a single chain antibody, a multimeric monospecific antibody or fragment thereof, or a bi- or multi-specific antibody or fragment thereof. The term also includes humanized and chimeric antibodies.
The term “treating” or “treatment” of a disease refers to executing a protocol, which may include administering one or more drugs to a patient (human or otherwise), in an effort to alleviate signs or symptoms of the disease. Alleviation can occur prior to signs or symptoms of the disease appearing, as well as after their appearance. Thus, “treating” or “treatment” includes “preventing” or “prevention” of disease. In addition, “treating” or “treatment” does not require complete alleviation of signs or symptoms, does not require a cure, and specifically includes protocols which have only a marginal effect on the patient.
The term “patient” refers to a biological system to which a treatment can be administered. A biological system can include, for example, an individual cell, a set of cells (e.g., a cell culture), an organ, a tissue, or a multi-cellular organism. A patient can refer to a human patient or a non-human patient.
The terms “remote region” or “remote location” indicate a distance of at least 100 base pairs from the SNP site to the site of the disease-causing mutation, such as, for example, at least 0.5 kb, or at least 1 kb, or at least 2 kb or at least 3 kb, or at least 4 kb or at least 5 kb, or at least 6 kb or more.
The term “practitioner” refers to a person who uses methods, kits and compositions of the current invention on the patient. The term includes, without limitations, doctors, nurses, scientists, and other medical or scientific personnel.
The terms “siRNA molecule,” “shRNA molecule,” “RNA molecule,” “DNA molecule,” “cDNA molecule” and “nucleic acid molecule” are each intended to cover a single molecule, a plurality of molecules of a single species, and a plurality of molecules of different species.
The term “siRNA” refers to a double-stranded RNA molecule wherein each strand is between about 15 and about 30 bases of ribonucleic acid in length, and the two strands have a region of complementarity such that the two strands hybridize or “base pair” together through the annealing of complementary bases (Adenosine to Uracil, and Guanine to Cytosine). For some siRNA molecules, the two strands hybridize together in a manner such that there is an overhang of non-annealed bases at the 5′ or 3′ ends of the strand. For other siRNA molecules, the two strands hybridize together such that each base of one strand is paired with a base of the other strand. For some siRNA molecules, the two strands may not be 100% complementary, but may have some bases that do not hybridize due to a mismatch. For some siRNA molecules, the RNA bases may be chemically modified, or additional chemical moieties may be conjugated to one or more ends of one or more of the strands.
The term “shRNA” refers to a “short, hairpin” RNA molecule comprised of a single strand of RNA bases that self-hybridizes in a hairpin structure. The RNA molecule is comprised of a stem region of RNA bases that hybridize together to form a double-stranded region, and a loop region of RNA bases that form the bend of the hairpin. The term “shRNA” also refers to a DNA molecule from which a short, hairpin RNA molecule may be transcribed in vitro or in vivo.
The methods of the present invention utilize routine techniques in the field of molecular biology. Basic texts disclosing general molecular biology methods include Sambrook et al., Molecular Cloning, A Laboratory Manual (3d ed. 2001) and Ausubel et al., Current Protocols in Molecular Biology (1994).
The present invention relates generally to compositions and methods for diagnosing diseases which have an allele-specific therapy and a disease-causing mutation that is sufficiently distant from the molecular site of the therapy. Table 1 depicts certain diseases applicable to the present invention. Table 1 was derived from information previously published (DiProspero (2005)). Table 1 describes in part examples of triplet repeat expansion diseases and the mutant gene associated with each disease.
The present invention is not limited to the diseases described above. There may be situations where a disease is caused by many different mutations in a single gene (thus designing many different gene-targeting therapies may not be practical from a commercial perspective). However, if one or two expressed SNPs are present in the disease-associated gene, then the SNPs may actually serve as the molecular target for the therapy (and thus determination of linkage of the SNP to the disease-causing mutation would be essential).
For purposes of illustration, only HD will be discussed herein as an example of a triplet repeat expansion disease and example of the applicability of the present invention in providing methods and kits for determining allele-specific reverse transcription from an SNP site and analysis of a cDNA at a region of mutation.
The coding region of the IT15 gene is about 13,000 bases long. The HD disease-causing mutation is the expansion of the CAG repeat region. The CAG repeat region starts at about nucleotide position 15. If the CAG triplets repeat for about 25 or 30 times, the patient is not at risk of the disease. If however, more than 37 CAG repeats occur in a row on the nucleotide sequence then the patient is going to get Huntington's disease.
About ten thousand bases downstream from the CAG repeat sequence, there is a natural variation (Single Nucleotide Polymorphism, or SNP) of the IT15 gene in the human population, where for many people it might be an “A residue” and for many others it is a “C residue”. That is just a normal variation, as it does not cause any disease. The information about the SNP can be used to determine that a child of a Huntington's disease patient has inherited an allele with the “A residue” from one parent and an allele with the “C residue” from the other parent.
The practitioner also knows that one of the patient's parents has HD and would like to know if the patient will also get HD. The practitioner can actually determine whether the patient is going to get HD or not, by looking at both of the patient's IT15 alleles, and determining how many CAG repeats the gene contains. If one of the CAG repeats is longer than 37, then the patient will get HD. Further, the practitioner can determine whether the patient is heterozygous (i.e., one allele has a normal number of repeats, e.g., 20, while the other allele has expanded repeats, e.g., 37). Analyzing the IT15 gene downstream of the CAG repeats, the practitioner may find that the patient received a “C residue” from one parent and an “A residue” from the other parent. Thus, the crucial issue for the allele-specific diagnosis is which SNP is on the same mRNA transcript as the expanded number of repeats in the patient's IT15 gene. Isolating the genetic information from the patient's parents may not help because it is possible that one or both parents are also heterozygous (e.g., each parent has two SNP variants of the gene (i.e., an A residue and a C residue variants). This disclosure provides a method of determining which SNP allele of the gene co-segregates with the disease-causing mutation.
Accordingly, one aspect of the present invention provides a diagnostic test, allowing the practitioner to determine which allele, classified by the nucleotide at the SNP position, co-segregates with the disease-causing mutation. In one embodiment, the test comprises a method for determining which single nucleotide polymorphism variant of an allele from a gene isolated from a heterozygous patient is on the same mRNA transcript as a disease-causing mutation at a remote region of the gene's mRNA comprising: a) an allele-specific reverse transcription reaction using an allele-specific primer which recognizes one single nucleotide polymorphism variant, and b) analysis of an allele-specific cDNA product from the allele-specific reverse transcription reaction at the remote region of the gene to determine the presence or absence of the mutation on the allele-specific cDNA product. A concurrently filed PCT application entitled “METHODS AND KITS FOR LINKING POLYMORPHIC SEQUENCES TO EXPANDED REPEAT MUTATIONS” (Inventors: Paul van Bilsen, William F. Kaemmerer, and Eric Burright, Medtronic Docket No. P0024231.03) and incorporated herein by reference is drawn to this aspect of the invention.
The inventors have discovered that the primer should preferably be shorter than about 20 nucleotides, e.g., about 15 nucleotides, long, because of a possibility that primers which are longer than about 20 nucleotides are unlikely to discriminate between the targeted SNP variants.
In a layman's terms, the practitioner takes RNA from the patient and applies a reverse transcription primer that recognizes just the “A allele.” The “A allele” specific primer will have at its 3′ position a complement to the SNP variant of interest. In case of the A-variant, the “A allele” specific primer will have the T at the 3′ end, and so when this “A allele” specific primer anneals to the mRNA, it will base-pair with the 3′ end and allow the reverse transcriptase to proceed to synthesize the cDNA from the “A allele.” Conversely, the “A allele” primer will not base-pair at the 3′ end of the primer with the “C allele” (since T is not complementary to C). Thus, the reverse transcription polymerase will not be able to produce cDNA from the C allele. On the other hand, in the “A” portion of a reaction, the practitioner will obtain a pool of the cDNAs that corresponds to the “A allele.” The reaction can be repeated in a separate tube with a “C allele” specific primer and no “A allele” primer. A person of ordinary skill in the art will understand that the “C allele” specific primer will have a G on its 3′ end. Essentially the practitioner will perform at least one allele-specific reverse transcription reaction, but preferably two allele-specific reverse transcriptions reactions (each with its own allele-specific primer), on the mRNA from the patient. As a result, the practitioner will have two sub-populations of cDNA, wherein each subpopulation is allele-specific, and the practitioner knows which pool corresponds to which variant. Thus, the practitioner will be able to use any number of possible methods, the simplest being PCR to analyze the upstream portion of the cDNA containing the CAG repeat region and quantify the number of the repeats from the cDNA products that came specifically from the “C reaction” or specifically from the “A reaction.” Non-limiting examples of such methods are restriction endonuclease EcoP15I cleavage and fluorescent PCR. The recognition sequence for EcoP15I consists of two copies of the asymmetric sequence 5′-CAGCAG present on opposite strands of the DNA double helix. Cleavage occurs 25-27 bp downstream of one of these sites. The distance between the two sites can be up to 3.5 kb. HD exon1 DNA is a substrate of EcoP15I. Because of the overlapping recognition sites within the CAG repeat, EcoP15I digestion of DNA fragments bearing HD exon1 generates a ladder of restriction fragments, whose number corresponds to the number of CAG repeats. Making use of this fact an EcoP15I cleavage assay can be developed to determine the number of CAG repeats in HD exon1 DNA on the basis of the resulting DNA fragment pattern. See, e.g., Möncke-Buchner et al., Nucleic Acids Res. 2002 Aug. 15; 30(16): e83. Fluorescent PCR entails labeling the primers with a label, such as, for example, 5-carboxyfluorescein (FAM). The method of separation of amplification products involves capillary electrophoresis or denaturing polyacrylamide gel electrophoresis. A representative example of a successful determination of the number of CAG repeats by this method is disclosed, for example, in Toth et al., Clinical Chemistry. 1997; 43:2422-2423.
The embodiment of the invention described above employs the notion that a mismatch on the 3′ position of the allele-specific primer will not allow reverse transcriptase to produce cDNA from the allele with a mismatched SNP variant. A person of ordinary skill in the art will undoubtedly recognize that the 3′ end of the allele-specific primer does not have to be positioned at the single nucleotide polymorphism nucleotide position. For example, a skilled artisan may design primers and conditions of the reverse transcription reaction in such a way that the allele-specific primer will not bind altogether and thus lead to the same end result: absence of cDNA the allele with a mismatched SNP variant.
The accurate determination of the number of CAG repeats is required for the DNA-based predictive testing of at-risk individuals. To date, CAG repeat length determination is based on polymerase chain reaction (PCR) amplification of genomic DNA using primers flanking the CAG repeat region in the IT15 gene, and subsequent electrophoretic separation of the products in denaturing polyacrylamide gels (Williams et al., (1999) Comparative semi-automated analysis of (CAG) repeats in the HD gene: use of internal standards. Mol. Cell Probes, 13:283-289).
Numerous methods and commercial kits for the synthesis of first strand cDNA molecules are well known in the art. Examples include the Superscript™ Double Strand cDNA Synthesis Kit (Invitrogen, Carlsbad, Calif.), the Array 50™, Array 350™ and Array 900™ Detection Kits (Genisphere, Hatfield, Pa.), and the CyScribe™ Post-Labelling Kit (Amersham, Piscataway, N.J.). RNA molecules (e.g., mRNA, hnRNA, rRNA, tRNA, miRNA, snoRNA, non-coding RNAs) from a source of interest are used as templates in a reverse transcription reaction. The RNA may be obtained from a mammalian or more preferably human tissue or cell source. The methods of the present invention are particularly suited for amplification of RNA from small numbers of cells, including single cells, which can be purified from complex cellular samples using, e.g., micromanipulation, fluorescence-activated cell sorting (FACS) and laser microdissection techniques (see Player et al., Expert Rev. Mol. Diagn. 4:831 (2004)).
Any reverse transcriptase can be used in the initial reverse transcription reaction, including thermostable, RNAse H+ and RNase H− reverse transcriptases. Preferably, an RNase H− reverse transcriptase is used.
Primers for first strand cDNA synthesis can be obtained commercially or synthesized and purified using techniques well known in the art. As disclosed above, the inventors discovered that primers which are about 15 nucleotides long provide the best results in terms of discriminating between the SNP variants and efficiency in producing the allele-specific cDNA.
PCR amplifications of the CAG repeat region have primarily been performed by incorporating [α-32P]dNTPs, or using 32P or fluorescently end-labeled primers. Sizing of fluorescently end-labeled amplification products was performed in various Applied Biosystems DNA sequencers (Andrew et al., (1993) Nature Genet., 4:398-403; Choudhry et al., (2001) Hum. Mol. Genet., 10:2437-2446; Ishii et al., (2001) J. Clin. Endocrinol. Metab., 86:5372-5378; Le et al., (1997) Mol. Pathol., 50:261-265; Mangiarini et al., (1997) Nature Genet., 15:197-200; Pelotti et al., (2001) Am. J. Forensic Med. Pathol., 22:55-57; Wallerand et al., (2001) Fertil. Steril., 76:769-774; Warner et al., (1993) Mol. Cell Probes, 7:235-239; and Warner et al., (1996) J. Med. Genet., 33:1022-1026).
High-resolution method can be used for the exact length determination of CAG repeats in HD genes as well as in genes affected in related CAG repeat disorders (Elisabeth Möncke-Buchner et al., Nucleic Acids Res. 2002 Aug. 15; 30(16)).
A wide variety of kits may be prepared according to present invention. For example, a kit may include a single stranded promoter template comprising at least one RNA polymerase recognition sequence; and instructional materials for synthesizing cDNA molecules using said promoter template. While the instructional materials typically comprise written or printed materials, they are not limited to such. Any medium capable of storing such instructions and communicating them to an end user is contemplated by this invention. Such media include, but are not limited to, electronic storage media (e.g., magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the like. Such media may include addresses to internet sites that provide such instructional materials.
The kits of the present invention may further include one or more of the following components or reagents: a reverse transcriptase (preferably with RNA-dependent DNA polymerase activity); an RNase inhibitor; an enzyme for attaching a 3′ oligodeoxynucleotide tail onto DNA molecules (e.g., terminal deoxynucleotidyl transferase); an enzyme for degrading RNA in RNA/DNA duplexes (e.g., RNase H); and one or more RNA polymerases (e.g., T7, T3 or SP6 RNA polymerase). Additionally, the kits may include buffers, primers (e.g., oligodT primers, random primers), nucleotides, labeled nucleotides, an RNase inhibitor, polyA polymerase, RNase-free water, containers, vials, reaction tubes, and the like compatible with the synthesis of complementary DNA (cDNA) molecules from RNA according to the methods of the present invention. The components and reagents may be provided in containers with suitable storage media.
A person of ordinary skill in the art will appreciate that such allele-specific diagnosis empowers a practitioner to devise and implement an allele-specific treatment which generally comprises inactivation of the mutated copy of the gene. It is known that patients are able to survive and live healthy lives with only one functioning copy of the HD gene. It is known that the expression of the mutant gene is causing the trouble for the HD patient. Applicants' therapeutic model provides for selectively shutting off mutant gene expression without affecting expression of the normal gene, and is applicable to any disease which contains an SNP variant of an allele in a heterozygous subject that is on the same mRNA transcript as a disease-causing mutation that is at a remote region of the gene's mRNA.
Accordingly, another aspect of the present invention provides a method of treating a patient susceptible to Huntington's disease comprising: a) determining which single nucleotide polymorphism variant is on the same mRNA transcript as a disease-causing mutation according to an allele-specific reverse transcription reaction using an allele-specific primer which recognizes one single nucleotide polymorphism variant, wherein further the 3′ end of the primer is positioned at the single nucleotide polymorphism nucleotide position, and b) analysis of the resulting cDNA product from the reverse transcription reaction at the region of the mutation to determine the presence or absence of the mutation on this allele-specific cDNA product, and c) applying an allele-specific therapy to the SNP variant. In one embodiment, the allele-specific therapy comprises an RNA molecule comprising a double-stranded portion, wherein the single nucleotide polymorphism site is located within seven nucleotides from an end of the double stranded portion. In one embodiment, the double stranded portion is between 15 and 23 nucleotides long, e.g., about 19 nucleotides long. Further, as discussed above, the siRNA molecule may contain a loop (e.g., shRNA), a 3′ overhand or a 5′ overhang which are outside of the double-stranded portion. The instant invention also provides a method of allele-specific therapy, wherein the double stranded portion does not contain a mismatch in a position adjacent to the single nucleotide polymorphism site. Thus, in one embodiment of the invention, the siRNA molecule does not contain any mismatches and one strand of the double-stranded portion is 100% identical to the portion of the targeted mRNA transcript. In one embodiment, wherein the disease treated by the allele-specific therapy is Huntington's disease, the non-limiting example of a single nucleotide polymorphism site suitable for the allele-specific therapy is rs363125.
It should be noted that the allele-specific therapy could itself operate at a different SNP site than the SNP site used to make the determination about which allele contains the mutation, so long as the SNP site of the therapy target and the SNP site used to identify the mutation-containing allele are already determined, before the therapy is administered to the patient, to be linked; that is, on the same mRNA transcript.
In some embodiments of the present invention the allele-specific therapy comprises allele-specific RNA interference using siRNA or shRNA. In this embodiment of the invention, the allele-specific therapy destroys the “A allele” of the patient. In this embodiment the siRNA targets the “A allele” upon introduction into the subject's brain by any method known to those of skill in the art (See for example, U.S. application Ser. No. 11/253,393, U.S. application Ser. No. 10/852,997, U.S. application Ser. No. 10/721,693, U.S. application Ser. No. 11/157,608, and PCT Patent Application No. US05/022156, which are incorporated herein in their entirety). When the siRNA is delivered into a cell it is used by proteins in the cell (known as the RISC complex) to find and destroy the mRNA from the Huntington's gene that has the “A allele.” Thus, the messenger RNA is destroyed before it can be used to make protein. Conversely, the allele that came from the healthy parent does not get destroyed and so its messenger RNA still survives to be used to make functional biologically active protein.
The design and use of small interfering RNA complementary to mRNA targets that produce particular proteins is a recent tool employed by molecular biologists to prevent translation of specific mRNAs. Various groups have been recently studying the effectiveness of siRNAs as biologically active agents for suppressing the expression of specific proteins involved in neurological disorders. Caplen et al. Human Molecular Genetics, 11(2): 175-184 (2002) assessed a variety of different double stranded RNAs for their ability to inhibit cell expression of mRNA transcripts of the human androgen receptor gene containing different CAG repeats. Their work found gene-specific inhibition occurred with double stranded RNAs containing CAG repeats only when flanking sequences to the CAG repeats were present in the double stranded RNAs. They were also able to show that constructed double stranded RNAs were able to rescue caspase-3 activation induced by expression of a protein with an expanded polyglutamine region. Xia, Mao, et al., Nature Biotechnology, 20: 1006-1010 (2002) demonstrated the inhibition of polyglutamine (CAG) expression in engineered neural PC12 clonal cell lines that express a fused polyglutamine-fluorescent protein using constructed recombinant adenovirus expressing siRNAs targeting the mRNA encoding green fluorescent protein.
One aspect of the present invention provides an siRNA molecule corresponding to at least a portion of a gene containing an SNP variant of an allele in a heterozygous subject that is on the same mRNA transcript as a disease-causing mutation located at a remote region of the gene's mRNA, wherein such siRNA nucleic acid sequence is capable of inhibiting expression of the mRNA transcript containing the disease-causing mutation in a cell. siRNAs are typically short (19-29 nucleotides), double-stranded RNA molecules that cause sequence-specific degradation of complementary target mRNA known as RNA interference (RNAi). Bass, Nature 411:428 (2001).
Accordingly, in some embodiments, the siRNA molecules comprise a double-stranded structure comprising a sense strand and an antisense strand, wherein the antisense strand comprises a nucleotide sequence that is complementary to at least a portion of a desired nucleic acid sequence and the sense strand comprises a nucleotide sequence that is complementary to at least a portion of the nucleotide sequence of said antisense region, and wherein the sense strand and the antisense strand each comprise about 19-29 nucleotides.
Any desired nucleic acid sequence can be targeted by the siRNA molecules of the present invention. Nucleic acid sequences encoding desired gene targets are publicly available from Genbank.
The siRNA molecules targeted to desired sequence can be designed based on criteria well known in the art (e.g., Elbashir et al., EMBO J. 20:6877 (2001)). For example, the target segment of the target mRNA preferably should begin with AA (most preferred), TA, GA, or CA; the GC ratio of the siRNA molecule preferably should be 45-55%; the siRNA molecule preferably should not contain three of the same nucleotides in a row; the siRNA molecule preferably should not contain seven mixed G/Cs in a row; the siRNA molecule preferably should comprise two nucleotide overhangs (preferably TT) at each 3′ terminus; the target segment preferably should be in the ORF region of the target mRNA and preferably should be at least 75 bp after the initiation ATG and at least 75 bp before the stop codon; and the target segment preferably should not contain more than 16-17 contiguous base pairs of homology to other coding sequences.
Based on some or all of these criteria, siRNA molecules targeted to desired sequences can be designed by one of skill in the art using the aforementioned criteria or other known criteria (e.g., Gilmore et al., J. Drug Targeting 12:315 (2004); Reynolds et al., Nature Biotechnol. 22:326 (2004); Ui-Tei et al., Nucleic Acids Res. 32:936 (2004)). Such criteria are available in various web-based program formats useful for designing and optimizing siRNA molecules (e.g., siDESIGN Center at Dharmacon; BLOCK-iT RNAi Designer at Invitrogen; siRNA Selector at Wistar Institute; siRNA Selection Program at Whitehead Institute; siRNA Design at Integrated DNA Technologies; siRNA Target Finder at Ambion; and siRNA Target Finder at Genscript).
siRNA molecules targeted to desired sequences can be produced in vitro by annealing two complementary single-stranded RNA molecules together (one of which matches at least a portion of a desired nucleic acid sequence) (e.g., U.S. Pat. No. 6,506,559) or through the use of a short hairpin RNA (shRNA) molecule which folds back on itself to produce the requisite double-stranded portion (Yu et al., Proc. Natl. Acad. Sci. USA 99:6047 (2002)). Such single-stranded RNA molecules can be chemically synthesized (e.g., Elbashir et al., Nature 411:494 (2001)) or produced by in vitro transcription using DNA templates (e.g., Yu et al., Proc. Natl. Acad. Sci. USA 99:6047 (2002)). When chemically synthesized, chemical modifications can be introduced into the siRNA molecules to improve biological stability. Such modifications include phosphorothioate linkages, fluorine-derivatized nucleotides, deoxynucleotide overhangs, 2′-O-methylation, 2′-O-allylation, and locked nucleic acid (LNA) substitutions (Dorset and Tuschl, Nat. Rev. Drug Discov. 3:318 (2004); Gilmore et al., J. Drug Targeting 12:315 (2004)).
siRNA molecules targeted to desired target sequences can be introduced into cells to inhibit expression. Alternatively, DNA molecules from which shRNA molecules targeted to desired target sequences can be introduced into cells to inhibit expression. Accordingly, another aspect of the present invention provides for inhibiting expression of an mRNA sequence containing an SNP allele and a disease-causing mutation in a cell comprising introducing into a cell at least one siRNA molecule or shRNA molecule that corresponds to at least a portion of the mRNA nucleic acid sequence. Any cell can be targeted. For example, the siRNA or shRNA molecules are introduced into a heart cell or brain cell. In some embodiments, the brain cell is from a subject at risk for HD, i.e., the offspring of a HD patient.
The siRNA molecules produced herein can be introduced into cells in vitro or ex vivo using techniques well-known in the art, including electroporation, calcium phosphate co-precipitation, microinjection, lipofection, polyfection, and conjugation to cell penetrating peptides (CPPs). The siRNA molecules can also be introduced into cells in vivo by direct delivery into specific organs such as the liver, brain, eye, lung and heart, or systemic delivery into the blood stream or nasal passage using naked siRNA molecules or siRNA molecules encapsulated in biodegradable polymer microspheres (Gilmore et al., J. Drug Targeting 12:315 (2004)).
Alternatively, siRNA molecules targeted to specific mRNA sequences can be introduced into cells in vivo by endogenous production from an expression vector(s) encoding the sense and antisense siRNA sequences. Accordingly, another aspect of the present invention provides an expression vector comprising at least one DNA sequence encoding a siRNA molecule corresponding to at least a portion of a specific mRNA nucleic acid sequence capable of inhibiting expression of a specific mRNA in a cell operably linked to a genetic control element capable of directing expression of the siRNA molecule in a cell. Expression vectors can be transfected into cells using any of the methods described above.
Genetic control elements include a transcriptional promoter, and may also include transcription enhancers to elevate the level of mRNA expression, a sequence that encodes a suitable ribosome binding site, and sequences that terminate transcription. Suitable eukaryotic promoters include constitutive RNA polymerase II promoters (e.g., cytomegalovirus (CMV) promoter, the SV40 early promoter region, the promoter contained in the 3′ long terminal repeat of Rous sarcoma virus (RSV), the herpes thymidine kinase (TK) promoter, and the chicken beta-actin promoter), cardiac-tissue-specific RNA polymerase II promoters (e.g., the ventricular myosin light chain 2 (MLC-2v) promoter, and the sodium-calcium exchanger gene H1 promoter (NCX1H1)), and RNA polymerase III promoters (e.g., U6, H1, 7SK and 7SL).
In some embodiments, the sense and antisense strands of siRNA molecules are encoded by different expression vectors (i.e., cotransfected) (e.g., Yu et al., Proc. Natl. Acad. Sci. USA 99:6047 (2002). In other embodiments, the sense and antisense strands of siRNA molecules are encoded by the same expression vector. The sense and antisense strands can be expressed separately from a single expression vector, using either convergent or divergent transcription (e.g., Wang et al., Proc. Natl. Acad. Sci. USA 100:5103 (2003); Tran et al., BMC Biotechnol. 3:21 (2003)). Alternatively, the sense and antisense strands can be expressed together from a single expression vector in the form of a single hairpin RNA molecule, either as a short hairpin RNA (shRNA) molecule (e.g., Arts et al., Genome Res. 13:2325 (2003)) or a long hairpin RNA molecule (e.g., Paddison et al., Proc. Natl. Acad. Sci. USA 99:1443 (2002)).
Although numerous expression vectors can be used to express siRNA molecules in cells (Dorsett and Tuschl, Nat. Rev. Drug Discov. 3:318 (2004)), viral expression vectors are preferred, particularly those that efficiently transduce heart cells (e.g., alphaviral, lentiviral, retroviral, adenoviral, adeno-associated viral (AAV)) (Williams and Koch, Annu. Rev. Physiol. 66:49 (2004); del Monte and Hajjar, J. Physiol. 546.1:49 (2003). Both adenoviral and AAV vectors have been shown to be effective at delivering transgenes (including transgenes directed to diseases) into heart, including failing cardiomyocytes (e.g., Iwanaga et al., J. Clin. Invest. 113:727 (2004); Seth et al., Proc. Natl. Acad. Sci. USA 101:16683 (2004); Champion et al., Circulation 108:2790 (2003); Li et al., Gene Ther. 10:1807 (2003); Vassalli et al., Int. J. Cardiol. 90:229 (2003); del Monte et al., Circulation 105:904 (2002); Hoshijima et al., Nat. Med. 8:864 (2002); Eizema et al., Circulation 101:2193 (2000); Miyamoto et al., Proc. Natl. Acad. Sci. USA 97:793 (2000); He et al., Circulation 100:974 (1999). Recent reports have demonstrated the use of AAV vectors for sustained gene expression in mouse and hamster myocardium and arteries for over one year (Li et al., Gene Ther. 10:1807 (2003); Vassalli et al., Int. J. Cardiol. 90:229 (2003)). In particular, expression vectors based on AAV serotype 6 have been shown to efficiently transduce both skeletal and cardiac muscle (e.g., Blankinship et al., Mol. Ther. 10:671 (2004)). The present invention also provides for the use of coxsackie viral vectors for delivery of desired siRNA sequences.
Following introduction of the desired siRNA molecules into cells, changes in desired gene product levels can be measured if desired. Desired gene products include, for example, desired mRNA and desired polypeptide, and both can be measured using methods well-known to those skilled in the art. For example, desired mRNA can be directly detected and quantified using, e.g., Northern hybridization, in situ hybridization, dot and slot blots, or oligonucleotide arrays, or can be amplified before detection and quantitation using, e.g., polymerase chain reaction (PCR), reverse-transcription-PCR (RT-PCR), PCR-enzyme-linked immunosorbent assay (PCR-ELISA), or ligase chain reaction (LCR).
Desired polypeptide (or fragments thereof) can be detected and quantified using various well-known immunological assays, such as, e.g., enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RIA), immunoprecipitation, immunofluorescence, and Western blotting. Anti-desired antibodies (preferably anti-human desired) for use in immunological assays are commercially available from, e.g., EMD Biosciences (San Diego, Calif.), Upstate (Charlottesville, Va.), Abeam (Cambridge, Mass.), Affinity Bioreagents (Golden, Colo.) and Novus Biologicals (Littleton, Colo.), or may be produced by methods well-known to those skilled in the art.
As noted above, prior art identifies multiple SNPs on the IT15 gene. However, the prior art is silent regarding frequencies of these SNPs in Huntington's patients. Thus, the practitioner does not have guidance as to which SNPs or SNP combinations provide better targets for diagnosis and treatment of the population of Huntington's patients as a whole. Lack of such information leads to loss of time, reagents and needlessly increased costs of treating the patients.
The instant application addresses such need by disclosing SNPs and SNP combinations (i.e., groups of 2, 3, 4, 5, 6, and 7 SNPs) which encompass a portion of the population of Huntington's patients which portion is greater than a portion which would be encompasses by randomly selecting the corresponding number of SNPs (i.e., 2, 3, 4, 5, 6 or 7). The SNPs targeted by the kits of the instant invention are selected from rs362331, rs362307, rs35892913, rs362267, rs2276881, rs17781557, and rs363125, as exemplified in Table 2.
In these sequences, a non-standard character, (i.e., “y”, “r”, “k”, or “m”) designates the SNP nucleotide, and “y” means “C” or “T” (or, in case of RNA, “U”), “r” means “A” or “G”, “k” means “G” or “T” (or, in case of RNA, “U”), and “m” means “A” or “C.”
Notably, in addition to the SNPs previously disclosed in U.S. Patent Publication No. 20060270623 (McSwiggen), the instant invention discloses two SNPs, rs35892913 and rs11781557, not listed in McSwiggen, which are particularly useful for the kits of the instant invention.
Generally, a kit of the instant invention comprises between two and seven sets of primers. Each set of primers targets an SNP at a unique location within the IT15 gene. Thus, the kits will target between two and seven SNPs, respectively.
Further, within each set, at least two primers are present. The first primer selectively hybridizes to a first variant of a SNP, and at least the second primer selectively hybridizes to at least a second variant of the same SNP.
By way of example, Table 3 provides a list of primers suitable for allele-specific reverse transcription reaction targeting SNPs rs363125, rs362331, rs362307, rs35892913, rs2276881, rs17781557, and rs362267.
G
A
C
T
C
C
T
C
A
A
G
In one embodiment, the primers for the seven SNPs are as follows:
Applicants have surprisingly discovered that if these seven SNPs or subcombinations thereof are used, as opposed to randomly selected SNPs, an increased percentage of Huntington's patients (preferably, those of European descent) can be diagnosed (i.e., it can be determined which SNP variant of IT15 gene contains an increased number of CAG repeats), and accordingly treated in and allele-specific manner.
In one embodiment, the SNP pair is rs362331 and rs362307. If these SNPs are used, 65.1% [99% CI=58.0% to 70.8%] of Huntington's disease patients of European descent are amenable to diagnostic and treatment with the corresponding kit.
In another embodiment, the kit targets three SNPs. In a preferred embodiment, the SNPs are rs362331, rs362307, and rs35892913. If these SNPs are used, 74.6% [99% CI=67.9% to 80.6%] of Huntington's disease patients of European descent are amenable to diagnostic and treatment with the corresponding kit.
In another embodiment, the kit targets four SNPs. In a preferred embodiment, the SNPs are rs362331, rs362307, rs35892913, and rs362267. If these SNPs are used, 78.6% [99% CI=72.2% to 84.1%] of Huntington's disease patients of European descent are amenable to diagnostic and treatment with the corresponding kit.
In another embodiment, the kit targets five SNPs. In a preferred embodiment, the SNPs are rs362331, rs362307, rs35892913, rs362267, and rs2276881. If these SNPs are used, 82.3% [99% CI=76.2% to 87.4%] of Huntington's disease patients of European descent are amenable to diagnostic and treatment with the corresponding kit.
In another embodiment, the kit targets six SNPs. In a preferred embodiment, the SNPs are rs362331, rs362307, rs35892913, rs362267, rs2276881, and rs17781557. If these SNPs are used, 85.3% [99% CI=79.6% to 90.0%] of Huntington's disease patients of European descent are amenable to diagnostic and treatment with the corresponding kit.
In another embodiment, the kit targets all seven preferred SNPs. If these SNPs are used, then 85.6% [99% CI=80.0% to 90.2%] of Huntington's disease patients of European descent are amenable to diagnostic and treatment with the corresponding kit.
In another aspect, the kit comprises a set of PCR primers, one of which hybridizes to a portion of the IT15 gene located upstream of the CAG repeat region, and the other hybridizes to a portion of the IT15 gene located downstream of the CAG repeat region. It is preferred that the product resulting from the PCR reaction using these primers would be relatively short for an easier differentiation between the normal CAG repeat number and an increased CAG repeat number. Suitable non-limiting examples of such PCR primers are SEQ ID NOs 122 (5′-GCCTTCGAGTCCCTCAAGT-3′) and 123 (5′-GACAATGATTCACACGGTCT-3′). If the product resulting from the PCR reaction using SEQ ID NOs 122 and 123 as PCR primers has length greater than or equal to 343 base pairs then the allele used as the template has an abnormal number of CAG repeats. Specifically, the expected PCR product is 226 base pairs plus (3 times N base pairs) where N is the number of CAG repeats in the Huntington messenger RNA; a mutant, disease-causing Huntington messenger RNA has 39 or more CAG repeats.
In yet another embodiment, the kits may further comprise a corresponding number of sets of allele-specific therapeutic agents against the same SNPs as corresponding SNP-specific primers. Thus, if the kit targets two SNPs (e.g., rs362331 and rs362307), two sets of therapeutic agents are provided. First set of the therapeutic agents is against the variants of SNP rs362331, and the second set of therapeutic agents is against the variants of SNP rs362307.
Within each set, at least two populations of therapeutic agents are provided. The first population targets (i.e., preferentially inhibits) a first variant of a SNP, and at least the second population targets at least the second variant of that SNP. Thus, the kits provide a means of applying an allele-specific therapy to the single nucleotide polymorphism variant containing the abnormal number of CAG repeats.
In some embodiments, three sets of therapeutic agents are provided with kits targeting three SNPs, four sets of therapeutic agents are provided with kits targeting four SNPs, five sets of therapeutic agents are provided with kits targeting five SNPs, six sets of therapeutic agents are provided with kits targeting six SNPs, and seven sets of therapeutic agents are provided with kits targeting seven SNPs.
As discussed above, the therapeutic agents are short RNA molecules comprising a double-stranded portion. In different embodiments, these molecules can be provided as two strands (shown as strand 1 and strand 2, in table 5, for example) connected by A-G and U-C bonds, and in another embodiment, such molecules can be represented as single chain molecules forming double-stranded portions by means of a loop.
In other embodiments, the therapeutic agents may be provided as expression constructs (e.g., expression cassette, a vector, etc) which encode the short RNA molecule as described above.
Non-limiting examples of suitable allele-specific therapeutic agents are provided in Table 5. As used in that table, “1” in “1a)” and “1b”) indicates that the SNP targeted by the therapeutic siRNA is SNP rs362331, “2” in “2a)” and “2b)” indicates SNP rs362307, “3” indicates SNP rs35892913, “4” indicates SNP rs362267, “5” indicates SNP rs2276881, “6” indicates SNP rs17781557, and “7” indicates SNP rs363125. Designations “a” and “b” in the left column refer to the different variants of the same SNP.
Thus, the kit can be used in several steps. First, the practitioner determines whether a patient heterozygous for the mutated Huntingtin is also heterozygous for one of the SNPs targeted in a kit, and also which SNP variant is present in the mutated IT15 gene. Once the suitable SNP variant is found, allele-specific treatment is administered.
In a patient that has an expanded CAG repeat on both of his or her alleles, and is thus homozygous with respect to the disease-causing CAG repeat length, the kit can be used to identify an allele-specific treatment targeting the larger of his or her two alleles. As will be appreciated by those knowledgeable in the art and in the clinical aspects of Huntington's disease, an allele-specific treatment targeting the larger of the two alleles in a homozygous patient may have benefit in delaying the onset of the patient's disease symptoms, since shorter CAG repeat regions are associated with a later onset of disease symptoms in the patient's life.
By way of example, the algorithm for the use of a kit targeting two SNPs, (e.g., SNP rs362331 and SNP rs362307) may be expressed as follows:
If an increased number of CAG repeats (greater or equal to 39) is present at the C-allele of SNP rs362331 and a normal number of CAG repeats is present at the T allele of SNP rs362331, then the practitioner should select an allele-specific therapy exemplified in SEQ ID NOs: 124-131, e.g., duplexes of SEQ ID NOs 124 and 125, 126 and 127, 128 and 129, or 130 and 131.
On the other hand, if an increased number of CAG repeats (greater or equal to 39) is present at the T-allele of SNP rs362331 and a normal number of CAG repeats is present at the C allele of SNP rs362331, then the practitioner should select an allele-specific therapy exemplified in SEQ ID NOs: 132-139, e.g., duplexes of SEQ ID NOs 132 and 133, 134 and 135, 136 and 137, or 138 and 139.
If the analysis of the patient's mRNA by RT-PCR using the allele-specific reverse transcription primers of SEQ ID NOs 39 and 47 shows that the patient is homozygous at SNP site rs362331 (revealed by the fact that only one of the two reverse transcription primers yields an RT-PCR product) then SNP rs362331 is not suitable for allele-specific treatment of this particular patient. Accordingly, the practitioner should select another SNP, which, in this particular example, is SNP rs362307.
If an increased number of CAG repeats (greater or equal to 39) is present at the C-allele of SNP rs362307 and a normal number of CAG repeats is present at the T allele of SNP rs362307, then the practitioner should select an allele-specific therapy exemplified in SEQ ID NOs: 140-147, e.g., duplexes of SEQ ID NOs 140 and 141, 142 and 143, 144 and 145, or 146 and 147.
On the other hand, if an increased number of CAG repeats (greater or equal to 39) is present at the T-allele of SNP rs362307 and a normal number of CAG repeats is present at the C allele of SNP rs362307, then the practitioner should select an allele-specific therapy exemplified in SEQ ID NOs: 148-155, e.g., duplexes of SEQ ID NOs 148 and 149, 150 and 151, 152 and 153, or 154 and 155.
If the analysis of the patient's mRNA by RT-PCR using the allele-specific reverse transcription primers of SEQ ID NOs 56 and 63 shows that the patient is homozygous at SNP site rs362307 (revealed by the fact that only one of the two reverse transcription primers yields an RT-PCR product) then SNP rs362307 is not suitable for allele-specific treatment of this particular patient.
The similar logical algorithm applies for allele-specific diagnostic and allele-specific treatment using kits with a greater number of SNPs, such as 3, 4, 5, 6, or 7.
Specific embodiments according to the methods of the present invention will now be described in the following examples. The examples are illustrative only, and are not intended to limit the remainder of the disclosure in any way.
RNA-Isolation and Reverse Transcription Reaction.
Applicants analyzed the CAG-repeat sequences in the Huntington's disease gene in the mRNA obtained from a patient's cells using the following allele-specific reverse transcription reaction. The subject SNP sites in the Huntington's disease gene (IT15) are designated using the identification number provided by the National Center for Biotechnology Information (NCBI) database, accessible at: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=snp.
Development of Allele-Specific Reverse Transcription.
Donor-1 (a Caucasian female who was 28 years old at the time of cell collection and whose mother was diagnosed with HD at the age 49) was determined to be heterozygous (adenine versus cytosine) at SNP site rs363125. In order to design an allele-specific RNA interference-based therapy for donor-1, it is necessary to determine which of these SNP sequences is associated with the expanded CAG repeat mutation that is located approximately 5000 nucleotides upstream from the SNP position. Appropriate siRNA or shRNA targeting SNP site rs363125 should only reduce protein expression from the allele that contains the expanded CAG repeat and should therefore only be specific for the associated SNP. Correspondence between the SNP identity and the expanded allele cannot be readily determined by cDNA sequencing or by comparing the lengths of PCR products spanning the SNP position and the CAG repeat region. To solve this problem, the inventors developed a strategy that uses SNP-specific reverse transcription (RT) primers to selectively generate cDNA from only one allelic species of HD mRNA. The primers each contained either guanine or thymine at the 3′ terminal position, corresponding to SNP site rs363125, as shown in
RNA isolated from the fibroblasts was reverse transcribed using the Superscript III RT kit (Invitrogen) in the presence of 100 nM of one of the following DNA primers: the 20-mer 5′-GTGTTCTTCTAGCGTTGAAT-3′, SEQ ID NO: 19 (or a shorter, corresponding 15-mer or 10-mer ending in T-3′, corresponding to SEQ ID NO: 14 and SEQ ID NO: 9, respectively) or the 20-mer primer 5′-GTGTTCTTCTAGCGTTGAAG-3′ SEQ ID NO: 31 (or a shorter, corresponding 15-mer or 10-mer primer ending in G-3′, corresponding to SEQ ID NO: 26 and SEQ ID NO: 21, respectively) at 100 nM. The CAG repeat sequence on either RT product was then amplified by PCR (Bio-Rad iCycler) using Accuprime GC-Rich DNA-polymerase (Invitrogen), and forward primer 5′-GCCTTCGAGTCCCTCAAGT-3′ and reverse primer 5′-GACAATGATT-CACACGGTCT-3′ at 0.2 μM each (SEQ ID NO: 122 and SEQ ID NO: 123, respectively). The resulting PCR products contain the complete CAG repeat sequence of one of the two alleles of the GM04022 cells. CAG-repeat size for each allelic RT products was determined using standard 1.5% agarose gel electrophoresis with ethidium bromide staining, and also by sequencing of the products of the PCR amplification of the CAG repeat region.
Gel electrophoresis of the respective PCR products (
Correspondence Between a SNP Identity and an Individual's Expanded Allele.
Based on the relative sizes of the PCR products in
Development of SNP-Specific Real Time PCR Assays.
To be able to verify that allele-specific suppression of HD mRNA is occurring in cells, it is necessary to be able to quantify the amount of HD mRNA corresponding to each allele individually. Molecular beacons are synthetic oligonucleotide probes that have a fluorophore and a quencher covalently linked to the respective ends of the oligo. In solution, the beacon adopts a hairpin conformation, causing the fluorophore to be quenched. However, upon hybridization with complementary DNA in a PCR reaction, the hairpin conformation is lost and fluorescence from the fluorophore can be detected. Beacons can be constructed such that as little as a single nucleotide mismatch between the beacon and the complementary DNA is sufficient for the probe to be more stable in its self-annealed state than in the probe-cDNA hybrid. The inventors designed two such beacons corresponding to the two allelic variants of SNP rs363125, for the A allele and the C allele, respectively (
As shown in
Allele-Specific Suppression of Huntingtin mRNA.
In one set of experiments, fibroblasts were cultured in 25 cm2 culture flasks (Nunc) as described, but without the addition of PSN antibiotics and Fungizone. Lipofectamine 2000 (Invitrogen) was used to conduct siRNA transfection at three different conditions: 1) mock transfection (n=8); 2) transfection with scrambled siRNA (Ambion) (n=4); and 3) transfection with siRNA sequence 5′-GAAGUACUGUCCCCAUCUCdTdT-3′, SEQ ID NO: 228 and its complementary strand SEQ ID NO: 229, (Ambion) (n=7) at a concentration of 100 nM. This siRNA has a guanine located at position 16 relative to the 3′ end of the complementary region of the target huntingtin mRNA, providing specificity for the allele containing cytosine at the SNP position. A parallel cell culture was transfected with a fluorescently labeled Block-It siRNA (Invitrogen) to verify efficiency of Lipofectamine 2000 transfection. The cells were incubated overnight at 37° C., 5% CO2 and maximum humidity. The cultures were then washed with PBS and fluorescence microscopy (Leica DM-IRB) was used to confirm transfection in the Block-It transfected cultures, which were considered representative for all transfection conditions. The cells were cultured for another day before RNA isolation as described, approximately 48 hours post-transfection.
Fibroblasts from donor-1 were transfected with siRNA designed to specifically target the mRNA containing cytosine at SNP site rs363125. This siRNA molecule (siRNA 363125_C-16, GAAGUACUGUCCCCAUCUCdTdTm comprising SEQ ID NO: 228) was designed such that the cytosine nucleotide of the SNP is located at position 16 relative to the 5′ end of the sense strand of the siRNA molecule. The amount of mRNA from both endogenous alleles was separately quantified using the molecular beacons developed to be specific for the allelic variants at this SNP site. The results showed that about 48 hours following treatment of the fibroblasts with siRNA 363125_C-16, mRNA transcripts containing cytosine at position rs363125 were detected at levels approximately 80% lower (p<0.01, two-tailed) from that detected in controls that were mock transfected, or transfected with a scrambled siRNA (
Further experiments described below show that Applicants have developed siNA molecules which are capable of selectively inhibiting different variants of the same SNP.
In one set of experiments a psiCheck-2 vector constructs were transfected into NIH/3T3 cells expressing a chimeric gene comprising a region of IT15 and Renilla luciferase gene. NIH3T3 cells were seeded at 5,000 cells per well into wells of a 96-well plate. The next day, cells were transfected with a psiCheck-2 plasmid vector containing a target sequence from the region surrounding a SNP site in the Huntington's disease gene, as described below. This transfection consisted of a cotransfection of 100 nanogram; of plasmid and 10 picomoles of candidate allele-specific siRNA using 0.4 microliters DharmaFECT Duo (Dharmacon). The third day, the medium of the cells was refreshed and on day 4 cells were assayed for relative levels of expression of Renilla compared to Firefly luciferase.
The psiCheck-2 vector (Promega, catalog number 8021) is designed to monitor changes in expression of a target gene which is fused to the Renilla luciferase reporter gene. The target gene of interest is cloned into the multiple cloning region downstream of the translational stop codon. Once this vector is transfected into mammalian cells, a fusion mRNA of the Renilla luciferase and the gene of interest is transcribed, which is translated to the Renilla luciferase enzyme. siRNA molecules or vectors expressing shRNA molecules are cotransfected into the cells. If the specific siRNA/shRNA binds to the target mRNA and initiates the RNA interference process, the fused Renilla luciferase plus gene of interest mRNA will be cleaved and subsequently degraded, decreasing the amount of luciferase enzyme. The activity of the luciferase enzyme is measured for quantification.
The psiCheck-2 vector also contains a secondary firefly reporter gene. This firefly reporter gene is an intraplasmid normalization reporter used to normalize for differences in transfection efficiency. Thus the Renilla luciferase signal is normalized to the Firefly luciferase signal.
For each allelic variant of the Huntington gene SNPs shown in each graph, the region of the Huntington gene containing the SNP (with the nucleotide for the allelic variant of that SNP, e.g., an A) was cloned into the psiCheck-2 vector at the multiple cloning region downstream of the translational stop codon for the Renilla luciferase reporter. Then, the plasmid was transfected into cells, and a candidate allele-specific siRNA was cotransfected into those cells to determine how effectively the candidate allele-specific siRNA targets its matching SNP variant. Similarly, another plasmid containing the same sample region of the Huntington gene containing the SNP, but with the nucleotide (e.g., a G) for the other allelic variant of the SNP was prepared. Then, this other plasmid was transfected into cells, and the candidate allele-specific siRNA was applied to those cells to determine how effectively (or ineffectively) the candidate allele-specific siRNA targets its non-matching allelic variant.
In one set of experiments, the vectors contained SEQ ID NO 258: 5′ TCGACCTGTACAGTAATTAATAGGTTAAGAGATGGGGACAGTAATTCAACGCT AGAAGAACACAGTGAAGGGAAACAAATAAAG 3′ or SEQ ID NO 259: 5′ TCGACCTGTACAGTAATTAATAGGTTAAGAGATGGGGACAGTACTTCAACGCT AGAAGAACACAGTGAAGGGAAACAAATAAAG 3′ comprising allelic variants of rs363125 (as exemplified by SEQ ID NO: 7). For each allelic variant of that SNP, the following groups were used:
Thus, for example, when cells are transfected with a plasmid containing the region of the Huntington gene SEQ ID NO: 258 containing rs363125 (as exemplified by SEQ ID NO: 7), with an A at the SNP position, the candidate allele-specific siRNA matching that A allele (SEQ ID NOs 220 and 221) results in nearly 80% reduction in the Renilla luciferase reporter expression compared to the Firefly luciferase signal. However, when cells are transfected with a plasmid containing SEQ ID NO: 259 comprising the same region of the Huntington gene containing rs363125 but with a C as the SNP position, the candidate allele-specific siRNA (SEQ ID NOs 220 and 221) matching the A allele (and thus mismatching the C in the plasmid) results only about 30% reduction in the Renilla luciferase reporter expression compared to the Firefly luciferase signal, as shown in
Referring to the experiments illustrated in
The cells were harvested approximately 48 hours after transfection and Renilla luciferase activity was measured and normalized to firefly luciferase used as an internal control. siRNAs 124 and 132, as designated in the previous paragraph, subsections (d) and (g), respectively, showed the most promising results. Particularly, siRNA 124 suppressed the C-allele to below 10% of control, while the expression of the T allele was reduced to about 50%. siRNA 132 has not affected the expression of the C-allele while reducing the expression of the T-allele to less than about 35%.
In a set of experiments illustrated in
The results of the experiments indicate that siRNAs 188 and 196, as described in the previous paragraph, sections (b) and (e), respectively, have been the most efficient. siRNA 188 did not affect the expression of the O-allele while decreasing the expression of the A-allele by over 90%. siRNA 196 did not significantly decrease the expression of the A-allele, while decreasing the expression of the A-allele by over 90%. Notably, siRNAs 246 and 248, as described in the previous paragraph, sections (c) and (0, respectively, which have mutations adjacent the SNP site, unexpectedly did not effectively inhibit either the A-allele or the G-allele, contrary to the teaching of Davidson (see US 20040241854, paragraphs 323-338).
Genetic material from 327 Huntington's patients, (obtained with informed consent from patients for use of the material for research purposes by the Carlo Bests Institute, Milan, Italy), was analyzed for presence and frequency of SNPs, including those disclosed in the prior art. The following results were obtained:
The median heterozygosity on these SNPs is only 14.37%, and the mean heterozygosity is 16.41%. Thus, these data underscore the importance of optimizing SNP choice for allele-specific diagnosis and allele-specific treatment of patients suffering from or at risk of developing Huntington's disease.
From these data, however, one cannot make any conclusions regarding the heterozygosity rates obtained jointly on two or more SNPs (i.e., what percent of the patients are heterozygous on at least one SNP out of some set of SNPs.) For example, while 14.37% of patients are heterozygous at rs362303, and 16.21% are heterozygous at rs362304, the number of patients who are heterozygous on at least one of the two SNPs is only 16.51% (not anywhere near the sum of the two, nor any other mathematical combination of the individual percentages). This is because these events (heterozygosity at a SNP) are not independent events, but are linked (because the SNPs are on the same gene, some of them quite close to each other). Accordingly, mechanistically selecting SNPs with the highest frequency of heterozygosity will not necessarily yield the kit capable of encompassing the greatest proportion of the Huntington's population. Conversely, some SNPs have a relatively low frequency of heterozygosity but provide a greater benefit than SNPs with a higher frequency of heterozygosity (e.g., compare heterozygosity frequencies at rs17781557 and rs362273).
Among the 26 SNPs tested so far (for the 327 patients), the median, joint heterozygosity is 30.89% at two randomly chosen SNPs. Among 325 possible pairs, 11 pairs provide joint heterozygosity which is at least about 75% higher than the median joint heterozygosity (i.e., greater than 30.89 times 1.75, or 54.06%).
As disclosed in Table 6, SNP rs362331 has heterozygosity frequency of 46.18%. Accordingly, if one SNP is rs362331, the combination of this SNP with any other SNP disclosed in table 6 will necessarily yield superior results compared to 85% of two randomly selected SNPs (because 85%, or 276, out of the 325 possible pairs of SNPs have a joint heterozygosity less than 46.18%). Thus, there are 25 possible pairs, each comprising SNP rs362331, which are among preferred embodiments of the kits of the instant invention.
In addition, other SNP pairs exist which do not comprise SNP rs362331 and which provide heterozygosity at least 75% greater than the joint median heterozygosity of SNPs listed in Table 6. These additional pairs are shown in Table 7.
Thus, the use of these SNP pairs, as discovered by the Applicants, increases the probability of diagnosing and treating a Huntington's patient by at least about 75% over what could be done using a randomly selected pair of SNPs, or the best individual SNP (rs362331, in terms of its individual rate of heterozygosity. In fact, the joint heterozygosity frequency for the best possible pair (rs362331 paired with rs362307 is 65.14%) increases the probability of heterozygosity by more than 100% over that of a randomly selected pair.
Further, in an unlikely scenario that one designing such kit for multi-SNP diagnosis and/or treatment of Huntington's patient would select SNP rs362331 (this SNP has the heterozygosity frequency of 46.18%) as a first SNP and another SNP is selected randomly, median joint heterozygosity for the 25 pairs where one member of the pair is rs362331 is 48.62%. Thus, an addition of a random SNP to rs362331 increases the probability of joint heterozygosity by only about 5.3%. In contrast, the selected SNPs listed in table 8 increase the joint heterozygosity by at least about 10.6%, which is twice as much of the increase compared to a scenario where the second SNP is chosen randomly.
Joint median heterozygosity at SNPs rs362331, rs2276881 and rs363125 is about 55%. Accordingly, the siNAs disclosed in the previous examples may be used in the kits of the instant invention, which kits can diagnose and treat about 55% of Huntington's population in an allele-specific manner.
In more preferred embodiments, the second SNP increases the joint median heterozygosity by at least about 28.5%, which is more than five times greater compared to a randomly selected SNP. SNPs rs35892913, GAG/-, and rs362307 are examples of such preferred candidates for the second SNP.
The joint heterozygosity frequency for the best possible second SNP (rs362307) increases the joint median heterozygosity by about 41.1%, which is more than seven times greater compared to a randomly selected SNP.
All publications cited in the specification, both patent publications and non-patent publications, are indicative of the level of skill of those skilled in the art to which this invention pertains. All these publications are herein fully incorporated by reference to the same extent as if each individual publication were specifically and individually indicated as being incorporated by reference.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US08/64532 | 5/22/2008 | WO | 00 | 4/16/2010 |