Mitochondrial DNA polymorphisms

[0002] The Sequence Listing associated with this application is provided on CD-ROM in lieu of a paper copy, and is hereby incorporated by reference into the specification. Three CD-ROMs are provided, containing identical copies of the sequence listing: CD-ROM No. 1 is labeled COPY 1, contains the file 461.app.txt which is 11.3 MB and created on Nov. 25, 2002; CD-ROM No.2 is labeled COPY 2, contains the file 461.app.txt which is 11.3 MB and created on Nov. 25, 2002; CD-ROM No. 3 is labeled CRF, contains the file 461.app.txt which is 11.3 MB and created on Nov. 25, 2002.

TECHNICAL FIELD

[0003] The present invention relates generally to mitochondrial DNA polymorphisms and, more specifically, to compositions and methods based upon the identification of mitochondrial DNA polymorphisms for use in disease diagnosis, prognosis and treatment; patient and population profiling; pharmacogenomics; phylogenetic and population genetic analysis; genealogy; forensics and paternity testing; and related areas.

BACKGROUND OF THE INVENTION

[0004] Mitochondria are the subcellular organelles that manufacture bioenergetically essential adenosine triphosphate (ATP) by oxidative phosphorylation. Functional mitochondria contain gene products encoded by mitochondrial genes situated in mitochondrial DNA (mtDNA) and by extramitochondrial genes not situated in the circular mitochondrial genome. The 16.5 kb mtDNA encodes 22 tRNAs, two ribosomal RNAs (12s and 16s rRNA) and only 13 enzymes of the electron transport chain (ETC), the elaborate multi-subunit complex mitochondrial assembly where, for example, respiratory oxidative phosphorylation takes place. (See, e.g., Wallace et al., in Mitochondria & Free Radicals in Neurodegenerative Diseases, M. F. Beal, N. Howell and I. Bodis-Wollner, eds., 1997 Wiley-Liss, Inc., New York, pp. 283-307, and references cited therein; see also, e.g., Scheffler, I. E., Mitochondria, 1999 Wiley-Liss, Inc., New York.) More than 5000 copies of the mitochondrial genome may be present within a single cell, due to the presence of numerous mitochondria within a single cell, and of multiple copies of mtDNA within each mitochondrion. Furthermore, since mitochondrial DNA is strictly maternally inherited, all copies of mitochondrial DNA within an individual are generally monoclonal. Finally, certain regions of mtDNA are highly polymorphic. Mitochondrial DNA includes gene sequences encoding a number of ETC components, including seven subunits of NADH dehydrogenase, also known as ETC Complex I (ND1, ND2, ND3, ND4, ND4L, ND5 and ND6); one subunit of Complex III (ubiquinol: cytochrome c oxidoreductase, Cytb); three cytochrome c oxidase (Complex IV) subunits (COX1, COX2 and COX3); and two proton-translocating ATP synthase (Complex V) subunits (ATPase6 and ATPase8).

[0005] The first complete sequence of a human mitochondrial DNA (mtDNA), also referred to as the Cambridge reference sequence (CRS), was originally published in 1981 and was more recently revised (Anderson, S. et al., Nature 290:457-465, 1981; Andrews, R. M. et al., Nature Genetics 23:147, 1999). The mtDNA is strictly maternally inherited and has a mutation rate ten times that of nuclear DNA, with certain regions exhibiting notably high degrees of polymorphism. During human evolution and subsequent colonization of the continents by human subpopulations, the genealogic pattern of mtDNA inheritance indicates divergence of distinct maternal lineages, which are observed to harbor population-specific mtDNA polymorphisms. Thus, nine different European mtDNA haplogroups (e.g., discrete constellations of homoplasmic mtDNA polymorphisms that are highly conserved among members of a common maternal lineage as well as distinct Asian, Native American and African mtDNA haplogroups, have been described on the basis of the presence or absence in mtDNA of one or several restriction endonuclease recognition sites (described in Wallace et al., Gene 238:211-230, 1999; Torroni, A. et al., Genetics 144:1835-1850, 1996).

[0006] A number of degenerative diseases are thought to be caused by, or are associated with, alterations in mitochondrial function. These diseases include Alzheimer's Disease, diabetes mellitus, Parkinson's Disease, Huntington's disease, dystonia, Leber's hereditary optic neuropathy, schizophrenia, and myodegenerative disorders such as “mitochondrial encephalopathy, lactic acidosis, and stroke” (MELAS), and “myoclonic epilepsy ragged red fiber syndrome” (MERRF). Other diseases involving altered metabolism or respiration within cells may also be regarded as diseases associated with altered mitochondrial function. The extensive list of additional diseases associated with altered mitochondrial function continues to expand as aberrant mitochondrial or mitonuclear activities are implicated in particular disease processes.

[0007] There is evidence to suggest that the genetic basis of at least some diseases associated with altered mitochondrial function resides in mitochondrial DNA rather than in extramitochondrial DNA such as that found in the nucleus (e.g., nuclear chromosomal and extrachromosomal DNA). For example, noninsulin dependent diabetes mellitus (NIDDM) exhibits a predominantly maternal pattern of inheritance, and in at least some cases this disease appears to be associated with a mitochondrial DNA (mtDNA) abnormality not found in the CRS or the revised CRS. Thus, for instance, approximately 1.5% of all diabetic individuals harbor a mutation at mtDNA position 3243 in the mitochondrial gene encoding leucyl-tRNA (tRNALeu). This mutation is known as the MELAS (mitochondrial encephalopathy, lactic acidosis and stroke) mutation. (Gerbitz et al., Biochim. Biophys. Acta 1271:253-260, 1995.) Similar theories have been advanced for analogous relationships between mtDNA mutations and other diseases associated with altered mitochondrial function, including but not limited to Alzheimer's Disease (AD), Huntington's Disease (HD), Parkinson's Disease (PD), dystonia, Leber's hereditary optic neuropathy (LHON), schizophrenia, and myoclonic epilepsy ragged red fiber syndrome (MERRF) (See, e.g., Chinnery et al., 1999 J. Med. Genet. 36:425). In addition, a limited number of specific single nucleotide polymorphisms that correlate with Alzheimer's disease or with type 2 diabetes mellitus have been identified (see U.S. patent application Ser. No. 09/551,941; see also co-pending application serial No. 60/333,448). Certain somatic mtDNA sequence changes have also been detected in tissues from colorectal cancer (Polyack et al., (1998) Nature Genetics 20:291-293) and in lung, bladder, and head and neck tumors (Fliss et al., (1999) Science 287:2017-2019). The identification of additional mtDNA mutations associated with diseases may provide targets for the development of diagnostic and/or therapeutic agents.

[0008] As is well known in the art generally, and especially with regard to extramitochondrial DNA such as nuclear chromosomal and/or extrachromosomal DNA, direct (e.g., by nucleotide sequencers) or indirect (e.g., by RFLP, SSCP, etc.) determination of DNA sequence variations in individuals and in populations has been used for a wide range of purposes. For example, identification of variability at specific marker loci is useful in a wide range of genetic studies (e.g., genetic counseling, diagnosis of inherited disorders and/or of cancer, pharmacogenetics, etc.), in commercial breeding, in genotyping of samples (e.g., for transplantation, transfusion, cell or tissue grafting, etc.), forensic analysis, paternity testing and the like.

[0009] Currently available DNA-based identification technology makes use of a variety of DNA fingerprinting techniques (for a discussion of common procedures, see Murch, R. S. and Budowle, B., (1997) Are Developments in Forensic Applications ofDNA Technology Consistent with Privacy Protection? in Genetic Secrets: Protecting Privacy and Confidentiality in the Genetic Era (ed. Mark A. Rothstein), Yale University Press, New Haven, Conn.). DNA fingerprinting includes a variety of methods for assessing sequence differences in DNA isolated from various sources, e.g., by comparing the presence of marker DNA in samples of isolated DNA. Typically, DNA fingerprinting is used to analyze and compare DNA from different species of organisms, or from different individuals of the same species. Many technologies have been used in DNA fingerprinting, including, inter alia, restriction fragment length polymorphism (RFLP; e.g., Bostein et al. (1980) Am. J. Hum. Genet. 32:314-331), single strand conformation polymorphism (SSCP; Fischer et al. (1983) Proc. Natl. Acad. Sci. USA 80:1579-1583; Orita et al. (1989) Genomics 5:874-879), amplified fragment-length polymorphism (AFLP; Vos et al. (1995) Nucleic Acids Res. 23:4407-4414), microsatellite or single-sequence repeat analysis (SSR; Weber J L and May P E (1989) Am. J. Hum. Genet. 44:388-396), rapid-amplified polymorphic DNA analysis (RAPD; Williams et al. (1990) Nucleic Acids Res. 18:6531-6535), sequence tagged site analysis (STS; Olson et al. (1989) Science 245:1434-1435), genetic-bit analysis (GBA; Nikiforov et al. (1994) Nucleic Acids Res 22:4167-4175), allele-specific polymerase chain reaction (ASPCR; Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448; Newton et al. (1989) Nucleic Acids Res. 17:2503-2516), nick-translation PCR (e.g., TaqMan™; Lee et al. (1993) Nucleic Acids Res. 21:3761-3766), and allele-specific hybridization (ASH; Wallace et al. (1979) Nucleic Acids Res. 6:3543-3557; Sheldon et al. (1993) Clinical Chemistry 39(4):718-719) among others.

[0010] According to certain commonly used methods of DNA fingerprinting, variable number tandem repeat (VNTR) regions within genomic DNA are genetically characterized by restriction fragment length polymorphisms (RFLP) analysis. Alternative approaches, for example, those that are generally based upon use of the polymerase chain reaction (PCR), include dot-blot assays, electrophoretic analysis and direct nucleic acid sequencing. By such methods, a variety of genomic DNA sequence polymorphisms can be analyzed, including, for example, polymorphisms in HLA-DQA1 (or other loci of the highly polymorphic major histocompatibility complex (MHC)), low-density lipoprotein receptor (LDL-R), glycophorin A, hemoglobin G gammaglobin, D7S8 and other group-specific components.

[0011] When such techniques are directed to analysis of nuclear (e.g., chromosomal) DNA, however, certain limitations become apparent, including, for instance, (1) the availability of only a limited amount of sample material because there are only two copies of each gene per cell, and (2) the presence in a sample of two different nucleotide sequences at a particular genetic locus where the subject is heterozygous at that locus (e.g., by having inherited different allelic forms of the gene from each parent). Given the devastating consequences of human diseases such as, for example, Alzheimer's disease and type 2 diabetes mellitus, clearly there is a need to develop improved compositions and methods for identifying the presence of, or risk for having, such diseases in individuals. In addition, there is a clear need in the art to develop more sensitive and reliable compositions and methods for use in forensics and in other applications requiring the identification of individuals and/or their genetic relationships to others. The present invention addresses these needs by providing compositions and methods that employ mtDNA as a source of genetic markers for diagnostic, prognostic, pharmacogenetic, evolutionary, forensic and/or genealogic analyses, and offers other related advantages.

SUMMARY OF THE INVENTION

[0012] The present invention is directed in part to compositions and methods that relate to identification of mitochondrial DNA polymorphisms. Accordingly, it is an aspect of the invention to provide a method for determining the mitochondrial haplogroup of a subject, comprising determining, in a biological sample comprising mitochondrial DNA from a subject, the presence or absence of at least one mitochondrial single nucleotide polymorphism that is associated with a mitochondrial haplogroup. In certain embodiments the mitochondrial DNA is amplified. In certain other embodiments, at least one mitochondrial single nucleotide polymorphism that is associated with a mitochondrial haplogroup is present in a mitochondrial DNA region that is a D-loop, a mitochondrial rRNA gene, a mitochondrial NADH dehydrogenase gene, a mitochondrial tRNA gene, a mitochondrial cytochrome c oxidase gene, a mitochondrial ATP synthase gene or a mitochondrial cytochrome b gene.

[0013] In certain other embodiments at least one mitochondrial single nucleotide polymorphism that is associated with a mitochondrial haplogroup is a haplogroup-specific polymorphism or a haplogroup-associated polymorphism, wherein a mitochondrial single nucleotide polymorphism that is haplogroup-specific is, for A, B, C, D, E, H, I, J, K, L1, L2, L3, T, U, V, W or X, located at a nucleotide that corresponds to a nucleotide position of SEQ ID NO:1 that is selected from:

1NucleotidePositionGENEHAPLOGROUP64D-LOOPA235D-LOOPA66312S rRNAA159812S rRNAA173616S rRNAA3316ND1A4248ND1A4824ND2A4970ND2A6308COIA7112COIA7724COIIA8794ATPase6A11314ND4A12468ND5A12811ND5A13855ND5A14364ND6A16111D-LOOPA16290D-LOOPA16319D-LOOPA82712S rRNAB3547ND1B4820ND2B4977ND2B6023COIB6216COIB6413COIB6473COIB6755COIB7241COIB9950COIIIB11177ND4B15535CYT BB16217D-LOOPB249D-LOOPC289D-LOOPC290D-LOOPC3552ND1C4715ND2C7196COIC7694COIIC8584ATPase6C9545COIIC10454tRNA-RC12642ND5C12978ND5C14318ND6C15487CYT BC15930tRNA-TC209216S rRNAD4883ND2D5178ND2D8414ATPase8D11593ND4D14668ND6D302716S rRNAE3705ND1E4491ND2E7598COIIE239D-LOOPH456D-LOOPH477D-LOOPH95112S rRNAH96112S rRNAH3277tRNA-LH3333ND1H3591ND1H3796ND1H3915ND1H3992ND1H4024ND1H4310tRNA-IH4336tRNA-QH4531ND2H4727ND2H4745ND2H4772ND2H4793ND2H5004ND2H6365COIH6776COIH6869COIH7013COIH8269COIIH8448ATPase8H8602ATPase6H8803ATPase6H8839ATPase6H8843ATPase6H8898ATPase6H9123ATPase6H9150ATPase6H9380COIIIH9804COIIIH10044tRNA-GH11353ND4H11560ND4H12579ND5H13404ND5H13680ND5H13759ND5H14125ND5H14350ND6H14365ND6H14470ND6H14582ND6H14872CYT BH15466CYT BH15789CYT BH15808CYT BH15833CYT BH16162D-LOOPH16293D-LOOPH199D-LOOPI250D-LOOPI3447ND1I3990ND1I4529ND2I6734COII8616ATPase6I9947COIIII10034tRNA-GI10238ND3I11065ND4I12501ND5I13780ND5I15758CYT BI16391D-LOOPI228D-LOOPJ295D-LOOPJ462D-LOOPJ215816S rRNAJ238716S rRNAJ3394ND1J5198ND2J5633tRNA-AJ6464COIJ6554COIJ6671COIJ7476tRNA-SJ7711COIIJ10084ND3J10172ND3J10192ND3J10499ND4LJ10598ND4LJ10685ND4LJ11377ND4J12127ND4J12570ND5J12612ND5J13281ND5J13681ND5J13879ND5J13933ND5J14569ND6J15679CYT BJ15812CYT BJ16069D-LOOPJ16092D-LOOPJ16261D-LOOPJ114D-LOOPK497D-LOOPK593tRNA-FK118912S rRNAK221716S rRNAK248316S rRNAK3480ND1K4295tRNA-IK4561ND2K5814tRNA-CK6260COIK9006ATPase6K9055ATPase6K9698COIIIK9716COIIIK9962COIIIK10289ND3K10550ND4LK10978ND4K11299ND4K11470ND4K11485ND4K11840ND4K11869ND4K11923ND4K12954ND5K13135ND5K13740ND5K13967ND5K14002ND5K14037ND5K14040ND5K14167ND6K15884CYT BK15946tRNA-TK16224D-LOOPK16234D-LOOPK16463D-LOOPK185D-LOOPL1186D-LOOPL1189D-LOOPL1236D-LOOPL1247D-LOOPL1297D-LOOPL1357D-LOOPL171012S rRNAL182512S rRNAL1104812S rRNAL1173816S rRNAL1224516S rRNAL1239516S rRNAL1275816S rRNAL1276816S rRNAL1288516S rRNAL13308ND1L13516ND1L13666ND1L13693ND1L13777ND1L13796ND1L13843ND1L14312tRNA-IL14586ND2L15036ND2L15393ND2L15442ND2L15603tRNA-AL15655tRNA-AL15913COIL15951COIL16071COIL16150COIL16185COIL16253COIL16548COIL16827COIL16989COIL17055COIL17076COIL17146COIL17337COIL17389COIL17867COIIL18248COIIL18428ATPase8L18655ATPase6L18784ATPase6L18877ATPase6L19042ATPase6L19072ATPase6L19347COIIIL19755COIIIL19818COIIIL110321ND3L110586ND4LL110589ND4LL110664ND4LL110688ND4LL110792ND4L110793ND4L110810ND4L111176ND4L111641ND4L111654ND4L111899ND4L112007ND4L112049ND4L112519ND5L112720ND5L112810ND5L113149ND5L113276ND5L113485ND5L113506ND5L113789ND5L113880ND5L113980ND5L114000ND5L114148ND5L114178ND6L114203ND6L114308ND6L114560ND6L114769CYT BL114911CYT BL115115CYT BL115136CYT BL116148D-LOOPL116187D-LOOPL116188D-LOOPL116230D-LOOPL116264D-LOOPL116265D-LOOPL116360D-LOOPL116527D-LOOPL1143D-LOOPL2144212S rRNAL2170616S rRNAL2233216S rRNAL2235816S rRNAL2241616S rRNAL2278916S rRNAL23495ND1L23918ND1L24158ND1L24185ND1L24370tRNA-QL24767ND2L25027ND2L25285ND2L25331ND2L25581tRNA-WL25744NON-CODINGL26713COIL27175COIL27274COIL27624COIIL27771COIIL28080COIIL28206COIIL28387ATPase8L28541ATPase8L28790ATPase6L28925ATPase6L29221COIIIL210115ND3L211944ND4L212236tRNA-SL212630ND5L212693ND5L212948ND5L213803ND5L214059ND5L214544ND6L214566ND6L214599ND6L215110CYT BL215217CYT BL215229CYT BL215236CYT BL215244CYT BL215391CYT BL215629CYT BL215945tRNA-TL216114D-LOOPL216213D-LOOPL216309D-LOOPL216390D-LOOPL2200D-LOOPL3200016S rRNAL33438ND1L33450ND1L35773tRNA-CL36524COIL36587COIL36680COIL37424COIL37618COIIL38616ATPase6L38618ATPase6L38650ATPase6L39554COIIIL310086ND3L310373ND3L310667ND4LL310819ND4L311800ND4L313101ND5L313886ND5L313914ND5L314152ND6L314212ND6L314284ND6L315099CYT BL315311CYT BL315670CYT BL315824CYT BL315942tRNA-TL315944tRNA-TL316124D-LOOPL316327D-LOOPL393012S rRNAT214116S rRNAT285016S rRNAT4688ND2T4917ND2T5277ND2T6489COIT7022COIT8572ATPase6T8697ATPase6T9117ATPase6T9899COIIIT10463tRNA-RT11242ND4T11812ND4T12633ND5T13368ND5T13758ND5T13965ND5T14233ND6T14687tRNA-ET14905CYT BT15028CYT BT15274CYT BT15607CYT BT15928tRNA-TT16163D-LOOPT16182D-LOOPT16186D-LOOPT16294D-LOOPT16296D-LOOPT16324D-LOOPT98812S rRNAU170016S rRNAU172116S rRNAU229416S rRNAU311616S rRNAU319716S rRNAU3348ND1U3720ND1U3849ND1U4553ND2U4646ND2U4703ND2U4732ND2U4811ND2U5319ND2U5390ND2U5495ND2U5656NON-CODINGU5999COIU6045COIU6047COIU6146COIU6518COIU6629COIU6719COIU7109COIU7385COIU7768COIIU7805COIIU8473ATPase8U9070ATPase6U9266COIIIU9477COIIIU9667COIIIU10506ND4LU10876ND4U10907ND4U10927ND4U11197ND4U11332ND4U11732ND4U12346ND5U12557ND5U12618ND5U13020ND5U13617ND5U13637ND5U14139ND5U14179ND6U14182ND6U14620ND6U14793CYT BU14866CYT BU15191CYT BU15218CYT BU15454CYT BU15693CYT BU15907tRNA-TU16051D-LOOPU16256D-LOOPU16270D-LOOPU16399D-LOOPU4580ND2V15904tRNA-TV194D-LOOPW124312S rRNAW140612S rRNAW3505ND1W6528COIW8994ATPase6W10097ND3W11674ND4W11947ND4W12414ND5W13263ND5W15775CYT BW15884CYT BW16292D-LOOPW225D-LOOPX226D-LOOPX6371COIX8393ATPase8X8705ATPase6X14470ND6X15927tRNA-TX−73D-LOOPnot presentin [H]−7028COInot presentin [H]−11698ND4not presentin [H]and 14766,CYT Bnot presentin [H]

[0014] and wherein a mitochondrial single nucleotide polymorphism that is haplogroup-associated is, for haplogroup A, B, C, D, E, H, I, J, K, L1, L2, L3, T, U, W or X, located at a nucleotide that corresponds to a nucleotide position of SEQ ID NO:1 that is selected from:

2NucleotidePositionGeneHaplogroup16362D-LOOPA, C, D, E,H, L312705ND5A, C, D, E,I, L1, L2,L3, W, X188816S rRNAA, C, T13708ND5A, J, X8027COIIA, L1153D-LOOPA, X207D-LOOPB, I, L2, W13590ND5B, L29449COIIIB, L3499D-LOOPB, U16325D-LOOPC, D10400ND3C, D, E14783CYT BC, D, E10398ND3C, D, E, I,J, K, L1,L2, L315043CYT BC, D, E,I, T489D-LOOPC, D, E, J8701ATPase6C, D, E, L1,L2, L39540COIIIC, D, E, L1,L2, L310873ND4C, D, E, L1,L2, L315301CYT BC, D, E,L2, L311914ND4C, K, L1, L216298D-LOOPC, V301016S rRNAD, H, J, U16311D-LOOPH, K, L1, L393D-LOOPH, L116304D-LOOPH, T16291D-LOOPH, U16356D-LOOPH, U16482D-LOOPH, U15924tRNA-TI, K, U10915ND4I, L1204D-LOOPI, L2, W8251COIII, W171916S rRNAI, X14798CYT BJ, K15257CYT BJ, K5460ND2J, L1, W185D-LOOPJ, L311002ND4J, L34216ND1J, T11251ND4J, T15452CYT BJ, T16126D-LOOPJ, T9548COIIIJ, U13934ND5J, U5231ND2K, L170912S rRNAK, L1, T, W181116S rRNAK, U11467ND4K, U12308tRNA-LK, U12372ND5K, U182D-LOOPL1, L2198D-LOOPL1, L276912S rRNAL1, L2101812S rRNAL1, L23594ND1L1, L24104ND1L1, L27256COIL1, L27521tRNA-DL1, L213650ND5L1, L216278D-LOOPL1, L2, L3, X189D-LOOPL1, L3235216S rRNAL1, L313105ND5L1, L35046ND2L1, W6152COIL2, U15784CYT BL2, U, W5147ND2L3, T6221COIL3, X5426ND2T, U13734ND5T, Uand 13966.ND5T, X

[0015] In certain other embodiments the mitochondrial single nucleotide polymorphism that is associated with a mitochondrial haplogroup is present in members of only one haplogroup and is a haplogroup-specific polymorphism as just described that is present in members of only one haplogroup

[0016] According to another embodiment of the invention there is provided a method for determining the mitochondrial haplogroup subgroup of a subject, comprising determining, in a biological sample comprising mitochondrial DNA from a subject, the presence or absence of at least one mitochondrial single nucleotide polymorphism that is associated with a mitochondrial haplogroup subgroup. In certain other embodiments the mitochondrial haplogroup is haplogroup K, U, J, T, W, I, H, V, X, L1, L2 or L3. In certain other embodiments at least one mitochondrial single nucleotide polymorphism that is associated with a mitochondrial haplogroup subgroup is a mitochondrial single nucleotide polymorphism located at a nucleotide that corresponds to a nucleotide position of SEQ ID NO:1 that is selected from the group consisting of position 3010, 16162, 16189, 16304, 1811, 3197, 9477, 14793, 16256, 13617, 16270, 7768, 14182, 3480, 9055, 9698, 10550, 11299, 14167, 14798, 16224, 16311, 1189, 10398, 497, 11470, 11914, 15924, 3010, 10398, 12612, 13798, 16069, 295, 489, 228, 462, 16193, 709, 1888, 4917, 8697, 10463, 13368, 14905, 15607, 15928, 16189, 16294, 5426, 6489, 11812, 15043, 16298, 12633, 16163 and 16186.

[0017] In another embodiment the invention provides a method for determining a mitochondrial haplogroup subgroup of a subject, comprising determining, in a biological sample comprising mitochondrial DNA from a subject of known mitochondrial haplogroup selected from haplogroups K, U, X, I, J, T, L1, L2 and L3, the presence or absence of a set comprising a plurality of single nucleotide polymorphisms wherein each polymorphism is located at a nucleotide that corresponds to a nucleotide position of SEQ ID NO:1, the set selected from the group consisting of a first haplogroup K subgroup comprising a polymorphism at positions 1147, 12308, 12372, 1811, 3480, 9055, 9698, 10550, 11299, 14167, 14798, 16224, 16311 and 709; a second haplogroup K subgroup comprising a polymorphism at positions 1147, 12308, 12372, 1811, 3480, 9055, 9698, 10550, 11299, 14167, 14798, 16224, 16311, 1189 and 10398; a third haplogroup K subgroup comprising a polymorphism at positions 1147, 12308, 12372, 1811, 3480, 9055, 9698, 10550, 11299, 14167, 14798, 16224, 16311, 1189, 10398 and 497; a fourth haplogroup K subgroup comprising a polymorphism at positions 1147, 12308, 12372, 1811, 3480, 9055, 9698, 10550, 11299, 14167, 14798, 16224, 16311, 1189, 10398, 497 and 11914; a fifth haplogroup K subgroup comprising a polymorphism at positions 1147, 12308, 12372, 1811, 3480, 9055, 9698, 10550, 11299, 14167, 14798, 16224, 16311, 1189, 10398, 497, 11914, 11470 and 15924; a sixth haplogroup K subgroup comprising a polymorphism at positions 1147, 12308, 12372, 1811, 3480, 9055, 9698, 10550, 11299, 14167, 14798, 16224, 16311, 1189, 10398, 497; 11914, 11470, 15924, 12978 and 12954; a seventh haplogroup K subgroup comprising a polymorphism at positions 1147, 12308, 12372, 1811, 3480, 9055, 9698, 10550, 11299, 14167, 14798, 16224, 16311, 1189, 10398, 497; 11914, 11470, 15924, 12978, 12954 and 114; a first haplogroup U subgroup comprising a polymorphism at positions 11467, 12308, 12372 and 1811; a second haplogroup U subgroup comprising a polymorphism at positions 11467, 12308, 12372, 3197, 9477, 13617, 16270; a third haplogroup U subgroup comprising a polymorphism at positions 11467, 12308, 12372, 3197, 9477, 13617, 16270, 7768 and 14182; a fourth haplogroup U subgroup comprising a polymorphism at positions 11467, 12308, 12372, 3197, 9477, 13617, 16270, 14793 and 16256; a fifth haplogroup U subgroup comprising a polymorphism at positions 11467, 12308, 12372, 3197, 9477, 13617, 16270, 14793, 16256 and 15218; a first haplogroup X subgroup comprising a polymorphism at positions 12705, 16223, 1719, 6221, 6371, 13966, 14470, 16278 and 225; a second haplogroup X subgroup comprising a polymorphism at positions 12705, 16223, 1719, 6221, 6371, 13966, 14470, 16278, 225 and 226; a first haplogroup I subgroup comprising a polymorphism at positions 12705, 16223, 1719, 10238, 10398, 12501, 13780 and 15043; a second haplogroup I subgroup comprising a polymorphism at positions 12705, 16223, 1719, 10238, 10398, 12501, 13780, 15043, 250, 4529, 10034, 15924 and 16391; a first haplogroup J subgroup comprising a polymorphism at positions 4216, 11251, 15452, 3010, 10398, 12612, 13708, 16069 and 16126; a second haplogroup J subgroup comprising a polymorphism at positions 4216, 11251, 15452, 3010, 10398, 12612, 13708, 16069, 16126, 295 and 489; a third haplogroup J subgroup comprising a polymorphism at positions 4216, 11251, 15452, 3010, 10398, 12612, 13708, 16069, 16126, 295, 489 and 15257; a fifth haplogroup J subgroup comprising a polymorphism at positions 4216, 11251, 15452, 3010, 10398, 12612, 13708, 16069, 16126, 295, 489 and 462; a sixth haplogroup J subgroup comprising a polymorphism at positions 4216, 11251, 15452, 3010, 10398, 12612, 13708, 16069, 16126, 295, 489, 462 and 228; a first haplogroup T subgroup comprising a polymorphism at positions 709, 1888, 4917, 8697, 10463, 13368, 14905, 15607, 15928, 16126, 16294 and 12633; a second haplogroup T subgroup comprising a polymorphism at positions 709, 1888, 4917, 8697, 10463, 13368, 14905, 15607, 15928, 16126, 16294, 12633, 16163 and 16186; a third haplogroup T subgroup comprising a polymorphism at positions 709, 1888, 4917, 8697, 10463, 13368, 14905, 15607, 15928, 16126, 16294, 11812, 14233 and 16296; a fourth haplogroup T subgroup comprising a polymorphism at positions 709, 1888, 4917, 8697, 10463, 13368, 14905, 15607, 15928, 16126, 16294, 11812, 14233, 16296, 930, 5147 and 16304; a fifth haplogroup T subgroup comprising a polymorphism at positions 709, 1888, 4917, 8697, 10463, 13368, 14905, 15607, 15928, 16126, 16294, 11812, 14233, 16296, 5426, 6489 and 15043; a first haplogroup L1 subgroup comprising a polymorphism at positions 8701, 9540, 10398, 10873, 12705, 16223, 769, 1018, 3594, 4104, 7256, 7521, 13650, 247, 825, 2758, 2885, 2666, 7055, 7146, 7389, 8468, 8655, 10688, 10810, 13105, 13506, 13789, 14178, 14560, 16187 and 16311; a second haplogroup L1 subgroup comprising a polymorphism at positions 8701, 9540, 10398, 10873, 12705, 16223, 769, 1018, 3594, 4104, 7256, 7521, 13650, 247, 825, 2758, 2885, 2666, 7055, 7146, 7389, 8468, 8655, 10688, 10810, 13105, 13506, 13789, 14178, 14560, 16187, 16311 and 182; a third haplogroup L1 subgroup comprising a polymorphism at positions 8701, 9540, 10398, 10873, 12705, 16223, 769, 1018, 3594, 4104, 7256, 7521, 13650, 247, 825, 2758, 2885, 2666, 7055, 7146, 7389, 8468, 8655, 10688, 10810, 13105, 13506, 13789, 14178, 14560, 16187, 16311, 182, 8027 and 16294; a fourth haplogroup L1 subgroup comprising a polymorphism at positions 8701, 9540, 10398, 10873, 12705, 16223, 769, 1018, 3594, 4104, 7256, 7521, 13650, 247, 825, 2758, 2885, 2666, 7055, 7146, 7389, 8468, 8655, 10688, 10810, 13105, 13506, 13789, 14178, 14560, 16187, 16311, 182, 357, 709, 710, 1738, 2352, 2768, 3308, 3693, 5036, 5393, 5655, 6548, 6827, 6989, 7867, 8248, 12519, 13880, 14203, 15115, 16126 and 16264; a fifth haplogroup L1 subgroup comprising a polymorphism at positions 8701, 9540, 10398, 10873, 12705, 16223, 769, 1018, 3594, 4104, 7256, 7521, 13650, 247, 825, 2758, 2885, 2666, 7055, 7146, 7389, 8468, 8655, 10688, 10810, 13105, 13506, 13789, 14178, 14560, 16187, 16311, 182, 357, 709, 710, 1738, 2352, 2768, 3308, 3693, 5036, 5393, 5655, 6548, 6827, 6989, 7867, 8248, 12519, 13880, 14203, 15115, 16126, 16264 and 14769; a first haplogroup L2 subgroup comprising a polymorphism at positions 8701, 9540, 10398, 10873, 12705, 16223, 769, 1018, 3594, 4104, 7256, 7521, 13650, 2416, 8206, 9221, 10115, 11944, 13590, 15301, 16278 and 16390; a second haplogroup L2 subgroup comprising a polymorphism at positions 8701, 9540, 10398, 10873, 12705, 16223, 769, 1018, 3594, 4104, 7256, 7521, 13650, 2416, 8206, 9221, 10115, 11944, 13590, 15301, 16278, 16390 and 182; a third haplogroup L2 subgroup comprising a polymorphism at positions 8701, 9540, 10398, 10873, 12705, 16223, 769, 1018, 3594, 4104, 7256, 7521, 13650, 2416, 8206, 9221, 10115, 11944, 13590, 15301, 16278, 16390, 2789, 7175, 7274, 7771, 11914, 12693, 13803, 14566, 15784 and 16294; a fourth haplogroup L2 subgroup comprising a polymorphism at positions 8701, 9540, 10398, 10873, 12705, 16223, 769, 1018, 3594, 4104, 7256, 7521, 13650, 2416, 8206, 9221, 10115, 11944, 13590, 15301, 16278, 16390, 2789, 7175, 7274, 7771, 11914, 12693, 13803, 14566, 15784, 16294 and 16309; a fifth haplogroup L2 subgroup comprising a polymorphism at positions 8701, 9540, 10398, 10873, 12705, 16223, 769, 1018, 3594, 4104, 7256, 7521, 13650, 2416, 8206, 9221, 10115, 11944, 13590, 15301, 16278, 16390, 2789, 7175, 7274, 7771, 11914, 12693, 13803, 14566, 15784, 16294, 16309, 3918, 5285, 15244, 15629; a first haplogroup L3 subgroup comprising a polymorphism at positions 8701, 9540, 10398, 10873, 12705, 16223, 15301; a second haplogroup L3 subgroup comprising a polymorphism at positions 8701, 9540, 10398, 10873, 12705, 16223, 15301, 13105; a third haplogroup L3 subgroup comprising a polymorphism at positions 8701, 9540, 10398, 10873, 12705, 16223, 15301, 13105, 3450, 5773, 6221, 9449, 10089, 10373, 13914, 15311, 15824, 15944, 16124, 16278 and 16362; a fourth haplogroup L3 subgroup comprising a polymorphism at positions 8701, 9540, 10398, 10873, 12705, 16223, 15301, 489, 10400, 14783 and 15043; a fifth haplogroup L3 subgroup comprising a polymorphism at positions 8701, 9540, 10398, 10873, 12705, 16223, 15301, 489, 10400, 14783, 15043, 2092, 3010, 4883, 5178, 6578, 14668 and 16325; a sixth haplogroup L3 subgroup comprising a polymorphism at positions 8701, 9540, 10398, 10873, 12705, 16223, 15301, 489, 10400, 14783, 15043 and 16362; a seventh haplogroup L3 subgroup comprising a polymorphism at positions 8701, 9540, 10398, 10873, 12705, 16223, 15301, 489, 10400, 14783, 15043, 4715, 7196, 8584 and 16298; an eighth haplogroup L3 subgroup comprising a polymorphism at positions 8701, 9540, 10398, 10873, 12705, 16223, 15301, 489, 10400, 14783, 15043, 4715, 7196, 8584, 16298, 249, 3552, 9545, 11914, 13263, 14318, 15487, 16325 and 16327; a ninth haplogroup L3 subgroup comprising a polymorphism at positions 8701, 9540, 10398, 10873, 12705, 16223, 15301, 489, 10400, 14783, 15043, 4715, 7196, 8584, 16298, 249, 3552, 9545, 11914, 13263, 14318, 15487, 16325, 16327, 289 and 290; and a tenth haplogroup L3 subgroup comprising a polymorphism at positions 8701, 9540, 10398, 10873, 12705, 16223, 15301, 489, 10400, 14783, 15043, 4715, 7196, 8584, 16298, 249, 3552, 9545, 11914, 13263, 14318, 15487, 16325, 16327, 289, 290 and 15930.

[0018] In another embodiment the invention provides a method for determining the genetic relationship between two subjects, comprising determining, in each of a first biological sample comprising mitochondrial DNA from a first subject and a second biological sample comprising mitochondrial DNA from a second subject, the presence or absence of at least one mitochondrial single nucleotide polymorphism, wherein either (i) the presence of at least one mitochondrial single nucleotide polymorphism in both of said first and second biological samples, or (ii) the absence of at least one mitochondrial single nucleotide polymorphism from both of said first and second biological samples, indicates a genetic relationship between the subjects, and therefrom determining the genetic relationship between the subjects. In certain embodiments at least one mitochondrial single nucleotide polymorphism is associated with a mitochondrial haplogroup that is haplogroup A, B, C, D, E, H, I, J, K, L1, L2, L3, T, U, V, W or X. In certain further embodiments at least one mitochondrial single nucleotide polymorphism is a haplogroup-specific polymorphism as described above.

[0019] The invention also provides, in other embodiments, a method for determining the genetic relationship between (i) an unknown source or biological subject from which an unidentified sample is obtained, and (ii) a known source or biological subject from an identified sample is obtained, comprising determining the presence or absence of at least one mitochondrial single nucleotide polymorphism, in each of a first biological sample derived from an unknown subject or biological source and a second biological sample derived from a known subject or biological source, wherein said first and second biological samples each comprise mitochondrial DNA, wherein either (i) the presence of at least one mitochondrial single nucleotide polymorphism in both of said first and second biological samples, or (ii) the absence of at least one mitochondrial single nucleotide polymorphism from both of said first and second biological samples, indicates a genetic relationship between the subjects, and therefrom determining the genetic relationship between the biological samples.

[0020] Turning to another embodiment of the present invention, there is provided a method of determining the presence of or the risk for having a disease associated with a mitochondrial DNA single nucleotide polymorphism, comprising (a) identifying at least one haplogroup-associated mitochondrial DNA single nucleotide polymorphism in a biological sample comprising mitochondrial DNA from a subject suspected of having or being at risk for having a disease associated with a mitochondrial DNA single nucleotide polymorphism; and (b) identifying in said sample at least one disease associated mitochondrial DNA single nucleotide polymorphism that is not a haplogroup-associated mitochondrial DNA single nucleotide polymorphism, and therefrom determining the presence or risk of disease. In certain other embodiments the disease associated mitochondrial DNA single nucleotide polymorphism that is not a haplogroup-associated mitochondrial DNA single nucleotide polymorphism is an Alzheimer's disease-associated polymorphism, and in certain other embodiments the disease associated mitochondrial DNA single nucleotide polymorphism that is not a haplogroup-associated mitochondrial DNA single nucleotide polymorphism is a type 2 diabetes-associated polymorphism.

[0021] These and other aspects of the present invention will become apparent upon reference to the following detailed description and attached drawings. All references disclosed herein are hereby incorporated by reference in their entireties as if each was incorporated individually

BRIEF DESCRIPTION OF THE DRAWINGS

[0022]
FIG. 1 provides a phylogenetic tree of European mtDNA haplotypes.

[0023]
FIG. 2 provides a phylogenetic tree of African mtDNA haplotypes.

[0024] FIGS. 3-6 show reduced median network of the specified mtDNA haplogroups. For reduced median network analysis, see, e.g., Bandelt et al., 1995 Genetics 141:743-753.

[0025]
FIG. 3 shows a reduced median network of European mtDNA haplogroups.

[0026]
FIG. 4 shows a reduced median network of European H and V mtDNA haplogroups.

[0027]
FIG. 5 shows a reduced median network of African mtDNA haplogroups.

[0028]
FIG. 6 shows a reduced median network of Asian mtDNA haplogroups.

DETAILED DESCRIPTION OF THE INVENTION

[0029] The present invention provides improved compositions and methods for identifying individuals, subpopulations and populations by determination of mtDNA haplogroup, genealogic, forensic, and related genetic relationships. As described herein, surprising diversity in mtDNA sequences permits expanded definition of mitochondrial polymorphism and redefinition of mtDNA haplogroups and subgroups at a level of refinement not previously recognized. The invention thus exploits the high mutation rate of mitochondrial DNA (mtDNA) to identify individuals, subpopulations and/or populations on the basis of specific mutations associated with particular characteristics such as race, genealogy and/or the presence of, or risk for having, certain diseases. In addition, mtDNA may be used to identify specific individuals. The present invention is directed generally to compositions and methods for identifying mtDNA mutations and thereby diagnosing the risk for having, or presence of, a disease. The invention also permits determination of other characteristics such as genealogy, population, race or ethnic group. In addition, the methods of the present invention are directed to identifying genetic and familial relationships between subjects or biological sources of mitochondrial DNA samples for a variety of purposes including, for instance, maternity testing, forensic studies, genetic counseling and genealogical analysis, and the like.

[0030] Biological samples may be provided by obtaining a blood sample, biopsy specimen, tissue explant, organ culture or any other tissue or cell preparation from a subject or a biological source, and in most preferred embodiments of the invention the biological sample comprises mtDNA. The subject or biological source may be a human or another biological organism, including a genetically engineered organism, such as a non-human animal, a plant, a unicellular organism or a multicellular organism or mitochondria prepared therefrom. The subject or biological source may also be a primary cell culture or culture adapted cell line including but not limited to genetically engineered cell lines that may contain chromosomally integrated or episomal recombinant nucleic acid sequences, immortalized or immortalizable cell lines, somatic cell hybrid or cytoplasmic hybrid “cybrid” cell lines (see, e.g., U.S. Pat. No. 5,888,498), differentiated or differentiatable cell lines, transformed cell lines and the like. A biological sample may, for example, be derived from a recombinant cell line or from a transgenic animal.

[0031] In certain embodiments of the invention, a subject or biological source may be infected with a microorganism such as a DNA virus, a retrovirus, a mycoplasma or a bacterium. In particular embodiments of the invention, for instance, those that relate to forensic sciences, a subject or biological source may provide material comprising mitochondrial DNA that is found at a crime scene or that may be otherwise associated with a person (including, for example, a criminal suspect), place or thing with which a suspect may have come into contact, for use as evidence. In certain related embodiments a biological sample may be derived from an unknown source or biological subject to provide an unidentified sample, which may then be characterized using the compositions and methods described herein. In certain other related embodiments such characterization may be used to determine a mitochondrial genetic relationship between the unknown source or biological subject and one or more of a particular species, a mitochondrial haplogroup, a mitochondrial haplogroup subgroup or a known source or biological subject having at least one mitochondrial single nucleotide polymorphism as provided herein, for instance, to identify the biological subject and/or to determine a genetic relationship between the subject and another individual, population or subpopulation (e.g., a haplogroup, subgroup or family).

[0032] In certain embodiments or the invention, the subject or biological source may be suspected of having or being at risk of having a disease associated with altered mitochondrial function, (e.g., Alzheimer's Disease, type 2 diabetes mellitus), and in certain embodiments of the invention, the subject or biological source may be known to be free of a risk for, or presence of, such a disease, or the risk or presence of a disease may not be known. In other embodiments of the invention, the subject or source may be suspected of being involved in an illegal activity. A subject or sample may be suspected of being genetically related to a specific individual or of belonging to a certain genealogical lineage, race or ethnic group. In other embodiments, the subject or source may not be suspected of being involved in an illegal activity or of being genetically related to a specific individual or of belonging to a certain genealogical lineage, race or ethnic group.

[0033] For example, and according to non-limiting theory, in certain embodiments it may be desirable to use as a subject or biological source a control individual selected according to criteria that will be apparent to a person having ordinary skill in the art based on one or more variables which may require normalization, typically an age- and/or sex-matched individual, a healthy individual or an individual appropriate as a control for a subject suspected of having or being at risk for a particular disease It may also be desirable to use as a subject or biological source a control individual for comparison purposes for maternity or forensic tests. Those having ordinary skill in the art are thus familiar with the design and selection of appropriate controls for different particular purposes. For instance, in certain embodiments it may be desirable to identify such a control individual who is believed to be free of a particular disease, and in certain other embodiments, a control individual may share a mitochondrial genetic relationship to a subject suspected of having a particular disease, such as the mother or sibling of the subject.

[0034] According to the present invention there is provided the unexpected discovery that determination of unprecedented polymorphism in mitochondrial DNA permits refinement of the assignment of individuals to particular mitochondrial haplogroups and haplogroup subgroups as provided herein. Additionally, the present invention exploits the surprising discoveries that in many cases, the haplogroup and/or haplogroup subgroup to which an individual belongs may not be determinative of a presence of a disease or of a risk for having a disease. Rather, as provided by the present disclosure, the improved ability to identify mitochondrial single nucleotide polymorphisms that are associated with particular haplogroups and/or haplogroup subgroups as described herein further permits identification of additional mitochondrial single nucleotide polymorphisms.

[0035] As described in greater detail herein, such additional polymorphisms, which are not definitive for a particular haplogroup and/or haplogroup subgroup, are useful correlates for other purposes, for example, in the identification of unique individuals (e.g., in forensics) and/or for determination of an individual's disease predisposition. Thus, in certain embodiments the present invention provides an improved system for distinguishing individuals belonging to the same mitochondrial haplogroup on the basis of particular mitochondrial DNA polymorphisms described herein. For instance, individuals of a common maternal lineage would fall into a common haplogroup according to standard mitochondrial genotyping methodologies without there being a basis for further differentiation, while the present invention provides one or more mitochondrial single nucleotide polymorphisms that are unique to each individual within a haplogroup (or haplogroup subgroup) thereby permitting such differentiation. These and other embodiments are also described in greater detail below.

[0036] In certain preferred embodiments it may be desirable to determine whether the subject, patient or biological source falls within clinical parameters indicative of a disease such as, but not limited to, Alzheimer's disease (AD) or type 2 diabetes mellitus (type 2 DM). Signs and symptoms of AD accepted by those skilled in the art may be used to so designate a subject or biological source, for example, clinical signs referred to in McKhann et al. (Neurology 34:939, 1984, National Institute of Neurology, Communicative Disorders and Stroke and Alzheimer's Disease and Related Disorders Association Criteria of Probable AD, NINCDS-ADRDA) and references cited therein, or other means known in the art for diagnosing AD. Similarly, signs and symptoms of type 2 diabetes mellitus accepted by those skilled in the art may be used to so designate a subject or biological source, for example, clinical signs referred to in Gavin et al. (Diabetes Care 22 (Suppl. 1):55-519, 1999, American Diabetes Association Expert Committee on the Diagnosis and Classification of Diabetes Mellitus) and references cited therein, or other means known in the art for diagnosis of type 2 diabetes mellitus. Also, by way of illustration and not limitation, accepted criteria such as particular clinical signs and symptoms (or ranges or combinations thereof) will be known to those familiar with the art for any of the other diseases associated with altered mitochondrial function as provided herein. (See, e.g., Chinnery et al., 1999 J. Med. Genet. 36:425)

[0037] For example, according to certain embodiments the present invention provides a method for determining the risk for having, or presence of, a malignant condition in a subject. A malignant condition in a subject, as used herein, refers to the presence of dysplastic, cancerous and/or transformed cells in the subject, including, for example neoplastic, tumor, non-contact inhibited or oncogenically transformed cells, or the like. By way of illustration and not limitation, in the context of the present invention a malignant condition may refer further to the presence in a subject of cancer cells such as colorectal cancer, lung cancer, bladder cancer, or head and neck tumors, as have been described in the context of somatic mtDNA sequence variations distinct from the mitochondrial single nucleotide polymorphisms described herein (cf, Fliss et al., 1999 Science 287:2017; Polyak et al., 1998 Nat. Genet. 20:291).

[0038] In other preferred embodiments, it may be useful to combine the present invention with other procedures for identifying an individual, including methods of analyzing either mtDNA or genomic DNA (e.g., extramitochondrial DNA, i.e., nuclear chromosomal or episomal DNA) such as, for example, RFLP analysis, allele specific oligonucleotide analysis or any other technique for DNA analysis known in the art, including those described above.

[0039] As noted above, a biological sample for use according to the most highly preferred embodiments of the present invention contains mtDNA as provided herein, and may comprise any source of mitochondrial DNA, including any tissue or cell preparation in which mitochondrially derived nucleic acids (e.g., mtDNA) are present. In addition, a source or biological sample comprising mitochondrial DNA may include a source of mtDNA wherein cells or tissues are not present. Biological samples may therefore contain live cells, or dead cells or no cells. Compositions and methods useful for obtaining and detecting mtDNA are provided, for example, in U.S. Pat. Nos. 5,565,323 and 5,840,493. Biological samples may thus be provided by obtaining a sample of blood, hair, scalp, skin or other epithelial cells, bone, saliva, mucous or other secretion, semen, or other forensic sample, biopsy specimen, tissue explant, organ culture or any other tissue, cell preparation or non-cell preparation from a subject or a biological source as provided herein. Thus, for example, in certain specific embodiments, biological samples of the invention may include mtDNA isolated at a crime scene or from another source of forensic evidence. In certain other embodiments, biological samples may include mtDNA isolated from archaeological sites or from human or animal remains.

[0040] Any mtDNA sequence or portion of a mutated mtDNA (e.g., mtDNA that contains at least one single nucleotide polymorphism as provided herein, including mtDNA that contains a plurality of such single nucleotide polymorphisms) sequence that corresponds to the human mtDNA sequence disclosed by Anderson et al. (SEQ ID NO:1, 1981 Nature 290:457; see also Marzuki et al., 1991 Human Genet. 88:139) and revised according to Andrews et al. (1999 Nature Genetics 23:147), or a portion thereof or several portions thereof, may be useful in these embodiments of the invention. Examples of human mtDNA point mutations derived from specific mtDNA sequence regions that are useful in certain embodiments of the invention are disclosed, according to the nucleotide positions at which wildtype and mutant mtDNA differ, in Tables 1 and 2. Those familiar with the art will recognize the established convention for naming regions of the circular mtDNA genome according to the D-loop and the several mtDNA gene loci, including the mitochondrial rRNA genes, the mitochondrial tRNA genes, the mitochondrial NADH dehydrogenase genes, the mitochondrial cytochrome c oxidase genes, the mitochondrial ATP synthase genes and the mitochondrial cytochrome b gene, and the corresponding nucleotide position numbers of SEQ ID NO:1 that are spanned by each of these regions (see, e.g., Scheffler, Mitochondria, 1999 John Wiley & Sons, pages 48-140, and references cited therein; see also “Mitomap” at http://www.gen.emory.edu/mitomap.html). Thus, for example, Table 1 shows mitochondrial single nucleotide polymorphisms that include polymorphisms which have been correlated with Alzheimer's disease, as disclosed in co-pending U.S. patent application Ser. No. 09/551,941 which is hereby incorporated by reference, and Table 2 shows mitochondrial single nucleotide polymorphisms that include polymorphisms which have been correlated with type 2 diabetes in certain mitochondrial haplogroups, as disclosed in the co-pending U.S. Patent application No. 60/333,448, hereby incorporated by reference. Full-length mtDNA sequences from 560 human subjects are disclosed herein in the Sequence Listing and are set forth at SEQ ID NOS:2-561.

3TABLE 1MTDNA SINGLE NUCLEOTIDE POLYMORPHISMSASSOCIATED WITH ADNucl.NucleotideNucleotideSubsti-#Haplo-GenePosition(CRS)tutionSamplesgroupD-LOOP72TC1HD-LOOP114CT1KD-LOOP146TC2U, HD-LOOP185GA1JD-LOOP189AG1K, WD-LOOP199TC1ID-LOOP204TC1WD-LOOP207GA2W, ID-LOOP228GA1JD-LOOP236TC1HD-LOOP239TC2HD-LOOP456CT1HD-LOOP462CT2JD-LOOP482TC1JD-LOOP489TC2JD-LOOP497CT1K, KD-LOOP500CG5H, W, JD-LOOP516CT1UD-LOOP522CDEL1HD-LOOP523ADEL1HD-LOOP547AT1I12S RRNA593TC1K12S RRNA669TC1I12S RRNA960CDEL1U12S RRNA1007GA1J12S RRNA1243TC1W12S RRNA1393GA1H16S RRNA1719GA1H, I16S RRNA1809TC1U16S RRNA2352TC1H16S RRNA2483TC1K16S RRNA2702GA1I16S RRNA2851AG1H16S RRNA3197TC1UND13333CT1HND13336TC1IND13348AG1UND13394TC1JND13398TC1IND13423GT1JND13505AG1WND13559CT1HND13915GA2HND13992CT1HND14024AG1HND14095CT1HND14216TC3T, JTRNA-Q4336TC1HND24529AT1IND24727AG2HND24793AG1HND24917AG1TND24991GA1HND25004TC2H, WND25046GA1WND25228CG1HND25315AG1IND25418TG1JND25426TC1TND25460GA3H, WND25461CG1JTRNA-W5516AG1HTRNA-W5554CA1UTRNA-A5634AG1HTRNA-A/5656AG1UTRNA-NTRNA-C5773GA1JCO16182GA1UCO16221TC1HCO16341CT1UCO16367TC1KCO16371CT1HCO16489CA1TCO17184AG1JCO17325AG1HCO27621TC1KCO27768AG1UCO27787CT1HCO27789GA1JCO27864CT1WCO27895GA1UCO27963AG1JCO28149AG1HCO28251GA2W, ICO28269GA1HCO2/8276-8284DEL1TTRNA-KATPAse 88470AG1HATPASE 88485GA1IATPASE 88508AG1IATPASE 68602TC1HATPASE68697GA1TATPASE 68752AG1HATPASE 68901AG1IATPASE 68994GA1WATPASE 69123GA1HCO39254AG1HCO39362AG1HCO39380GA2HCO39477GA1UCO39554GA1HCO39708TC1HCO39804GA1HCO39861TC1HTRNA-G10034TC1ITRNA-G10044AG1HND310238TC2ITRNA-R10463TC2T, JND4L10589GA1HND410978AG1KND411065AG1IND411251AG1JND411253TC1HND411272AG1UND411470AG2KND411527CT1JND411611GA1HND411674CT1WND411812AG1TND411914GA2KND411947AG1WND512414TC1WND512501GA2IND512609TC1UND512705CT4H, W, IND512954TC1KND513111TC1HND513194GA2H, UND513212CT1HND513368GA1TND513617TC1UND513780AG2IND513966AG1HND514020TC1TND514148AG1WND614178TC1IND614179AG1UND614182TC1UND614212TC1HND614233AG1TND614470TC1HND614582AG1HCYT.B14905GA1TCYT.B15028CA1TCYT.B15043GA3T, ICYT.B15191TC1UCYT.B15299TC1ICYT.B15380AG1UCYT.B15553GA1HCYT.B15607AG1TCYT.B15758AG1ICYT.B15790CT1UCYT.B15808AG1HCYT.B15833CT1HCYT.B15884GC1WTRNA-T15924AG3K, ITRNA-T15928GA1TD-LOOP16069CT2JD-LOOP16086TC1ID-LOOP16093TC1KD-LOOP16126TC3T, JD-LOOP16129GA2H, ID-LOOP16145GA1ID-LOOP16147CA1ID-LOOP16172TC1UD-LOOP16174CT1UD-LOOP16182AC1TD-LOOP16183AC4T, U, HD-LOOP16189TC5T, U, HD-LOOP16192CT1UD-LOOP16193CT1JD-LOOP16223CT4H, W, ID-LOOP16224TC3KD-LOOP16234CT1KD-LOOP16235AG1JD-LOOP16239CT1HD-LOOP16248CT1ID-LOOP16256CT1JD-LOOP16261CT1HD-LOOP16270CT1UD-LOOP16278CT2U, HD-LOOP16290CT1HD-LOOP16292CT1WD-LOOP16293AG1HD-LOOP16294CT1TD-LOOP16298TC2T, HD-LOOP16300AG1JD-LOOP16304TC1HD-LOOP16309AG1JD-LOOP16311TC5H, K, UD-LOOP16320CT1ID-LOOP16355CT1ID-LOOP16362TC2HD-LOOP16391GA1ID-LOOP16482AG2HD-LOOP16524AG1KD-LOOP45CINS1UD-LOOP140CT1HD-LOOP188AG2JD-LOOP291AINS1UD-LOOP455TDEL1UD-LOOP500CG2HD-LOOP513GDEL1HD-LOOP514CDEL1HD-LOOP516CT1UD-LOOP527CG2K, TD-LOOP533AG1ID-LOOP568CCCINS1ID-LOOP710GA1H12S RRNA749AG1K12S RRNA960CDEL1U12S RRNA1508CT1H16S RRNA1809TC1U16S RRNA2352TC1H16S RRNA2833AG1UND13559CT1XND13745GA1UND25237GA1KND25348CT1HTRNA-W5516AG1HTRNA-W5554CA1UTRNA-C5773GA1HCO15979GA1HCO16182GA1UCO16272AG1HCO16320TC1TCO16341CT1UCO16480GA2H, TCO16498CG1UCO16722GA1BCO16845CT1KCO16911TC1UCO27787CT1XCO27830GA1HCO27853GA1TCO27927CG1HCO27978CG1HCO27985CG2H, UCO28149AG1HATPASE 68865GA1UATPASE 69098TC1BCO39129CT1HCO39708TC1HCO39861TC1HND310154AG1KND310394CT1HTRNA-R10448TC1HND4L10724TC1UND411272AG1UND411590AG1HND411824AG1HND411893AG1HTRNA-S12217AG1HND512471TC1HND512609TC1UND512738TG1KND513194GA1UND513212CT1XND513351CT1UND513746CT1HND513889GA1HND513943CT1KND514133AG1HND614323GA1HND614512TC1UTRNA-E14684CT1UCYB14869GA1HCYB14978AG1HCYB15295CT1TCYB15380AG1UCYB15734GA1UCYB15790CT1UTRNA-T15947AG1JD-LOOP16136TC1HD-LOOP16168CT1UD-LOOP16174CT1UD-LOOP16209TC1TD-LOOP16219AG1UD-LOOP16239CT1UD-LOOP16260CT1JD-LOOP16263TC1HD-LOOP16295CT1HD-LOOP16343AG1UD-LOOP16524AG1HD-LOOP16526GA1U

[0041]

4

TABLE 2

MITOCHONDRIAL SNPS AND HAPLOGROUPS IN NIDDM

NUCL.

NUCLEOTIDE
NUCLEOTIDE
SUBSTI-

HAPLO-

GENE
POSITION
(CRS)
TUTION
# SAMPLES
GROUP

D-LOOP
62
G
C
1
B

D-LOOP
62
G
C
1
B

D-LOOP
94
G
A
1
A

D-LOOP
94
G
A
1
A

D-LOOP
95
A
C
1
L1

D-LOOP
114
C
G
1
B

D-LOOP
151
C
T
1
L1

D-LOOP
159
T
C
1
H

D-LOOP
188
A
G
1
J

D-LOOP
198
C
T
1
L2

D-LOOP
215
A
G
1
C

D-LOOP
215
A
G
1
C

D-LOOP
257
A
G
1
H

D-LOOP
267
T
C
1
D

D-LOOP
271
C
T
1
B

D-LOOP
288
A
G
1
A

D-LOOP
317
G
A
1
L1

D-LOOP
317
G
A
1
L1

D-LOOP
320
C
T
1
B

D-LOOP
343
C
T
1
B

D-LOOP
362
A
INS
1
A

D-LOOP
418
C
T
1
L2

D-LOOP
453
T
DEL
1
B

D-LOOP
454
T
DEL
1
B

D-LOOP
466
T
C
1
U

D-LOOP
480
T
C
1
H

D-LOOP
481
C
T
1
B

D-LOOP
481
C
T
1
B

D-LOOP
493
A
G
1
C

D-LOOP
568
C
INS
1
D

D-LOOP
568
C
INS
1
H

D-LOOP
568
C
INS
1
T

D-LOOP
568
CC
INS
1
K

D-LOOP
568
C
INS
1
C

TRNA-F
629
T
C
1
A

12S RRNA
735
A
G
1
H

12S RRNA
921
T
C
1
L3

12S RRNA
956
C
INS
1
B

12S RRNA
956
C
INS
1
A

12S RRNA
956
C
INS
1
B

12S RRNA
960
C
DEL
1
U

12S RRNA
961
T
C
1
A

12S RRNA
961
T
C
1
A

12S RRNA
966
CC
INS
1
A

12S RRNA
966
C
INS
1
A

12S RRNA
1002
C
T
1
B

12S RRNA
1002
C
T
1
B

12S RRNA
1007
G
C
1
L2

12S RRNA
1420
T
C
1
L1

12S RRNA
1462
G
A
1
H

16S RRNA
1766
T
C
1
L1

16S RRNA
2145
G
A
1
H

16S RRNA
2157
A
INS
1
L1

16S RRNA
2735
G
A
1
C

16S RRNA
2755
A
G
1
L1

16S RRNA
2863
T
C
1
L1

16S RRNA
2903
T
C
1
L3

16S RRNA
3203
A
G
1
L3

ND1
3308
T
A
1
A

ND1
3309
C
INS
1
A

ND1
3311
C
T
1
A

ND1
3338
T
C
1
L2

ND1
3338
T
C
1
B

ND1
3513
C
T
1
L1

ND1
3766
T
C
1
B

ND1
3766
T
C
1
B

ND1
3777
T
C
1
L2

ND1
3927
A
G
1
L1

ND1
3936
C
A
1
L2

ND1
4012
A
G
1
B

ND1
4129
A
G
1
T

ND1
4167
C
T
1
B

ND1
4167
C
T
1
B

ND1
4232
T
C
1
B

ND1
4242
C
T
1
C

TRNA-Q
4386
T
C
1
H

TRNA-Q
4392
C
T
1
L3

TRNA-M
4435
A
G
1
D

ND2
4506
A
G
1
L1

ND2
4655
G
A
1
L2

ND2
4733
T
C
1
L3

ND2
4908
C
T
1
U

ND2
5156
A
T
1
H

ND2
5165
C
T
1
A

ND2
5183
A
G
1
A

ND2
5237
G
A
1
L1

ND2
5372
A
G
1
L3

TRNA-W
5463
G
A
1
H

TRNA-C
5824
G
A
1
D

TRNA-Y/CO1
5899
C
DEL
1
B

TRNA-Y/CO1
5900
CC
INS
1
L1

CO1
6026
G
A
1
L2

CO1
6164
C
T
1
B

CO1
6164
C
T
1
L2

CO1
6164
C
T
1
B

CO1
6308
C
T
1
A

CO1
6308
C
T
1
A

CO1
6311
C
T
1
L2

CO1
6340
C
T
1
H

CO1
6378
T
C
1
L1

CO1
6392
T
C
1
T

CO1
6620
T
C
1
A

CO1
6663
A
G
1
L2

CO1
6710
A
G
1
K

CO1
6722
G
A
1
B

CO1
6932
A
G
1
L3

CO1
6951
G
A
1
D

CO1
6951
G
A
1
K

CO1
7158
A
G
1
B

BO1
7158
A
G
1
B

CO1
7202
A
G
1
L1

CO2
7621
T
C
1
H

CO2
7692
T
C
1
L1

CO2
7697
G
A
1
C

CO2
7702
G
A
1
L2

CO2
7853
G
A
1
D

CO2
7978
C
G
1
L3

CO2
7978
C
G
1
L2

CO2
7985
C
G
1
L3

CO2
7985
C
G
1
L2

CO2
8032
C
T
1
A

CO2
8087
T
C
1
L1

CO2
8098
A
G
1
K

CO2
8119
T
C
1
L2

ATPASE 8
8419
T
C
1
L1

ATPASE 8
8460
A
G
1
B

ATPASE 8
8460
A
G
1
B

ATPASE 8
8478
C
T
1
H

ATPASE 8
8530
A
G
1
L3

ATPASE 8
8557
G
A
1
A

ATPASE 6
8574
C
T
1
H

ATPASE 6
8604
T
C
1
L2

ATPASE 6
8679
A
G
1
H

ATPASE 6
8680
C
T
1
D

ATPASE 6
8746
T
G
1
A

ATPASE 6
8856
G
C
1
K

ATPASE 6
8987
T
C
1
L2

ATPASE 6
9017
T
C
1
A

ATPASE 6
9052
A
G
1
L1

ATPASE 6
9097
A
G
1
B

ATPASE 6
9098
T
C
1
B

ATPASE 6
9111
T
C
1
L3

ATPASE 6
9145
G
A
1
W

ATPASE 6
9192
G
A
1
E

ATPASE 6
9198
C
T
1
W

CO3
9254
A
G
1
L3

CO3
9336
A
G
1
L1

CO3
9557
C
T
1
C

CO3
9647
T
C
1
L1

CO3
9813
T
A
1
K

CO3
9836
T
C
1
A

CO3
9903
T
C
1
L2

CO3
9932
G
A
1
L3

CO3
9932
G
A
1
B

TRNA-G
9950
T
C
1
B

TRNA-G
10031
T
C
1
A

ND3
10335
C
T
1
B

ND4L
10654
C
T
1
H

ND4
10834
C
T
1
E

ND4
10853
C
T
1
C

ND4
10920
C
T
1
L2

ND4
11239
A
G
1
L3

ND4
11854
T
C
1
K

ND4
11884
A
G
1
B

ND4
11935
T
C
1
W

ND4
12063
C
T
1
U

ND4
12064
C
T
1
U

ND4
12092
C
T
1
C

TRNA-H
12172
A
G
1
L1

TRNA-H
12172
A
G
1
H

ND5
12453
T
C
1
H

ND5
12454
G
A
1
C

ND5
12477
T
C
1
L1

ND5
12582
A
G
1
U

ND5
12723
A
G
1
K

ND5
12768
A
G
1
L1

ND5
12870
C
T
1
L3

ND5
13254
T
C
1
E

ND5
13263
A
G
1
W

ND5
13269
A
G
1
L3

ND5
13473
A
C
1
U

ND5
13494
C
T
1
E

ND5
13542
A
G
1
L3

ND5
13617
T
C
1
U

ND5
13629
A
G
1
L2

ND5
13707
G
A
1
A

ND5
13711
G
A
1
H

ND5
13749
C
T
1
W

ND5
13781
T
C
1
H

ND5
13924
C
T
1
L2

ND5
14053
A
G
1
L1

ND5
14088
T
C
1
L1

ND6
14173
T
C
1
L2

ND6
14251
A
G
1
H

ND6
14280
A
G
1
A

ND6
14388
A
G
1
HAPLO-

GROUP

ND6
14460
C
G
1
A

ND6
14548
A
G
1
H

TRNA-E
14693
A
G
1
D

CYT B
14769
A
G
1
L1

CYT B
14974
C
G
1
B

CYT B
15016
C
T
1
L1

CYT B
15043
G
A
1
C

CYT B
15107
C
T
1
A

CYT B
15266
A
G
1
D

CYT B
15317
G
A
1
E

CYT B
15386
C
T
1
A

CYT B
15451
C
T
1
L2

CYT B
15496
A
G
1
L3

CYT B
15508
C
T
1
K

CYT B
15511
T
C
1
U

CYT B
15589
C
A
1
H

CYT B
15616
C
T
1
B

CYT B
15616
C
T
1
B

CYT B
15766
A
G
1
HAPLO-

GROUP

CYT B
15784
T
C
1
W

TRNA-T
15946
C
T
1
K

D-LOOP
16086
T
C
1
L2

D-LOOP
16086
T
C
1
L2

D-LOOP
16093
T
C
1
L3

D-LOOP
16109
A
C
1
C

D-LOOP
16114
C
T
1
H

D-LOOP
16136
T
C
1
B

D-LOOP
16145
G
A
1
L1

D-LOOP
16147
C
T
1
H

D-LOOP
16153
G
A
1
H

D-LOOP
16163
A
G
1
T

D-LOOP
16166
A
G
1
L3

D-LOOP
16185
C
T
1
L3

D-LOOP
16185
C
T
1
L3

D-LOOP
16188
C
T
1
C

D-LOOP
16213
G
A
1
L1

D-LOOP
16235
A
G
1
H

D-LOOP
16242
C
T
1
H

D-LOOP
16245
C
T
1
W

D-LOOP
16249
T
C
1
B

D-LOOP
16259
C
T
1
T

D-LOOP
16274
G
A
1
D

D-LOOP
16274
G
A
1
L1

D-LOOP
16274
G
A
1
T

D-LOOP
16274
G
A
1
H

D-LOOP
16286
C
G
1
L1

D-LOOP
16295
C
T
1
L2

D-LOOP
16311
T
C
1
C

D-LOOP
16312
A
G
1
B

D-LOOP
16336
G
A
1
A

D-LOOP
16344
C
T
1
B

D-LOOP
16354
C
T
1
D

D-LOOP
16354
C
T
1
H

D-LOOP
16355
C
T
1
L2

D-LOOP
16357
T
C
1
B

D-LOOP
16357
T
C
1
H

D-LOOP
16468
T
C
1
D

D-LOOP
16468
T
C
1
A

D-LOOP
16483
G
A
1
B

D-LOOP
16512
T
C
1
A

D-LOOP
16554
A
G
1
A

[0042] Portions of the mtDNA sequence of SEQ ID NO:1, and portions of a sample mtDNA sequence derived from a biological source or subject as provided herein, are regarded as “corresponding” nucleic acid sequences, regions, fragments or the like, based on the convention for numbering mtDNA nucleic acid positions according to SEQ ID NO:1 (Anderson et al., Nature 290:457, 1981), wherein a sample mtDNA sequence is aligned with the mtDNA sequence of SEQ ID NO:1 such that at least 70%, preferably at least 80% and more preferably at least 90% of the nucleotides in a given sequence of at least 20 consecutive nucleotides of a sequence are identical. For example, a portion of the mtDNA sequence in a biological sample containing mtDNA from a subject suspected of having or being at risk for having AD, or, as another example, a portion of the mtDNA sequence in mtDNA containing at least one mitochondrial single nucleotide polymorphism that is associated with Alzheimer's disease as provided herein (e.g., mutated mtDNA), may be aligned with a corresponding portion of the mtDNA sequence of SEQ ID NO:1 using any of a number of alignment procedures and/or tools with which those having ordinary skill in the art will be familiar (e.g., CLUSTAL W, Thompson et al., 1994 Nucl. Ac. Res. 22:4673; CAP, www.no.embnet.org/clustalw.html: FASTA/FASTP, Pearson, 1990 Proc. Nat. Acad. Sci. USA 85:2444, available from D. Hudson, Univ. of Virginia, Charlottesville, Va.). In certain preferred embodiments, a sample mtDNA sequence is greater than 95% identical to a corresponding mtDNA sequence of SEQ ID NO:1. In certain other preferred embodiments, a sample mtDNA sequence is identical to a corresponding mtDNA sequence of SEQ ID NO:1. Those oligonucleotide probes having sequences that are identical in corresponding regions of the mtDNA sequence of SEQ ID NO:1 and sample mtDNA may be identified and selected following hybridization target DNA sequence analysis, to verify the absence of mutations.

[0043] According to the present invention and as known in the art, the term “haplotype” refers to a particular combination of genetic markers in a defined region of the mitochondrial genome. Such genetic markers include, for example, RFLPs and SNPs. RFLPs (restriction fragment polymorphisms) result from an alteration in a recognition site, often a palindrome, that is specifically cleaved in a site-specific manner by a DNAse known as a restriction enzyme. A SNP (single nucleotide polymorphism) is a change (e.g., a deletion, insertion or substitution) in any single nucleotide base in a region of a genome of interest. In particularly preferred embodiments provided by the instant disclosure, the genome of interest is the mitochondrial genome. Because SNPs vary from individual to individual, they are useful markers for studying the association of a genome. Moreover, because they occur more frequently than other markers such as RFLPs, analysis of SNPs should produce a “higher resolution” picture of disease, haplogroup and individual-associated genetic marker segregation (Weiss, (1998) Genome Res. 8:691-697; Gelbert and Gregg, (1997) Curr. Opin. Biotechnol. 8:669-674).

[0044] The term “haplogroup” refers to a group of haplotypes found in association with one another. Several mitochondrial DNA haplotypes and haplogroups are known in the art, including nine European mtDNA haplogroups as well as discrete Asian, Native American and African mtDNA haplogroups, each identified on the basis of the presence or absence of one or more specific restriction endonuclease recognition sites (see, e.g., Wallace et al., 1999 Gene 238:211; Torroni et al., 1996 Genetics 144:1835). In addition to haplogroups that may be regarded as clusters of haplotypes, designation of individuals as belonging to various nodes or branches within such a cluster, for example, subgroupings, subclusters, subcategories or the like, may be referred to as assignment to a “haplogroup subgroup”, as described, for example, by Macaulay et al. (1999 Am J. Hum. Genet. 64:232-249). As shown in FIGS. 1 and 2, for example, and as provided by the unexpected discovery of the present invention, particular mitochondrial haplogroups (e.g., U, K, J, T, etc.) may be divided and further subdivided into haplogroup subgroups on the basis of polymorphisms detected at nucleotide positions having the indicated numbers corresponding to nucleotide positions in SEQ ID NO:1.

[0045] Nucleic acid sequences within the scope of the invention include isolated DNA and RNA sequences that specifically hybridize under conditions of moderate or high stringency to mtDNA nucleotide sequences, including mtDNA sequences disclosed herein or fragments thereof, and their complements. As used herein, conditions of moderate stringency, as known to those having ordinary skill in the art, and as defined by Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed. Vol. 1, pp. 1.101-104, Cold Spring Harbor Laboratory Press (1989), include, for example, the use as a prewashing solution for nitrocellulose filters on which proband nucleic acids have been immobilized of 5× SSC, 0.5% SDS, 1.0 mM EDTA (pH 8.0), hybridization conditions of 50% formamide, 6× SSC at 42° C. (or other similar hybridization solution), and washing conditions of about 50-60° C., 0.5× SSC, 0.1% SDS. Conditions of high stringency are defined as hybridization conditions as above, and with washing at 60-68° C., 0.2×SSC, 0.1% SDS. In other embodiments, hybridization to an mtDNA nucleotide sequence may be at normal stringency, which is approximately 25-30° C. below Tm of the native duplex (e.g., 5× SSPE, 0.5% SDS, 5× Denhardt's solution, 50% formamide, at 42° C. or equivalent conditions), at low stringency hybridizations, which utilize conditions approximately 40° C. below Tm, or at high stringency hybridizations, which utilize conditions approximately 110° C. below Tm. The skilled artisan will recognize that the temperature, salt concentration, and/or chaotrope composition of hybridization and wash solutions may be adjusted as necessary according to factors such as the length and nucleotide base composition of the probe. (See also, e.g., Ausubel et al., (1987) Current Protocols in Molecular Biology, Greene Publishing) Thus, desired variations in stringency of hybridization conditions may be achieved by altering the time, temperature and/or concentration of the solutions used for pre-hybridization, hybridization and wash steps. Accordingly, it will be appreciated that suitably stringent conditions can be readily selected without undue experimentation where a desired selectivity of the probe is identified, based on its ability to hybridize to one or more certain proband sequences while not hybridizing to certain other proband sequences.

[0046] An “isolated nucleic acid molecule” refers to a polynucleotide molecule in the form of a separate fragment or as a component of a larger nucleic acid construct, that has been separated from its source cell (including the chromosome it normally resides in) at least once, preferably in a substantially pure form. Isolated nucleic acids may be nucleic acids having particular disclosed nucleotide sequences or may be regions, portions or fragments thereof. Those having ordinary skill in the art are able to prepare isolated nucleic acids having the complete nucleotide sequence, or the sequence of any portion of a particular isolated nucleic acid molecule, when provided with the appropriate nucleic acid sequence information as disclosed herein. Nucleic acid molecules may be comprised of a wide variety of nucleotides, including DNA, RNA, nucleotide analogues such as phosphorothioates or peptide nucleic acids, or other analogues with which those skilled in the art will be familiar, or some combination of these.

[0047] The present invention, as described herein, provides mtDNA sequences and isolated mtDNA nucleic acid molecules. mtDNA may be isolated from cellular DNA according to well known methodologies, for example those described in U.S. Pat. No. 5,840,493, which is hereby incorporated by reference in its entirety. The Sequence Listing includes full-length mtDNA sequences from 560 different human subjects, as set forth at SEQ ID NOS:2-561.

[0048] Where it is advantageous to use oligonucleotide primers according to the present invention, such primers may be 10-60 nucleotides in length, preferably 15-35 nucleotides and still more preferably 18-30 nucleotides in length. Primers may be useful in the present invention for quantifying mtDNA mutations, including single nucleotide polymorphisms or homoplasmic mtDNA mutations provided herein, by any of a variety of techniques well known in the art for determining the amount of specific nucleic acid target sequences present in a sample based on specific hybridization of a primer to the target sequence. Optionally, in certain of these techniques, hybridization precedes nucleotide polymerase catalyzed extension of the primer using the strand containing the target sequence as a template, and/or ligation of oligonucleotides hybridized to adjacent target sequences, and embodiments of the invention using primer extension are particularly preferred.

[0049] For examples of references on such quantitative detection techniques, including those that may be used to detect nucleotide insertions, substitutions or deletions in a portion of an mtDNA sequence site near an oligonucleotide primer target hybridization site that corresponds to a portion of the wildtype mtDNA sequence as disclosed in Anderson et al. (1981 Nature 290:457, SEQ ID NO:1) or a mutated site such as may be created by any of the mtDNA point mutations disclosed herein, and further including those that involve primer extension, see U.S. Pat. No. 5,760,205 and the references cited therein, all of which are hereby incorporated by reference, and see also, for example, Botstein et al. (Am. J. Hum. Gen. 32:314, 1980), Gibbs et al. (Nucl. Ac. Res. 17:2437, 1989), Newton et al. (Nucl. Ac. Res. 17:2503, 1989), Grossman et al. (Nucl. Ac. Res. 22:4527, 1994), and Saiki et al. (Proc. Nat. Acad. Sci. 86:6230, 1989), all of which are hereby incorporated by reference. A particularly useful method for this purpose is the primer extension assay disclosed by Fahy et al. (Nucl. Acids Res. 25:3102, 1997) and by Ghosh et al. (Am. J. Hum. Genet. 58:325, 1996), both of which references are hereby incorporated in their entireties, as is Krook et al. (Hum. Molec. Genet. 1:391, 1995) which teaches modification of primer extension reactions to detect multiple nucleotide substitutions, insertions, deletions or other mutations. Other examples of useful techniques for quantifying the presence of specific nucleic acid target sequences in a sample include but need not be limited to labeled probe hybridization to the target nucleic acid sequences with or without first partially separating target nucleic acids from other nucleic acids present in the sample.

[0050] Examples of other useful techniques for determining the amount of specific nucleic acid target sequences present in a sample based on specific hybridization of a primer to the target sequence include specific amplification of target nucleic acid sequences and quantification of amplification products, including but not limited to polymerase chain reaction (PCR; Gibbs et al., (1989) Nucl. Ac. Res. 17:2437), transcriptional amplification systems, strand displacement amplification and self-sustained sequence replication (3SR; Ghosh et al, (1995) in Molecular Methods for Virus Detection, Academic Press, NY, pp. 287-314), the cited references for which are hereby incorporated in their entireties. Examples of other useful techniques include ligase chain reaction, single stranded conformational polymorphism analysis, Q-beta replicase assay, restriction fragment length polymorphism (RFLP; Botstein et al., (1980) Am. J. Hum. Gen. 32:314) analysis and cycled probe technology, as well as other suitable methods that will be known to those familiar with the art.

[0051] In a particularly preferred embodiment of the invention, primer extension is used to quantify mtDNA mutations present in a biological sample. (Ghosh et al., (1996) Am. J. Hum. Genet. 58:325) This embodiment may offer certain advantages by permitting both wildtype and mutant mtDNA to be simultaneously quantified using a single oligonucleotide primer capable of hybridizing to a complementary nucleic acid target sequence that is present in a defined region of wildtype mtDNA and in a corresponding region of a mutated mtDNA sequence. Without wishing to be bound by theory, the use of a single primer for quantification of wildtype and mutated mtDNA is believed to avoid uncertainties associated with potential disparities in the relative hybridization properties of multiple primers and may offer other advantages. Where such a target sequence is situated adjacent to a mutated mtDNA nucleotide sequence position that is a nucleotide substitution, insertion or deletion relative to the corresponding wildtype mtDNA sequence position, primer extension assays may be designed such that oligonucleotide extension products of primers hybridizing to mutated mtDNA are of different lengths than oligonucleotide extension products of primers hybridizing to wildtype mtDNA. Accordingly, the amount of mutant mtDNA in a sample and the amount of wildtype mtDNA in the sample may be determined by quantification of distinct extension products that are separable on the basis of sequence length or molecular mass.

[0052] Sequence length or molecular mass of primer extension assay products may be determined using any known method for characterizing the size of nucleic acid sequences with which those skilled in the art are familiar. In a preferred embodiment, primer extension products are characterized by gel electrophoresis. In another preferred embodiment, primer extension products are characterized by mass spectrometry (MS), which may further include matrix assisted laser desorption ionization/time of flight (MALDI-TOF) analysis or other MS techniques known to those having skill in the art. See, for example, U.S. Pat. No. 5,622,824, U.S. Pat. No. 5,605,798 and U.S. Pat. No. 5,547,835, all of which are hereby incorporated by reference in their entireties. In another preferred embodiment, primer extension products are characterized by liquid or gas chromatography, which may further include high performance liquid chromatography (HPLC), gas chromatography-mass spectrometry (GC-MS) or other well known chromatographic methodologies.

[0053] In another particularly preferred embodiment of the invention, DNA in a biological sample containing mtDNA is first amplified by methodologies well known in the art, such that the amplification products may be used as templates in a method for detecting single nucleotide polymorphisms or homoplasmic mtDNA mutations present in the sample. Accordingly, it may be desirable to employ oligonucleotide primers that are complementary to target sequences that are identical in, and common to, wildtype and mutant mtDNA, for example PCR amplification templates and primers prepared according to Fahy et al. (Nucl. Acids Res., 25:3102, 1997) and Davis et al. (Proc. Nat. Acad. Sci. USA 94:4526, 1997; see also Hirano et al., Proc. Nat. Acad. Sci. USA 94:14894, 1997, and Wallace et al., Proc. Nat. Acad. Sci. USA 94:14900,1997.)

[0054] In certain other preferred embodiments, mtDNA mutations may be efficiently detected, screened and/or quantified by high throughput hybridization methodologies directed to independently probing a plurality of distinct mtDNAs, or a plurality of distinct oligonucleotide primers as provided herein, that have been immobilized as nucleic acid arrays on a solid phase support. Typically, the solid support may be silica, quartz or glass, or any other material on which nucleic acid may be immobilized in a manner that permits appropriate hybridization, washing and detection steps as known in the art and as provided herein. In preferred embodiments, solid-phase nucleic acid arrays are precisely spatially addressed, as described, for example, U.S. Pat. No. 5,800,992 (see also, e.g., WO 95/21944; Schena et al., 1995 Science 270:467-470, 1995; Pease et al., 1994 Proc. Nat. Acad. Sci. USA 91:5022; Lipshutz et al., 1995 Biotechniques 19: 442-447).

[0055] Detection of hybridized (e.g., duplexed) nucleic acids on the nucleic acid array may be achieved according to any known procedure, for example, by spectrometry or potentiometry (e.g., MALDI-MS). Within certain preferred embodiments the array contains oligonucleotides that are less than 50 bp in length. For high throughput screening of nucleic acid arrays, the format is preferably amenable to automation. It is preferred, for example, that an automated apparatus for use according to high throughput screening embodiments of the present invention is under the control of a computer or other programmable controller. The controller can continuously monitor the results of each step of the nucleic acid deposition, washing, hybridization, detection and related processes, and can automatically alter the testing paradigm in response to those results.

[0056] The present invention also provides compositions and methods that are useful in pharmacogenomics, for the classification and/or stratification of a subject or a patient population. Such stratification may involve, for example, correlation of single nucleotide polymorphisms or homoplasmic mtDNA mutations as provided herein with, for instance, one or more particular traits in a subject, such as, for example, indicators of the responsiveness to, or efficacy of, a particular therapeutic treatment or characteristic genomic DNA alterations, mutations, deletions, insertions or polymorphisms.

[0057] As described herein, determination of specific single nucleotide polymorphisms or homoplasmic mtDNA mutations may be used to stratify a patient population. Accordingly, in another preferred embodiment of the invention, determination of such mutations in a biological sample from a subject diagnosed with a disease may provide a useful correlative indicator for that subject. A disease subject so classified on the basis of one or more specific mutations may then be monitored using clinical parameters referred to above and known on the art, such that correlation between particular mtDNA mutations and any particular clinical score used to evaluate a disease may be monitored. For example, stratification of an AD patient population according to at least one of the single nucleotide polymorphisms or homoplasmic mtDNA mutations provided herein may provide a useful marker with which to correlate the efficacy of any candidate therapeutic agent being used in AD subjects. In a further preferred embodiment of this aspect of the invention, determination of one or more specific mtDNA mutations in concert with determination of an AD subject's APOE genotype may also be useful. These and related advantages will be appreciated by those familiar with the art.

[0058] In particularly preferred embodiments, oligonucleotide primers will be employed that permit specific detection of the single nucleotide polymorphisms or homoplasmic mtDNA point mutations disclosed in Table 3, wherein specific substitution and deletion mutations in mitochondrial genes including, for example, those encoding 12S rRNA, 16S rRNA, several tRNAs, COX1, COX2, COX3, cytochrome b, ATPase 8, ATPase 6, ND1, ND2, ND4 and ND5 are disclosed, as are numerous mutations in the mtDNA D-loop region. Each mutation listed in Table 3 is designated with (i) the identity of the particular nucleotide position of the mutation according to the wildtype human mtDNA sequence (Anderson et al., 1981 Nature 290:457; see also Andrews et al., 1999 Nature Genetics 23:147 and references cited therein), (ii) the mitochondrial gene region according to the convention of Anderson et al. (1981) and (iii) if the mutation is not a transition (purine-to-purine or pyrimidine-to-pyrimidine), the identity of the mutated nucleotide at that position in the case of a transversion (purine-to-pyrimidine or pyrimidine-to-purine), identified as disclosed herein, or of a deletion or insertion mutation. Thus, for example, the purine nucleotide G (guanine) situated at position 3010 of the wildtype mtDNA 16S rRNA gene is mutated to the purine nucleotide A (adenosine) in mtDNA analyzed from a substantial number of haplogroup H samples (see Table 3).

5TABLE 3HAPLOGROUP-SPECIFIC AND HAPLOGROUP-ASSOCIATEDMITOCHONDRIAL SINGLE NUCLEOTIDE POLYMORPHISMSIN MTDNA OF 560 UNRELATED INDIVIDUALSNPGENETV/DEL/INSHAPLOGROUPA. Haplogroup-specific polymorphism64D-LOOPA235D-LOOPA66312S rRNAA159812S rRNAA173616S rRNAA3316ND1A4248ND1A4824ND2A4970ND2A6308COIA7112COIA7724COIIA> TA8794ATPase6A11314ND4A12468ND5A12811ND5A13855ND5A14364ND6A16111D-LOOPA16290D-LOOPA16319D-LOOPA82712S rRNAB3547ND1B4820ND2B4977ND2B6023COIB6216COIB6413COIB6473COIB6755COIB7241COIB9950COIIIB11177ND4B15535CYT BB16217D-LOOPB249D-LOOPDELC289D-LOOPDELC290D-LOOPDELC3552ND1C4715ND2C7196COIC> AC7694COIIIC8584ATPase6C9545COIIIC10454tRNA-RC12642ND5C12978ND5C14318ND6C15487CYT BA> TC15930tRNA-TC209216S rRNAD4883ND2D5178ND2C> AD8414ATPase8D11593ND4A> TD14668ND6D302716S rRNAE3705ND1E4491ND2E7598COIIE239D-LOOPH456D-LOOPH477D-LOOPH95112S rRNAH96112S rRNAH3277tRNA-LH3333ND1H3591ND1H3796ND1H3915ND1H3992ND1H4024ND1H4310tRNA-IH4336tRNA-QH4531ND2H4727ND2H4745ND2H4772ND2H4793ND2H5004ND2H6365COIH6776COIH6869COIH7013COIH8269COIIH8448ATPase8H8602ATPase6H8803ATPase6H8839ATPase6H8843ATPase6H8898ATPase6H9123ATPase6H9150ATPase6H9380COIIIH9804COIIIH10044tRNA-GH11353ND4H11560ND4H12579ND5H13404ND5H13680ND5H13759ND5H14125ND5H14350ND6H14365ND6H14470ND6T> AH14582ND6H14872CYT BH15466CYT BH15789CYT BH15808CYT BH15833CYT BH16162D-LOOPH16293D-LOOPH199D-LOOPI250D-LOOPI3447ND1I3990ND1I4529ND2A> TI6734COII8616ATPase6G> TI9947COIIII10034tRNA-GI10238ND3I11065ND4I12501ND5I13780ND5I15758CYT BI16391D-LOOPI228D-LOOPJ295D-LOOPJ462D-LOOPJ215816S rRNAJ238716S rRNAJ3394ND1J5198ND2J5633tRNA-AJ6464COIC> AJ6554COIJ6671COIJ7476tRNA-SJ7711COIIJ10084ND3J10172ND3J10192ND3J10499ND4LJ10598ND4LJ10685ND4LJ11377ND4J12127ND4J12570ND5J12612ND5J13281ND5J13681ND5J13879ND5J13933ND5J14569ND6J15679CYT BJ15812CYT BJ16069D-LOOPJ16092D-LOOPJ16261D-LOOPJ114D-LOOPK497D-LOOPK593tRNA-FK118912S rRNAK221716S rRNAK248316S rRNAK3480ND1K4295tRNA-IK4561ND2K5814tRNA-CK6260COIK9006ATPase6K9055ATPase6K9698COIIIK9716COIIIK9962COIIIK10289ND3K10550ND4LK10978ND4K11299ND4K11470ND4K11485ND4K11840ND4K11869ND4C> AK11923ND4K12954ND5K13135ND5K13740ND5K13967ND5K14002ND5K14037ND5K14040ND5K14167ND6K15884CYT BK15946tRNA-TK16224D-LOOPK16234D-LOOPK16463D-LOOPK185D-LOOPG> TL1186D-LOOPL1189D-LOOPL1236D-LOOPL1247D-LOOPL1297D-LOOPL1357D-LOOPL171012S rRNAL182512S rRNAL1104812S rRNAL1173816S rRNAL1224516S rRNAL1239516S rRNADELL1275816S rRNAL1276816S rRNAL1288516S rRNAL13308ND1L13516ND1C> AL13666ND1L13693ND1L13777ND1L13796ND1A> TL13843ND1L14312tRNA-IL14586ND2L15036ND2L15393ND2L15442ND2L15603tRNA-AL15655tRNA-AL15913COIL15951COIL16071COIL16150COIL16185COIL16253COIL16548COIL16827COIL16989COIL17055COIL17076COIL17146COIL17337COIL17389COIL17867COIIL18248COIIL18428ATPase8L18655ATPase6L18784ATPase6L18877ATPase6L19042ATPase6L19072ATPase6L19347COIIIL19755COIIIL19818COIIIL110321ND3L110586ND4LL110589ND4LL110664ND4LL110688ND4LL110792ND4L110793ND4L110810ND4L111176ND4L111641ND4L111654ND4L111899ND4L112007ND4L112049ND4L112519ND5L112720ND5L112810ND5L113149ND5L113276ND5L113485ND5L113506ND5L113789ND5L113880ND5C> AL113980ND5L114000ND5T> AL114148ND5L114178ND6L114203ND6L114308ND6L114560ND6L114769CYT BL114911CYT BL115115CYT BL115136CYT BL116148D-LOOPL116187D-LOOPL116188D-LOOPC> GL116230D-LOOPL116264D-LOOPL116265D-LOOPL116360D-LOOPL116527D-LOOPL1143D-LOOPL2144212S rRNAL2170616S rRNAL2233216S rRNAL2235816S rRNAL2241616S rRNAL2278916S rRNAL23495ND1C> AL23918ND1L24158ND1L24185ND1L24370tRNA-QL24767ND2L25027ND2L25285ND2L25331ND2C> AL25581tRNA-WL25744NON-CODINGL26713COIL27175COIL27274COIL27624COIIT> AL27771COIIL28080COIIL28206COIIL28387ATPase8L28541ATPase8L28790ATPase6L28925ATPase6L29221COIIIL210115ND3L211944ND4L212236tRNA-SL212630ND5L212693ND5L212948ND5L213803ND5L214059ND5L214544ND6L214566ND6L214599ND6L215110CYT BL215217CYT BL215229CYT BL215236CYT BL215244CYT BL215391CYT BL215629CYT BL215945tRNA-TT-INSL216114D-LOOPC> AL216213D-LOOPL216309D-LOOPL216390D-LOOPL2200D-LOOPL3200016S rRNAL33438ND1L33450ND1L35773tRNA-CL36524COIL36587COIL36680COIL37424COIL37618COIIL38616ATPase6L38618ATPase6L38650ATPase6L39554COIIIL310086ND3L310373ND3L310667ND4LL310819ND4L311800ND4L313101ND5A> CL313886ND5L313914ND5C> AL314152ND6L314212ND6L314284ND6L315099CYT BL315311CYT BL315670CYT BL315824CYT BL315942tRNA-TL315944tRNA-TDELL316124D-LOOPL316327D-LOOPL393012S rRNAT214116S rRNAT285016S rRNAT4688ND2T4917ND2T5277ND2T6489COIC> AT7022COIT8572ATPase6T8697ATPase6T9117ATPase6T9899COIIIT10463tRNA-RT11242ND4T11812ND4T12633ND5C> AT13368ND5T13758ND5C> AT13965ND5T14233ND6T14687tRNA-ET14905CYT BT15028CYT BT15274CYT BT15607CYT BT15928tRNA-TT16163D-LOOPT16182D-LOOPA> CT16186D-LOOPT16294D-LOOPT16296D-LOOPT16324D-LOOPT98812S rRNAU170016S rRNAU172116S rRNAU229416S rRNAU311616S rRNAU319716S rRNAU3348ND1U3720ND1U3849ND1U4553ND2U4646ND2U4703ND2U4732ND2U4811ND2U5319ND2U5390ND2U5495ND2U5656NON-CODINGU5999COIU6045COIU6047COIU6146COIU6518COIU6629COIU6719COIU7109COIU7385COIU7768COIIU7805COIIU8473ATPase8U9070ATPase6T> GU9266COIIIU9477COIIIU9667COIIIU10506ND4LU10876ND4U10907ND4U10927ND4U11197ND4U11332ND4U11732ND4U12346ND5U12557ND5U12618ND5U13020ND5U13617ND5U13637ND5U14139ND5U14179ND6U14182ND6U14620ND6U14793CYT BU14866CYT BU15191CYT BU15218CYT BU15454CYT BU15693CYT BU15907tRNA-TU16051D-LOOPU16256D-LOOPU16270D-LOOPU16399D-LOOPU4580ND2V15904tRNA-TV194D-LOOPW124312S rRNAW140612S rRNAW3505ND1W6528COIW8994ATPase6W10097ND3W11674ND4W11947ND4W12414ND5W13263ND5W15775CYT BW15884CYT BG> CW16292D-LOOPW225D-LOOPX226D-LOOPX6371COIX8393ATPase8X8705ATPase6X14470ND6X15927tRNA-TX−73D-LOOPnot presentin [H]−7028COInot presentin [H]#####ND4not presentin [H]14766CYT Bnot presentin [H]B. Haplogroup-associated polymorphisms16362D-LOOPA, C, D, E,H, L312705ND5A, C, D, E,I, L1, L2,L3, W, X188816S rRNAA, C, T13708ND5A, J, X8027COIIA, L1153D-LOOPA, X207D-LOOPB, I, L2, W13590ND5B, L29449COIIIB, L3499D-LOOPB, U16325D-LOOPC, D10400ND3C, D, E14783CYT BC, D, E10398ND3C, D, E, I,J, K, L1,L2, L315043CYT BC, D, E,I, T489D-LOOPC, D, E, J8701ATPase6C, D, E,L1, L2, L39540COIIIC, D, E,L1, L2, L310873ND4C, D, E,L1, L2, L315301CYT BC, D, E,L2, L311914ND4C, K, L1, L216298D-LOOPC, V301016S rRNAD, H, J, U16311D-LOOPH, K, L1, L393D-LOOPH, L116304D-LOOPH, T16291D-LOOPH, U16356D-LOOPH, U16482D-LOOPH, U15924tRNA-TI, K, U10915ND4I, L1204D-LOOPI, L2, W8251COIII, W171916S rRNAI, X14798CYT BJ, K15257CYT BJ, K5460ND2J, L1, W185D-LOOPJ, L311002ND4J, L34216ND1J, T11251ND4J, T15452CYT BC> AJ, T16126D-LOOPJ, T9548COIIIJ, U13934ND5J, U5231ND2K, L170912S rRNAK, L1, T, W181116S rRNAK, U11467ND4K, U12308tRNA-LK, U12372ND5K, U182D-LOOPL1, L2198D-LOOPL1, L276912S rRNAL1, L2101812S rRNAL1, L23594ND1L1, L24104ND1L1, L27256COIL1, L27521tRNA-DL1, L213650ND5L1, L216278D-LOOPL1, L2,L3, X189D-LOOPA> CL1, L3235216S rRNAL1, L313105ND5L1, L35046ND2L1, W6152COIL2, U15784CYT BL2, U, W5147ND2L3, T6221COIL3, X5426ND2T, U13734ND5T, U13966ND5T, X

[0059] TV/DEL/INS column: all nucleotide substitutions are transitions unless indicated otherwise NP: nucleotide position; TV: transversion; DEL: deletion; INS: insertion

[0060] The data of Table 3 are also depicted in Table 4, wherein the frequencies of occurrence of particular “haplogroup-specific” (e.g., characteristic of only a single haplogroup) or “haplogroup-associated” (e.g., detected in two or more identified haplogroups) mitochondrial single nucleotide polymorphism among the 560 unrelated individuals analyzed are presented; full length mtDNA sequences of these 560 individuals are set forth at SEQ ID NOS:2-561 in the Sequence Listing.

6TABLE 4HAPLOGROUP-SPECIFIC AND HAPLOGROUP-ASSOCIATED POLYMORPHISMSIN MTDNA OF 560 UNRELATED INDIVIDUALSHAPLOGROUPABCDEHIJKL1L2L3MTUVWXTV/NNNNNNNNNNNNNNNNNNDEL/==================NPGENEINS25181393226143347132320146428811REF64D-LOOP20114−73D-LOOP25181393191223441023201383571093D-LOOP153114D-LOOP8143D-LOOP4153D-LOOP19117182D-LOOP94185D-LOOP112215185D-LOOPG > T16186D-LOOP4189D-LOOP4189D-LOOPA > C111477194D-LOOP51198D-LOOP33199D-LOOP21121200D-LOOP171204D-LOOP39237207D-LOOP317126225D-LOOP7226D-LOOP15228D-LOOP113235D-LOOP241236D-LOOP13239D-LOOP10247D-LOOP12249D-LOOPDEL12250D-LOOP10289D-LOOPDEL10290D-LOOPDEL10295D-LOOP123297D-LOOP4357D-LOOP6456D-LOOP161462D-LOOP18477D-LOOP131489D-LOOP1393231497D-LOOP25499D-LOOP1713593tRNA-F166312S25 3-6,rRNA14-1770912S2397146181rRNA71012S7rRNA75012S25181393218143347132317146428811rRNA76912S1323rRNA82512S13rRNA82712S17rRNA93012S11161rRNA95112S31rRNA96112S4rRNA98812S3rRNA101812S1323rRNA104812S2:L1arRNA118912S341rRNA124312S81rRNA140612S5rRNA143812S2518139320812334762320146428811rRNA144212S3rRNA159812S211rRNA170016S4rRNA170616S3rRNA171916S21411111, 3,rRNA5, 6,9, 12,14172116S41rRNA173616S25rRNA173816S7rRNA181116S146151rRNA188816S261461rRNA200016S210rRNA209216S914rRNA214116S3rRNA215816S211rRNA221716S15rRNA224516S2:L1arRNA229416S3rRNA233216S42rRNA235216S7112, 3,rRNA6, 8,10235816S3rRNA238716S2rRNA239516SDEL4rRNA241616S23rRNA248316S21rRNA270616S2518139313143147132320146428811rRNA275816S132, 10rRNA276816S7rRNA278916S18rRNA285016S3rRNA288516S13rRNA301016S973271131rRNA302716S11rRNA311616S21rRNA319716S1241, 4rRNA3277tRNA-L23308ND1713316ND1211113333ND123348ND123394ND111213438ND11133447ND1413450ND163480ND14713495ND1C > A33505ND1813516ND1C > A2:L1a3547ND11423552ND11213591ND1123594ND113232, 3,5, 6,8, 103666ND112113693ND1713705ND121113720ND1713777ND123796ND1133796ND1A > T123843ND123849ND123915ND11913918ND117113990ND1313992ND194024ND1104104ND1132314158ND1324185ND1124216ND12331461, 44248ND1254295tRNA-I34310tRNA-I24312tRNA-I2:L1a4, 104336tRNA-Q101, 12,144370tRNA-Q34491ND224529ND2A > T121, 3,4, 9,12, 144531ND224553ND244561ND21814580ND281,3-5,94586ND22:L1a4646ND215:U41, 44688ND2124703ND234715ND2134727ND254732ND2214745ND2214767ND2134769ND2251813932091433471323201464288114772ND224793ND2514811ND224820ND21714824ND22514883ND294917ND21461, 44970ND224977ND2145004ND21115027ND235036ND275046ND271815147ND21131615178ND2C > A92-6,11,13-175198ND235231ND252:L1a5277ND2515285ND275319ND2125331ND2C > A35390ND2715393ND275426ND27715442ND212:L1a15460ND21122:L1a11815495ND2215581tRNA-W315603tRNA-A2:L1a5633tRNA-A315655tRNA-A75656141574425773tRNA-C65814tRNA-C35913COI1415951COI45999COI15:U416023COI32116045COI716047COI5:U416071COI46146COI26150COI216152COI21716185COI2:L1a6216COI36221COI11111116253COI126260COI121616308COI46365COI76371COI1116413COI316464COIC > A26473COI146489COIC > A56518COI26524COI26528COI26548COI76554COI26587COI56629COI26671COI1126680COI126713COI26719COI126734COI416755COI216776COI2316827COI1176869COI46989COI77013COI37022COI3−7028COI25181393121433471323201464288113, 97055COI111107076COI27109COI37112COI37146COI137175COI187196COIC > A137241COI27256COI12237274COI187337COI27385COI217389COI117424COI137476tRNA-S617521tRNA-D13237598COII1137618COII27624COIIT > A37694COII27711COII27724COIIA > T27768COII817771COII187805COII27867COII78027COII2548080COII28206COII238248COII78251COII13181, 3,5, 9,12, 148269COII818387ATPase388393ATPase588414ATPase91488428ATPase2:L1a88448ATPase388473ATPase4288541ATPase288572ATPase1268584ATPase1368602ATPase368616ATPaseG > T4168616ATPase211, 2,6108618ATPase231, 3,66, 8,108650ATPase468655ATPase1368697ATPase1146168701ATPase13931323201168705ATPase3168784ATPase268790ATPase368794ATPase2568803ATPase268839ATPase268843ATPase3168860ATPase2518139322014334713232014642881168877ATPase268898ATPase268925ATPase268994ATPase81, 3,64, 99006ATPase269042ATPase2:L1a69055ATPase1471,63-5,9, 149070ATPaseT > G269072ATPase42, 3,68, 109117ATPase1369123ATPase1169150ATPase369221COIII239266COIII1139347COIII2:L1a9380COIII69449COIII1269477COIII2319540COIII139313232019545COIII112119548COIII229554COIII11129667COIII1819698COIII4719716COIII139755COIII12:L1a9804COIII59818COIII2:L1a9899COIII69947COIII3119950COIII1219962COIII310034tRNA-G121, 4,5, 9,12, 1410044tRNA-G5710084ND3411110086ND361-3,6, 8,1010097ND3210115ND323110172ND3310192ND3210238ND3141, 410289ND3310321ND341010373ND316110398ND3139313293313232011, 2,5, 6,8, 9,11-1710400ND3139314-6,11,13-1710454tRNA-R2110463tRNA-R461, 410499ND4L3110506ND4L310550ND4L4710586ND4L1410589ND4L22:L1a110598ND4L11210664ND4L2:L1a10667ND4L210685ND4L113110688ND4L1310792ND4210793ND4210810ND42131, 2,5, 6,8, 1010819ND411110873ND413931323201610876ND47110907ND412:U410915ND4132:L1a110927ND42110978ND4711002ND4236, 1011065ND42111176ND42:L1a1111177ND41411197ND42111242ND4211251ND43346111299ND447111314ND4211332ND45:U41, 411353ND42111377ND431111467ND44642111470ND411111485ND4711560ND4211593ND4A > T211641ND42:L1a1311654ND4211674ND481#####ND42518139361433471323201464281111732ND42111800ND42311812ND41:a35111840ND4711869ND4C > A511899ND4211914ND4123133181211923ND4311944ND42211947ND48112007ND422122:L1a112049ND4212127ND4212236tRNA-S312308tRNA-L47421,3-5,912346ND5111212372ND5147421, 412414ND511812468ND5412501ND514112519ND5712557ND5212570ND5212579ND5212612ND5331, 412618ND52112630ND52312633ND5C > A11112642ND52212693ND51812705ND52513931413231918111, 412720ND52:L1a12810ND543, 8,1012811ND52112948ND5312954ND5712978ND5313020ND5271113101ND5A > C213105ND521310113135ND5513149ND5213263ND51243, 4,5, 11,13-1713276ND52:L1a13281ND5313368ND5461,3-5,913404ND5413485ND5413506ND51313590ND5172313617ND523113637ND54113650ND5132313680ND5213681ND5213708ND5314333151,3-5,9, 12,1413734ND527113740ND5713758ND5C > A213759ND5313780ND513113789ND51113803ND5182, 1013855ND5213879ND52113880ND5C > A713886ND51313914ND5C > A613933ND5313934ND55313965ND5813966ND5311113967ND5313980ND51214000ND5T > A414002ND5214037ND5514040ND5214059ND5314125ND5214139ND5314148ND5214152ND6314167ND647114178ND61114179ND6214182ND6118114203ND6614212ND621114233ND6135114284ND6314308ND6112:L1a14318ND61214350ND6214364ND641114365ND6914470ND61111111, 414470ND6T > A314544ND62114560ND611114566ND61714569ND632114582ND61014599ND6214620ND65:U4114668ND6914687tRNA-E814766CYT B25171393211334613232014642811314769CYT B16114783CYT B1393114793CYT B16114798CYT B2447114866CYT B2:U414872CYT B214905CYT B3461, 414911CYT B415028CYT B5115043CYT B139313151, 415099CYT B1215110CYT B13115115CYT B1715136CYT B2:L1a15191CYT B215217CYT B315218CYT B213115229CYT B215236CYT B2115244CYT B1715257CYT B631, 1415274CYT B215301CYT B113931123201115311CYT B615326CYT B2518139322014334713212014642881115391CYT B215452CYT BC > A3346115454CYT B315466CYT B215487CYT BA > T1215535CYT B1715607CYT B461, 4,5, 915629CYT B715670CYT B12515679CYT B215693CYT B5:U4115758CYT B611115775CYT B1215784CYT B1111173415789CYT B215808CYT B215812CYT B31115824CYT B615833CYT B9115884CYT B13415884CYT BG > C8115904tRNA-T81, 415907tRNA-T71, 415924tRNA-T311112211, 415927tRNA-T5115928tRNA-T461, 4,915930tRNA-T62115942tRNA-T415944tRNA-TDEL615945tRNA-TT-INS215946tRNA-T216051D-LOOP1511616069D-LOOP2316092D-LOOP322116111D-LOOP233116114D-LOOPC > A3116124D-LOOP716126D-LOOP22363816148D-LOOP2:L1a16162D-LOOP14116163D-LOOP716172D-LOOP22132216182D-LOOPA > C811516186D-LOOP816187D-LOOP1116188D-LOOPC > G12:L1a16213D-LOOP316217D-LOOP1416224D-LOOP14116230D-LOOP2:L1a16234D-LOOP1212162116256D-LOOP17116261D-LOOP112416264D-LOOP26116265D-LOOP216270D-LOOP1241816278D-LOOP11411123621016290D-LOOP23116291D-LOOP1811416292D-LOOP111716293D-LOOP8516294D-LOOP2417138216296D-LOOP12716298D-LOOP11235316304D-LOOP1611916309D-LOOP111316311D-LOOP111421431217131116319D-LOOP25112316320D-LOOP2122:L1a316324D-LOOP1516325D-LOOP11109116327D-LOOP12516356D-LOOP9416360D-LOOP416362D-LOOP2429214113613116390D-LOOP111121223116391D-LOOP11016399D-LOOP12121016463D-LOOP216482D-LOOP916527D-LOOP121TV/DEL/INS column: all nucleotide substitutions are transitions unless indicated otherwise NP: nucleotide position; TV: transversion; DEL: deletion; INS: insertion References 1: Finnilä et al. 2001 Am J Hum Genet 68:1475-1474 2: Chen et al. 2000 Am. J. Hum. Genet. 66:1362-1383 3: Alves-Silva et al. 2000 Am. J. Hum. Genet. 67:444-461 4: Macaulay et al. 1999 Am J Hum Genet 64:232-249 5: Wallace et al. 1999 Gene 238:211-230 6: Quintana-Murci et al. 1999Nat Genet 23:437-41 7: Lehtonen et al. 1999s in mtDNA 8: Rando et al. 1998Am J Hum Genet 62:531-550 9: Torroni et al. 1996. Genetics 144:1835-1850 10: Chen et al. 1995 Am J Hum Genet 57:133-149 11: Torroni et al. 1994 Am J Phys Anthropol 93:189-99 12: Torroni et al. 1994 Am J Hum Genet 55:760-776 13: Torroni et al. 1994 Am. J. Phys. Anthr. 93:189-199 14: Torroni et al. 1994 J Bioener Biomembr 26:261-71 15: Torroni et al. 1993 Am J Hum Genet 53:563-590 16: Torroni et al. 1992 Genetics 130:153-162 17: Wallace et al. 1992 Hum Biol 64:403-416

[0061] A mitochondrial single nucleotide polymorphism or homoplasmic mtDNA point mutation, which includes a deviation in the identity of the nucleotide base situated at a specific position in a mtDNA sequence relative to the “wildtype” human mtDNA sequence (CRS) disclosed by Anderson et al. (1981), may fall into at least one of the following categories: An “error” refers to sequencing mistakes in the human mtDNA sequence reported by Anderson et al. (1981), as corrected by Andrews et al. (1999 Nature Genetics 23:147). A “polymorphism” refers to a known polymorphism in a human mtDNA sequence that is not associated with a particular human disease, but that has been detected and described as a result of naturally occurring variability in the identity of the nucleotide base situated at a given position in a human mtDNA sequence (see, e.g., “Mitomap”, Emory University School of Medicine, available at http://www.gen.emory. edu/mitomap.html). A “rare polymorphism” refers to a mtDNA nucleotide that differs from the base situated at the corresponding position in the Cambridge Reference Sequence (CRS) of Anderson et al. (1981) but which, upon subsequent accumulation of human mtDNA sequence data from a plurality of subjects (and in contrast to the reliance of Anderson et al. upon the mtDNA sequence of a single donor to generate the CRS), suggests the presence of a low frequency allele in the CRS donor, relative to the larger sample population (see Andrews et al., 1999 Nature Genetics 23:147 and references cited therein). Particularly useful mutations that segregate with AD according to the present invention include homoplasmic mtDNA point mutations (e.g., single nucleotide polymorphisms) that are not errors, polymorphisms or rare polymorphisms as just described, and additionally, the homoplasmic mtDNA point mutations (e.g., single nucleotide polymorphisms). A number of polymorphisms are identified in Tables 3 and 4, and in FIGS. 1-6; full length mtDNA sequences of 560 unrelated human subjects are set forth at SEQ ID NOS:2-561 in the Sequence Listing.

[0062] In certain embodiments of the invention, the presence or absence of a specific genetic mutation or variation, such as, for example, a single nucleotide polymorphism or a deletion, that correlates with a specific haplogroup, disease or individual may be sufficient to determine the haplogroup, presence or risk of disease or identity of the individual from whom the biological sample being tested was obtained. In these situations, the association or correlation of a particular genetic mutation or variation with a haplogroup, disease or individual may be determined by means generally acceptable to those with skill in the relevant or a related art. For example, an association or correlation may be established by the presence of a statistically significant increase or decrease in the presence or absence of a single nucleotide polymorphism or other genetic alteration or marker in samples from subjects with a disease or haplogroup. An association or correlation with a specific indivudal may also be determined by a statistically significant presence or absence of a specific genetic mutation or alteration within mtDNA derived from the individual compared to the general population or a sample of other individuals.

[0063] In other embodiments of the invention, the determination of the presence or risk of a disease, the haplogroup or identity of an individual, or the genetic relationship between individuals may be determined by analyzing one or more genetic markers, including one or more mtDNA single nucleotide polymorphisms or deletions. In these situations, the correlation or association of one specific mtDNA mutation or alteration with a certain phenotype or individual need not be statistically significant in isolation. Rather, the overall analysis of multiple markers may be used to establish the association or correlation. The presence or absence of one or more markers together may be statistically associated or correlated with a disease, haplogroup or individual.

[0064] As presented below, the accompanying examples reveal that among individuals of the general population in the United States there are large differences in the frequencies of single nucleotide polymophisms (typically nucleotide substitutions) that are carried in the mtDNA genome (see also Table 3). While distinct sets of SNPs are associated with particular mtDNA haplogroups, individuals in the examples described below were found to carry up to 8 additional non-synonymous and 22 additional synonymous nucleotide substitutions in protein-coding genes, up to 8 nucleotide changes each in the ribosomal RNA and tRNA genes, and up to 13 changes in the D-loop region. Most of these nucleotide substitutions are likely to be of systemic nature as suggested by their detection following analyses of paired blood and skeletal muscle samples. Additionally, while the samples collected in the attached examples were non-neoplastic tissues, they exhibited some of the same nucleotide changes that were reported as acquired mutations in cancer tissue (Fliss, M. S. et al. (1999) Science 287:2017-2019). Thus, as also noted above and as shown in Table 3, the present invention provides non-haplogroup associated mitochondrial single nucleotide polymorphisms that may be used to determine the unique identity of an individual and/or to determine the presence of or risk for having a disease,(e.g., Alzheimer's disease, diabetes), in addition to providing improved profiles of mitochondrial single nucleotide polymorphisms for identifying a mitochondrial haplogroup and/or a mitochondrial haplogroup subgroup.

[0065] The present invention also contemplates compositions and methods for the detection of potentially pathogenic mtDNA mutations involved in human disease, either in the background of, or independent of, polymorphisms associated with mtDNA haplogroups. As noted above, the invention also provides materials and methods for haplogroup identification and genealogical and forensic analyses.

[0066] The following examples are offered by way of illustration, and not by way of limitation.

EXAMPLES

Example 1

DNA Isolation From Blood, Muscle and Brain Samples

[0067] Blood samples, muscle biopsies, and frozen brain samples were collected from 560 maternally unrelated individuals (as determined from family-history information) of European, African and Asian descent after institutional review board (IRB) approval and informed consent. The sampled population consisted of 38% females and 62% males ranging in age from 33-93 years and 24-103 years with mean ages of 72 and 61, respectively. Total cellular DNA was prepared from white blood cells and frozen brain tissue by homogenization and cell lysis in TE buffer (10 mM Tris-HCl, pH 7.5, 1 mM EDTA) containing proteinase K (400μ/ml) and 1% SDS at 37° C. for 12 hr, followed by phenol:chlorofom:isoamyl alcohol (50:48:2) and chloroform:isoamyl alcohol (24:1) extractions. Alternatively, mitochondria were isolated from frozen brain tissue and mtDNA was extracted. DNA was precipitated with ethanol and resuspended in TE buffer. DNA concentrations were determined by UV absorption at 260 nm.

Example 2

[0068] PCR Amplification, Sequencing and Sequence Analysis

[0069] MtDNA was amplified in 68 fragments of approximately 500 bp each in length with 50% overlap between neighboring fragments. PCR primers were 16-26 nucleotides in length and designed to be complementary to the mitochondrial light and heavy strands.

[0070] PCR amplification was performed as described below using the oligonucleotide primer set presented in Table 5 was used to generate 68 PCR product fragments spanning the complete mtDNA molecule, each fragment having approximately 50% sequence overlap with each neighboring product fragment. This strategy permitted direct mtDNA sequencing in both forward and reverse directions with four-fold redundancy in the identification of each nucleotide base, resulting in error-free sequencing. Thus, for each patient sample approximately 68,000 nucleotides were sequenced and analysis of homoplasmic mutations was verified.

7TABLE 5OLIGONUCLEOTIDE PRIMERS SPECIFIC FOR INDICATED REGIONS OF MITOCHONDRIAL GENOMEPRIM.SEQFRAG.FRAGMENTPRIMERGENENUCLEOTIDELENGTHPRIMER SEQUENCE 5′ −> 3′ID NO.LENGTH201201FD-Loop16225L24CAACTATCACACATCAACTGCA512AC201R168H21TCGCCTGTAATATTGAACGTA202202FD-Loop16477L23GCTAAAGTGAACTGTATCCGA379CA 202R287H21GACTGTTAAAAGTGCATACCG203203FD-Loop155L21TATTTATCGCACCTACGTTCA407203R562H18AACTGTGGGGGGTGTCTT204204FD-Loop/tRNA275L23GACATGATAACAAAAAATTTC510Phe/12SCA204R785H18GTGTGGCTAGGCTAAGCG205205FD-Loop/tRNA498L16CGCCCATCCTACCCAG541Phe/12S205R1039H25TCTTAGCTATTGTGTGTTCAGATAT206206F12S773L19TGCAGCTCAAAACGCTTAG525206R1298H18CGTGGGTACTTGCGCTTA207207F12S1031L24GCTTTAACATATCTGAACACAC505AA207R1536H24CTTGTCTCCTCTATATAAATGCGT208208F12S/tRNA1285L20GAAGGCTACAAAGTAAGCGC500Val/16S208R1785H17TCATCTTTCCCTTGCGG209209F12S/tRNA1535L24TACGCATTTATATAGAGGAGA461Val/16SCAA209R1996H18ATCACCAGGCTCGGTAGG210210F16S1780L19TAGTACCGCAAGGGAAAGA417210R2197H19GTTGAGCTTGAACGCTTTC211211F16S1986L17AGGCGACAAACCTACCG411211R2397H23GTGAGGGTAATAATGACTTGTTG212212F16S2165L21CCATAGTAGGCCTAAAAGCAG420212R2585H22AGTGATTATGCTACCTTTGCAC213213F16S2380L24CAATATCTACAARCAACCAAC413AAG213R2793H20ACCGAAATTTTTAATGCAGG214214F16S2580L19TAACCGTGCAAAGGTAGCA405214R2985H18CCTGATCCAACATCGAGG215215F16S2779L21CCTAAACTACCAAACCTGCAT442215R3221H20GCCATCTTAACAAACCTGT216216F165/tRNA2974L19AGGGTTTACGACCTCGATG517Leu/ND1216R3491H17GGTAGATGTGGCGGGTT217217F16S/tRNA3228L19TTGTTAAGATGGCAGAGCC506Leu/ND1217R3734H20AATGATGGCTAGGGTGACTT218218FND13482L16AGCCCCTAAAACCCGC498218R3980H22ATAATGTTTGTGTATTCGGCTA219219FND13718L23CAAACAATCTCATATGAAGTC520AC219R4238H17AGGGGGAATGCTGGAGA220220FND1/tRNAs3967L19GCCCTATTCTTCATAGCCG514Ile/Gln/Met/ND2220R4481H15ATGACGGGTTGGGCC221221FND1/tRNAs4224L21CATACCCATTACAATCTCCAG511Ile/Gln/Met/ND2221R4735H26TATTAATGATGAGTATTGATTGGTAG222222FND24481L15GGCCCAACCCGTCAT508222R4989H20GCTAAGATTTTGCGTAGCTG223223FND24691L23CCTCTTCAACAATATACTCTCC548G223R5239H16GGCAAAAAGCCGGTTA224224FND24979L18AAACCAGACCCAGCTACG506224R5485H25AAGATTATTAGTATAAAAGGGGAGA225225FND2/tRNAs5234L15CCCGCTAACCGGCTT480Trp/Ala/Asn225R5714H22GAGAAGTAGATTGAAGCCAGTT226226FND2/tRNAs5455L17TCATCGCCCTTACCACG537Trp/Ala/Asn/Cys/Tyr+O-L226R5992H19AGGAGGCTTAGAGCTGTGC227227FtRNAs5700L20TAAGCACCCTAATCAACTGG542Ala/Asn/Cys/Tyr+O-L/CO1227R6242H20CCTCCACTATAGCAGATGCG228228FCOI5995L21CAGCTCTAAGCCTCCTTATTC498228R6493H21CAGCTAGGACTGGGAGAGATA229229FCOI6230L19CCTACTCCTGCTCGCATCT527229R6757H21TATGGTGTGCTCACACGATAA230230FCOI6476L24AGCAGTCCTACTTCTCCTATCT510CT230R6986H25GTCGTGTAGTACGATGTCTAGTGAT231231FCOI6713L23CATAGGTATGGTCTGAGCTATG517A231R7230H18GGTGTATGCATCGGGGTA232232FCO1/tRNA6979L24CAAACTCATCACTAGACATCGT501SerAC232R7480H17ATGGGGTTGGCTTGAAA233233FCO1/tRNAs7225L16CGGACTACCCCGATGC519Ser/Asp/COII233R7744H20CCTGAGCGTCTGAGATGTTA234234FtRNAs7469L18CCCAAAGCTGGTTTCAAG510Ser/Asp/COII234R7979H18GTCAAGGAGTCGCAGGTC235235FCOII7690L20CTTCCTAGTCCTGTATGCCC557235R8247H18GGGTAAATACGGGCCCTA236236FCOII/tRNA7975L16AGGCGACCTGCGACTC504Lys/ATPase 8236R8479H21TTTATTTTTATGGGCTTTGGT237237FCOII/tRNA8225L22ATTCCCCTAAAAATCTTTGAAA516Lys/ATPase8/ATPase 6237R8741H25GCAATAAAAATGATTAAGGATACTA238238FATPase8499L24AAAATTATAACAAACCCTGAG4978/ATPase 6AAC238R8996H16GCGGTTAGGCGTACGG239239FATPase8722L26CGAACCTGATCTCTTATACTAG5158/ATPaseTATC6/COIII239R9237H18TCATGGGCTGGGTTTTAC240240FATPase8978L18TTCAACCAATAGCCCTGG5066/COIII240R9484H19CAGAAAAATCCTGCGAAGA241241FCOIII9229L22ATCATATAGTAAAACCCAGCC505C241R9734H21TCGAAGTACTCTGAGGCTTGT242242FCOIII/tRNA9470L21CTCAGAAGTTTTTTTCTTCGC537Glu242R10007H25AGTTAATTGGAAGTTAACGGTACTA243243FCOIII/tRNA9732L22CTACAAGCCTCAGAGTACTTCG520Glu/ND3243R10252H21GGAGGGCAATTTCTAGATCAA244244FCOIII/tRNA9976L24ATTGATGAGGGTCTTACTCTTT523Glu/ND3/tRNTAA Arg244R10499H23CTAGAAGTGAGATGGTAAATGCT245245FND3/tRNA10209L26TTCTCCATAAAATTCTTCTTAG522Arg/ND4LTAGC245R10731H26AGTAGGTTTAGGTTATGTACGTAGTC246246FtRNA10450L24ATGATAATCATATTTACCAAAT536Arg/ND4L/NDGC4246R10986H18GTTGGCTTGCCATGATTG247247FND4L/ND410719L23ACATATGGCCTAGACTACGTAC511A247R11230H18GCGATGAGTAGGGGAAGG248248FND410979L17CCCCTCACAATCATGGC502248R11481H22AGTGTGAGGCGTATTATACCAT249249FND411213L20TACACCCTAGTAGGCTCCCT524249R11737H21TTTGAGTTTGCTAGGCAGAAT250250FND411474L20GGCGGCTATGGTATAATACG510250R11984H24CCATTGTGTTGTGGTAAATATGTA251251FND4/tRNAs11721L24TTACATCCTCATTACTATTCTG511His/SerCC251R12232H18TTAGACATGGGGGCATGA252252FND4/tRNAs11968L23AGCCCTATACTCCCTCTACATA538His/Ser/Leu/ND5T252R12506H25TTCGAGATAATAACTTCTTGGTCTA253253FtRNAs12213L22GCTCACAAGAACTGCTAACTC516Ser/Leu/ND5A253R12729H22GAATAGGTTGTTAGCGGTAACT254254FND512458L24CATCCACCTTTATTATCAGTCT532CT254R12990H17ATTTGCCTGCTGCTGCT255255FND512714L25TACCATACTAATCTTAGTTACC520GCT255R13234H21AGTGGAGAAGGCTACGATTTT256256FND512979L17GGCCTCCTCCTAGCAGC501256R13480H21ACCTGTGAGGAAAGGTATTCC257257FND513225L21GACATCAAAAAAATCGTAGCC504257R13729H25AAATGTTGTTAGTAATGAGAAATCC258258FND513472L21CATTAGCAGGAATACCTTTCC506258R13978H19CTAGGAGGAGTAGGGGCAG259259FND5/ND613720L21CTATTCGCAGGATTTCTCATT517259R14237H16GTGCGGGGGCTTTGTA260260FND5/ND613939L19CGCACAATCCCCTATCTAG545260R14484H22TTAATTTATTTAGGGGGAATGA261261FND6/tRNA14209L22AACTACTACTAATCAACGCCCA522Glu261R14731H22GGTCATTGGTGTTCTTGTAGTT262262FND6/tRNA14471L20CCAAAGACAACCATCATTCC508Glu/Cyt b262R14979H18CGTGAAGGTAGCGGATGA263263FtRNA Glu/Cyt14715L23ACCATCGTTGTATTTCAACTAC514bA263R15229H21TAGCCTCCTCAGATTCATTGA264264FCyt b14961L22ACGTAAATTATGGCTGAATCAT518264R15479H20CCTAGGAGGTCTGGTGAGAA265265FCyt b15223L22CCTAGTTCAATGAATCTGAGGA509265R15732H17AATGAGGAGGTCTGCGG266266FCyt b/tRNA15449L24TTCCTTCTCTCCTTAATGACATT530Thr/PheA266R15979H20TAGCTTTGGGTGCTAATGGT267267FCyt b/tRNAs15723L18GACTCCTAGCCGCAGACC509Thr/Phe/D-Loop267R16232H20GGAGTTGCAGTTGATGTGTG268268FtRNA Phe/D-Loop15968L22TCTTTAACTCCACCATTAGCAC517268R16485H24GGAACCAGATGTCGGATACAGTTC

[0071] PCR amplifications were performed in triplicate, each containing 5-50 ng total cellular DNA or 1 ng mtDNA, 100 ng of each forward and reverse primers, and 12.5 μl of Taq PCR Master Mix (Qiagen) in a reaction volume of 25 μl. After denaturation at 95° C. for 2 min, amplification was carried out for 30 cycles at 95° C. for 10 sec, 60° C. for 10 sec, and 72° C. for 1 min, followed by 72° C. for 4 min and cooling to 4° C. Triplicate reactions were pooled and purified with the QIAquick 96 PCR Purification Kit (Qiagen). Sequencing reaction were preformed using 3 μl of PCR product, forward or reverse PCR primer, and BigDyeTerminator chemistry (Perkin-Elmer). Sequencing reactions were purified using Centri-Sep 96 plates (Princeton Separations). Electrophoresis and base calling was performed using a 3700 DNA Analyzer (Perkin-Elmer). Sequence data for the PCR fragments were built into contiguous mtDNA sequences using extensive source code modifications in the Contig Assembly Program (CAP; Thompson, J. D. et al., (1994) Nucl Acids Res 22:4673-4680) and aligned with the published sequence of mitochondrial DNA (e.g., SEQ ID NO:1) using modified publicly available FASTA (Pearson et al., 1990 Proc. Nat. Acad. Sci. USA 85:2444) and CLUSTALW (Thompson et al., 1994 Nucl. Ac. Res. 22:4673) software that was modified in order to identify nucleotide substitutions. Data analysis included categorization of sample sequences according to various parameters, including: patient haplogroup, mtDNA gene region in which an identified SNP resided and, for protein encoding mtDNA genes in which a SNP was identified, whether the SNP was a synonymous substitution (i.e., resulted in no change in the amino acid sequence of the encoded protein) or a non-synonymous substitution (i.e., resulted in a different amino acid sequence for the encoded protein). Full length mtDNA sequences from 560 unrelated human subjects are set forth as SEQ ID NOS:2-561.

Example 3

Single Nucleotide Polymorphisms in Mirochondrial DNA that Segregate with Specific Haplogroups

[0072] The distribution of mtDNA haplogroups amongst the 435 individual samples of European origin were 52%, 10.8%, 9.7%,10.6%, 7.6%, 1.8%, 3.2%, 1.8% and 2.5% for the haplogroups H (n=226), K (n=47), U (n=42), T (n=46), J (n=33), W (n=8), I (n=14), V (n=8) and X (n=11) respectively. The African mtDNA haplogroups L1, L2, and L3 were represented by 3.3%, 4.1%, and 7.4%, respectively, in our population before collection of additional African American individuals in order to expand data sets for the L haplogroups, which totaled 56 individual subjects; 69 samples were obtained from individual subjects of Asian descent. Novel polymorphisms were identified that were associated, either as “haplogroup-specific” or “haplogroup-associated” polymorphisms as described above, with one (haplogroup-specific) or more (haplogroup-associated) of mtDNA haplogroups A, B, C, D, E, H, I, J, K, L1, L2, L3, T, U, V, W and X (Tables 3 and 4). Novel nucleotide substitutions that were specific for individual European haplogroups are C114T, C497T, T1189C, A3480G, T9698C, A10550G, A10978C, T11299C, A11470G, T12954C, C14167T and T14798C for haplogroup K; T3197C, A7768G, G9477A, T13617C, T14182, A14793G, A15218G and C16256T for haplogroup U; G228A, C295T, C462T and G15257A for haplogroup J; G930A, G1888A, T5426C, C6489A, G8697A, A11812G, T13965C, A14233G and C16296T for haplogroup T; T1243C, A3505G, G5046A, G5460A, C11674T and G15884C for haplogroup W; T250C, G12501A, and A13780G for haplogroup I; T239C, C456T, T4336C, T677C, A16162G for haplogroup H; T16298C for haplogroup V; G225A, T226C, G6371T, C8393T, A13966G, and G15927A for haplogroup X. Other novel SNPs are shared by two of the European haplogroups, i.e., A181 IG and Al 1467G, which occurred in haplogroups U and K; G207A in haplogroups I and W; T4216C, A11251G, and C15452A in haplogroups J and T; T16304C in haplogroups H and T; A15924G in haplogroups I and K; G13708A in haplogroups J and X. Many of these novel SNPs allowed for the identification of novel subgroups for haplotypes H, K, U, J, and T based on CLUSTALW/NJPLOT7 analysis (FIG. 1).

[0073] Nucleotide substitutions at positions T489C, G709A, G3010A, T6221C, G11914A, C12705T, G14905A, G15043A, C16223T, C16294T and T16362C were found in European and African haplogroups. In this regard, they are “haplogroup associated” rather than “haplogroup specific”. The finding that all individuals not belonging to haplogroup H carried the nucleotide substitutions A73G, C7028T, and G11719A, which were absent in most of the H haplotypes (93-99%), confirmed a previous report (Andrews, R. M. et al. (1999) Nature Genetics 23:147).

Example 4

Single Nucleotide Polymorphisms and Deletions in Mitochondrial DNA that Segregate with African Haplogroups

[0074] Nucleotide changes that were novel and unique to all members of the three African mtDNA haplogroups are A8701G, T9540C, and T10873C. In addition to the unique changes previously identified for the L1 and L2 haplogroups at T10810C and G16390A, respectively, the substitutions G247A, T825A, G2758A, T2885C, G3666A, T7146, C8468T, C8655T, G10688A, C13506T, T13789C, T14178C, G14560A, and C16187T were found in all (or all but one) haplogroup L1 mtDNAs, whereas the substitutions T2416C, G8206A, A9221G, T10155C, T11944C, and G13590A were present in all (or all but one) haplogroup L2 mtDNAs. Changes at nucleotide positions 2758, 2768, 3308, and 16187, previously reported and not associated with any mtDNA haplogroup, were found in haplogroup L1 mtDNA (Table 3). Nucleotide substitutions that were found both in haplogroups L1 and L2 in addition to C3594T were C182T, G769A, G1018A, A4104G, C7256T, T7521C, and C13650T. A nucleotide substitution found in haplogroups L1 and L3 was A13105G, and a substitution found in L2 and L3 was G15301A.

[0075] The African L3 haplotypes were identified based on the absence of the HpaI restriction site at nucleotide position 3592 which corresponds to the absence of a C to T transition at nucleotide position 3594, and also the absence of substitutions at positions 182, 769, 1018, 4104, 7256, 7521, and 13650. Nucleotide changes that occurred in L3 mtDNA only were A249del, A289del, A290del, C2092T, T2352C, C4883T, C5178A, C6587T, C8650T, A9545G, A14152G, C14668T, T15670C, T15942C, C3450T, T3552A, A4715G, G5733A, T6221C, C7196A, G8584A, C9449T, A10086G, G10373A, C10400T, A10819G, A13263G, C13914A, T14212C, T14318C, T14783C, A15311G, A15487T, A15824G, G15930A, T15944del, T16124C, T16325C, and C16327T. Many of these nucleotide changes were clustered in subgroups of L3 (FIG. 2). All of the L3 haplotypes were characterized by the absence of polymorphisms that were present in one or both of the other African haplogroups, except for substitutions A13105 and G15301A, which were also present in some of L1 and all of L2 mtDNAs, respectively. Of the three African haplogroups identified to date, the L3 mtDNA haplogroup that was found in the population sample described herein appeared to be of most recent origin, based according to non-limiting theory on the observation that the overall level of sequence divergence was lower than for the L1 and L2 haplogroups.

[0076] Additional clusters of polymorphisms (Tables 3 and 4) were detected and their relationships plotted using reduced meridian networks (Bandelt et al. 1995 Genetics 141:743) for European mtDNA haplogroups (FIG. 3), for European H and V mtDNA haplogroups (FIG. 4), for African mtDNA haplogroups (FIG. 5) and for Asian mtDNA mtDNA haplogroups (FIG. 6).

Example 5

Single Nucleotide Polymorphisms in Mitochondrial DNA that Correlate with Cancer and Tumors

[0077] None of the somatic changes previously reported in colorectal cancer tissue were found in tissues from the population analyzed here. More than one-third of the somatic changes previously found in lung, bladder, and head and neck tumors were also detected in the herein described population which included non-neoplastic blood, brain, and skeletal muscle samples (Table 2). The A1811G polymorphism was associated with mtDNA haplogroups K and U and was detected in 100% of the haplogroup K and in 29% of the haplogroup U mtDNAs for which the entire mtDNA was sequenced, and in an additional 24 blood samples from individuals for which only the ribosomal RNA genes were analyzed. The G2758A, A2768G, T3308C, G10688A, T10810C, and T16187C changes were associated specifically with haplogroup L1 mtDNAs and appeared together with a number of other polymorphisms (Table 1, FIG. 1). The A to C substitution (not C to A as previously reported) was found at nucleotide 16183 in 25 individual blood samples (12% of our sample population) and in four paired skeletal muscle samples from the same individuals. Furthermore, nucleotide substitutions at positions 150 (C to T), 195 (T to C) and 16519 (T to C) were common systemic polymorphisms which were found in 13%, 25% and 70%, respectively, of the herein characterized population. These substitutions were present in blood and brain, and in all skeletal muscle tissues paired with blood samples. Nucleotide substitution A4917G was found in all of haplogroup T mtDNAs where it was associated with other T-specific polymorphisms (Table 3, FIG. 1).

[0078] From the foregoing it will be appreciated that, although specific embodiments of the invention have been described herein for purposes of illustration, various modifications may be made without deviating from the spirit and scope of the invention. Accordingly, the invention is not limited except as by the appended claims.

Number	Date	Country
60369539	Apr 2002	US
60369131	Mar 2002	US
60333622	Nov 2001	US

Mitochondrial DNA polymorphisms

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

US Classifications

International Classifications

Abstract

Description

Claims

CROSS-REFERENCE TO RELATED APPLICATIONS

Provisional Applications (3)