Nucleic acids containing single nucleotide polymorphisms and methods of use thereof

Information

  • Patent Grant
  • 6670464
  • Patent Number
    6,670,464
  • Date Filed
    Tuesday, November 16, 1999
    24 years ago
  • Date Issued
    Tuesday, December 30, 2003
    20 years ago
Abstract
The invention provides nucleic acids containing single-nucleotide polymorphisms identified for transcribed human sequences, as well as methods of using the nucleic acids.
Description




BACKGROUND OF THE INVENTION




Sequence polymorphism-based analysis of nucleic acid sequences has lead to novel approaches for determining the identity and relatedness of individuals. The approach is generally based on alterations in nucleic acid sequences between related individuals. This analysis has been widely used in a variety of genetic, diagnostic, and forensic applications. For example, polymorphism analyses are used in identity and paternity analysis, and in genetic mapping studies.




Several different types of polymorphisms in nucleic acid have been described. One such type of variation is a restriction fragment length polymorphism (RFLP). RFLPS can create or delete a recognition sequence for a restriction endonuclease in one nucleic acid relative to a second nucleic acid. The result of the variation is in an alteration the relative length of restriction enzyme generated DNA fragments in the two nucleic acids.




Other polymorphisms take the form of short tandem repeats (STR) sequences, which are also referred to as variable numbers of tandem repeat (VNTR) sequences. STR sequences typically that include tandem repeats of 2, 3, or 4 nucleotide sequences that are present in a nucleic acid from one individual but absent from a second, related individual at the corresponding genomic location.




Other polymorphisms take the form of single nucleotide variations, termed single nucleotide polymorphisms (SNPs), between individuals. A SNP can, in some instances, be referred to as a “cSNP” to denote that the nucleotide sequence containing the SNP originates as a cDNA.




SNPs can arise in several ways. A single nucleotide polymorphism may arise due to a substitution of one nucleotide for another at the polymorphic site. Substitutions can be transitions or transversions. A transition is the replacement of one purine nucleotide by another purine nucleotide, or one pyrimidine by another pyrimidine. A transversion is the replacement of a purine by a pyrimidine, or the converse.




Single nucleotide polymorphisms can also arise from a deletion of a nucleotide or an insertion of a nucleotide relative to a reference allele. Thus, the polymorphic site is a site at which one allele bears a gap with respect to a single nucleotide in another allele. Some SNPs occur within, or near genes. One such class includes SNPs falling within regions of genes encoding for a polypeptide product. These SNPs may result in an alteration of the amino acid sequence of the polypeptide product and give rise to the expression of a defective or other variant protein. Such variant products can, in some cases result in a pathological condition, e.g., genetic disease. Examples of genes in which a polymorphism within a coding sequence gives rise to genetic disease include sickle cell anemia and cystic fibrosis. Other SNPs do not result in alteration of the polypeptide product. Of course, SNPs can also occur in noncoding regions of genes.




SNPs tend to occur with great frequency and are spaced uniformly throughout the genome. The frequency and uniformity of SNPs means that there is a greater probability that such a polymorphism will be found in close proximity to a genetic locus of interest.




SUMMARY OF THE INVENTION




The invention is based in part on the discovery of novel single nucleotide polymorphisms (SNPs) in regions of human DNA.




Accordingly, in one aspect, the invention provides an isolated polynucleotide which includes one or more of the SNPs described herein. The polynucleotide can be, e.g., a nucleotide sequence which includes one or more of the polymorphic sequences shown in Table 1 (SEQ ID NOS: 1-1192) and which includes a polymorphic sequence, or a fragment of the polymorphic sequence, as long as it includes the polymorphic site. The polynucleotide may alternatively contain a nucleotide sequence which includes a sequence complementary to one or more of the sequences (SEQ ID NOS: 1-1192), or a fragment of the complementary nucleotide sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence.




The polynucleotide can be, e.g., DNA or RNA, and can be between about 10 and about 100 nucleotides, e.g, 10-90, 10-75, 10-51, 10-40, or 10-30, nucleotides in length.




In some embodiments, the polymorphic site in the polymorphic sequence includes a nucleotide other than the nucleotide listed in Table 1, column 5 for the polymorphic sequence, e.g., the polymorphic site includes the nucleotide listed in Table 1, column 6 for the polymorphic sequence.




In other embodiments, the complement of the polymorphic site includes a nucleotide other than the complement of the nucleotide listed in Table 1, column 5 for the complement of the polymorphic sequence, e.g., the complement of the nucleotide listed in Table 1, column 6 for the polymorphic sequence.




In some embodiments, the polymorphic sequence is associated with a polypeptide related to one of the protein families disclosed herein. For example, the nucleic acid may be associated with a polypeptide related to angiopoietin, 4-hydroxybutyrate dehydrogenase, or any of the other proteins identified in Table 1, column 10.




In another aspect, the invention provides an isolated allele-specific oligonucleotide that hybridizes to a first polynucleotide containing a polymorphic site. The first polynucleotide can be, e.g., a nucleotide sequence comprising one or more polymorphic sequences (SEQ ID NOS:1-1192), provided that the polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for the polymorphic sequence. Alternatively, the first polynucleotide can be a nucleotide sequence that is a fragment of the polymorphic sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence, or a complementary nucleotide sequence which includes a sequence complementary to one or more polymorphic sequences (SEQ ID NOS:1-1192), provided that the complementary nucleotide sequence includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5. The first polynucleotide may in addition include a nucleotide sequence that is a fragment of the complementary sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence.




In some embodiments, the oligonucleotide does not hybridize under stringent conditions to a second polynucleotide. The second polynucleotide can be, e.g., (a) a nucleotide sequence comprising one or more polymorphic sequences (SEQ ID NOS:1-1192), wherein the polymorphic sequence includes the nucleotide listed in Table 1, column 5 for the polymorphic sequence; (b) a nucleotide sequence that is a fragment of any of the polymorphic sequences; (c) a complementary nucleotide sequence including a sequence complementary to one or more polymorphic sequences (SEQ ID NOS:1-1192), wherein the polymorphic sequence includes the complement of the nucleotide listed in Table 1, column 5; and (d) a nucleotide sequence that is a fragment of the complementary sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence.




The oligonucleotide can be, e.g., between about 10 and about 100 bases in length. In some embodiments, the oligonucleotide is between about 10 and 75 bases, 10 and 51 bases, 10 and about 40 bases, or about 15 and 30 bases in length.




The invention also provides a method of detecting a polymorphic site in a nucleic acid. The method includes contacting the nucleic acid with an oligonucleotide that hybridizes to a polymorphic sequence selected from the group consisting of SEQ ID NOS: 1-1192, or its complement, provided that the polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for the polymorphic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5. The method also includes determining whether the nucleic acid and the oligonucleotide hybridize. Hybridization of the oligonucleotide to the nucleic acid sequence indicates the presence of the polymorphic site in the nucleic acid.




In preferred embodiments, the oligonucleotide does not hybridize to the polymorphic sequence when the polymorphic sequence includes the nucleotide recited in Table 1, column 5 for the polymorphic sequence, or when the complement of the polymorphic sequence includes the complement of the nucleotide recited in Table 1, column 5 for the polymorphic sequence.




The oligonucleotide can be, e.g., between about 10 and about 100 bases in length. In some embodiments, the oligonucleotide is between about 10 and 75 bases, 10 and 51 bases, 10 and about 40 bases, or about 15 and 30 bases in length.




In some embodiments, the polymorphic sequence identified by the oligonucleotide is associated with a nucleic acid encoding polypeptide related to one of the protein families disclosed herein, the polymorphic sequence is associated with a polypeptide related to one of the protein families disclosed herein. For example, the nucleic acid may be associated with a polypeptide related to angiopoietin, 4-hydroxybutyrate dehydrogenase, or any of the other proteins identified in Table 1, column 10.




In a further aspect, the invention provides a method of determining the relatedness of a first and second nucleic acid. The method includes providing a first nucleic acid and a second nucleic acid and contacting the first nucleic acid and the second nucleic acid with an oligonucleotide that hybridizes to a polymorphic sequence selected from the group consisting of SEQ ID NOS: 1-1192, or its complement, provided that the polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for the polymorphic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5. The method also includes determining whether the first nucleic acid and the second nucleic acid hybridize to the oligonucleotide, and comparing hybridization of the first and second nucleic acids to the oligonucleotide. Hybridization of first and second nucleic acids to the nucleic acid indicates the first and second subjects are related.




In preferred embodiments, the oligonucleotide does not hybridize to the polymorphic sequence when the polymorphic sequence includes the nucleotide recited in Table 1, column 5 for the polymorphic sequence, or when the complement of the polymorphic sequence includes the complement of the nucleotide recited in Table 1, column 5 for the polymorphic sequence.




The oligonucleotide can be, e.g., between about 10 and about 100 bases in length. In some embodiments, the oligonucleotide is between about 10 and 75 bases, 10 and 51 bases, 10 and about 40 bases, or about 15 and 30 bases in length.




The method can be used in a variety of applications. For example, the first nucleic acid may be isolated from physical evidence gathered at a crime scene, and the second nucleic acid may be obtained is a person suspected of having committed the crime. Matching the two nucleic acids using the method can establishing whether the physical evidence originated from the person.




In another example, the first sample may be from a human male suspected of being the father of a child and the second sample may be from a child. Establishing a match using the described method can establishing whether the male is the father of the child.




In another aspect, the method includes determining if a sequence polymorphism is the present in a subject, such as a human. The method includes providing a nucleic acid from the subject and contacting the nucleic acid with an oligonucleotide that hybridizes to a polymorphic sequence selected from the group consisting of SEQ ID NOS: 1-1192, or its complement, provided that the polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for said polymorphic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5. Hybridization between the nucleic acid and the oligonucleotide is then determined. Hybridization of the oligonucleotide to the nucleic acid sequence indicates the presence of the polymorphism in said subject.




In another aspect, the invention provides an isolated polypeptide comprising a polymorphic site at one or more amino acid residues, and wherein the protein is encoded by a polynucleotide including one of the polymorphic sequences SEQ ID NOS:1-1192, or their complement, provided that the polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for the polymorphic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5.




The polypeptide can be, e.g., related to one of the protein families disclosed herein. For example, polypeptide can be related to angiopoietin, 4-hydroxybutyrate dehydrogenase, ATP-dependent RNA helicase, MHC Class I histocompatibility antigen, or phosphoglycerate kinase.




In some embodiments, the polypeptide is translated in the same open reading frame as is a wild type protein whose amino acid sequence is identical to the amino acid sequence of the polymorphic protein except at the site of the polymorphism.




In some embodiments, the polypeptide encoded by the polymorphic sequence, or its complement, includes the nucleotide listed in Table 1, column 6 for the polymorphic sequence, or the complement includes the complement of the nucleotide listed in Table 1, column 6.




The invention also provides an antibody that binds specifically to a polypeptide encoded by a polynucleotide comprising a nucleotide sequence encoded by a polynucleotide selected from the group consisting of polymorphic sequences SEQ ID NOS:1-1192, or its complement. The polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for the polymorphic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5.




In some embodiments, the antibody binds specifically to a polypeptide encoded by a polymorphic sequence which includes the nucleotide listed in Table 1, column 6 for the polymorphic sequence.




Preferably, the antibody does not bind specifically to a polypeptide encoded by a polymorphic sequence which includes the nucleotide listed in Table 1, column 5 for the polymorphic sequence.




The invention further provides a method of detecting the presence of a polypeptide having one or more amino acid residue polymorphisms in a subject. The method includes providing a protein sample from the subject and contacting the sample with the above-described antibody under conditions that allow for the formation of antibody-antigen complexes. The antibody-antigen complexes are then detected. The presence of the complexes indicates the presence of the polypeptide.




The invention also provides a method of treating a subject suffering from, at risk for, or suspected of, suffering from a pathology ascribed to the presence of a sequence polymorphism in a subject, e.g., a human, non-human primate, cat, dog, rat, mouse, cow, pig, goat, or rabbit. The method includes providing a subject suffering from a pathology associated with aberrant expression of a first nucleic acid comprising a polymorphic sequence selected from the group consisting of SEQ ID NOS:1-1192, or its complement, and treating the subject by administering to the subject an effective dose of a therapeutic agent. Aberrant expression can include qualitative alterations in expression of a gene, e.g., expression of a gene encoding a polypeptide having an altered amino acid sequence with respect to its wild-type counterpart. Qualitatively different polypeptides can include, shorter, longer, or altered polypeptides relative to the amino acid sequence of the wild-type polypeptide. Aberrant expression can also include quantitative alterations in expression of a gene. Examples of quantitative alterations in gene expression include lower or higher levels of expression of the gene relative to its wild-type counterpart, or alterations in the temporal or tissue-specific expression pattern of a gene. Finally, aberrant expression may also include a combination of qualitative and quantitative alterations in gene expression.




The therapeutic agent can include, e.g., second nucleic acid comprising the polymorphic sequence, provided that the second nucleic acid comprises the nucleotide present in the wild type allele. In some embodiments, the second nucleic acid sequence comprises a polymorphic sequence which includes nucleotide listed in Table 1, column 5 for the polymorphic sequence.




Alternatively, the therapeutic agent can be a polypeptide encoded by a polynucleotide comprising polymorphic sequence selected from the group consisting of SEQ ID NOS:1-1192, or by a polynucleotide comprising a nucleotide sequence that is complementary to any one of polymorphic sequences SEQ ID NOS:1-1192, provided that the polymorphic sequence includes the nucleotide listed in Table 1, column 6 for the polymorphic sequence.




The therapeutic agent may further include an antibody as herein described, or an oligonucleotide comprising a polymorphic sequence selected from the group consisting of SEQ ID NOS:1-1192, or by a polynucleotide comprising a nucleotide sequence that is complementary to any one of polymorphic sequences SEQ ID NOS:1-1192, provided that the polymorphic sequence includes the nucleotide listed in Table 1, column 5 or Table 1, column 6 for the polymorphic sequence,




In another aspect, the invention provides an oligonucleotide array comprising one or more oligonucleotides hybridizing to a first polynucleotide at a polymorphic site encompassed therein. The first polynucleotide can be, e.g., a nucleotide sequence comprising one or more polymorphic sequences (SEQ ID NOS:1-1192); a nucleotide sequence that is a fragment of any of the nucleotide sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence; a complementary nucleotide sequence comprising a sequence complementary to one or more polymorphic sequences (SEQ ID NOS:1-1192); or a nucleotide sequence that is a fragment of the complementary sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence.




In preferred embodiments, the he array comprises 10; 100; 1,000; 10,000; 100,000 or more oligonucleotides.




The invention also provides a kit comprising one or more of the herein-described nucleic acids. The kit can include, e.g., polynucleotide which includes one or more of the SNPs described herein. The polynucleotide can be, e.g., a nucleotide sequence which includes one or more of the polymorphic sequences shown in Table 1 (SEQ ID NOS: 1-1192) and which includes a polymorphic sequence, or a fragment of the polymorphic sequence, as long as it includes the polymorphic site. The polynucleotide may alternatively contain a nucleotide sequence which includes a sequence complementary to one or more of the sequences (SEQ ID NOS:1-1192), or a fragment of the complementary nucleotide sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence. Alternatively, or in addition, the kit can include the invention provides an isolated allele-specific oligonucleotide that hybridizes to a first polynucleotide containing a polymorphic site. The first polynucleotide can be, e.g., a nucleotide sequence comprising one or more polymorphic sequences (SEQ ID NOS:1-1192), provided that the polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for the polymorphic sequence. Alternatively, the first polynucleotide can be a nucleotide sequence that is a fragment of the polymorphic sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence, or a complementary nucleotide sequence which includes a sequence complementary to one or more polymorphic sequences (SEQ ID NOS:1-1192), provided that the complementary nucleotide sequence includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5. The first polynucleotide may in addition include a nucleotide sequence that is a fragment of the complementary sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence.




Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In the case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.




Other features and advantages of the invention will be apparent from the following detailed description and claims.











DETAILED DESCRIPTION OF THE INVENTION




The invention provides human SNPs in sequences which are transcribed, i.e., are cSNPs, and which have not been previously described. As is explained in more detail below, many SNPs have been identified in genes related to polypeptides of known function. If desired, SNPs associated with various polypeptides can be used together. For example, SNPs can be grouped according to whether they are derived from a nucleic acid encoding a polypeptide related to particular protein family or involved in a particular function. Thus, SNPs related to ATPase associated protein may be used together, as may SNPs associated with cadherin, or ephrin (EPH), or any of the other proteins recited in Table 1, column 10. Similarly, SNPs can be grouped according to the functions played by their gene products. Such functions include, structural proteins, proteins from which associated with metabolic pathways fatty acid metabolism, glycolysis, intermediary metabolism, calcium metabolism, proteases, and amino acid metabolism, etc.




The SNPs are shown in Table 1. Table 1 provides a summary of the polymorphic sequences disclosed herein. In the Table, a “SNP” is a polymorphic site embedded in a polymorphic sequence. The polymorphic site is occupied by a single nucleotide, which is the position of nucleotide variation between the wild type and polymorphic allelic sequences. The site is usually preceded by and followed by relatively highly conserved sequences of the allele (e.g., sequences that vary in less than 1/100 or 1/1000 members of the populations). Thus, a polymorphic sequence can include one or more of the following sequences: (1) a sequence having the nucleotide denoted in Table 1, column 5 at the polymorphic site in the polymorphic sequence: and (2) a sequence having a nucleotide other than the nucleotide denoted in Table 1, column 5 at the polymorphic site in the polymorphic sequence. An example of the latter sequence is a polymorphic sequence having the nucleotide denoted in Table 1, column 6 at the polymorphic site in the polymorphic sequence.




Nucleotide sequences for a referenced-polymorphic pair are presented in Table 1. The choice of designating one sequence of the cognate pair as a “reference” sequence and the second cognate of the pair as a “polymorphic” sequence is arbitrary. Each cSNP entry provides information concerning both the reference nucleotide sequence as well as the cognate polymorphic sequence occurring at a given polymorphic site. Each row of the Table provides this information for a given reference-polymorphism cognate pair. A reference to the sequence identifier number providing the sequences of both alleles is also provided. In addition, references to the SEQ ID NOS: giving the translated amino acid sequences are also given if appropriate.




Table 1 includes thirteen columns that provide descriptive information for each cSNP, each of which occupies one row in the Table. The column headings, and an explanation for each, are given below.




“SEQ ID” provides the cross-references to the two nucleotide SEQ ID NOS: for the cognate pair, which are numbered consecutively, and, as explained below, amino acid SEQ ID NOS: as well, in the Sequence Listing of the application. Conversely, each sequence entry in the Sequence Listing also includes a cross-reference to the CuraGen sequence ID, under the label “Accession number”. The first pair of SEQ ID NOS: given in the first column of each row of the Table is the SEQ ID NO: identifying the nucleic acid sequence for the polymorphism. If a polymorphism carries an entry for the amino acid portion of the row, a third SEQ ID NO: appears in parentheses in the column “Amino acid before” (see below) for the reference amino acid sequence, and a fourth SEQ ID NO: appears in parentheses in the column “Amino acid after” (see below) for the polymorphic amino acid sequence. The latter SEQ ID NOS: refer to amino acid sequences giving the cognate reference and polymorphic amino acid sequences that are the translation of the nucleotide polymorphism. If a polymorphism carries no entry for the protein portion of the row, only one pair SEQ ID NOS: is provided, in the first column.




“CuraGen sequence ID” provides CuraGen Corporation's accession number.




“Base pos. of SNP” gives the numerical position of the nucleotide in the nucleic acid at which the cSNP is found, as identified in this invention.




“Polymorphic sequence” provides a 51-base sequence with the polymorphic site at the 26


th


base in the sequence, as well as 25 bases from the reference sequence on the 5′ side and the 3′ side of the polymorphic site. The designation at the polymorphic site is enclosed in square brackets, and provides first, the reference nucleotide; second, a “slash (/)”; and third, the polymorphic nucleotide. In certain cases the polymorphism is an insertion or a deletion. In that case, the position that is “unfilled” (i.e., the reference or the polymorphic position) is indicated by the word “gap”.




“Base before” provides the nucleotide present in the reference sequence at the position at which the polymorphism is found.




“Base after” provides the altered nucleotide at the position of the polymorphism. “Amino acid before” provides the amino acid in the reference protein, if the polymorphism occurs in a coding region. This column also includes the SEQ ID NO: in parentheses for the translated reference amino acid sequence if the polymorphism occurs in a coding region.




“Amino acid after” provides the amino acid in the polymorphic protein, if the polymorphism occurs in a coding region. This column also includes the SEQ ID NO: in parentheses for the translated polymorphic amino acid sequence if the polymorphism occurs in a coding region.




“Type of change” provides information on the nature of the polymorphism. “SILENT-NONCODING” is used if the polymorphism occurs in a noncoding region of a nucleic acid. “SILENT-CODING” is used if the polymorphism occurs in a coding region of a nucleic acid of a nucleic acid and results in no change of amino acid in the translated polymorphic protein. “CONSERVATIVE” is used if the polymorphism occurs in a coding region of a nucleic acid and provides a change in which the altered amino acid falls in the same class as the reference amino acid. The classes are: 1) Aliphatic: Gly, Ala, Val, Leu, Ile; 2) Aromatic: Phe, Tyr, Trp; 3) Sulfur-containing: Cys, Met; 4) Aliphatic OH: Ser, Thr; 5) Basic: Lys, Arg, His; 6) Acidic: Asp, Glu, Asn, Gln; 7) Pro falls in none of the other classes; and 8) End defines a termination codon.




“NONCONSERVATIVE” is used if the polymorphism occurs in a coding region of a nucleic acid and provides a change in which the altered amino acid falls in a different class than the reference amino acid.




“FRAMESHIFT” relates to an insertion or a deletion. If the frameshift occurs in a coding region, the Table provides the translation of the frameshifted codons 3′ to the polymorphic site.




“Protein classification of CuraGen gene” provides a generic class into which the protein is classified. Approximately multiple classes of proteins were identified. The classes include the following:




Examples of possible disease correlations between the claimed SNPs with members of the genes of each classification are listed below for representative protein families.




Amylases




Amylase is responsible for endohydrolysis of 1,4-alpha-glucosidic linkages in oligosaccharides and polysaccharides. Variations in amylase gene may be indicative of delayed maturation and of various amylase producing neoplasms and carcinomas.




Amyloid




The serum amyloid A (SAA) proteins comprise a family of vertebrate proteins that associate predominantly with high-density lipoproteins (HDL). The synthesis of certain members of the family is greatly increased in inflammation. Prolonged elevation of plasma SAA levels, as in chronic inflammation, 15 results in a pathological condition, called amyloidosis, which affects the liver, kidney and spleen and which is characterized by the highly insoluble accumulation of SAA in these tissues. Amyloid selectively inhibits insulin-stimulated glucose utilization and glycogen deposition in muscle, while not affecting adipocyte glucose metabolism. Deposition of fibrillar amyloid proteins intraneuronally, as neurofibrillary tangles, extracellularly, as plaques and in blood vessels, is characteristic of both Alzheimer's disease and aged Down's syndrome. Amyloid deposition is also associated with type II diabetes mellitus.




Angiopoeitin




Members of the angiopoeitin/fibrinogen family have been shown to stimulate the generation of new blood vessels, inhibit the generation of new blood vessels, and perform several roles in blood clotting. This generation of new blood vessels, called angiogenesis, is also an essential step in tumor growth in order for the tumor to get the blood supply that it needs to expand. Variation in these genes may be predictive of any form of heart disease, numerous blood clotting disorders, stroke, hypertension and predisposition to tumor formation and metastasis. In particular, these variants may be predictive of the response to various antihypertensive drugs and chemotherapeutic and anti-tumor agents.




Apoptosis-Related Proteins




Active cell suicide (apoptosis) is induced by events such as growth factor withdrawal and toxins. It is controlled by regulators, which have either an inhibitory effect on programmed cell death (anti-apoptotic) or block the protective effect of inhibitors (pro-apoptotic). Many viruses have found a way of countering defensive apoptosis by encoding their own anti-apoptosis genes preventing their target-cells from dying too soon. Variants of apoptosis related genes may be useful in formulation of anti-aging drugs.




Cadherin, Cyclin, Polymerase, Oncogenes, Histones, Kinases




Members of the cell division/cell cycle pathways such as cyclins, many transcription factors and kinases, DNA polymerases, histones, helicases and other oncogenes play a critical role in carcinogenesis where the uncontrolled proliferation of cells leads to tumor formation and eventually metastasis. Variation in these genes may be predictive of predisposition to any form of cancer, from increased risk of tumor formation to increased rate of metastasis. In particular, these variants may be predictive of the response to various chemotherapeutic and anti-tumor agents.




Colony-Stimulating Factor-Related Proteins




Granulocyte/macrophage colony-stimulating factors are cytokines that act in hematopoiesis by controlling the production, differentiation, and function of 2 related white cell populations of the blood, the granulocytes and the monocytes-macrophages.




Complement-Related Proteins




Complement proteins are immune associated cytotoxic agents, acting in a chain reaction to exterminate target cells to that were opsonized (primed) with antibodies, by forming a membrane attack complex (MAC). The mechanism of killing is by opening pores in the target cell membrane. Variations in 20 complement genes or their inhibitors are associated with many autoimmune disorders. Modified serum levels of complement products cause edemas of various tissues, lupus (SLE), vasculitis, glomerulonephritis, renal failure, hemolytic anemia, thrombocytopenia, and arthritis. They interfere with mechanisms of ADCC (antibody dependent cell cytotoxicity), severely impair immune competence and reduce phagocytic ability. Variants of complement genes may also be indicative of type I diabetes mellitus, meningitis neurological disorders such as Nemaline myopathy, Neonatal hypotonia, muscular disorders such as congenital myopathy and other diseases.




Cytochrome




The respiratory chain is a key biochemical pathway which is essential to all aerobic cells. There are five different cytochromes involved in the chain. These are heme bound proteins which serve as electron carriers. Modifications in these genes may be predictive of ataxia areflexia, dementia and myopathic and neuropathic changes in muscles. Also, association with various types of solid tumors.




Kinesins




Kinesins are tubulin molecular motors that function to transport organelles within cells and to move chromosomes along microtubules during cell division. Modifications of these genes may be indicative of neurological disorders such as Pick disease of the brain, tuberous sclerosis.




Cytokines, Interferon, Interleukin




Members of the cytokine families are known for their potent ability to stimulate cell growth and division even at low concentrations. Cytokines such as erythropoietin are cell-specific in their growth stimulation; erythropoietin is useful for the stimulation of the proliferation of erythroblasts. Variants in cytokines may be predictive for a wide variety of diseases, including cancer predisposition.




G-protein Coupled Receptors




G-protein coupled receptors (also called R7G) are an extensive group of hormones, neurotransmitters, odorants and light receptors which transduce extracellular signals by interaction with guanine nucleotide-binding (G) proteins. Alterations in genes coding for G-coupled proteins may be involved in and indicative of a vast number of physiological conditions. These include blood pressure regulation, renal dysfunctions, male infertility, dopamine associated cognitive, emotional, and endocrine functions, hypercalcemia, chondrodysplasia and osteoporosis, pseudohypoparathyroidism, growth retardation and dwarfism.




Thioesterases




Eukaryotic thiol proteases are a family of proteolytic enzymes which contain an active site cysteine. Catalysis proceeds through a thioester intermediate and is facilitated by a nearby histidine side chain; an asparagine completes the essential catalytic triad. Variants of thioester associated genes may be predictive of neuronal disorders and mental illnesses such as Ceroid Lipoffiscinosis, Neuronal 1, Infantile, Santavuori disease and more.




“Name of protein identified following a BLASTX analysis of the CuraGen sequence” provides the database reference for the protein found to resemble the novel reference-polymorphism cognate pair most closely.




“Similarity (pvalue) following a BLASTX analysis” provides the pvalue, a statistical measure from the BLASTX analysis that the polymorphic sequence is similar to, and therefore an allele of, the reference, or wild-type, sequence. In the present application, a cutoff of pvalue >1×10


−50


(entered, for example, as 1.0E-50 in the Table) is used to establish that the reference-polymorphic cognate pairs are novel. A pvalue <1×10


−50


defines proteins considered to be already known.




“Map location” provides any information available at the time of filing related to localization of a gene on a chromosome.




The polymorphisms are arranged in Table 1 in the following order:




SEQ ID NOS: 1 to 1112, in consecutive pairs, are SNPs that are silent;




SEQ ID NOS: 1113-1128, in consecutive pairs, are SNPs that lead to conservative amino acid changes;




SEQ ID NOS: 1129-1186, in consecutive pairs, are SNPs that lead to nonconservative amino acid changes; and




SEQ ID NOS: 1187-1192, in consecutive pairs, are SNPs that involve a gap.




With respect to the reference or wild-type sequence at the position of the polymorphism, the allelic cSNP introduces an additional nucleotide (an insertion) or deletes a nucleotide (a deletion). A SNP that involves a gap generates a frame shift.




Also presented in the sequence listing filed herewith are predicted amino acid sequences encoded by the polymorphic sequences shown in Table 1. SEQ ID NOS: 1193-1208, in consecutive pairs, are the amino acid sequences centered at the polymorphic amino acid residue for the protein products provided by SNPs that lead to conservative amino acid changes between the reference and the polymorphic sequences. 7 or 8 amino acids on either side of the polymorphic site are shown. The order in which these sequences appear mirrors the order of presentation of the cognate nucleotide sequences, and is set forth in Table 1.




SEQ ID NOS: 1209-1266, in consecutive pairs, are the amino acid sequences centered at the polymorphic amino acid residue for the protein products provided by SNPs that lead to nonconservative amino acid changes between the reference and the polymorphic sequences. 7 or 8 amino acids on either side of the polymorphic site are shown. The order in which these sequences appear mirrors the order of presentation of the cognate nucleotide sequences, and is set forth in the Table.




SEQ ID NOS: 1267-1272, in consecutive pairs, are the amino acid sequences centered at the polymorphic amino acid residue for the protein products provided by SNPs that lead to frameshift-induced amino acid changes between the reference and the polymorphic sequences. 7 or 8 amino acids on either side of the polymorphic site are shown. The order in which these sequences appear mirrors the order of presentation of the cognate nucleotide sequences, and is set forth in Table 1.




Provided herein are compositions which include, or are capable of detecting, nucleic acid sequences having these polymorphisms, as well as methods of using nucleic acids.




Identification of Individuals Carrying SNPs




Individuals carrying polymorphic alleles of the invention may be detected at either the DNA, the RNA, or the protein level using a variety of techniques that are well known in the art. Strategies for identification and detection are described in e.g., EP 730,663, EP 717,113, and PCT US97/02102. The present methods usually employ pre-characterized polymorphisms. That is, the genotyping location and nature of polymorphic forms present at a site have already been determined. The availability of this information allows sets of probes to be designed for specific identification of the known polymorphic forms.




Many of the methods described below require amplification of DNA from target samples. This can be accomplished by e.g., PCR. (1989), B. for detecting polymorphisms. See generally PCR Technology: Principles and Applications for DNA Amplification (ed. H. A. Erlich, Freeman Press, NY, N.Y., 1992); PCR Protocols: A Guide to Methods and Applications (eds. Innis, et al., Academic Press, San Diego, Calif., 1990); Mattila et al., Nucleic Acids Res. 19, 4967 (1991); Eckert et al., PCR Methods and Applications 1, 17 (1991); PCR (eds. McPherson et al., IRL Press, Oxford); and U.S. Pat. No. 4,683,202.




The phrase “recombinant protein” or “recombinantly produced protein” refers to a peptide or protein produced using non-native cells that do not have an endogenous copy of DNA able to express the protein. In particular, as used herein, a recombinantly produced protein relates to the gene product of a polymorphic allele, i.e., a “polymorphic protein” containing an altered amino acid at the site of translation of the nucleotide polymorphism. The cells produce the protein because they have been genetically altered by the introduction of the appropriate nucleic acid sequence. The recombinant protein will not be found in association with proteins and other subcellular components normally associated with the cells producing the protein. The terms “protein” and “polypeptide” are used interchangeably herein.




The phrase “substantially purified” or “isolated” when referring to a nucleic acid, peptide or protein, means that the chemical composition is in a milieu containing fewer, or preferably, essentially none, of other cellular components with which it is naturally associated. Thus, the phrase “isolated” or “substantially pure” refers to nucleic acid preparations that lack at least one protein or nucleic acid normally associated with the nucleic acid in a host cell. It is preferably in a homogeneous state although it can be in either a dry or aqueous solution. Purity and homogeneity are typically determined using analytical chemistry techniques such as gel electrophoresis or high performance liquid chromatography. Generally, a substantially purified or isolated nucleic acid or protein will comprise more than 80% of all macromolecular species present in the preparation. Preferably, the nucleic acid or protein is purified to represent greater than 90% of all macromolecular species present. More preferably the nucleic acid or protein is purified to greater than 95%, and most preferably the nucleic acid or protein is purified to essential homogeneity, wherein other macromolecular species are not detected by conventional analytical procedures.




The genomic DNA used for the diagnosis may be obtained from any nucleated cells of the body, such as those present in peripheral blood, urine, saliva, buccal samples, surgical specimen, and autopsy specimens. The DNA may be used directly or may be amplified enzymatically in vitro through use of PCR (Saiki et al.


Science


239:487-491 (1988)) or other in vitro amplification methods such as the ligase chain reaction (LCR) (Wu and Wallace


Genomics


4:560-569 (1989)), strand displacement amplification (SDA) (Walker et al.


Proc. Natl. Acad. Sci. U.S.A


, 89:392-396 (1992)), self-sustained sequence replication (3SR) (Fahy et al. PCR Methods P&J& 1:25-33 (1992)), prior to mutation analysis.




The method for preparing nucleic acids in a form that is suitable for mutation detection is well known in the art. A “nucleic acid” is a deoxyribonucleotide or ribonucleotide polymer in either single-or double-stranded form, including known analogs of natural nucleotides unless otherwise indicated. The term “nucleic acids”, as used herein, refers to either DNA or RNA. “Nucleic acid sequence” or “polynucleotide sequence” refers to a single-stranded sequence of deoxyribonucleotide or ribonucleotide bases read from the 5′ end to the 3′ end. The direction of 5′ to 3′ addition of nascent RNA transcripts is referred to as the transcription direction; sequence regions on the DNA strand having the same sequence as the RNA and which are beyond the 5′ end of the RNA transcript in the 5′ direction are referred to as “upstream sequences”; sequence regions on the DNA strand having the same sequence as the RNA and which are beyond the 3′ end of the RNA transcript in the 3′ direction are referred to as “downstream sequences”. The term includes both self-replicating plasmids, infectious polymers of DNA or RNA and nonfunctional DNA or RNA. The complement of any nucleic acid sequence of the invention is understood to be included in the definition of that sequence. “Nucleic acid probes” may be DNA or RNA fragments.




The detection of polymorphisms in specific DNA sequences, can be accomplished by a variety of methods including, but not limited to, restriction-fragment-length-polymorphism detection based on allele-specific restriction-endonuclease cleavage (Kan and Dozy


Lancet


ii:910-912 (1978)), hybridization with allele-specific oligonucleotide probes (Wallace et al. Nucl. Acids Res. 6:3543-3557 (1978)), including immobilized oligonucleotides (Saiki et al.


Proc. Natl. Acad. SCI. USA


, 86:6230-6234 (1969)) or oligonucleotide arrays (Maskos and Southern


Nucl. Acids Res


21:2269-2270 (1993)), allele-specific PCR (Newton et al.


Nucl Acids Res


17:2503-2516 (1989)), mismatch-repair detection (MRD) (Faham and Cox


Genome Res


5:474-482 (1995)), binding of MutS protein (Wagner et al.


Nucl Acids Res


23:3944-3948 (1995), denaturing-gradient gel electrophoresis (DGGE) (Fisher and Lerman et al.


Proc. Natl. Acad. Sci. U.S.A


. 80:1579-1583 (1983)), single-strand-confirmation-polymorphism detection (Orita et al. Genomics 5:874-879 (1983)), RNAase cleavage at mismatched base-pairs (Myers et al.


Science


230:1242 (1985)), chemical (Cotton et al. Proc. Natl. w Sci. U.S.A, 8Z4397-4401 (1988)) or enzymatic (Youil et al. Proc. Natl. Acad. Sci. U.S.A. 92:87-91 (1995)) cleavage of heteroduplex DNA, methods based on allele specific primer extension (Syvanen et al.


Genomics


8:684-692 (1990)), genetic bit analysis (GBA) (Nikiforov et al. &&I


Acids


22:4167-4175 (1994)), the oligonucleotide-ligation assay (OLA) (Landegren et al. Science 241:1077 (1988)), the allele-specific ligation chain reaction (LCR) (Barrany Proc. Natl. Acad. Sci. U.S.A. 88:189-193 (1991)), gap-LCR (Abravaya et al.


Nucl Acids Res


23:675-682 (1995)), radioactive and/or fluorescent DNA sequencing using standard procedures well known in the art, and peptide nucleic acid (PNA) assays (Orum et al., Nucl. Acids Res, 21:5332-5356 (1993); Thiede et al.,


Nucl. Acids Res


. 24:983-984 (1996)).




“Specific hybridization” or “selective hybridization” refers to the binding, or duplexing, of a nucleic acid molecule only to a second particular nucleotide sequence to which the nucleic acid is complementary, under suitably stringent conditions when that sequence is present in a complex mixture (e.g., total cellular DNA or RNA). “Stringent conditions” are conditions under which a probe will hybridize to its target subsequence, but to no other sequences. Stringent conditions are sequence-dependent and are different in different circumstances. Longer sequences hybridize specifically at higher temperatures than shorter ones. Generally, stringent conditions are selected such that the temperature is about 5° C. lower than the thermal melting point (Tm) for the specific sequence to which hybridization is intended to occur at a defined ionic strength and pH. The Tm is the temperature (under defined ionic strength, pH, and nucleic acid concentration) at which 50% of the target sequence hybridizes to the complementary probe at equilibrium. Typically, stringent conditions include a salt concentration of at least about 0.01 to about 1.0 M Na ion concentration (or other salts), at pH 7.0 to 8.3. The temperature is at least about 30° C. for short probes (e.g., 10 to 50 nucleotides). Stringent conditions can also be achieved with the addition of destabilizing agents such as formamide. For example, conditions of 5× SSPE (750 mM NaCl, 50 mM NaPhosphate, 5 mM EDTA, pH 7.4) and a temperature of 25-30° C. are suitable for allele-specific probe hybridizations.




“Complementary” or “target” nucleic acid sequences refer to those nucleic acid sequences which selectively hybridize to a nucleic acid probe. Proper annealing conditions depend, for example, upon a probe's length, base composition, and the number of mismatches and their position on the probe, and must often be determined empirically. For discussions of nucleic acid probe design and annealing conditions, see, for example, Sambrook et al., or


Current Protocols in Molecular Biology


, F. Ausubel et al., ed., Greene Publishing and Wiley-Interscience, New York (1987).




A perfectly matched probe has a sequence perfectly complementary to a particular target sequence. The test probe is typically perfectly complementary to a portion of the target sequence. A “polymorphic” marker or site is the locus at which a sequence difference occurs with respect to a reference sequence. Polymorphic markers include restriction fragment length polymorphisms, variable number of tandem repeats (VNTR's), hypervariable regions, minisatellites, dinucleotide repeats, trinucleotide repeats, tetranucleotide repeats, simple sequence repeats, and insertion elements such as Alu. The reference allelic form may be, for example, the most abundant form in a population, or the first allelic form to be identified, and other allelic forms are designated as alternative, variant or polymorphic alleles. The allelic form occurring most frequently in a selected population is sometimes referred to as the “wild type” form, and herein may also be referred to as the “reference” form. Diploid organisms may be homozygous or heterozygous for allelic forms. A diallelic polymorphism has two distinguishable forms (i.e., base sequences), and a triallelic polymorphism has three such forms.




As use herein an “oligonucleotide” is a single-stranded nucleic acid ranging in length from 2 to about 60 bases. Oligonucleotides are often synthetic but can also be produced from naturally occurring polynucleotides. A probe is an oligonucleotide capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing via hydrogen bond formation. Oligonucleotides probes are often between 5 and 60 bases, and, in specific embodiments, may be between 10-40, or 15-30 bases long. An oligonucleotide probe may include natural (i.e. A, G, C, or T) or modified bases (7-deazaguanosine, inosine, etc.). In addition, the bases in an oligonucleotide probe may be joined by a linkage other than a phosphodiester bond, such as a phosphoramidite linkage or a phosphorothioate linkage, or they may be peptide nucleic acids in which the constituent bases are joined by peptide bonds rather than by phosphodiester bonds, so long as it does not interfere with hybridization.




As used herein, the term “primer” refers to a single-stranded oligonucleotide which acts as a point of initiation of template-directed DNA synthesis under appropriate conditions (e.g., in the presence of four different nucleoside triphosphates and a polymerization agent, such as DNA polymerase, RNA polymerase or reverse transcriptase) in an appropriate buffer and at a suitable temperature. The appropriate length of a primer depends on the intended use of the primer, but typically ranges from 15 to 30 nucleotides. Short primer molecules generally require cooler temperatures to form sufficiently stable hybrid complexes with the template. A primer need not be perfectly complementary to the exact sequence of the template, but should be sufficiently complementary to hybridize with it. The term “primer site” refers to the sequence of the target DNA to which a primer hybridizes. The term “primer pair” refers to a set of primers including a 5′ (upstream) primer that hybridizes with the 5′ end of the DNA sequence to be amplified and a 3′ (downstream) primer that hybridizes with the complement of the 3′ end of the sequence to be amplified.




DNA fragments can be prepared, for example, by digesting plasmid DNA, or by use of PCR. Oligonucleotides for use as primers or probes are chemically synthesized by methods known in the field of the chemical synthesis of polynucleotides, including by way of non-limiting example the phosphoramidite method described by Beaucage and Carruthers,


Tetrahedron Lett


22:1859-1862 (1981) and the triester method provided by Matteucci, et al.,


J. Am. Chem. Soc


., 103:3185 (1981) both incorporated herein by reference. These syntheses may employ an automated synthesizer, as described in Needham-VanDevanter, D. R., et al.,


Nucleic Acids Res


. 12:61596168 (1984). Purification of oligonucleotides may be carried out by either native acrylamide gel electrophoresis or by anion-exchange HPLC as described in Pearson, J. D. and Regnier, F. E., ,J. Chrom, 255:137-149 (1983). A double stranded fragment may then be obtained, if desired, by annealing appropriate complementary single strands together under suitable conditions or by synthesizing the complementary strand using a DNA polymerase with an appropriate primer sequence. Where a specific sequence for a nucleic acid probe is given, it is understood that the complementary strand is also identified and included. The complementary strand will work equally well in situations where the target is a double-stranded nucleic acid.




The sequence of the synthetic oligonucleotide or of any nucleic acid fragment can be can be obtained using either the dideoxy chain termination method or the Maxam-Gilbert method (see Sambrook et al.


Molecular Cloning—a Laboratory Manual


(2nd Ed.), Vols. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., (1989), which is incorporated herein by reference. This manual is hereinafter referred to as “Sambrook et al.”; Zyskind et al., (1988)). Recombinant DNA Laboratory Manual, (Acad. Press, New York). Oligonucleotides useful in diagnostic assays are typically at least 8 consecutive nucleotides in length, and may range upwards of 18 nucleotides in length to greater than 100 or more consecutive nucleotides.




Another aspect of the invention pertains to isolated antisense nucleic acid molecules that are hybridizable to or complementary to the nucleic acid molecule comprising the SNP-containing nucleotide sequences of the invention, or fragments, analogs or derivatives thereof. An “antisense” nucleic acid comprises a nucleotide sequence that is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. In specific aspects, antisense nucleic acid molecules are provided that comprise a sequence complementary to at least about 10, about 25, about 50, or about 60 nucleotides or an entire SNP coding strand, or to only a portion thereof.




In one embodiment, an antisense nucleic acid molecule is antisense to a “coding region” of the coding strand of a polymorphic nucleotide sequence of the invention. The term “coding region” refers to the region of the nucleotide sequence comprising codons which are translated into amino acid. In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence of the invention. The term “noncoding region” refers to 5′ and 3′ sequences which flank the coding region that are not translated into amino acids (i.e., also referred to as 5′ and 3′ untranslated regions).




Given the coding strand sequences disclosed herein, antisense nucleic acids of the invention can be designed according to the rules of Watson and Crick or Hoogsteen base pairing. For example, the antisense nucleic acid molecule can generally be complementary to the entire coding region of an mRNA, but more preferably as embodied herein, it is an oligonucleotide that is antisense to only a portion of the coding or noncoding region of the mRNA. An antisense oligonucleotide can range in length between about 5 and about 60 nucleotides, preferably between about 10 and about 45 nucleotides, more preferably between about 15 and 40 nucleotides, and still more preferably between about 15 and 30 in length. An antisense nucleic acid of the invention can be constructed using chemical synthesis or enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used.




Examples of modified nucleotides that can be used to generate the antisense nucleic acid include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine, 5′-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the antisense nucleic acid can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).




The antisense nucleic acid molecules of the invention are typically administered to a subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a polymorphic protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by conventional nucleotide complementary to form a stable duplex, or, for example, in the case of an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in the major groove of the double helix. An example of a route of administration of anti sense nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For example, for systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.




In yet another embodiment, the antisense nucleic acid molecule of the invention is an α-anomeric nucleic acid molecule. An α-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual units, the strands run parallel to each other (Gaultier et al. (1987)


Nucleic Acids Res


15: 6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987)


Nucleic Acids Res


15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et al. (1987)


FEBS Lett


215: 327-330).




The following terms are used to describe the sequence relationships between two or more nucleic acids or polynucleotides: “reference sequence”, “comparison window”, “sequence identity”, “percentage of sequence identity”, and “substantial identity”. A “reference sequence” is a defined sequence used as a basis for a sequence comparison; a reference sequence may be a subset of a larger sequence, for example, as a segment of a full-length cDNA or gene sequence given in a sequence listing, or may comprise a complete cDNA or gene sequence. Optimal alignment of sequences for aligning a comparison window may, for example, be conducted by the local homology algorithm of Smith and Waterman


Adv. Appl. Math


. 2482 (1981), by the homology alignment algorithm of Needleman and Wunsch


J. Mol. Biol


. 48:443 (1970), by the search for similarity method of Pearson and Lipman


Proc. Natl. Acad. Sci. U.S.A


. 852444 (1988), or by computerized implementations of these algorithms (for example, GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package Release 7.0, Genetics Computer Group, 575 Science Dr., Madison, Wis.).




Techniques for nucleic acid manipulation of the nucleic acid sequences harboring the cSNP's of the invention, such as subcloning nucleic acid sequences encoding polypeptides into expression vectors, labeling probes, DNA hybridization, and the like, are described generally in Sambrook et al., The phrase “nucleic acid sequence encoding” refers to a nucleic acid which directs the expression of a specific protein, peptide or amino acid sequence. The nucleic acid sequences include both the DNA strand sequence that is transcribed into RNA and the RNA sequence that is translated into protein, peptide or amino acid sequence. The nucleic acid sequences include both the full length nucleic acid sequences disclosed herein as well as non-full length sequences derived from the full length protein. It being further understood that the sequence includes the degenerate codons of the native sequence or sequences which may be introduced to provide codon preference in a specific host cell. Consequently, the principles of probe selection and array design can readily be extended to analyze more complex polymorphisms (see EP 730,663). For example, to characterize a triallelic SNP polymorphism, three groups of probes can be designed tiled on the three polymorphic forms as described above. As a further example, to analyze a diallelic polymorphism involving a deletion of a nucleotide, one can tile a first group of probes based on the undeleted polymorphic form as the reference sequence and a second group of probes based on the deleted form as the reference sequence.




For assay of genomic DNA, virtually any biological convenient tissue samples include whole blood, semen, saliva, tears, urine, fecal material, sweat, buccal, skin and hair can be used. Genomic DNA is typically amplified before analysis. Amplification is usually effected by PCR using primers flanking a suitable fragment e.g., of 50-500 nucleotides containing the locus of the polymorphism to be analyzed. Target is usually labeled in the course of amplification. The amplification product can be RNA or DNA, single stranded or double stranded. If double stranded, the amplification product is typically denatured before application to an array. If genomic DNA is analyzed without amplification, it may be desirable to remove RNA from the sample before applying it to the array. Such can be accomplished by digestion with DNase-free RNAase.




Detection of Polymorphisms in a Nucleic Acid Sample




The SNPs disclosed herein can be used to determine which forms of a characterized polymorphism are present in individuals under analysis.




The design and use of allele-specific probes for analyzing polymorphisms is described by e.g., Saiki et al., Nature 324, 163-166 (1986); Dattagupta, EP 235,726, Saiki, WO 89/11548. Allele-specific probes can be designed that hybridize to a segment of target DNA from one individual but do not hybridize to the corresponding segment from another individual due to the presence of different polymorphic forms in the respective segments from the two individuals. Hybridization conditions should be sufficiently stringent that there is a significant difference in hybridization intensity between alleles, and preferably an essentially binary response, whereby a probe hybridizes to only one of the alleles. Some probes are designed to hybridize to a segment of target DNA such that the polymorphic site aligns with a central position (e.g., in a 15-mer at the 7 position; in a 16-mer, at either the 7, 8 or 9 position) of the probe. This design of probe achieves good discrimination in hybridization between different allelic forms.




Allele-specific probes are often used in pairs, one member of a pair showing a perfect match to a reference form of a target sequence and the other member showing a perfect match to a variant form. Several pairs of probes can then be immobilized on the same support for simultaneous analysis of multiple polymorphisms within the same target sequence.




The polymorphisms can also be identified by hybridization to nucleic acid arrays, some examples of which are described in oublished PCT application WO 95/11995. WO 95/11995 also describes subarrays that are optimized for detection of a variant form of a precharacterized polymorphism. Such a subarray contains probes designed to be complementary to a second reference sequence, which is an allelic variant of the first reference sequence. The second group of probes is designed by the same principles, except that the probes exhibit complementarity to the second reference sequence. The inclusion of a second group (or further groups) can be particularly useful for analyzing short subsequences of the primary reference sequence in which multiple mutations are expected to occur within a short distance commensurate with the length of the probes (e.g., two or more mutations within 9 to 21 bases).




An allele-specific primer hybridizes to a site on target DNA overlapping a polymorphism and only primes amplification of an allelic form to which the primer exhibits perfect complementarity. See Gibbs, Nucleic Acid Res. 17 2427-2448 (1989). This primer is used in conjunction with a second primer which hybridizes at a distal site. Amplification proceeds from the two-primers, resulting in a detectable product which indicates the particular allelic form is present. A control is usually performed with a second pair of primers, one of which shows a single base mismatch at the polymorphic site and the other of which exhibits perfect complementarity to a distal site. The single-base mismatch prevents amplification and no detectable product is formed. The method works best when the mismatch is included in the 3′-most position of the oligonucleotide aligned with the polymorphism because this position is most destabilizing to elongation from the primer (see, e.g., WO 93/22456).




Amplification products generated using the polymerase chain reaction can be analyzed by the use of denaturing gradient gel electrophoresis. Different alleles can be identified based on the different sequence-dependent melting properties and electrophoretic migration of DNA in solution. Erlich, ed., PCR Technology, Principles and Applications for DNA Amplification, (W. H. Freeman and Co New York, 1992, Chapter 7).




Alleles of target sequences can be differentiated using single-strand conformation polymorphism analysis, which identifies base differences by alteration in electrophoretic migration of single stranded PCR products, as described in Orita et al., Proc. Nat. Acad. Sci. 86, 2766-2770 (1989). Amplified PCR products can be generated and heated or otherwise denatured, to form single stranded amplification products. Single-stranded nucleic acids may refold or form secondary structures which are partially dependent on the base sequence. The different electrophoretic mobilities of single-stranded amplification products can be related to base-sequence differences between alleles of target sequences.




The genotype of an individual with respect to a pathology suspected of being caused by a genetic polymorphism may be assessed by association analysis. Phenotypic traits suitable for association analysis include diseases that have known but hitherto unmapped genetic components (e.g., agammaglobulinemia, diabetes insipidus, Lesch-Nyhan syndrome, muscular dystrophy, Wiskott-Aldrich syndrome, Fabry's disease, familial hypercholesterolemia, polycystic kidney disease, hereditary spherocytosis, von Willebrand's disease, tuberous sclerosis, hereditary hemorrhagic telangiectasia, familial colonic polyposis, Ehlers-Danlos syndrome, osteogenesis imperfecta, and acute intermittent porphyria).




Phenotypic traits also include symptoms of, or susceptibility to, multifactorial diseases of which a component is or may be genetic, such as autoimmune diseases, inflammation, cancer, system, diseases of the nervous and infection by pathogenic microorganisms. Some examples of autoimmune diseases include rheumatoid arthritis, multiple sclerosis, diabetes (insulin-dependent and non-independent), systemic lupus erythematosus and Graves disease. Some examples of cancers include cancers of the bladder, brain, breast, colon, esophagus, kidney, oral cavity, ovary, pancreas, prostate, skin, stomach, leukemia, liver, lung, and uterus. Phenotypic traits also include characteristics such as longevity, appearance (e.g., baldness, obesity), strength, speed, endurance, fertility, and susceptibility or receptivity to particular drugs or therapeutic treatments.




Such correlations can be exploited in several ways. In the case of a strong correlation between a polymorphic form and a disease for which treatment is available, detection of the polymorphic form set in a human or animal patient may justify immediate administration of treatment, or at least the institution of regular monitoring of the patient. Detection of a polymorphic form correlated with serious disease in a couple contemplating a family may also be valuable to the couple in their reproductive decisions. For example, the female partner might elect to undergo in vitro fertilization to avoid the possibility of transmitting such a polymorphism from her husband to her offspring. In the case of a weaker, but still statistically significant correlation between a polymorphic set and human disease, immediate therapeutic intervention or monitoring may not be justified. Nevertheless, the patient can be motivated to begin simple life-style changes (e.g., diet, exercise) that can be accomplished at little cost to the patient but confer potential benefits in reducing the risk of conditions to which the patient may have increased susceptibility by virtue of variant alleles. After determining polymorphic form(s) present in an individual at one or more polymorphic sites, this information can be used in a number of methods.




Determination of which polymorphic forms occupy a set of polymorphic sites in an individual identifies a set of polymorphic forms that distinguishes the individual. See generally National Research Council,


The Evaluation of Forensic DNA Evidence


(Eds. Pollard et al., National Academy Press, DC, 1996). Since the polymorphic sites are within a 50,000 bp region in the human genome, the probability of recombination between these polymorphic sites is low. That low probability means the haplotype (the set of all 10 polymorphic sites) set forth in this application should be inherited without change for at least several generations. The more sites that are analyzed the lower the probability that the set of polymorphic forms in one individual is the same as that in an unrelated individual. Preferably, if multiple sites are analyzed, the sites are unlinked. Thus, polymorphisms of the invention are often used in conjunction with polymorphisms in distal genes. Preferred polymorphisms for use in forensics are diallelic because the population frequencies of two polymorphic forms can usually be determined with greater accuracy than those of multiple polymorphic forms at multi-allelic loci.




The capacity to identify a distinguishing or unique set of forensic markers in an individual is useful for forensic analysis. For example, one can determine whether a blood sample from a suspect matches a blood or other tissue sample from a crime scene by determining whether the set of polymorphic forms occupying selected polymorphic sites is the same in the suspect and the sample. If the set of polymorphic markers does not match between a suspect and a sample, it can be concluded (barring experimental error) that the suspect was not the source of the sample. If the set of markers does match, one can conclude that the DNA from the suspect is consistent with that found at the crime scene. If frequencies of the polymorphic forms at the loci tested have been determined (e.g., by analysis of a suitable population of individuals), one can perform a statistical analysis to determine the probability that a match of suspect and crime scene sample would occur by chance.




p(ID) is the probability that two random individuals have the same polymorphic or allelic form at a given polymorphic site. In diallelic loci, four genotypes are possible: AA, AB, BA, and BB. If alleles A and B occur in a haploid genome of the organism with frequencies x and y, the probability of each genotype in a diploid organism are (see WO 95/12607):






Homozygote:


p


(


AA


)=


x




2










Homozygote:


p


(


BB


)=


y




2


=(1


−x


)


2










Single Heterozygote:


p


(


AB


)=


p


(


BA


)=


xy=x


(1


−x


)








Both Heterozygotes:


p


(


AB+BA


)=2


xy


=2


x


(1


−x


)






The probability of identity at one locus (i.e, the probability that two individuals, picked at random from a population will have identical polymorphic forms at a given locus) is given by the equation:








p


(


ID


)=(


x




2


)


2+


(2


xy


)


2+


(


y




2


)


2


.






These calculations can be extended for any number of polymorphic forms at a given locus. For example, the probability of identity p(ID) for a 3-allele system where the alleles have the frequencies in the population of x, y and z, respectively, is equal to the sum of the squares of the genotype frequencies:








p


(


ID


)=


x




4+


(2


xy


)


2+


(2


yz


)


2+


(2


xz


)


2+




z




4+




y




4








In a locus of n alleles, the appropriate binomial expansion is used to calculate p(ID) and p(exc).




The cumulative probability of identity (cum p(ID)) for each of multiple unlinked loci is determined by multiplying the probabilities provided by each locus:








cum p


(


ID


)=


p


(


ID


1)


p


(


ID


2)


p


(


ID


3) . . .


p


(


IDn


)






The cumulative probability of non-identity for n loci (i.e. the probability that two random individuals will be different at 1 or more loci) is given by the equation:








cum p


(


nonID


)=1


−cum p


(


ID


).






If several polymorphic loci are tested, the cumulative probability of non-identity for random individuals becomes very high (e.g., one billion to one). Such probabilities can be taken into account together with other evidence in determining the guilt or innocence of the suspect.




The object of paternity testing is usually to determine whether a male is the father of a child. In most cases, the mother of the child is known and thus, the mother's contribution to the child's genotype can be traced. Paternity testing investigates whether the part of the child's genotype not attributable to the mother is consistent with that of the putative father. Paternity testing can be performed by analyzing sets of polymorphisms in the putative father and the child.




If the set of polymorphisms in the child attributable to the father does not match the putative father, it can be concluded, barring experimental error, that the putative father is not the real father. If the set of polymorphisms in the child attributable to the father does match the set of polymorphisms of the putative father, a statistical calculation can be performed to determine the probability of coincidental match.




The probability of parentage exclusion (representing the probability that a random male will have a polymorphic form at a given polymorphic site that makes him incompatible as the father) is given by the equation (see WO 95/12607):








p


(


exc


)=


xy


(1


−xy


)






where x and y are the population frequencies of alleles A and B of a diallelic polymorphic site. (At a triallelic site p(exc)=xy(1−xy)+yz(1−yz)+xz(1−xz)+3xyz(1−xyz))), where x, y and z and the respective population frequencies of alleles A, B and C). The probability of non-exclusion is:








p


(


non


-


exc


)=1


−p


(


exc


)






The cumulative probability of non-exclusion (representing the value obtained when n loci are used) is thus:








cum p


(


non


-


exc


)=


p


(


non


-


exc


1)


p


(


non


-


exc


2)


p


(


non


-


exc


3) . . .


p


(


non


-


excn


)






The cumulative probability of exclusion for n loci (representing the probability that a random male will be excluded) is:








cum p


(


exc


)=1


−cum p


(


non


-


exc


).






If several polymorphic loci are included in the analysis, the cumulative probability of exclusion of a random male is very high. This probability can be taken into account in assessing the liability of a putative father whose polymorphic marker set matches the child's polymorphic marker set attributable to his/her father.




The polymorphisms of the invention may contribute to the phenotype of an organism in different ways. Some polymorphisms occur within a protein coding sequence and contribute to phenotype by affecting protein structure. The effect may be neutral, beneficial or detrimental, or both beneficial and detrimental, depending on the circumstances. For example, a heterozygous sickle cell mutation confers resistance to malaria, but a homozygous sickle cell mutation is usually lethal. Other polymorphisms occur in noncoding regions but may exert phenotypic effects indirectly via influence on replication, transcription, and translation. A single polymorphism may affect more than one phenotypic trait. Likewise, a single phenotypic trait may be affected by polymorphisms in different genes. Further, some polymorphisms predispose an individual to a distinct mutation that is causally related to a certain phenotype.




Phenotypic traits include diseases that have known but hitherto unmapped genetic components. Phenotypic traits also include symptoms of, or susceptibility to, multifactorial diseases of which a component is or may be genetic, such as autoimmune diseases, inflammation, cancer, diseases of the nervous system, and infection by pathogenic microorganisms. Some examples of autoimmune diseases include rheumatoid arthritis, multiple sclerosis, diabetes (insulin-dependent and non-independent), systemic lupus erythematosus and Graves disease. Some examples of cancers include cancers of the bladder, brain, breast, colon, esophagus, kidney, leukemia, liver, lung, oral cavity, ovary, pancreas, prostate, skin, stomach and uterus. Phenotypic traits also include characteristics such as longevity, appearance (e.g., baldness, obesity), strength, speed, endurance, fertility, and susceptibility or receptivity to particular drugs or therapeutic treatments.




Correlation is performed for a population of individuals who have been tested for the presence or absence of a phenotypic trait of interest and for polymorphic markers sets. To perform such analysis, the presence or absence of a set of polymorphisms (i.e. a polymorphic set) is determined for a set of the individuals, some of whom exhibit a particular trait, and some of which exhibit lack of the trait. The alleles of each polymorphism of the set are then reviewed to determine whether the presence or absence of a particular allele is associated with the trait of interest. Correlation can be performed by standard statistical methods such as a κ-squared test and statistically significant correlations between polymorphic form(s) and phenotypic characteristics are noted. For example, it might be found that the presence of allele A1 at polymorphism A correlates with heart disease. As a further example, it might be found that the combined presence of allele A1 at polymorphism A and allele B1 at polymorphism B correlates with increased milk production of a farm animal.




Such correlations can be exploited in several ways. In the case of a strong correlation between a set of one or more polymorphic forms and a disease for which treatment is available, detection of the polymorphic form set in a human or animal patient may justify immediate administration of treatment, or at least the institution of regular monitoring of the patient. Detection of a polymorphic form correlated with serious disease in a couple contemplating a family may also be valuable to the couple in their reproductive decisions. For example, the female partner might elect to undergo in vitro fertilization to avoid the possibility of transmitting such a polymorphism from her husband to her offspring. In the case of a weaker, but still statistically significant correlation between a polymorphic set and human disease, immediate therapeutic intervention or monitoring may not be justified. Nevertheless, the patient can be motivated to begin simple life-style changes (e.g., diet, exercise) that can be accomplished at little cost to the patient but confer potential benefits in reducing the risk of conditions to which the patient may have increased susceptibility by virtue of variant alleles. Identification of a polymorphic set in a patient correlated with enhanced receptiveness to one of several treatment regimes for a disease indicates that this treatment regime should be followed.




For animals and plants, correlations between characteristics and phenotype are useful for breeding for desired characteristics. For example, Beitz et al., U.S. Pat. No. 5,292,639 discuss use of bovine mitochondrial polymorphisms in a breeding program to improve milk production in cows. To evaluate the effect of mtDNA D-loop sequence polymorphism on milk production, each cow was assigned a value of 1 if variant or 0 if wild type with respect to a prototypical mitochondrial DNA sequence at each of 17 locations considered.




The previous section concerns identifying correlations between phenotypic traits and polymorphisms that directly or indirectly contribute to those traits. The present section describes identification of a physical linkage between a genetic locus associated with a trait of interest and polymorphic markers that are not associated with the trait, but are in physical proximity with the genetic locus responsible for the trait and co-segregate with it. Such analysis is useful for mapping a genetic locus associated with a phenotypic trait to a chromosomal position, and thereby cloning gene(s) responsible for the trait. See Lander et al.,


Proc. Natl. Acad. Sci


. (


USA


) 83, 7353-7357 (1986); Lander et al.,


Proc. Natl. Acad. Sci


. (


USA


) 84, 2363-2367 (1987); Donis-Keller et al.,


Cell


51, 319-337 (1987); Lander et al.,


Genetics


121, 185-199 (1989)). Genes localized by linkage can be cloned by a process known as directional cloning. See Wainwright,


Med. J. Australia


159, 170-174 (1993); Collins,


Nature Genetics


1, 3-6 (1992) (each of which is incorporated by reference in its entirety for all purposes).




Linkage studies are typically performed on members of a family. Available members of the family are characterized for the presence or absence of a phenotypic trait and for a set of polymorphic markers. The distribution of polymorphic markers in an informative meiosis is then analyzed to determine which polymorphic markers co-segregate with a phenotypic trait. See, e.g., Kerem et al.,


Science


245, 1073-1080 (1989); Monaco et al.,


Nature


316, 842 (1985); Yamoka et al.,


Neurology


40, 222-226 (1990); Rossiter et al.,


FASEB Journal


5, 21-27 (1991).




Linkage is analyzed by calculation of LOD (log of the odds) values. A lod value is the relative likelihood of obtaining observed segregation data for a marker and a genetic locus when the two are located at a recombination fraction , versus the situation in which the two are not linked, and thus segregating independently (Thompson & Thompson,


Genetics in Medicine


(5th ed, W. B. Saunders Company, Philadelphia, 1991); Strachan, “Mapping the human genome” in


The Human Genome


(BIOS Scientific Publishers Ltd, Oxford), Chapter 4). A series of likelihood ratios are calculated at various recombination fractions (), ranging from =0.0 (coincident loci) to =0.50 (unlinked). Thus, the likelihood at a given value of is: probability of data if loci linked at to probability of data if loci unlinked. The computed likelihood is usually expressed as the log


10


of this ratio (i.e., a lod score). For example, a lod score of 3 indicates 1000:1 odds against an apparent observed linkage being a coincidence. The use of logarithms allows data collected from different families to be combined by simple addition. Computer programs are available for the calculation of lod scores for differing values of (e.g., LIPED, MLINK (Lathrop,


Proc. Nat. Acad Sci


. (USA) 81, 3443-3446 (1984)). For any particular lod score, a recombination fraction may be determined from mathematical tables. See Smith et al.,


Mathematical tables for research workers in human genetics


(Churchill, London, 1961); Smith,


Ann. Hum. Genet.


32, 127-150 (1968). The value of at which the lod score is the highest is considered to be the best estimate of the recombination fraction.




Positive lod score values suggest that the two loci are linked, whereas negative values suggest that linkage is less likely (at that value of ) than the possibility that the two loci are unlinked. By convention, a combined lod score of +3 or greater (equivalent to greater than 1000:1 odds in favor of linkage) is considered definitive evidence that two loci are linked. Similarly, by convention, a negative lod score of −2 or less is taken as definitive evidence against linkage of the two loci being compared. Negative linkage data are useful in excluding a chromosome or a segment thereof from consideration. The search focuses on the remaining non-excluded chromosomal locations.




The invention further provides transgenic nonhuman animals capable of expressing an exogenous variant gene and/or having one or both alleles of an endogenous variant gene inactivated. Expression of an exogenous variant gene is usually achieved by operably linking the gene to a promoter and optionally an enhancer, and microinjecting the construct into a zygote. See Hogan et al., “Manipulating the Mouse Embryo, A Laboratory Manual,” Cold Spring Harbor Laboratory. (1989). Inactivation of endogenous variant genes can be achieved by forming a transgene in which a cloned variant gene is inactivated by insertion of a positive selection marker. See Capecchi, Science 244, 1288-1292 The transgene is then introduced into an embryonic stem cell, where it undergoes homologous recombination with an endogenous variant gene. Mice and other rodents are preferred animals. Such animals provide useful drug screening systems.




The invention further provides methods for assessing the pharmacogenomic susceptibility of a subject harboring a single nucleotide polymorphism to a particular pharmaceutical compound, or to a class of such compounds. Genetic polymorphism in drug-metabolizing enzymes, drug transporters, receptors for pharmaceutical agents, and other drug targets have been correlated with individual differences based on distinction in the efficacy and toxicity of the pharmaceutical agent administered to a subject. Pharmocogenomic characterization of a subjects susceptibility to a drug enhances the ability to tailor a dosing regimen to the particular genetic constitution of the subject, thereby enhancing and optimizing the therapeutic effectiveness of the therapy.




In cases in which a cSNP leads to a polymorphic protein that is ascribed to be the cause of a pathological condition, method of treating such a condition includes administering to a subject experiencing the pathology the wild type cognate of the polymorphic protein. Once administered in an effective dosing regimen, the wild type cognate provides complementation or remediation of the defect due to the polymorphic protein. The subject's condition is ameliorated by this protein therapy.




A subject suspected of suffering from a pathology ascribable to a polymorphic protein that arises from a cSNP is to be diagnosed using any of a variety of diagnostic methods capable of identifying the presence of the cSNP in the nucleic acid, or of the cognate polymorphic protein, in a suitable clinical sample taken from the subject. Once the presence of the cSNP has been ascertained, and the pathology is correctable by administering a normal or wild-type gene, the subject is treated with a pharmaceutical composition that includes a nucleic acid that harbors the correcting wild-type gene, or a fragment containing a correcting sequence of the wild-type gene. Non-limiting examples of ways in which such a nucleic acid may be administered include incorporating the wild-type gene in a viral vector, such as an adenovirus or adeno associated virus, and administration of a naked DNA in a pharmaceutical composition that promotes intracellular uptake of the administered nucleic acid. Once the nucleic acid that includes the gene coding for the wild-type allele of the polymorphism is incorporated within a cell of the subject, it will initiate de novo biosynthesis of the wild-type gene product. If the nucleic acid is further incorporated into the genome of the subject, the treatment will have long-term effects, providing de novo synthesis of the wild-type protein for a prolonged duration. The synthesis of the wild-type protein in the cells of the subject will contribute to a therapeutic enhancement of the clinical condition of the subject.




A subject suffering from a pathology ascribed to a SNP may be treated so as to correct the genetic defect. (See Kren et al., Proc. Natl. Acad. Sci. USA 96:10349-10354 (1999)). Such a subject is identified by any method that can detect the polymorphism in a sample drawn from the subject. Such a genetic defect may be permanently corrected by administering to such a subject a nucleic acid fragment incorporating a repair sequence that supplies the wild-type nucleotide at the position of the SNP. This site-specific repair sequence encompasses an RNA/DNA oligonucleotide which operates to promote endogenous repair of a subject's genomic DNA. Upon administration in an appropriate vehicle, such as a complex with polyethylenimine or encapsulated in anionic liposomes, a genetic defect leading to an inborn pathology may be overcome, as the chimeric oligonucleotides induces incorporation of the wild-type sequence into the subject's genome. Upon incorporation, the wild-type gene product is expressed, and the replacement is propagated, thereby engendering a permanent repair.




The invention further provides kits comprising at least one allele-specific oligonucleotide as described above. Often, the kits contain one or more pairs of allele-specific oligonucleotides hybridizing to different forms of a polymorphism. In some kits, the allele-specific oligonucleotides are provided immobilized to a substrate. For example, the same substrate can comprise allele-specific oligonucleotide probes for detecting at least 10, 100, 1000 or all of the polymorphisms shown in the Table. Optional additional components of the kit include, for example, restriction enzymes, reverse-transcriptase or polymerase, the substrate nucleoside triphosphates, means used to label (for example, an avidin-enzyme conjugate and enzyme substrate and chromogen if the label is biotin), and the appropriate buffers for reverse transcription, PCR, or hybridization reactions. Usually, the kit also contains instructions for carrying out the hybridizing methods.




Several aspects of the present invention rely on having available the polymorphic proteins encoded by the nucleic acids comprising a SNP of the inventions. There are various methods of isolating these nucleic acid sequences. For example, DNA is isolated from a genomic or cDNA library using labeled oligonucleotide probes having sequences complementary to the sequences disclosed herein.




Such probes can be used directly in hybridization assays. Alternatively probes can be designed for use in amplification techniques such as PCR.




To prepare a cDNA library, mRNA is isolated from tissue such as heart or pancreas, preferably a tissue wherein expression of the gene or gene family is likely to occur. cDNA is prepared from the mRNA and ligated into a recombinant vector. The vector is transfected into a recombinant host for propagation, screening and cloning. Methods for making and screening cDNA libraries are well known, See Gubler, U. and Hoffman, B. J. Gene 25:263-269 (1983) and Sambrook et al.




For a genomic library, for example, the DNA is extracted from tissue and either mechanically sheared or enzymatically digested to yield fragments of about 12-20 kb. The fragments are then separated by gradient centrifugation from undesired sizes and are constructed in bacteriophage lambda vectors. These vectors and phage are packaged in vitro, as described in Sambrook, et al. Recombinant phage are analyzed by plaque hybridization as described in Benton and Davis,


Science


196:180-182 (1977). Colony hybridization is carried out as generally described in M. Grunstein et al. Proc. Natl. Acad. Sci. USA. 72:3961-3965 (1975). DNA of interest is identified in either cDNA or genomic libraries by its ability to hybridize with nucleic acid probes, for example on Southern blots, and these DNA regions are isolated by standard methods familiar to those of skill in the art. See Sambrook, et al.




In PCR techniques, oligonucleotide primers complementary to the two 3′ borders of the DNA region to be amplified are synthesized. The polymerase chain reaction is then carried out using the two primers. See PCR Protocols: a Guide to Methods and Applications (Innis, M, Gelfand, D., Sninsky, J. and White, T., eds.), Academic Press, San Diego (1990). Primers can be selected to amplify the entire regions encoding a full-length sequence of interest or to amplify smaller DNA. segments as desired. PCR can be used in a variety of protocols to isolate cDNA's encoding a sequence of interest. In these protocols, appropriate primers and probes for amplifying DNA encoding a sequence of interest are generated from analysis of the DNA sequences listed herein. Once such regions are PCR-amplified, they can be sequenced and oligonucleotide probes can be prepared from the sequence.




Once DNA encoding a sequence comprising a cSNP is isolated and cloned, one can express the encoded polymorphic proteins in a variety of recombinantly engineered cells. It is expected that those of skill in the art are knowledgeable in the numerous expression systems available for expression of DNA encoding a sequence of interest. No attempt to describe in detail the various methods known for the expression of proteins in prokaryotes or eukaryotes is made here.




In brief summary, the expression of natural or synthetic nucleic acids encoding a sequence of interest will typically be achieved by operably linking the DNA or cDNA to a promoter (which is either constitutive or inducible), followed by incorporation into an expression vector. The vectors can be suitable for replication and integration in either prokaryotes or eukaryotes. Typical expression vectors contain, initiation sequences, transcription and translation terminators, and promoters useful for regulation of the expression of a polynucleotide sequence of interest. To obtain high level expression of a cloned gene, it is desirable to construct expression plasmids which contain, at the minimum, a strong promoter to direct transcription, a ribosome binding site for translational initiation, and a transcription/translation terminator. The expression vectors may also comprise generic expression cassettes containing at least one independent terminator sequence, sequences permitting replication of the plasmid in both eukaryotes and prokaryotes, i.e., shuttle vectors, and selection markers for both prokaryotic and eukaryotic systems. See Sambrook et al.




A variety of prokaryotic expression systems may be used to express the polymorphic proteins of the invention. Examples include


E. coli


, Bacillus, Streptomyces, and the like.




It is preferred to construct expression plasmids which contain, at the minimum, a strong promoter to direct transcription, a ribosome binding site for translational initiation, and a transcription/translation terminator. Examples of regulatory regions suitable for this purpose in


E. coli


are the promoter and operator region of the


E. coli


tryptophan biosynthetic pathway as described by Yanofsky, C., J. Bacterial. 158:1018-1024 (1984) and the leftward promoter of phage lambda (P) as described by A, I. and Hagen, D.,


Ann. Rev. Genet


. 14:399-445 (1980). The inclusion of selection markers in DNA vectors transformed in


E. coli


is also useful. Examples of such markers include genes specifying resistance to ampicillin, tetracycline, or chloramphenicol. See Sambrook et al. for details concerning selection markers for use in


E. coli.






To enhance proper folding of the expressed recombinant protein, during purification from


E. coli


, the expressed protein may first be denatured and then renatured. This can be accomplished by solubilizing the bacterially produced proteins in a chaotropic agent such as guanidine HCI and reducing all the cysteine residues with a reducing agent such as beta-mercaptoethanol. The protein is then renatured, either by slow dialysis or by gel filtration. See U.S. Pat. No. 4,511,503. Detection of the expressed antigen is achieved by methods known in the art as radioimmunoassay, or Western blotting techniques or immunoprecipitation. Purification from


E. coli


can be achieved following procedures such as those described in U.S. Pat. No. 4,511,503.




Any of a variety of eukaryotic expression systems such as yeast, insect cell lines, bird, fish, and mammalian cells, may also be used to express a polymorphic protein of the invention. As explained briefly below, a nucleotide sequence harboring a cSNP may be expressed in these eukaryotic systems. Synthesis of heterologous proteins in yeast is well known.


Methods in Yeast Genetics


, Sherman, F., et al., Cold Spring Harbor Laboratory, (1982) is a well recognized work describing the various methods available to produce the protein in yeast. Suitable vectors usually have expression control sequences, such as promoters, including 3-phosphogtycerate kinase or other glycolytic enzymes, and an origin of replication, termination sequences and the like as desired. For instance, suitable vectors are described in the literature (Botstein, et al., Gene 8:17-24 (1979); Broach, et al., Gene 8:121-133 (1979)).




Two procedures are used in transforming yeast cells. In one case, yeast cells are first converted into protoplasts using zymolyase, lyticase or glusulase, followed by addition of DNA and polyethylene glycol (PEG). The PEG-treated protoplasts are then regenerated in a 3% agar medium under selective conditions. Details of this procedure are given in the papers by J. D. Beggs, Nature (London) 275:104-109 (1978); and Hinnen, A., et al., Proc. Natl. Acad. Sci. USA, 75:1929-1933 (1978). The second procedure does not involve removal of the cell wall. Instead the cells are treated with lithium chloride or acetate and PEG and put on selective plates (Ito, H., et al., J. Bact, 153163-168 (1983)). cells and applying standard protein isolation techniques to the lysates:.




The purification process can be monitored by using Western blot techniques or radioimmunoassay or other standard techniques. The sequences encoding the proteins of the invention can also be ligated to various immunoassay expression vectors for use in transforming cell cultures of, for instance, mammalian, insect, bird or fish origin. Illustrative of cell cultures useful for the production of the polypeptides are mammalian cells. Mammalian cell systems often will be in the form of monolayers of cells although mammalian cell suspensions may also be used. A number of suitable host cell lines capable of expressing intact proteins have been developed in the art, and include the HEK293, BHK21, and CHO cell lines, and various human cells such as COS cell lines, HeLa cells, myeloma cell lines, Jurkat cells, etc. Expression vectors for these cells can include expression control sequences, such as an origin of replication, a promoter (e.g., the CMV promoter, a HSV tk promoter or pgk (phosphoglycerate kinase) promoter), an enhancer (Queen et al.


Immunol. Rev


, 89:49 (1986)) and necessary processing information sites, such as ribosome binding sites, RNA splice sites, polyadenylation sites (e.g., an SV40 large T Ag poly A addition site), and transcriptional terminator sequences.




Other animal cells are available, for instance, from the American Type Culture Collection Catalogue of Cell Lines and Hybridomas (7th edition, (1992)). Appropriate vectors for expressing the proteins of the invention in insect cells are usually derived from baculovirus. Insect cell lines include mosquito larvae, silkworm, armyworm, moth and Drosophila cell lines such as a Schneider cell line (See Schneider J. Embryol. Exp. Morphol., 27:353-365 (1987). As indicated above, the vector, e.g., a plasmid, which is used to transform the host cell, preferably contains DNA sequences to initiate transcription and sequences to control the translation of the protein. These sequences are referred to as expression control sequences. As with yeast, when higher animal host cells are employed, polyadenylation or transcription terminator sequences from known mammalian genes need to be incorporated into the vector. An example of a terminator sequence is the polyadenylation sequence from the bovine growth hormone gene. Sequences for accurate splicing of the transcript may also be included. An example of a splicing sequence is the VP1 intron from SV40 (Sprague, J. et a/., J. Virol. 45: 773-781 (1983)). Additionally, gene sequences to control replication in the host cell may be Saveria-Campo, M., 1985, “Bovine Papilloma virus DNA a Eukaryotic Cloning Vector” in DNA Cloning Vol. II a Practical Approach Ed. D. M. Glover, IRL Press, Arlington, Va. pp. 213-238. The host cells are competent or rendered competent for transformation by various means. There are several well-known methods of introducing DNA into animal cells. These include: calcium phosphate precipitation, fusion of the recipient cells with bacterial protoplasts containing the DNA, treatment of the recipient cells with liposomes containing the DNA, DEAE dextran, electroporation and micro-injection of the DNA directly into the cells.




The transformed cells are cultured by means well known in the art (Biochemical Methods in Cell Culture and Virology, Kuchler, R. J., Dowden, Hutchinson and Ross, Inc., (1977)). The expressed polypeptides are isolated from cells grown as suspensions or as monolayers. The latter are recovered by well known mechanical, chemical or enzymatic means.




General methods of expressing recombinant proteins are also known and are exemplified in R. Kaufman, Methods in Enzymology 185, 537-566 (1990). As defined herein “operably linked” refers to linkage of a promoter upstream from a DNA sequence such that the promoter mediates transcription of the DNA sequence. Specifically, “operably linked” means that the isolated polynucleotide of the invention and an expression control sequence are situated within a vector or cell in such a way that the gene encoding the protein is expressed by a host cell which has been transformed (transfected) with the ligated polynucleotide/expression sequence. The term “vector”, refers to viral expression systems, autonomous self-replicating circular DNA (plasmids), and includes both expression and nonexpression plasmids.




The term “gene” as used herein is intended to refer to a nucleic acid sequence which encodes a polypeptide. This definition includes various sequence polymorphisms, mutations, and/or sequence variants wherein such alterations do not affect the function of the gene product. The term “gene” is intended to include not only coding sequences but also regulatory regions such as promoters, enhancers, termination regions and similar untranslated nucleotide sequences. The term further includes all introns and other DNA sequences spliced from the mRNA transcript, along with variants resulting from alternative splice sites.




A number of types of cells may act as suitable host cells for expression of the protein. Mammalian host cells include, for example, monkey COS cells, Chinese Hamster Ovary (CHO) cells, human kidney 293 cells, human epidermal A43 1 cells, human Co10205 cells, 3T3 cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, HL-60, U937, HaK or Jurkat cells. Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast or in prokaryotes such as bacteria. Potentially suitable yeast strains include


Saccharomyces cerevisiae, Schizosaccharomyces pombe


, Kluyveromyces strains, Candida or any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial strains include


Escherichia coli, Bacillus subtilis, Salmonella typhimurium


, or any bacterial strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it may be necessary to modify the protein produced therein, for example by phosphorylation or glycosylation of the appropriate sites, in order to obtain the functional protein.




The protein may also be produced by operably linking the isolated polynucleotide of the invention to suitable control sequences in one or more insect expression vectors, and employing an insect expression system. Materials and methods for baculovirus/insect cell expression systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S.A. (the MaxBac© kit), and such methods are well known in the art, as described in Summers and Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by reference. As used herein, an insect cell capable of expressing a polynucleotide of the present invention is “transformed.” The protein of the invention may be prepared by culturing transformed host cells under culture conditions suitable to express the recombinant protein.




The polymorphic protein of the invention may also be expressed as a product of transgenic animals, e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized by somatic or germ cells containing a nucleotide sequence encoding the protein. The protein may also be produced by known conventional chemical synthesis. Methods for constructing the proteins of the present invention by synthetic means are known to those skilled in the art.




The polymorphic proteins produced by recombinant DNA technology may be purified by techniques commonly employed to isolate or purify recombinant proteins. Recombinantly produced proteins can be directly expressed or expressed as a fusion protein. The protein is then purified by a combination of cell lysis (e.g., sonication) and affinity chromatography. For fusion products, subsequent digestion of the fusion protein with an appropriate proteolytic enzyme releases the desired polypeptide. The polypeptides of this invention may be purified to substantial purity by standard techniques well known in the art, including selective precipitation with such substances as ammonium sulfate, column chromatography, immunopurification methods, and others. See, for instance, R. Scopes, Protein Purification: Principles and Practice, Springer-Verlag: New York (1982), incorporated herein by reference. For example, in an embodiment, antibodies may be raised to the proteins of the invention as described herein. Cell membranes are isolated from a cell line expressing the recombinant protein, the protein is extracted from the membranes and immunoprecipitated. The proteins may then be further purified by standard protein chemistry techniques as described above.




The resulting expressed protein may then be purified from such culture (i.e., from culture medium or cell extracts) using known purification processes, such as gel filtration and ion exchange chromatography. The purification of the protein may also include an affinity column containing agents which will bind to the protein; one or more column steps over such affinity resins as concanavalin A-agarose, heparin-Toyopearl@ or Cibacrom blue 3GA Sepharose B; one or more steps involving hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl ether; or immunoaffinity chromatography. Alternatively, the protein of the invention may also be expressed in a form which will facilitate purification. For example, it may be expressed as a fusion protein, such as those of maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX). Kits for expression and purification of such fusion proteins are commercially available from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.J.) and In Vitrogen, respectively. The protein can also be tagged with an epitope and subsequently purified by using a specific antibody directed to such epitope. One such epitope (“Flag”) is commercially available from Kodak (New Haven, Conn.). Finally, one or more reverse-phase high performance liquid chromatography (RP-HPLC) steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing purification steps, in various combinations, can also be employed to provide a substantially homogeneous isolated recombinant protein. The protein thus purified is substantially free of other mammalian proteins and is defined in accordance with the present invention as an “isolated protein.”




The term “antibody” as used herein refers to immunoglobulin molecules and immunologically active portions of immunoglobulin molecules, i.e., molecules that contain an antigen binding site that specifically binds (immunoreacts with) an antigen, such as polymorphic. Such antibodies include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F


ab


and F


(ab′)2


fragments, and an F


ab


expression library. In a specific embodiment, antibodies to human polymorphic proteins are disclosed.




The phrase “specifically binds to”, “immunospecifically binds to” or is “specifically immunoreactive with”, an antibody when referring to a protein or peptide, refers to a binding reaction which is determinative of the presence of the protein in the presence of a heterogeneous population of proteins and other biological materials. Thus, for example, under designated immunoassay conditions, the specified antibodies bind to a particular protein and do not bind in a significant amount to other proteins present in the sample. Specific binding to an antibody under such conditions may require an antibody that is selected for its specificity for a particular protein. Of particular interest in the present invention is an antibody that binds immunospecifically to a polymorphic protein but not to its cognate wild type allelic protein, or vice versa. A variety of immunoassay formats may be used to select antibodies specifically immunoreactive with a particular protein. For example, solid-phase ELISA immunoassays are routinely used to select monoclonal antibodies specifically immunoreactive with a protein. See Harlow and Lane (1988) Antibodies, a Laboratory Manual, Cold Spring Harbor Publications, New York, for a description of immunoassay formats and conditions that can be used to determine specific immunoreactivity.




Polyclonal and/or monoclonal antibodies that immunospecifically bind to polymorphic gene products but not to the corresponding prototypical or “wild-type” gene products are also provided. Antibodies can be made by injecting mice or other animals with the variant gene product or synthetic peptide. Monoclonal antibodies are screened as are described, for example, in Harlow & Lane, Antibodies, A Laboratory Manual, Cold Spring Harbor Press, New York (1988); Goding, Monoclonal antibodies, Principles and Practice (2d ed.) Academic Press, New York (1986). Monoclonal antibodies are tested for specific immunoreactivity with a variant gene product and lack of immunoreactivity to the corresponding prototypical gene product.




An isolated polymorphic protein, or a portion or fragment thereof, can be used as an immunogen to generate the antibody that bind the polymorphic protein using standard techniques for polyclonal and monoclonal antibody preparation. The full-length polymorphic protein can be used or, alternatively, the invention provides antigenic peptide fragments of polymorphic for use as immunogens. The antigenic peptide of a polymorphic protein of the invention comprises at least 8 amino acid residues of the amino acid sequence encompassing the polymorphic amino acid and encompasses an epitope of the polymorphic protein such that an antibody raised against the peptide forms a specific immune complex with the polymorphic protein. Preferably, the antigenic peptide comprises at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues. Preferred epitopes encompassed by the antigenic peptide are regions of polymorphic that are located on the surface of the protein, e.g., hydrophilic regions.




For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, goat, mouse or other mammal) may be immunized by injection with the polymorphic protein. An appropriate immunogenic preparation can contain, for example, recombinantly expressed polymorphic protein or a chemically synthesized polymorphic polypeptide. The preparation can further include an adjuvant. Various adjuvants used to increase the immunological response include, but are not limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface active substances (e.g, lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, dinitrophenol, etc.), human adjuvants such as Bacille Calmette-Guerin and


Corynebacterium parvum


, or similar immunostimulatory agents. If desired, the antibody molecules directed against polymorphic proteins can be isolated from the mammal (e.g., from the blood) and further purified by well known techniques, such as protein A chromatography, to obtain the IgG fraction.




The term “monoclonal antibody” or “monoclonal antibody composition”, as used herein, refers to a population of antibody molecules that originates from the clone of a singly hybridoma cell, and that contains only one type of antigen binding site capable of immunoreacting with a particular epitope of a polymorphic protein. A monoclonal antibody composition thus typically displays a single binding affinity for a particular polymorphic protein with which it immunoreacts. For preparation of monoclonal antibodies directed towards a particular polymorphic protein, or derivatives, fragments, analogs or homologs thereof, any technique that provides for the production of antibody molecules by continuous cell line culture may be utilized. Such techniques include, but are not limited to, the hybridoma technique (see Kohler & Milstein, 1975


Nature


256: 495-497); the trioma technique; the human B-cell hybridoma technique (see Kozbor, et al., 1983


Immunol Today


4: 72) and the EBV hybridoma technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal antibodies may be utilized in the practice of the present invention and may be produced by using human hybridomas (see Cote, et al., 1983


. Proc Natl Acad Sci USA


80: 2026-2030) or by transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et al., 1985 In: Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96).




According to the invention, techniques can be adapted for the production of single-chain antibodies specific to a polymorphic protein (see e.g., U.S. Pat. No. 4,946,778). In addition, methodologies can be adapted for the construction of F


ab


expression libraries (see e.g., Huse, et al., 1989


Science


246: 1275-1281) to allow rapid and effective identification of monoclonal F


ab


fragments with the desired specificity for a polymorphic protein or derivatives, fragments, analogs or homologs thereof. Non-human antibodies can be “humanized” by techniques well known in the art. See e.g., U.S. Pat. No. 5,225,539. Antibody fragments that contain the idiotypes to a polymorphic protein may be produced by techniques known in the art including, but not limited to: (i) an F


(ab′)2


fragment produced by pepsin digestion of an antibody molecule; (ii) an F


ab


fragment generated by reducing the disulfide bridges of an F


(ab′)2


fragment; (iii) an F


ab


fragment generated by the treatment of the antibody molecule with papain and a reducing agent and (iv) F


v


fragments.




Additionally, recombinant anti-polymorphic protein antibodies, such as chimeric and humanized monoclonal antibodies, comprising both human and non-human portions, which can be made using standard recombinant DNA techniques, are within the scope of the invention. Such chimeric and humanized monoclonal antibodies can be produced by recombinant DNA techniques known in the art, for example using methods described in PCT International Application No. PCT/US86/02269; European Patent Application No. 184,187; European Patent Application No. 171,496; European Patent Application No. 173,494; PCT International Publication No. WO 86/01533; U.S. Pat. No. 4,816,567; European Patent Application No. 125,023; Better et al. (1988)


Science


240:1041-1043; Liu et al. (1987)


PNAS


84:3439-3443; Liu et al. (1987)


J Immunol


. 139:3521-3526; Sun et al. (1987)


PNAS


84:214-218; Nishimura et al. (1987)


Cancer Res


47:999-1005; Wood et al. (1985)


Nature


314:446-449; Shaw et al. (1988)


J Natl Cancer Inst


80:1553-1559); Morrison(1985)


Science


229:1202-1207; Oi et al. (1986)


Bio Techniques


4:214; U.S. Pat. No. 5,225,539; Jones et al. (1986)


Nature


321:552-525; Verhoeyan et al. (1988)


Science


239:1534; and Beidler et al. (1988)


J Immunol


141:4053-4060.




In one embodiment, methodologies for the screening of antibodies that possess the desired specificity include, but are not limited to, enzyme-linked immunosorbent assay (ELISA) and other immunologically-mediated techniques known within the art.




Anti-polymorphic protein antibodies may be used in methods known within the art relating to the detection, quantitation and/or cellular or tissue localization of a polymorphic protein (e.g., for use in measuring levels of the polymorphic protein within appropriate physiological samples, for use in diagnostic methods, for use in imaging the protein, and the like). In a given embodiment, antibodies for polymorphic proteins, or derivatives, fragments, analogs or homologs thereof, that contain the antibody-derived CDR, are utilized as pharmacologically-active compounds in therapeutic applications intended to treat a pathology in a subject that arises from the presence of the cSNP allele in the subject.




An anti-polymorphic protein antibody (e.g., monoclonal antibody) can be used to isolate polymorphic proteins by a variety of immunochemical techniques, such as immunoaffinity chromatography or immunoprecipitation. An anti-polymorphic protein antibody can facilitate the purification of natural polymorphic protein from cells and of recombinantly produced polymorphic proteins expressed in host cells. Moreover, an anti-polymorphic protein antibody can be used to detect polymorphic protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the polymorphic protein. Anti-polymorphic antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to, for example, determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance. Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, -galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials includes umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include


125


I,


131


I,


35


S or


3


H.
























TABLE 1


















Protein





Similarity (pValue)








CuraGen




Base pos.




Polymorphic




Base




base




Amino acid




Amino acid





classification of




Name of protein identified following a




following a




Map






Seq ID




sequence ID




of SNP




sequence




before




after




before




after




Type of change




BLASTX analysis of the CuraGen sequence




BLASTX analysis




location











1-2




cg34158766




 254




TCGGGCTCGT




G




A






SILENT-




angiopoietin




Human Gene Similar to SPTREMBL-ID: O08538




6.00E−34




 4









CCAGCCAAAG






NONCODING





ANGIOPOIETIN-1 -


MUS MUSCULUS


(MOUSE),









CAGCT[G/A]CC








498 aa.









ACTCAAAAGAA









AGTAGAAAGAA









A






3-4




cg43299481




 218




GTCCCGGTCA




G




gap




SILENT-




dehydrogenase




Human Gene Similar to SPTREMBL-ID: Q44020




9.30E−33









GGGTGGGCTG






NONCODING





4-HYDROXYBUTYRATE DEHYDROGENASE









CGGGA[G/gap]C







(GBD), ORF 2 AND 4-10 GENES, COMPLETE









CTCAGGCCAC






CDS, AND ORF3 AND 11, 3′ END -









GGGGAAGTAG








ALCALIGENES EUTROPHUS


, 173 aa.









TGGG






5-6




cg43933536




1628




ACACCGGGGG




C




T




SILENT-




eph




Human Gene Similar to TREMBLNEW-ID:




7.70E−37




 3









GGCGCGGGGG






NONCODING





G2735762 HEAT SHOCK PROTEIN DNAJ -









TCTCC[C/T]TGG










LEPTOSPIRA INTERROGANS, 369 aa.











TCCGCAGAGA









CAGCTAGCTAG









C






7-8




cg44021557




 391




TGATATCCAGG




T




C




Leu




Leu




SILENT-




helicase




Human Gene Similar to SWISSPROT-ID: Q11039




1.50E−42




 2









CTGAATCTATC








CODING





ATP-DEPENDENT RNA HELICASE DEAD









CCA[T/C]TGATC










HOMOLOG -


MYCOBACTERIUM











TTAGGAGGAG












TUBERCULOSIS


, 563 aa.









GTGATGTACT






9-10




cg44021557




 516




GAAAGACCAAC




gap




A






SILENT-




helicase




Human Gene Similar to SWISSPROT-ID: Q11039




1.50E−42




 2









AGGAAGGCAA








NONCODING





ATP-DEPENDENT RNA HELICASE DEAD









AAAA[gap/A]GG










HOMOLOG -


MYCOBACTERIUM











AAAAACAACAA












TUBERCULOSIS


, 563 aa.









TTAAAACTGGT









G






11-12




cg43305492




 323




TCACCATCACC




A




G




Lys




Lys




SILENT-




immunoglob




Human Gene Similar to TREMBLNEW-ID:




740E−49









AAGGACACCTC








CODING





G2734101 IMMUNOGLOBULIN HEAVY CHAIN,









CAA[A/G]AACC










VD(5)J(4) LIKE GENE PRODUCT -


HOMO











AGGTGGTCCTT












SAPIENS


(HUMAN), 151 aa.









ACAATGACCA






13-14




cg43941918




 896




GATTAGAACCC




C




T




Gln




Gln




SILENT-




immunoglob




Human Gene Similar to TREMBLNEW-ID:




2.80E−42









AAATAGATCAG








CODING





G240581 IMMUNOGLOBULIN G2B VARIABLE









GAG[C/T]TGTG










REGION LIGHT CHAIN, AUTOANTOBODY









GAGACTGCCCT










BV04-01 VARIABLE REGION LIGHT CHAIN -









GGCTTCTGCA










MUSSP, 113 aa.






15-16




cg42312510




 235




CTACTTTGGAA




A




G




Ser




Ser




SILENT-




immunoglob




Human Gene Similar to TREMBLNEW-ID:




1.20E−39









AGTGGGGTCC








CODING





G300839 IMMUNOGLOBULIN LIGHT CHAIN









CATC[A/G]AGAT










VARIABLE REGION {CLONE ALPHA FOG1-









TCAGCGGCAG










A4} -


HOMO SAPIENS


, 107 aa.









TGGATCTGGGA






17-18




cg43147217




2161




TAACCTATTTAT




A




T






SILENT-




interleukin




Human Gene Similar to SWISSPROT-ID: P10145




8.30E−48




 4









TATTTATGTATT








NONCODING





INTERLEUKIN-8 PRECURSOR (IL-8)





(4q12)









T[A/T]TTTAAGC










(MONOCYTE-DERIVED NEUTROPHIL









ATCAAATATTT










CHEMOTACTIC FACTOR) (MDNCF) (T-CELL









GTGCAAG










CHEMOTACTIC FACTOR) (NEUTROPHIL-
















ACTIVATING PROTEIN 1) (NAP-1)
















(LYMPHOCYTE-DERIVED NEUTROPHIL-
















ACTIVATING FACTOR) (LYNAP) (PROTEIN
















3-10C) (NEUTROPHIL-ACTIVATING FACTOR)
















(NAF) (GRANULOCYTE CHEMOTACTIC
















PROTEIN 1) (GCP-1) (EMOCTAKIN) -


HOMO




















SAPIENS








19-20




cg20725546




 402




AGGCCGCTGA




T




C




Gly




Gly




SILENT-




kinase




Human Gene Similar to SWISSPROT-ID: O06821




9.40E−42









CACTCTCGTTA








CODING





PHOSPHOGLYCERATE KINASE (EC 2.7.2.3) -









TTGG[T/C]GGC












MYCOBACTERIUM TUBERCULOSIS


, 412 aa.









GGTATGGCGTA









CACCTTCCTCA






21-22




cg32160481




 218




GGGAAATGGC




C




T






SILENT-




MHC




Human Gene Similar to SWISSPROT-ID: P30508




5.60E−37









CTCTGTGGGG








NONCODING





HLA CLASS HISTOCOMPATIBILITY









AGGAG[C/T]GA










ANTIGEN, CW*1201 ALPHA CHAIN PRE-









GGGGCCCGCC










CURSOR (HLA-CX52) -


HOMO SAPIENS











CGGCGGGGGC










(HUMAN), 366 aa.









GCA






23-24




cg29691725




 109




CTCGATGTCGC




T




C






SILENT-




nuclease




Human Gene Similar to TREMBLNEW-ID:




1.90E−34









CCGAGACATG








NONCODING





E1264534 ENDONUCLEASE III -









GAGA[T/C]CGT












MYCOBACTERIUM TUBERCULOSIS


, 226 aa.









CGGCCCCTCG









CCAAGGCATTG









C






25-26




cg29691725




 122




GAGACATGGA




G




A






SILENT-




nuclease




Human Gene Similar to TREMBLNEW-ID:




1.90E−34









GATCGTCGGC








NONCODING





E1264534 ENDONUCLEASE III -









CCCTC[G/A]CC












MYCOBACTERIUM TUBERCULOSIS


, 226 aa.









AAGGCATTGCA









AGCCGTGATC









GG






27-28




cg29691725




 146




CGCCAAGGCA




G




A






SILENT-




nuclease




Human Gene Similar to TREMBLNEW-ID:




1.90E−34









TTGCAAGCCGT








NONCODING





E1264534 ENDONUCLEASE III -









GATC[G/A]GGC












MYCOBACTERIUM TUBERCULOSIS


, 226 aa.









ATGGGCCGGC









CCTCTGTGGC









GT






29-30




cg29691725




 169




CGGGCATGGG




gap




C






SILENT-




nuclease




Human Gene Similar to TREMBLNEW-ID:




1.90E−34









CCGGCCCTCT








NONCODING





E1264534 ENDONUCLEASE III -









GTGGC[gap/C]G












MYCOBACTERIUM TUBERCULOSIS


, 226 aa.









TCCCGGAACTT









TTCGCAATCGG









CC






31-32




cg29691725




 204




CTTTTCGCAAT




A




G






SILENT-




nuclease




Human Gene Similar to TREMBLNEW-ID:




1.90E−34









CGGCCCCGAC








NONCODING





E1264534 ENDONUCLEASE III -









GGCA[A/G]ATG












MYCOBACTERIUM TUBERCULOSIS


, 226 aa.









CGAGCACCCC









GGTATTCCGGC









A






33-34




cg29691725




 231




TGCGAGCACC




T




C






SILENT-




nuclease




Human Gene Similar to TREMBLNEW-ID:




1.90E−34









CCGGTATTCCG








NONCODING





E1264534 ENDONUCLEASE III -









GCAT[T/C]TTCA












MYCOBACTERIUM TUBERCULOSIS


, 226 aa.









CGCTGATGAG









CTATGCCCGCA






35-36




cg29691725




 232




GCGAGCACCC




T




C






SILENT-




nuclease




Human Gene Similar to TREMBLNEW-ID:




1.90E−34









CGGTATTCCGG








NONCODING





E1264534 ENDONUCLEASE III -









CATT[T/C]TCAC












MYCOBACTERIUM TUBERCULOSIS


, 226 aa.









GCTGATGAGCT









ATGCCCGCAT






37-38




cg29691725




 247




TTCCGGCATTT




T




C






SILENT-




nuclease




Human Gene Similar to TREMBLNEW-ID:




1.90E−34









TCACGCTGATG








NONCODING





E1264534 ENDONUCLEASE III -









AGC[T/C]ATGC












MYCOBACTERIUM TUBERCULOSIS


, 226 aa.









CCGCATCTCCC









CCGCAGCCAG






39-40




cg29691725




 300




AACATCAAGGC




C




T






SILENT-




nuclease




Human Gene Similar to TREMBLNEW-ID:




1.90E−34









GGCCATCACC








NONCODING





E1264534 ENDONUCLEASE III -









GCGC[C/T]GAA












MYCOBACTERIUM TUBERCULOSIS


, 226 aa.









AACGTTCATCC









CCCTCATCGAC






41-42




cg29691725





 80




ACGCCCAGATC




G




A






SILENT-




nuclease




Human Gene Similar to TREMBLNEW-ID:




1.90E−34









GTCACGACGG








NONCODING





E1264534 ENDONUCLEASE III -









TCAC[G/A]CCC












MYCOBACTERIUM TUBERCULOSIS


, 226 aa.









CTCGATGTCGC









CCGAGACATG









G






43-44




cg29691725




 89




TCGTCACGACG




T




C






SILENT-




nuclease




Human Gene Similar to TREMBLNEW-ID:




1.90E−34









GTCACGCCCCT








NONCODING





E1264534 ENDONUCLEASE III -









CGA[T/C]GTCG












MYCOBACTERIUM TUBERCULOSIS


, 226 aa.









CCCGAGACAT









GGAGATCGTC









G






45-46




cg43986329




1496




GGGTTTCCCAT




C




T






SILENT-




protease




Human Gene Similar to SWISSNEW-ID: P55032




2.20E-48




20









CAGCATTGCCG








NONCODING





MATRILYSIN PRECURSOR (EC 3.4.24.23)





(16q13)









TCC[C/T]GGGT










(PUMP 1 PROTEASE) (UTERINE METALLO-









GTAGAGTCTCT










PROTEINASE) (MATRIX METALLO-









CGCTGGGGCA










PROTEINASE-7) (MMP-7) (MATRIN) -


FELIS




















SILVESTRIS CATUS


(CAT), 262 aa (fragment).|
















pcls: SWISSPROT-ID: P55032 MATRILYSIN
















PRECURSOR (EC 3.4.24.23) (PUMP 1
















PROTEASE) (UTERINE METALLOPRO-
















TEINASE) (MATRIX METALLOPROTEINASE-7)
















(MMP-7) (MATRIN) -


FELIS SILVESTRIS CATUS


















(CAT), 262 aa (fragment).






47-48




cg20438082




 137




CTTCAACTGCT




A




T




Thr




Thr




SILENT-




ribosomalprot




Human Gene Similar to SWISSPROT-ID: P04447




2.80E−40









TTAGCAACATC








CODING





50S RIBOSOMAL PROTEIN L1 -


BACILLUS











CAT[A/T]GTTAC












STEAROTHERMOPHILUS


, 232 aa.









AGTACCAGTTT









TAGGGTTTG






49-50




cg20438082




 206




CAAGGACACGT




A




T




Leu




Leu




SILENT-




ribosomalprot




Human Gene Similar to SWISSPROT-ID:P04447




2.80E−40









CCAAGACGTCC








CODING





50S RIBOSOMAL PROTEIN L1 -


BACILLUS











AAC[A/T]AGAGC












STEAROTHERMOPHILUS


, 232 aa.









CATCATATCAG









GTGTAGCGA






51-52




cg39547655




 545




GCTGGATCCAC




A




G




Gly




Gly




SILENT-




ribosomalprot




Human Gene Similar to SWISSPROT-ID: P94977




2.30E−38









AGCGAGCGGA








CODING





50S RIBOSOMAL PROTEIN L20 -









AGTC[A/G]CCC












MYCOBACTERIUM TUBERCULOSIS


, 129 aa.|









TTCTTAGCACG










pcls: SPTREMBL-ID: P94977 50S RIBOSOMAL









ACGGTCACGG










PROTEIN L20 - MYCOBACTERIUM









A






53-54




cg29358731




 181




TCTCGCCTAGG




G




T




Arg




Arg




SILENT-




struct




Human Gene Similar to SWISSPROT-ID: P93087




2.90E−47









TTGGTCATGAC








CODING





CALMODULIN -


CAPSICUM ANNUUM


(BELL









ATG[G/T]CGAA










PEPPER), 148 aa.









GCTCAGCAGC









GGAGATGAAG









C






55-56




cg29358731




 196




TCATGACATGG




G




A




Ser




Ser




SILENT-




struct




Human Gene Similar to SWISSPROT-ID: P93087




2.90E−47









CGAAGCTCAG








CODING





CALMODULIN -


CAPSICUM ANNUUM


(BELL









CAGC[G/A]GAG










PEPPER), 148 aa.









ATGAAGCCGTT









CTGGTCCTTGT






57-58




cg29358731




 229




AGCCGTTCTGG




A




G




Arg




Arg




SILENT-




struct




Human Gene Similar to SWISSPROT-ID: P93087




2.90E−47









TCCTTGTCGAA








CODING





CALMODULIN -


CAPSICUM ANNUUM


(BELL









CAC[A/G]CGGA










PEPPER), 148 aa.









AGGCCTCCTTG









AGCTCCTCCT






59-60




cg29358731




 301




TGCGTGCCATC




A




G




Pro




Pro




SILENT-




struct




Human Gene Similar to SWISSPROT-ID: P93087




2.90E−47









AGGTTGAGAAA








CODING





CALMODULIN -


CAPSICUM ANNUUM


(BELL









CTC[A/G]GGAA










PEPPER), 148 aa.









AGTCGATGGTT









CCATTGCCAT






61-62




cg29358731




 304




GTGCCATCAG




A




G




Phe




Phe




SILENT-




struct




Human Gene Similar to SWISSPROT-ID: P93087




2.90E−47









GTTGAGAAACT








CODING





CALMODULIN -


CAPSICUM ANNUUM


(BELL









CAGG[A/G]AAG










PEPPER), 148 aa.









TCGATGGTTCC









ATTGCCATCAG






63-64




cg44127439




 554




TGCTGCGAAGA




C




T




Glu




Glu




SILENT-




synthase




Human Gene Similar to SWISSPROT-ID: P19206




7.90E−45









ATCTCGAGGG








CODING





BIOTIN SYNTHASE (EC 2.8.1.6) (BIOTIN









CCTC[C/T]TCGC










SYNTHETASE) -


BACILLUS SPHAERICUS


,









GGGTAATGGA










332 aa.









CTCCCCCCTCA






65-66




cg44127439




 704




AGTCCATGAGG




A




G






SILENT-




synthase




Human Gene Similar to SWISSPROT-ID: P19206




7.90E−45









CAGCACCAATG








NONCODING





BIOTIN SYNTHASE (EC 2.8.1.6) (BIOTIN









TGG[A/G]CTAC










SYNTHETASE) -


BACILLUS SPHAERICUS


,









CCCCGGAAAT










332 aa.









GGTGTCGTCGT






67-68




cg25321479




 101




AGAAAGTGTTG




G




A




Ile




Ile




SILENT-




transport




Human Gene Similar to TREMBLNEW-ID:




9.80E−47









GCTCCCAGGG








CODING





E1245760 PUATIVE COBALT TRANSPORT









TGGA[G/A]ATTC










PROTEIN -


STREPTOMYCES COELICOLOR


,









CCCCGTGAGC










257 aa.









CAGGAGCAGG









G






69-70




cg25321479




 113




CTCCCAGGGT




A




G




Ala




Ala




SILENT-




transport




Human Gene Similar to TREMBLNEW-ID:




9.80E−47









GGAGATTCCCC








CODING





E1245760 PUATIVE COBALT TRANSPORT









CGTG[A/G]GCC










PROTEIN -


STREPTOMYCES COELICOLOR


,









AGGAGCAGGG










257 aa.









CCTGGAAGATG









A






71-72




cg25321479




 197




GTTTGAACAGT




C




T




Thr




Thr




SILENT-




transport




Human Gene Similar to TREMBLNEW-ID:




9.80E−47









GCCGCCCCAA








CODING





E1245760 PUATIVE COBALT TRANSPORT









CTCC[C/T]GTTC










PROTEIN -


STREPTOMYCES COELICOLOR


,









CGGTGGGGTG










257 aa.









CGAGGACGAT









C






73-74




cg25321479




 245




ATCCCGTAACG




G




A




Ala




Ala




SILENT-




transport




Human Gene Similar to TREMBLNEW-ID:




9.80E−47









CTGGGCAGTTT








CODING





E1245760 PUATIVE COBALT TRANSPORT









GAT[G/A]GCCG










PROTEIN -


STREPTOMYCES COELICOLOR


,









AGAGCACAAAG










257 aa.









GTGAAAGCGC






75-76




cg25321479




 35




GCCCACGTAGT




A




G




Tyr




Tyr




SILENT-




transport




Human Gene Similar to TREMBLNEW-ID:




9.80E−47









TTCTTGGTGAG








CODING





E1245760 PUATIVE COBALT TRANSPORT









CTT[A/G]TAAAA










PROTEIN -


STREPTOMYCES COELICOLOR


,









GGCGTACCCA










257 aa.









GCCCACGGTC






77-78




cg25321479




 38




CACGTAGTTTC




A




G




Phe




Phe




SILENT-




transport




Human Gene Similar to TREMBLNEW-ID:




9.80E−47









TTGGTGAGCTT








CODING





E1245760 PUATIVE COBALT TRANSPORT









ATA[A/G]AAGG










PROTEIN -


STREPTOMYCES COELICOLOR


,









CGTACCCAGC










257 aa.









CCACGGTCCC









G






79-80




cg25321479




 83




GTCCCGCGATA




G




A




Asn




Asn




SILENT-




transport




Human Gene Similar to TREMBLNEW-ID:




9.80E−47









GCCATCGAGAA








CODING





E1245760 PUATIVE COBALT TRANSPORT









AGT[G/A]TTGG










PROTEIN -


STREPTOMYCES COELICOLOR


,









CTCCCAGGGT










257 aa.









GGAGATTCCCC






81-82




cg25321479




 14




ACGCGTCAGG




A




G






SILENT-




transport




Human Gene Similar to TREMBLNEW-ID:




9.80E−47









CCC[A/G]CGTA








NONCODING





E1245760 PUATIVE COBALT TRANSPORT









GTTTCTTGGTG










PROTEIN -


STREPTOMYCES COELICOLOR


,









AGCTTATAAA










257 aa.






83-84




cg25321479




 17




ACGCGTCAGG




T




A






SILENT-




transport




Human Gene Similar to TREMBLNEW-ID:




9.80E−47









CCCACG[T/A]A








NONCODING





E1245760 PUATIVE COBALT TRANSPORT









GTTTCTTGGTG










PROTEIN -


STREPTOMYCES COELICOLOR


,









AGCTTATAAAA










257 aa.









GG






85-86




cg39548335




 298




CTTCTGTATTC




G




A




Ala




Ala




SILENT-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC: P43618




2.90E−48









TCGTTGTTACC








CODING




FIED




HYPOTHETICAL 41.3 KD PROTEIN IN SAP155-









GGC[G/A]GCCT










YMR31 INTERGENIC REGION -


Saccharomyces











TCCTGGGAGT












cerevisiae


(Baker's yeast), 361 aa.









GCTCATTATTC






87-88




cg43335190




 341




GTAGTTCATAT




C




T






SILENT-




UNCLASSI-




Human Gene Similar to SWISSNEW-ACC: P32803




7.60E−48









CTATTTACTTTT








NONCODING




FIED




ENDOSOMAL P24B PROTEIN PRECURSOR









GC[C/T]TACATA










(24 KD ENDOMEMBRANE PROTEIN) (BASIC









CGATTACATAC










24 KD LATE ENDOCYTIC INTERMEDIATE









ACGATTGG










COMPONENT) -


Saccharomyces cerevisiae


















(Baker's yeast), 203 aa.






89-90




cg43335190




 411




AATTCTCGGTT




T




C






SILENT-




UNCLASSI-




Human Gene Similar to SWISSNEW-ACC: P32803




7.60E−48









TCATACTTTTTA








NONCODING




FIED




ENDOSOMAL P24B PROTEIN PRECURSOR









CC[T/C]TGATCC










(24 KD ENDOMEMBRANE PROTEIN) (BASIC









TTCCACTGTTT










24 KD LATE ENDOCYTIC INTERMEDIATE









TTCCCTGT










COMPONENT) -


Saccharomyces cerevisiae


















(Baker's yeast), 203 aa.






91-92




cg43298420




 625




GGTGACCGTG




G




T




Pro




Pro




SILENT-




UNCLASSI-




Human Gene Similar to TREMBLNEW-ACC:




5.30E−47









GTCTGAAAGAA








CODING




FIED




MD27784 PTD001 -


HOMO SAPIENS


(HUMAN),









GGCT[G/T]GGT










218 aa.









TGAACTGGTAC









AGCTTCAGGAC






93-94




cg44928880




 161




ACTGAGGTTGG




C




T






SILENT-




UNCLASSI-




Human Gene Similar to SPTREMBL-ACC: O17549




5.30E−47









GTTTCAGACCA








NONCODING




FIED




M18.8 PROTEIN -


CAENORHABDITIS ELEGANS


,









AGA[C/T]ACTG










447 aa.









GATTCTCCTAG









TTAAGATAAA






95-96




cg39410689




 99




ATCACCACCAC




C




T




His




His




SILENT-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC: P43638




5.50E−46









CGCCACCACC








CODING




FIED




MAP-HOMOLOGOUS PROTEIN 1 -









ATCA[C/T]ACGG












Saccharomyces cerevisiae


(Baker's yeast), 1398 aa.









AAGATGCTCCT









GCACCTAAGA






97-98




cg20297086




 176




TGCACGACAAG




T




C




Asn




Asn




SILENT-




UNCLASSI-




Human Gene Similar to SPTREMBL-ACC: P95543




9.00E−45









TACCCGGAGCT








CODING




FIED




ELONGATION FACTOR TU1 -


PLANOBISPORA











GAA[T/C]GAGG












ROSEA


, 397 aa.









AGTCGCCGTTC









GACCAGATCG






99-100




cg20297086




 239




AGGAGCGTCA




C




T




Ile




Ile




SILENT-




UNCLASSI-




Human Gene Similar to SPTREMBL-ACC: P95543




9.00E−45









GCGCGGCATC








CODING




FIED




ELONGATION FACTOR TU1 -


PLANOBISPORA











ACCAT[C/T]CG












ROSEA


, 397 aa.









ATCGCCCACAT









CGAGTACCAGA






101-102




cg20297086




 248




AGCGCGGCAT




C




T




Ala




Ala




SILENT-




UNCLASSI-




Human Gene Similar to SPTREMBL-ACC: P95543




9.00E−45









CACCATCTCGA








CODING




FIED




ELONGATION FACTOR TU1 -


PLANOBISPORA











TCGC[C/T]CACA












ROSEA


, 397 aa.









TCGAGTACCAG









ACCGAGAAGC






103-104




cg39386301




1177




TATGTTAATGG




T




C






SILENT-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC: P32612




1.00E−43









TGAAGAAATTC








NONCODING




FIED




PAU2 PROTEIN -


Saccharomyces cerevisiae











ACC[T/C]CCGA










(Baker's yeast), 120 aa.









CCGTGGTATGT









CAATGTGAGA






105-106




cg39386301




 958




ATAATGATAAA




T




C






SILENT-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC: P32612




1.00E−43









ATAGTTCGTTC








NONCODING




FIED




PAU2 PROTEIN -


Saccharomyces cerevisiae











ATA[T/C]ACTCC










(Baker's yeast), 120 aa.









GGTGGGATCAT









TGCAGAAAT






107-108




cg39515238




 192




TCAAGATCTAT




C




A






SILENT-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC: P53845




3.60E−40









ATTGCACACCA








NONCODING




FIED




HYPOTHETICAL 35.5 KD PROTEIN IN PIK1-









GAG[C/A]TGTT










POL2 INTERGENIC REGION -


Saccharomyces











GTTTTATACTA












cerevisiae


(Baker's yeast), 314 aa.









CAACTCATCT






109-110




cg29693502




 181




CAGCGGGGTA




G




A




Tyr




Tyr




SILENT-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




3.00E−39









GCGCACCTGAT








CODING




FIED




O10379 PROBABLE GLUTAMATE-AMMONIA-









CGAC[G/A]TATT










LIGASE ADENYLYLTRANSFERASE









CCAACAGCTCA










(EC 2.7.7.42) (GLUTAMINE-SYNTHETASE









TTCGTCAACT










ADENYLYLTRANSFERASE) (ATASE) -
















Mycobacterium






111-112




cg29693502




 226




TCAACTCCCGG




A




G




Ser




Ser




SILENT-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




3.00E−39









TCGCCGGCAC








CODING




FIED




Q10379 PROBABLE GLUTAMATE-AMMONIA-









CGTG[A/G]CTA










LIGASE ADENYLYLTRANSFERASE









GCCCGCAGCA










(EC 2.7.7.42) (GLUTAMINE-SYNTHETASE









GGGCCTGGGA










ADENYLYLTRANSFERASE) (ATASE) -









TT










Mycobacterium






113-114




cg29693502




 319




CCAGGCTCCG




T




C




Arg




Arg




SILENT-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




3.00E−39









TACCATCGGAC








CODING




FIED




Q10379 PROBABLE GLUTAMATE-AMMONIA-









CGGA[T/C]CGG










LIGASE ADENYLYLTRANSFERASE









CCTTCCGGGC










(EC 2.7.7.42) (GLUTAMINE-SYNTHETASE









GCAGATCGGC










ADENYLYLTRANSFERASE) (ATASE) -









AT










Mycobacterium






115-116




cg29693502




 385




GGTCCGCGCC




G




A




Asn




Asn




SILENT-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




3.00E−39









GTGTTTGCCGA








CODING




FIED




Q10379 PROBABLE GLUTAMATE-AMMONIA-









GCAG[G/A]TTG










LIGASE ADENYLYLTRANSFERASE









CGTAGCTTAGT










(EC 2.7.7.42) (GLUTAMINE-SYNTHETASE









GACGATTTTCA










ADENYLYLTRANSFERASE) (ATASE) -
















Mycobacterium






117-118




cg29693502




 388




CCGCGCCGTG




G




A




Arg




Arg




SILENT-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




3.00E−39









TTTGCCGAGCA








CODING




FIED




Q10379 PROBABLE GLUTAMATE-AMMONIA-









GGTT[G/A]CGT










LIGASE ADENYLYLTRANSFERASE









AGCTTAGTGAC










(EC 2.7.7.42) (GLUTAMINE-SYNTHETASE









GATTTTCAGCG










ADENYLYLTRANSFERASE) (ATASE) -
















Mycobacterium






119-120




cg29693502




 397




GTTTGCCGAGC




A




G




Thr




Thr




SILENT-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




3.00E−39









AGGTTGCGTAG








CODING




FIED




Q10379 PROBABLE GLUTAMATE-AMMONIA-









CTT[A/G]GTGAC










LIGASE ADENYLYLTRANSFERASE









GATTTTCAGCG










(EC 2.7.7.42) (GLUTAMINE-SYNTHETASE









CCTTCTCCC










ADENYLYLTRANSFERASE) (ATASE) -
















Mycobacterium






121-122




cg29693502




 403




CGAGCAGGTT




G




A




Ile




Ile




SILENT-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




3.00E−39









GCGTAGCTTAG








CODING




FIED




Q10379 PROBABLE GLUTAMATE-AMMONIA-









TGAC[G/A]ATTT










LIGASE ADENYLYLTRANSFERASE









TCAGCGCCTTC










(EC 2.7.7.42) (GLUTAMINE-SYNTHETASE









TCCCCGACTC










ADENYLYLTRANSFERASE) (ATASE) -
















Mycobacterium






123-124




cg43935044




 511




GAGTCTGGGG




G




T






SILENT-




UNCLASSI-




Human Gene Similar to SPTREMBL-ACC: P70429




3.50E−39









CTGGCTGGGC








NONCODING




FIED




ENA-VASODILATOR STIMULATED PHOSPHO-









TTCTG[G/T]CTG










PROTEIN (ENA-VASP LIKE PROTEIN) -


MUS











TCCTCTGTCGC












MUSCULUS


(MOUSE), 393 aa.









CGGATGGGCT









C






125-126




cg27850036




 116




TGAATGGTGGC




G




A




Asp




Asp




SILENT-




UNCLASSI-




Human Gene Similar to SWISSNEW-ACC: P11653




1.00E−38









AGACCGGCGT








CODING




FIED




METHYLMALONYL-COA MUTASE ALPHA-









AGGT[G/A]TCTA










SUBUNIT (EC 5.4.99.2) (MCM-ALPHA) -









GCCAGTCCATG












Propionibacterium freudenreichii











TCACCGTAGA












shermanii


, 727 aa.






127-128




cg27850036




 14




ACGCGTTGGA




T




C




Lys




Lys




SILENT-




UNCLASSI-




Human Gene Similar to SWISSNEW-ACC: P11653




1.00E−38









CTC[T/C]TTAGC








CODING




FIED




METHYLMALONYL-COA MUTASE ALPHA-









GGTGGAGAAT










SUBUNIT (EC 5.4.99.2) (MCM-ALPHA) -









CCGGCGTACT












Propionibacterium freudenreichii shermanii


, 727 aa.






129-130




cg27850036




 41




TAGCGGTGGA




G




A




Arg




Arg




SILENT-




UNCLASSI-




Human Gene Similar to SWISSNEW-ACC: P11653




1.00E−38









GAATCCGGCG








CODING




FIED




METHYLMALONYL-COA MUTASE ALPHA-









TACTG[G/A]CG










SUBUNIT (EC 5.4.99.2) (MCM-ALPHA) -









AATCGTCCACG












Propionibacterium freudenreichii shermanii


, 727 aa.









GCCGGAAGGC









AT






131-132




cg42331882




 466




TGTGCGTAGG




G




A






SILENT-




UNCLASSI-




Human Gene Similar to SPTREMBL-ACC: Q15407




1.10E−37




18









GAAAGTCAGTG








NONCODING




FIED




RTSBETA -


HOMO SAPIENS


(HUMAN), 416 aa.









TCGT[G/A]CAG









CTCCCAGGAG









CCTCCTGAGC









GT






133-134




cg43945926




 203




ACAACGCGAG




A




G






SILENT-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




1.10E−37









CCAGGAGTACT








NONCODING




FIED




Q06003 GOLIATH PROTEIN (G1 PROTEIN) -









ACAC[A/G]GCG












Drosophila melanogaster


(Fruit fly), 284 aa.









CTCATCAACGT









GACGGTGCAG









G






135-136




cg39575634




 12




AGATCTGGGAC




A




G




Thr




Thr




SILENT-




UNCLASSI-




Human Gene Similar to SPTREMBL-ACC: Q63965




3.00E−37









[A/G]ATGTCTG








CODING




FIED




TRICARBOXYLATE CARRIER -


RATTUS











GGGAAGTGCC












NORVEGICUS


(RAT), 357 aa (fragment).









ACCCAACA






137-138




cg39575634




 120




TCTTCACGGTT




C




T




Asn




Asn




SILENT-




UNCLASSI-




Human Gene Similar to SPTREMBL-ACC: Q63965




3.00E−37









ACTGATCCCAG








CODING




FIED




TRICARBOXYLATE CARRIER -


RATTUS











AAA[C/T]ATCCT












NORVEGICUS


(RAT), 357 aa (fragment).









TTTAACGAACG









AACAGCTAG






139-140




cg39575634




 165




AGCTAGAGAAT




A




G




Val




Val




SILENT-




UNCLASSI-




Human Gene Similar to SPTREMBL-ACC: Q63965




3.00E−37









GCGAGGAAAG








CODING




FIED




TRICARBOXYLATE CARRIER -


RATTUS











TGGT[A/G]CAC












NORVEGICUS


(RAT), 357 aa (fragment).









GATTACAGGCA









AGGAATCGTTC






141-142




cg39575634




 214




TCCTGCCGGC




T




C




Leu




Leu




SILENT-




UNCLASSI-




Human Gene Similar to SPTREMBL-ACC: Q63965




3.00E−37









CTCACGGAAAA








CODING




FIED




TRICARBOXYLATE CARRIER -


RATTUS











TGAG[T/C]TATG












NORVEGICUS


(RAT), 357 aa (fragment).









GAGAGCGAAG









TACGCGT






143-144




cg27826036




 118




GTAGGGCGAC




C




T




Glu




Glu




SILENT-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




3.40E−36









GGCGTATTTAT








CODING




FIED




Q10776 PUTATIVE LONG-CHAIN-FATTY-









GTCC[C/T]TCG










ACID--COA LIGASE (EC 6.2.1.3) (LONG-CHAIN









CCAAGCACGA










ACYLCOA SYNTHETASE) (LACS) -









CAGCGTTAGAC









A






145-146




cg27826036




 163




TAGACAGGTAC




G




C




Leu




Leu




SILENT-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




3.40E−36









GGGCAGTTGG








CODING




FIED




Q10776 PUTATIVE LONG-CHAIN-FATTY-









CCAT[G/C]AGA










ACID--COA LIGASE (EC 6.2.1.3) (LONG-CHAIN









GTGGCCTCGA










ACYLCOA SYNTHETASE) (LACS) -









CCTTCTGTGGG









G






147-148




cg42538578




 207




AGAGACATTGC




G




A




Cys




Cys




SILENT-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




7.40E−34




 8









CCTCCACTGCT








CODING




FIED




Q09753 BETA-DEFENSIN 1 PRECURSOR









GAC[G/A]CAATT










(HBD-1) (DEFENSIN, BETA 1) -


Homo sapiens











GTAATGATCAG










(Human), 68 aa.









ATCTGTGGC






149-150




cg42538578




 365




GGATTTCAGGA




G




A






SILENT-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




7.40E−34




 8









ACTGGGGAGA








NONCODING




FIED




Q09753 BETA-DEFENSIN 1 PRECURSOR









GGCT[G/A]GCT










(HBD-1) (DEFENSIN, BETA 1) -


Homo sapiens











CCTTTGGAGGC










(Human), 68 aa.









TGAGCTGACAG






151-152




cg44002673




1167




AGCAAGCTCTT




A




G






SILENT-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC: P32740




7.40E−34




13









TGAAACCTGAG








NONCODING




FIED




HYPOTHETICAL 31.0 KD PROTEIN R107.2 IN









CCC[A/G]CGCA










CHROMOSOME III -


Caenorhabditis elegans


,









GACCAGAAGTA










285 aa.









AACAGGCACC






153-154




cg39710199




 535




AGCCGGTGCG




C




T






SILENT-




UNCLASSI-




Human Gene Similar to TREMBLNEW-ACC:




1.20E−33









GCCTGAGGTG








NONCODING




FIED




CAB50754 PUTATIVE INTEGRAL MEMBRANE









CGGGG[C/T]GG










TRANSPORT PROTEIN -


STREPTOMYCES











AGATCGAGTGT












COELICOLOR


, 269 aa.









CGTCATGTCAA









T






155-156




cg39710199




 736




TGTGGGTAGTG




C




A






SILENT-




UNCLASSI-




Human Gene Similar to TREMBLNEW-ACC:




1.20E−33









AGCACGACGG








NONCODING




FIED




CAB50754 PUTATIVE INTEGRAL MEMBRANE









AGAC[C/A]CCG










TRANSPORT PROTEIN -


STREPTOMYCES











TCATGACGCAT












COELICOLOR


, 269 aa.









TTGCTCAACGA






157-158




cg38821538




 155




ATCACCTGAGG




T




C




Asp




Asp




SILENT-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC: P39195




3.50E−33









TCCGGAGTTCA








CODING




FIED




!!!! ALU SUBFAMILY SX WARNING ENTRY -









AGA[T/C]CAGC












Homo sapiens


(Human), 591 aa.









CTGGCCAACAT









GATGAAACCC






159-160




cg21632104




 261




CGTCGCAGTAC




A




G




Leu




Leu




SILENT-




UNCLASSI-




Human Gene Similar to TREMBLNEW-ACC:




3.00E−32









GTTCTGGCCTG








CODING




FIED




CAB38487 PUTATIVE HELICASE -









TCA[A/G]CGTTT












STREPTOMYCES COELICOLOR


, 815 aa.









TGCATATCCCG









GCAAAGGCC






161-162




cg20370177




 370




CCCAAGCCAAT




A




C




Val




Val




SILENT-




UNCLASSI-




Human Gene Similar to TREMBLNEW-ACC:




1.50E−31









ACCAAGATGAT








CODING




FIED




CAB49211 HYPOTHETICAL 57.7 KD PROTEIN -









CGC[A/C]ACTG












PYROCOCCUS ABYSSI


, 533 aa.









GCATCATGTCT









CCCATGCCTT






163-164




cg14810223




 103




CGAGCAGCCC




A




G






SILENT-









GCCAGGACTCT








NONCODING









GGCT[A/G]CTG









GAGATGGGCG









CCCGGCTATC









GC






165-166




cg19882105




 125




GAGATCAGTGT




C




T






SILENT-









GATGATGCACA








NONCODING









GGA[C/T]GGAT









GCGGGAATCC









CAGCTCTTCAT






167-168




cg19882105




 131




AGTGTGATGAT




C




T






SILENT-









GCACAGGACG








NONCODING









GATG[C/T]GGG









AATCCCAGCTC









TTCATATGGCT






169-170




cg19885950




 67




TCCTCAAGTCT




G




T






SILENT-









TTGTTCAAATAT








NONCODING









CA[G/T]CTTTTC









AGCAAGACCTT









CATTAACT






171-172




cg20452710




 118




TGGCCTGGAG




A




G






SILENT-









AGAGGCGGGA








NONCODING









GGGAC[A/G]CT









GGCCTGGAGA









GAGGCGGGAG









GGA






173-174




cg20452710




 64




GACAGGGGAG




A




G






SILENT-









AGAGGCGGGA








NONCODING









GGGAC[A/G]CT









GGCCTGGAGA









GAGGCGGGAG









GGA






175-176




cg20454325




 152




ACATCCCTGCA




A




G






SILENT-









CTGTCACCAGC








NONCODING









CCG[A/G]CCCC









TTGTACCATGG









CAGGGTTGGG






177-178




cg20454325




 159




TGCACTGTCAC




G




A






SILENT-









CAGCCCGACC








NONCODING









CCTT[G/A]TACC









ATGGCAGGGTT









GGGCTGACTG






179-180




cg20595730




 225




CTCCAAAGACT




A




G






SILENT-









TGATTCCAAGA








NONCODING









AAC[A/G]TCTGT









GAAATTCACTA









AGTTTAAGA






181-182




cg20595730




 247




AACATCTGTGA




A




C






SILENT-









AATTCACTAAG








NONCODING









TTT[A/C]AGATA









TGAAGAGACAG









ACTAGTTAT






183-184




cg20610793




 156




GGGCCGACCC




C




T






SILENT-









GAGCAGATGT








NONCODING









GTCGT[C/T]ATC









GAGGACTCCG









CTTTCGGATTG









C






185-186




cg20610793




 165




CGAGCAGATGT




C




T






SILENT-









GTCGTCATCGA








NONCODING









GGA[C/T]TCCG









CTTTCGGATTG









CGTGCCGGAC






187-188




cg20610793




 198




TCGGATTGCGT




C




T






SILENT-









GCCGGACGGG








NONCODING









CTGC[C/T]GGA









GCGTGGGTTCT









CACGGTCGGA









C






189-190




cg20610793




 228




CGTGGGTTCTC




G




A






SILENT-









ACGGTCGGAC








NONCODING









GCAG[G/A]CTC









AAGGGCCAGG









GGGACATGTG









GG






191-192




cg20610793




 237




TCACGGTCGG




C




A






SILENT-









ACGCAGGCTC








NONCODING









AAGGG[C/A]CA









GGGGGACATG









TGGGTTCCCG









GGC






193-194




cg20610793




 267




GGGACATGTG




T




C






SILENT-









GGTTCCCGGG








NONCODING









CTGGA]T/C]GAT









GAGCGGGTGA









CCTTCTGGGAA









C






195-196




cg20610793




 284




GGGCTGGATG




T




C






SILENT-









ATGAGCGGGT








NONCODING









GACCT[T/C]CTG









GGAACCCCATC









GATGAGGGCG









T






197-198




cg20610793




 296




GAGCGGGTGA




A




G






SILENT-









CCTTCTGGGAA








NONCODING









CCCC[A/G]TCG









ATGAGGGCGT









GCGAGCTGAC









AC






199-200




cg20610793




 43




CGGATTGGCCT




A




G






SILENT-









CACAAGGCTG








NONCODING









GCTG[A/G]AAC









TGTTCGACACC









GTCCTTGGGGT






201-202




cg20711459




 261




CCCGACTGAA




A




G






SILENT-









GGCACGGATG








NONCODING









AGTTC[A/G]CC









GATCCCATATT









TGGAGTGGAG









AG






203-204




cg20723460




 148




TTCCCCGGCG




C




T






SILENT-









AAGAAAAAGGC








NONCODING









GTCG[C/T]CCAT









TCCTCTTCCAA









AACGCTACAA






205-206




cg20723460




 193




CTACAACAAAA




C




T






SILENT-









ACCACCACGCT








NONCODING









TCC[C/T]TTCCT









TCTTCCTTGCC









CCTTTCCCT






207-208




cg20724182




 184




GATCATTGTAG




G




A






SILENT-









GCTATTTCAAA








NONCODING









ACC[G/A]CCAA









ACAAGCCATGA









ACGCAGCAAA






209-210




cg20724182




 197




TATTTCAAAAC




T




C






SILENT-









CGCCAAACAAG








NONCODING









CCA[T/C]GAAC









GCAGCAAAACA









ATTCCACTGG






211-212




cg20724182




 95




TGCTGGGGGC




T




A






SILENT-









GCTTCACAGAC








NONCODING









AACA[A/A]CAAA









TACGCTGTAGC









TGCCCAATAT






213-214




cg20724182




 99




GGGGGCGCTT




A




G






SILENT-









CACAGACAACA








NONCODING









TCAA[A/G]TACG









CTGTAGCTGCC









CAATATTGGA






215-216




cg20724478




 247




ACTATCTGGGA




C




T






SILENT-









GTTGGGGCCC








NONCODING









TGCA[C/T]GGC









ACTGGAACCAA









ACCTGAGGCT









G






217-218




cg20724478




 250




ATCTGGGAGTT




C




T






SILENT-









GGGGCCCTGC








NONCODING









ACGG[C/T]ACT









GGAACCAAACC









TGAGGCTGGG









G






219-220




cg20724478




 298




GGGAGCTCGG




C




T






SILENT-









CCTGGCTGGG








NONCODING









ATACG[C/T]GAT









GTCGTCAACGC









CAGCCCGTGG









C






221-222




cg20724478




 90




GTTGCCGAAAT




C




T






SILENT-









TGGGGCCGAT








NONCODING









GGTG[C/T]CCA









TGTTGGGCAGT









CTGACATGCCG






223-224




cg20726641




 101




TCAAGAAATTT




C




T






SILENT-









GCCATTCTTGA








NONCODING









CCA[C/T]GACCT









GACCGAGGATT









CTCACTCAG






225-226




cg20726641




 119




TTGACCACGAC




T




A






SILENT-









CTGACCGAGG








NONCODING









ATTC[T/A]CACT









CAGTGACGAC









CAGTCTCAAGG






227-228




cg20726641




 224




CTGACGAAAAC




T




C






SILENT-









GATCAACCGG








NONCODING









GCGC[T/C]TCA









AAGGGAAGCG









ACGCTTCATGA









C






229-230




cg20726641




 280




CTGTCGTCGTC




C




T






SILENT-









GATATTCCACT








NONCODING









GCG[C/T]TGGT









CCGATATGGAT









GCGCAGGGAC






231-232




cg20726641




 307




GGTCCGATATG




T




C






SILENT-









GATGCGCAGG








NONCODING









GACA[T/C]GTTA









ATAACGTTCGT









ATTAGCGAGC






233-234




cg20726641




 310




CCGATATGGAT




T




C






SILENT-









GCGCAGGGAC








NONCODING









ATGT[T/C]AATA









ACGTTCGTATT









AGCGAGCTCG






235-236




cg20726641




 316




TGGATGCGCA




C




T






SILENT-









GGGACATGTTA








NONCODING









ATAA[C/T]GTTC









GTATTAGCGAG









CTCGAACA






237-238




cg20728487




 53




GGAAACTCATC




T




C






SILENT-









GGCAATATCGT








NONCODING









TGC[T/C]GCTTG









GGAGACTGGC









TTCATGCTGG






239-240




cg20730743




 23




ACGCGTACTG




A




G






SILENT-









GCGGATCTCA








NONCODING









GT[A/G]CGATAA









CCCACCAGATT









GCCGGTGA






241-242




cg20730743




 26




ACGCGTACTG




A




G






SILENT-









GCGGATCTCA








NONCODING









GTACG[A/G]TAA









CCCACCAGATT









GCCGGTGAAC









T






243-244




cg20730743




 95




GCTGTCCGGC




A




G






SILENT-









CCCACCGGCG








NONCODING









AGTTT[A/G]TCG









AGCTGGGAGG









GATCGATTTTC









C






245-246




cg20744814




 84




GGCACCCGGG




C




gap






SILENT-









TGCTGCTGGC








NONCODING









CATGG[C/gap]C









ACCCACGAAG









CTCTCCCTGCC









CCC






247-248




cg21147609




 219




GTTCCATGCCT




C




T






SILENT-









TTCTAGACCCC








NONCODING









AGG[C/T]CCTTT









CCTGCATGATT









TTATCAGCA






249-250




cg21147791




 282




AATAAAGTGTT




G




A






SILENT-









TCCTTGAGTCC








NONCODING









TGT[G/A]AGTTG









CTCTAGCAAAT









TTATCAATC






251-252




cg21148047




 203




ATGGTGATTCC




A




G






SILENT-









TCAAGAAATTA








NONCODING









GAA[A/G]CAGA









ATTACCCTATG









ATCCAGCATT






253-254




cg21148203




 236




TATGTTTGCTG




T




C






SILENT-









GGGGAGTGGG








NONCODING









TGGG[T/C]TGC









AGAACTTAAGA









CCAGGACAATT






255-256




cg21150589




 135




CCTTTGAAATT




A




G






SILENT-









CGATTTCCTTC








NONCODING









CCC[A/G]GGTG









AAAGAGGAGAA









CAGATTCTAC






257-258




cg21395558




 63




ATTGACAGAGT




G




T






SILENT-









GACATTTGGGC








NONCODING









AAC[G/T]CGTG









AAGGAAGTGG









GTGGAGGAGG









T






259-260




cg21395558




 65




TGACAGAGTGA




G




T






SILENT-









CATTTGGGCAA








NONCODING









CGC[G/T]TGAA









GGAAGTGGGT









GGAGGAGGTG









G






261-262




cg21395558




 72




GTGACATTTGG




A




G






SILENT-









GCAACGCGTG








NONCODING









AAGG[A/G]AGT









GGGTGGAGGA









GGTGGCAGCC









AG






263-264




cg21415668




 138




AGGACTGGTCA




C




G






SILENT-









GGGAGGAGTT








NONCODING









AGGG[C/G]AGG









AGGACTGGTCA









GGGAGGAGTT









A






265-266




cg21417734




 43




CAACGGGTTAC




C




G






SILENT-









CCCGGCGCAC








NONCODING









CTGG[C/G]TTT









GCCCGATCACA









GCGGCACGCA









T






267-268




cg21428517




 142




CCATGCCCATC




A




G






SILENT-









CCGGTGCCGC








NONCODING









AGAA[A/G]AAG









ATTCCTCGATC









GGCTTTTCCGT






269-270




cg21428762




 113




TGGTCGTGGTC




A




G






SILENT-









TCATCAGAGGT








NONCODING









GAA[A/G]ACGA









TGAGCGGGGT









GCTCGGACGC









A






271-272




cg21428762




 149




GGGTGCTCGG




A




G






SILENT-









ACGCAGACGA








NONCODING









GCGAT[A/G]CG









ACGGGCGGTG









TCACCGGACTT









GG






273-274




cg21428762




 65




ACACCGGGGT




G




A






SILENT-









AACGACGGCG








NONCODING









TGAGC[G/A]CC









CCAGACCCAG









GCGAGGGTCT









TGG






275-276




cg21429119




 356




TTGGTAGGCCA




C




T






SILENT-









AGGCAGGACG








NONCODING









ACCA[C/T]TTGA









GCCTGGGAATT









TGAAACCAGC






277-278




cg21429119




 393




AATTTGAAACC




A




G






SILENT-









AGCCTGGGCA








NONCODING









ACAT[A/G]GTGA









GTCTTTGTTTC









TACAAGAAAT






279-280




cg21429119




 408




TGGGCAACATA




C




T






SILENT-









GTGAGTCTTTG








NONCODING









TTT[C/T]TACAA









GAAATTTAAAA









AAAAAATTA






281-282




cg21429803




 373




TGCACCCGGC




C




T






SILENT-









GTGCCCTGAAA








NONCODING









CACA[C/T]GCG









TGTGCCCCGAA









ATACCTGCATT






283-284




cg21433543




 205




GGTGGATCTG




C




T






SILENT-









GTCGGGATCG








NONCODING









GTGAC[C/T]ACT









CTGGTCATCGT









CGATTATGCGA






285-286




cg21433543




 239




CATCGTCGATT




C




T






SILENT-









ATGCGACGAC








NONCODING









CTTC[C/T]TACC









ACTGAAGTTAT









GGCGTCGCTG






287-288




cg21433543




 269




ACTGAAGTTAT




G




gap






SILENT-









GGCGTCGCTG








NONCODING









CGTA[G/gap]CC









GAGGCTGGGG









TAGCGCTCCTG









GG






289-290




cg21433543




 293




AGCCGAGGCT




G




A






SILENT-









GGGGTAGCGC








NONCODING









TCCTG]G/A]GC









GGAATCGTCCT









GACGCGGCCG









CC






291-292




cg21435199




 96




AAATTTAATAAAA




G




A






SILENT-









TAAATTATAAA








NONCODING









GA[G/A]CTCCT









CTTACCTAGAA









ATAATTATT






293-294




cg21637172




 37




CGGTTGGCCA




C




T






SILENT-









AGCCTGGCACT








NONCODING









CAAA[C/T]GTCC









GCCTAACCTGG









GGTCTTTATT






295-296




cg21643872




 325




GGTTGAGTGG




A




G






SILENT-









GACGCCTTCTA








NONCODING









CGAG[A/G]AGC









ACCCTGAGCTT









GACCTGGAAA









G






297-298




cg21657573




 102




AAAAAGGTTAA




G




C






SILENT-









AGATCAGACAG








NONCODING









ACA[G/C]CTGA









CCTTACTGCCC









TCAATGGCCA






299-300




cg21657879




 270




CCAGGGAAAG




C




G






SILENT-









GCAGTCCCCCT








NONCODING









CCCC[C/G]ACA









GCAGTCACGAA









CCTCAGAAGCC






301-302




cg21659205




 482




CCAAACAATCC




G




A






SILENT-









AGCTTGCTCCC








NONCODING









CTC[G/A]ACCA









CTCAGAACAAA









CGCCCTAAGT






303-304




cg21660634




 198




CCACGTGACG




T




C






SILENT-









ACCGGAACATC








NONCODING









ACTG[T/C]GAC









GCTTCACTCGG









GCAACCGGTC









G






305-306




cg21660634




 92




GCCACGGCTC




C




T






SILENT-









GGTGAATCCGA








NONCODING









CTCG[C/T]GGG









GCCAACACAAC









GGCCTCACCC









A






307-308




cg21661814




 164




CGCCGAAATC




C




G






SILENT-









GGTGACGATG








NONCODING









GCCTT[C/G]GC









GTGGCCAATGT









GGAGGTAGCC









GT






309-310




cg21661814




 239




GGACGCGCCC




T




G






SILENT-









GCCGTAGGTG








NONCODING









TCCTG[T/G]TG









GATGTCCGCG









CGMCAACCTG









AT






311-312




cg21661814




 29




CGTGGACAAC




G




A






SILENT-









GTGGGCCGGG








NONCODING









GAGTA[G/A]CC









TAACCACTCAA









TGTCTGCAATG









A






313-314




c921661814




 52




TAGCCTAACCA




T




C






SILENT-









CTCAATGTCTG








NONCODING









CAA[T/C]GATCG









ACTCGACATAC









TCGGTTTCC






315-316




cg21661814




 98




TTTCCTCGGTG




A




G






SILENT-









CCTGGATTAGT








NONCODING









ATC[A/G]TCAAG









TCTCAGGTTGC









AGGTGCCGC






317-318




cg24113982




 145




GCTCGGCTGC




C




T






SILENT-









TGCAGAAGTCT








NONCODING









CCTT[C/T]CCTC









CTTTGTGGCTG









GTATATAGAA






319-320




cg25268133




 209




GGCCGTCATC




G




A






SILENT-









GCGGTCACGA








NONCODING









CTCCC[G/A]TG









ATCACCATGAT









CGTGGGCATG









AC






321-322




cg25309388




 332




TGAGAGGGTAA




C




T






SILENT-









AGTGCCAGTCT








NONCODING









GTG[C/T]TAAAA









GAACGTGAAAA









GGAAACCTA






323-324




cg25339094




 215




TTGCCCAGACC




T




C






SILENT-









AATGCGATGG








NONCODING









GTCG[T/C]CTC









CGCCACCATC









GAGAAACGAG









AA






325-326




cg25339094




 345




TGCGAGCCTG




C




A






SILENT-









CACACCAACAA








NONCODING









CCCC[C/A]AGA









TCGGCGAGTC









GACCTCTCATC









G






327-328




cg25339094




 351




CCTGCACACCA




G




A






SILENT-









ACAACCCCCAG








NONCODING









ATC[G/A]GCGA









GTCGACCTCTC









ATCGTGCCAG






329-330




cg27778388




 118




GTCCCAGGGT




A




G






SILENT-









GACGCGAGGT








NONCODING









TGGGG[A/G]CT









GAGCAACCAG









GAATAGACCTT









CA






331-332




cg27802892




 239




ATCTAACCGGT




A




G






SILENT-









TCTAGACAGCT








NONCODING









TAA[A/G]CAAAC









AGATACAGTGC









CCTTTTCTC






333-334




cg27802892




 241




CTAACCGGTTC




A




C






SILENT-









TAGACAGCTTA








NONCODING









AAC[A/C]AACA









GATACAGTGCC









CTTTTCTCAG






335-336




cg27805688




 354




AATCCCGTTGC




A




G






SILENT-









TGTCGTGATGT








NONCODING









GAA[A/G]CCAG









CACCAGTTCTG









CTGGCCACGC






337-338




cg27825173




 254




GGCCACCGCG




C




T






SILENT-









GGCACCGCAC








NONCODING









GGACA[C/T]CC









CGACACACGA









GCACCCACAC









CCC






339-340




cg27827050




 71




CCGATGGCAA




G




A






SILENT-









GTGGGACAGC








NONCODING









CTGGA[G/A]GG









CTTGCTCACCT









GCGAGCCCGG









CC






341-342




cg27828294




 111




GTACAAAAACT




T




C






SILENT-









AGTAGATGTGT








NONCODING









GAA[T/C]GCAAT









AAAAGTGCTCA









GAAACACAC






343-344




cg27845127




 25




ACGCGTCCTGA




A




G






SILENT-









AGCCGCCGAC








NONCODING









GCG[A/G]CGAG









AACAGCAGGC









CAGCAGCTCG









A






345-346




cg27845127




 78




TCAGTGGCAGA




T




C






SILENT-









TAGCCAGCGG








NONCODING









CGAC[T/C]GAG









CGTGCGCCAT









GATGCCGCGA









CT






347-348




cg27845127




 95




GCGGCGACTG




G




A






SILENT-









AGCGTGCGCC








NONCODING









ATGAT[G/A]CC









GCGACTGACA









CCACCTGCGG









TCC






349-350




cg27922064




 325




CAAAAATGCTC




A




G






SILENT-









ATTTAGTTTTCCT








NONCODING









CA[A/G]CACCC









CCAGACTGACC









TTCAAAACT






351-352




cg27928117




 11




GCTAGCAGCT




C




G






SILENT-









[C/G]TGGCCCTG








NONCODING









CAGCTGAGCA









CAGGCCA






353-354




cg27928117




 160




TATTCAGTAGG




C




T






SILENT-









GAAAAGGGCA








NONCODING









AGGA[C/T]CTG









AAAAAAGTGTA









TTAAGAATCGT






355-356




cg27929704




 206




GTCACTGGGC




G




gap






SILENT-









CTGATGCCACC








NONCODING









GGAG[G/gap]CT









GAGCTACTGG









GCACCTTCGG









CCA






357-358




cg27956615




 73




TTCTCCATGCT




A




G






SILENT-









CCTAGATGGAA








NONCODING









AAC[A/G]CAGT









CATTCTGATCA









CTTTCTCTCT






359-360




cg27957329




 105




GGAGCTATGGT




C




T






SILENT-









TTTCGCCAAGT








NONCODING









CAA[C/T]TCACT









GATTGTGGGAC









GGGTGGTGG






361-362




cg27962799




 121




TGCTCCTCCCG




C




T






SILENT-









CGTGCTTCCGC








NONCODING









CGC[C/T]GGTG









GCTTGGACCC









GTCGGGGCTG









G






363-364




cg28315794




 188




GGTTTAGGAAT




C




T






SILENT-









GCTAGCTTTTG








NONCODING









AAA[C/T]TTCAT









TCAAAATGTCT









TTGAAGCCA






365-366




cg28389525




 109




TCGTGTTAGAA




A




G






SILENT-









AACTTTCGACC








NONCODING









TGG[A/G]GTCA









CGAAGCGTTTG









GGAGTGGATG






367-368




cg28389525




 145




GTTTGGGAGTG




A




G






SILENT-









GATGCGGAAA








NONCODING









GTGT[A/G]CATA









AAACCAATCCG









CGAATAATAT






369-370




cg28389525




 187




GAATAATATAC




T




C






SILENT-









GCCAGCATTTC








NONCODING









GGG[T/C]TTCG









GTCAAGAGGG









GCCGTTCCGAA






371-372




cg28389525




 91




TCGTTGAGCAT




A




G






SILENT-









GCAGACGTCG








NONCODING









TGTT[A/G]GAAA









ACTTTCGACCT









GGAGTCACGA






373-374




cg28397602




 82




CCTCTCTGCAC




A




G






SILENT-









GGCTGTGTGTG








NONCODING









TGC[A/G]TGTC









CATGCCTGTCC









AGGTCAGGAC






375-376




cg28459036




 106




GCGACGAGGG




C




T






SILENT-









TACGACCGTCG








NONCODING









GTAG[C/T]CGT









GTAGATCATAC









GTCGGGGCCG









G






377-378




cg28459036




 142




ATACGTCGGG




A




G






SILENT-









GCCGGGTGAC








NONCODING









GCGCC[A/G]GA









GGGCTTGCTGT









TCGGTGGCGG









TC






379-380




cg28459036




 266




ATCCCGATCCA




G




A






SILENT-









AATCCAGCTAG








NONCODING









ACC[G/A]ACCAT









AATCGTCAATG









CGATCACCA






381-382




cg28459036




 80




AGGCGGCCAA




C




G






SILENT-









GTCAGCGCAG








NONCODING









GAGGC[C/G]GC









GACGAGGGTA









CGACCGTCGG









TAG






383-384




cg28473092




 278




CACCCTCGAG




G




A






SILENT-









CATCGTCACCT








NONCODING









CGAT[G/A]CTAA









TTAGAGCCATG









TGCCGATGAG






385-386




cg28473092




 376




GGGGTGCGCC




A




G






SILENT-









ATACCAACTCC








NONCODING









CGAC[A/G]CAG









GACACCCTCG









CGGAAGTCGAT









C






387-388




cg28486260




 209




TATTATTTGCTA




A




G






SILENT-









TTACCCAAGCT








NONCODING









GT[A/G]GGGGC









TGTCCATTTTTA









TGCGAAGT






389-390




cg29195033




 109




GTGCAGGCTAA




T




C






SILENT-









TCCACGACATG








NONCODING









TAT[T/C]GACTT









CCGTCGCGGA









TCTTGCCGCC






391-392




cg29195033




 333




CAAGCCATTCA




A




G






SILENT-









TCGCCGTGCG








NONCODING









GACC[A/G]TAG









TAACCGACCGC









CGAACCATTGA






393-394




cg29195033




 339




ATTCATCGCCG




A




G






SILENT-









TGCGGACCATA








NONCODING









GTA[A/G]CCGA









CCGCCGAACC









ATTGAGGAAGA






395-396




cg29195033




 348




CGTGCGGACC




C




T






SILENT-









ATAGTAACCGA








NONCODING









CCGC[C/T]GAA









CCATTGAGGAA









GATCCTGCAGC






397-398




cg29195033




 368




ACCGCCGAAC




T




G






SILENT-









CATTGAGGAAG








NONCODING









ATCC[T/G]GCA









GCGCGGCGAG









GATGCTAAGGC









G






399-400




cg29195033




 44




ACGAGGTTCAC




T




C






SILENT-









CGGCCCGCTC








NONCODING









ATAG[T/C]GTCG









TCAGTCAGAAT









CTTCATCATT






401-402




cg29195033




 72




CGTCAGTCAGA




C




T






SILENT-









ATCTTCATCATT








NONCODING









GC[C/T]GATAC









GTGATCGTGCA









GGCTAATCC






403-404




cg29204207




 203




GACACTCCCCT




G




T






SILENT-







14









CGACGCAGCC








NONCODING







(14q22)









TCCG[G/T]AGC









GGCGCGCACT









CTCCAGAGGC









CA






405-406




cg29207528




 213




CGGGCCCGCC




C




T






SILENT-









CTAGCCCTCCT








NONCODING









CGAT[C/T]CAG









CGTGGGGACG









CCAGATCCACG









T






407-408




cg29207528




 245




GGGGACGCCA




G




A






SILENT-









GATCCACGTG








NONCODING









GAGAC[G/A]AC









AGGGTGCCCC









AGCGCCGTGG









TCT






409-410




cg29207528




 254




AGATCCACGTG




C




T






SILENT-









GAGACGACAG








NONCODING









GGTG[C/T]CCC









AGCGCCGTGG









TCTGGAATCCA









C






411-412




cg29207528




 260




ACGTGGAGAC




C




T






SILENT-









GACAGGGTGC








NONCODING









CCCAG[C/T]GC









CGTGGTCTGG









AATCCACGCTC









CT






413-414




cg29207528




 413




GCGACCTCAC




C




T






SILENT-









CATGTCCACAC








NONCODING









GGAT[C/T]AGC









GTCGAAACGTT









GTGATCGCTGC






415-416




cg29214234




 79




GACGCGTACCT




C




T






SILENT-









GCCATCAGGAT








NONCODING









CCT[C/T]GTTTG









TTTCTGAAGCA









ACCCCCTTC






417-418




cg29216983




 91




CCGTTGGGCC




T




C






SILENT-









ATACCCGTCTC








NONCODING









GTGA[T/C]CGA









GGAAGGCTCA









ACGGAATGCAT









T






419-420




cg29234854




 193




GACGCTGTGC




A




G






SILENT-









CGTGGGATTTC








NONCODING









CTCA[A/G]CGA









GGCTCAAGAG









AGGCACGCGT









CG






421-422




cg29235319




 75




CACACACACAC




G




A






SILENT-









ACACACACACA








NONCODING









CAC[G/A]CACG









CACGCACGCA









CGCACGCAAT









G






423-424




cg29242513




 250




TAACGGTTGAG




A




C






SILENT-









TAACACATCAA








NONCODING









AAC[A/C]CCGTT









CGAGGTCAAG









CCTGGCGTGT






425-426




cg29254804




 134




TACTTCATTTTT




A




G






SILENT-









TTTCCTATTTG








NONCODING









CA[A/G]CAACCT









GTAATGAGTAA









CTGTATTA






427-428




cg29345947




 298




TAGTGACAGGC




A




G






SILENT-









GCAATGCACAC








NONCODING









CGA[A/G]CGGG









CGCCAACAGA









GCAGCCACGC









T






429-430




cg29352964




 401




CGTCTTGCCCA




C




T






SILENT-









TATTGACGCCC








NONCODING









CGA[C/T]GCTG









CTGTCGGTGTG









GGGGAGTGAC






431-432




cg29357657




 392




CATGTGAGGA




C




T






SILENT-









GGCCGGCCAT








NONCODING









GAGGT[C/T]GT









CGTGCTGGAC









GACCTATCCGC









GG






433-434




cg29360558




 158




CGTTTATTTAT




T




C






SILENT-









GCTTTTGGTTG








NONCODING









GTT[T/C]TCCTT









TGATAAATGCG









GCCCTTGCT






435-436




cg29522548




 137




CCATAATCATT




A




G






SILENT-









TCCAACTCTTT








NONCODING









CAA[A/G]GTTTT









TTTAAATTTCAG









CTCAAAAT






437-438




cg30177683




 471




GGCTCTATATC




C




T






SILENT-









ATTGAAAACGA








NONCODING









ATT[C/T]TCCAC









GCAAAACCCAC









TTCACACCA






439-440




cg30377599




 144




TGAGGCATCTA




G




T






SILENT-









AATTTTCACAT








NONCODING









CCT[G/T]CCTGT









GGAGCAGCAA









GCTGAAGAAA






441-442




cg30790712




 273




CAGCACCGTG




T




C






SILENT-









GGGTCCAGGG








NONCODING









TCCAC[T/C]GTC









CACCAGGACCT









ACTGCGTGGG









G






443-444




cg32119538




 286




TTCCCTAAATC




A




C






SILENT-









CAAAGCGAGC








NONCODING









AAAC[A/C]GGA









GAGAGAAACC









CTGAAAATGGG









C






445-446




cg32120712




 608




CTCAACAGGTG




G




gap






SILENT-









GTGCACTGGG








NONCODING









ACCG[G/gap]CA









GCCGCCCGGG









GTCCCGCACG









ACC






447-448




cg32128189




 81




TCAGCTTTATT




C




T






SILENT-









ATAATCTTATG








NONCODING









GGA[C/T]CATCA









TCATGTATGTG









GTCCACTGG






449-450




cg32153241




 106




CGGTGCGGCA




C




T






SILENT-









CCATTCCACCG








NONCODING









CGAT[C/T]GAC









CCGGCTCCGG









TCCCGAGGTC









CC






451-452




cg32153241




 121




CCACCGCGAT




C




A






SILENT-









CGACCCGGCT








NONCODING









CCGGT[C/A]CC









GAGGTCCCAC









AGCAGTTGACC









AG






453-454




cg32153241




 175




TGGGCCGCAG




A




G






SILENT-









GGCTGCCAGC








NONCODING









GCGAC[A/G]GC









TCGTACCGCGT









GCTTGGTGATA









A






455-456




cg32153241




 348




AACGCGGGGT




T




C






SILENT-









GGGGGAGCCG








NONCODING









AAGCC[T/C]GT









GTGACACAATC









AAGGGGACTC









GC






457-458




cg32153241




 44




GGTACGTGCG




A




G






SILENT-









TTGGCGGCCC








NONCODING









GCTCA[A/G]CC









TTGCGTTCTAG









CCCGATCGCC









CG






459-460




cg32153241




 55




TGGCGGCCCG




T




C






SILENT-









CTCAACCTTGC








NONCODING









GTTC[T/C]AGCC









CGATCGCCCG









ACAGGTCGGG









T






461-462




cg32153241




 76




GTTCTAGCCCG




C




T






SILENT-









ATCGCCCGACA








NONCODING









GGT[C/T]GGGT









CGGTGCGGCA









CCATTCCACCG






463-464




cg32153241




 82




GCCCGATCGC




G




A






SILENT-









CCGACAGGTC








NONCODING









GGGTC[G/A]GT









GCGGCACCATT









CCACCGCGAT









CG






465-466




cg32168828




 367




TAGGGCAACG




T




C






SILENT-









CCAAGTTCGAA








NONCODING









GACG[T/C]CCC









CCTGTGCTTTT









CCGCCTCACTC






467-468




cg32177197




 596




TGACGAGAGTC




G




A






SILENT-









CCCTGGGACG








NONCODING









AGGG[G/A]AAG









GAATGGAAAGC









GGTGGGGTCG









T






469-470




cg32177197




 716




ATGTACCGTCC




A




G






SILENT-









GTCCCCTTCCA








NONCODING









CAT[A/G]ATCTG









GAGGCCGTAC









CAATCGGGTG






471-472




cg32180618




 404




TGTGTGTGAAA




C




T






SILENT-









ATCAGCACGGT








NONCODING









GCG[C/T]GTGA









GGGGCGGGCG









CGCTTCTCACA






473-474




cg33206207




 74




CAGGGATGAC




G




T






SILENT-









GCCGCCATGA








NONCODING









GTTGG[G/T]TG









ACGTGGGCCT









GCCGACTGTCT









CC






475-476




cg34715517




 291




TGATGCCTCAG




A




gap






SILENT-









CAAAAAATTGT








NONCODING









GCT[A/gap]AAA









AAAAAAACTGG









AAGAAAAGTAC






477-478




cg35050153




 456




CGACTTGTTCA




C




T






SILENT-









GCACCAGGAG








NONCODING









GAGG[C/T]GCT









GGCTGCTGTCA









CTGGGGCTCT









G






479-480




cg35066497




 449




TGTCGATCTGA




G




A






SILENT-









TCGGAGAACTT








NONCODING









GCC[G/A]CCGG









TCTTGTCGTCG









ACAGTGTTGC






481-482




cg35066497




 469




TTGCCGCCGG




T




C






SILENT-









TCTTGTCGTCG








NONCODING









ACAG[T/C]GTTG









CCAGCCTTGTC









GATGCCCTGC






483-484




cg35068462




 289




GAGGACTCGG




C




T






SILENT-









CCTGACGACG








NONCODING









GTCAC[C/T]GTC









ATTCATGACCT









CGACTTGGCTG






485-486




cg35341776




 413




TCAATTGCTTTT




A




G






SILENT-









TGTCCGACATC








NONCODING









TC[A/G]GACACT









CTCTTCACCAT









GACTCAGT






487-488




cg36508718




 517




TGGAGCCCGG




C




T






SILENT-









CGACAACCTTG








NONCODING









ACAT[C/T]ACCG









TGCATAGCGCC









CTCAACGATG






489-490




cg36517624




 234




GAACTGGCCC




G




A






SILENT-









AGCCAAACTCT








NONCODING









TCAA[G/A]CTGC









TGCCTAAAGCC









TGGGTTGGGG






491-492




cg36517624




 307




TGATGGCTTCA




G




A






SILENT-









AGCACGTCCC








NONCODING









GCCA[G/A]CCT









AGCCCCGTCA









CAGTCATCACA









T






493-494




cg36517624




 453




CATTCTTTGAA




G




A






SILENT-









GTGCTTTTTGA








NONCODING









TGG[G/A]TACCT









CAGGGGTATCA









GCGACCGGG






495-496




cg36517624




 475




TGGGTACCTCA




C




T






SILENT-









GGGGTATCAG








NONCODING









CGAC[C/T]GGG









ATGCGAAGGTA









GGTGATATCCT






497-498




cg36618790




 360




GGCGAAGCAC




C




T






SILENT-









CTGAGGTCAG








NONCODING









GAGTT[C/T]GA









GACCAGCCTG









GCCAACATGAC









AA






499-500




cg37003369




 306




TCAGTTGATTT




gap




A






SILENT-









AAGGAATAAAA








NONCODING









AAA[gap/A]GAC









CATTTTGCTAA









ACACTATTAAA






501-502




cg37003369




 404




TGTGACCTGTG




G




A






SILENT-









TTCATAGCTAA








NONCODING









CAT[G/A]AGCTC









TGACCTCCCTA









CGCCGGGCG






503-504




cg37026709




 344




GGCCGTGTGC




C




G






SILENT-









TGGTACCAGG








NONCODING









GATAC[C/G]CA









GGAGCTCAGC









AGATTTTGGCC









TC






505-506




cg38206730




 335




AGGCGTACAC




G




A






SILENT-









GTGCAGGTGT








NONCODING









GTTAC[G/A]TGT









TCATTTTCGGC









TCAAGGCGTAC






507-508




cg38278821




 212




CGACGGTACT




A




G






SILENT-









GGTTGCTCCTC








NONCODING









GTAT[A/G]GAAA









GCGGTGAATG









CGAATGCAATG






509-510




cg38403377




1024




CGACGATCTCC




A




G






SILENT-









CCGCGGTGTC








NONCODING









GTCC[A/G]TAG









GCGATACCGC









GAGCATTGACG









A






511-512




cg38403377




1084




TCCCCGAACC




A




G






SILENT-









GTTGACGCCG








NONCODING









AGAAG[A/G]AC









ATTGTTGTTGT









AAATCTTCTTG









A






513-514




cg38403377




1171




TCGTTGCGTGC




A




G






SILENT-









CGTGAGCGAT








NONCODING









CCGG[A/G]CGT









TGCACCGGGT









CATCCTGCGGT









G






515-516




cg38403377




 934




CCTCGGTGTGA




C




T






SILENT-









GTGGCGTTCGT








NONCODING









TAG[C/T]AGTGA









TGCGATGGCG









GTCGTGCGAT






517-518




cg38446357




 292




TTTCCTCGCTG




C




T






SILENT-









GGTATCTCCGC








NONCODING









GAC[C/T]CCTC









GGGCCCGTAG









ACCGTCCTCGA






519-520




cg38446677




 312




CACCTCCCATC




G




C






SILENT-









GGAATGACGTA








NONCODING









CGT[G/C]GTCA









GAGACGATCC









GACCGTCGTG









C






521-522




cg38446677




 439




TCGTCGCCGAA




G




C






SILENT-









AGCAGAATGG








NONCODING









CCAT[G/C]ACTT









CGACGGCGGT









GGCCGAGTCG









A






523-524




cg38446677




 447




GAAAGCAGAAT




C




T






SILENT-









GGCCATGACTT








NONCODING









CGA[C/T]GGCG









GTGGCCGAGT









CGAGGGCTCC









G






525-526




cg38869031




 345




ACATTGATTGT




gap




T






SILENT-









TCACATTTTTTT








NONCODING









TT[gap/T]CTCTT









CTCAATTTCCC









TTGATTATA






527-528




cg38925867




 196




GCAACGGAGA




T




C






SILENT-









TGACTCACAAG








NONCODING









CTCG[T/C]TACG









AGGCGGTAAG









GCTCACCCGC









G






529-530




cg38925867




 306




AAGGCCAGTA




G




C






SILENT-









GCACTCGTGCA








NONCODING









GTTG[G/C]GAC









GATGAGACGTT









GGCCTCGCGG









C






531-532




cg39373569




 271




AACATCTTGAA




G




A






SILENT-









AATACACAAGT








NONCODING









GGT[G/A]CAAA









GATGTGTCACG









TTCTGGACCT






533-534




cg39380084




 120




CCTCGTTGCCT




T




C






SILENT-









TATCTCCAGAT








NONCODING









TCC[T/C]CAATT









TCTGTGAAACG









TAAACATTA






535-536




cg39380084




 130




TTATCTCCAGA




T




G






SILENT-









TTCCTCAATTT








NONCODING









CTG[T/G]GAAA









CGTAAACATTA









TGGGAATAGT






537-538




cg39402442




 172




TTCTTCTCTGC




T




C






SILENT-









CATTCCTGGAG








NONCODING









ATT[T/C]TGAAA









AGAGTTGGTAA









TGTGTTTCA






539-540




cg39404391




 113




TTAGCCCAACA




A




G






SILENT-









GCCTGGCACA








NONCODING









AAGG[A/G]CAA









CAATGAGGAGA









GGAAAGGGGA









G






541-542




cg39404391




 48




GATTCCTACAC




C




T






SILENT-









TATCCCCAAAA








NONCODING









TGG[C/T]AGAG









CTGGGCTCTCC









CTGCAGTGGC






543-544




cg39435025




 186




AAACTTTAACG




A




T






SILENT-









TGTTATATCATT








NONCODING









CA[A/T]GGCGT









AACTTATACGC









GCGGGGTAC






545-546




cg39485034




 346




CCGCCCAGGA




A




G






SILENT-









ATTCGTGAGTT








NONCODING









TCCA[A/G]GTTG









CTGAGCCATTG









CCCGGATTCC






547-548




cg39485034




 400




AGTACAGGCG




A




G






SILENT-









ATCGCTGCCAC








NONCODING









CGTT[A/G]GAC









CATGGGAGCA









GGTAGGACCC









GC






549-550




cg39485034




 406




GGCGATCGCT




T




G






SILENT-









GCCACCGTTAG








NONCODING









ACCAT[T/G]GGG









AGCAGGTAGG









ACCCGCCTTCC









T






551-552




cg39485034




 508




GACCCGAGAT




A




G






SILENT-









GGACTTCTTCG








NONCODING









ACGA[A/G]GTT









GCCACGGCTC









CGTTAGCGAAA









G






553-554




cg39485034




 519




GACTTCTTCGA




C




T






SILENT-









CGAAGTTGCCA








NONCODING









CGG[C/T]TCCG









TTAGCGAAAGC









GATACGGCCG






555-556




cg39515553




 205




CGCACTAAACC




G




gap






SILENT-









TTAAAAATGCG








NONCODING









AGA[G/gap]CGC









ATGCACGGCG









GACGTCGTGG









AA






557-558




cg39515553




 88




TGTGTCCGGA




A




G






SILENT-









GAAACCTCGTG








NONCODING









CGGA[A/G]AAC









AGCGCAAACC









GCAAAAACCCC









G






559-560




cg39516001




 168




CGAAATGGAAA




A




G






SILENT-









ATCGTCAATGA








NONCODING









AGA[A/G]CCGA









AATACGATGCT









AAAGTTATTC






561-562




cg39517070




 164




GCCGCCCAAA




C




T






SILENT-









CAACGCTCACC








NONCODING









GTTA[C/T]ACGC









CGCAATCGAG









GCGTCTCAACC






563-564




cg39517070




 293




TGTTTGCGCTC




C




T






SILENT-









CACAGGGGAC








NONCODING









CGCA[C/T]CGC









CACTCCATCCA









CAGCGCAAACA






565-566




cg39517875




 406




GCTGACGGAT




A




G






SILENT-









GAACGTCTCG








NONCODING









GACAC[A/G]GC









GGACACCTCA









GTGAATCTCCC









TT






567-568




cg39517875




 426




GACACAGCGG




T




G






SILENT-









ACACCTCAGTG








NONCODING









AATC[T/G]CCCT









TTGAGTAGATA









CTGGACGAGA






569-570




cg39523703




 182




GCACAGGAGA




C




T






SILENT-









GCGGCCGAGA








NONCODING









CCGGG[C/T]GC









AGCCCCTCCG









ACATCATGCGC









AC






571-572




cg39524728




 343




ATATGCATATG




C




T






SILENT-









TATGCACTCAT








NONCODING









ACA[C/T]TCATA









CATATGTGCCC









CCTCAGAGA






573-574




cg39527111




 101




GGCGACATGG




C




T






SILENT-









GCGTCCCCAC








NONCODING









GGCCC[C/T]GG









AGGCCGGCAG









CTGGCGCTGG









GGA






575-576




cg39530245




 307




GACATTCTCAT




T




A






SILENT-









TCAGAGCGGA








NONCODING









GAAA[T/A]TTCA









GCAGATTTCAA









GGAGCCGCCA






577-578




cg39530249




 729




ACAAGCCTTCA




C




gap






SILENT-









CTTTCTTTTCTT








NONCODING









TT[C/gap]TTTTT









TTTTTATCTAAC









AACTGAAG






579-580




cg39535150




 137




CGGCTCTCCTG




A




G






SILENT-









GATGTGCCCC








NONCODING









CGCA[A/G]CAA









TGCCAGGTAAG









CCTTGGTCACG






581-582




cg39536028




 735




CTGGCACACA




G




A






SILENT-









GTATCCCAACC








NONCODING









ATGG[G/A]TTTA









GTGTCCACCAG









ACTTAAAGGA






583-584




cg39543172




 580




ATGTGTCTCCC




A




T






SILENT-









ACACTGGCCG








NONCODING









CTGC[A/T]CAAG









CTGAGAAGCTG









GGACGGCCCG






585-586




cg39545648




 532




GCAATTTTACT




C




T






SILENT-









CTACAGCTGAG








NONCODING









ACA[C/T]TGCCA









AAGAGTCCAGA









ATTGTGAAG






587-588




cg39547799




 846




AAAGTTGGAGA




C




T






SILENT-









AACAGAAACCA








NONCODING









AGG[C/T]GAGG









TGGTCCTTGGT









TAAGTCTGCA






589-590




cg39550340




 579




TTTTTTTAAATA




A




G






SILENT-









ACATCGTTGAT








NONCODING









TA[A/G]AACAAT









CCTATTCACTG









CAGTCACA






591-592




cg39568672




 164




GCTGGGTGGC




C




T






SILENT-









TGACCAGCGCT








NONCODING









TTGG[C/T]CAGT









CAGGGGTTCG









GTGGAATGTTC






593-594




cg39568672




 248




TACCACGGCAT




C




T






SILENT-









CATGGTCGCTT








NONCODING









TCG[C/T]GCTC









GTTGGGTACG









GATGGCTTGC









G






595-596




cg39568672




 398




TCGGGTATCAG




A




G






SILENT-









GCCGTTGACAT








NONCODING









GGC[A/G]CCCG









CTTGTTATCGA









TTCTCTCATC






597-598




cg39570661




 415




ACGAGGGGAG




G




A






SILENT-









CAAGCACGAG








NONCODING









CCGGG[G/A]AG









AGAGCTCTGC









GCTCGCACAC









GGG






599-600




cg39575840




 320




CCCATGGTCCT




C




T






SILENT-









CCCCATGTAAA








NONCODING









GAG[C/T]TCTG









GCCAATCAACA









AGGAGTGGAC






601-602




cg39575840




 361




AGGAGTGGAC




C




G






SILENT-









AGCTCATACAA








NONCODING









GGAC[C/G]ACC









AAGTGGCCAAC









AAACATAAAGC






603-604




cg39602316




 178




TAGCAGGAGG




A




G






SILENT-









AAGCTGATGAA








NONCODING









TTGA[A/G]GTCC









GATATAGGCAG









TTTGTGCTCC






605-606




cg39704218




 348




CAAACTCCTGG




C




T






SILENT-









GCTCAAGCGAT








NONCODING









CCT[C/T]CAACC









CCGGCCTCCC









AAAGTGCTGG






607-608




cg41085637




 479




AAGACGGCCAT




A




G






SILENT-









GAGGAGGCGA








NONCODING









TGGA[A/G]ACG









GAGGCCAGCA









CATCAGGGGA









GG






609-610




cg41591473




 190




AAATTGACATT




G




A






SILENT-









TAAGTGGACCT








NONCODING









GCC[G/A]TATTT









GTATTTGCTAA









ATCTGGCCA






611-612




cg41592212




 65




ACGCGTGAGC




G




T






SILENT-









CACCATGCCC








NONCODING









GACGT[G/T]AA









GACAGGAATTT









ATACCCATGGA









G






613-614




cg41618657




 555




CAATTGGCTGT




C




T






SILENT-









CCTATTTACAC








NONCODING









TTA[C/T]GTGTC









ATGTTAAAATA









ATCATTTCT






615-616




cg42267484




 619




GAACATCCCAT




A




G






SILENT-









GCAAAAGACTT








NONCODING









TTC[A/G]AAGG









GAAGGGCCTG









GTTTGAGAATG






617-618




cg42312996




 675




TCACCTCCTTC




T




G






SILENT-









CCTTTATTCTA








NONCODING









CCG[T/G]CCCA









AGGGCCTGAG









ATTGGGCGACT






619-620




cg42322469




 385




TCCTAAAACCA




A




T






SILENT-









TTAGTATCTAC








NONCODING









TAA[A/T]TTGAC









GCTGAAATTTT









GTATTTTTG






621-622




cg42327033




 94




GGCTTCTCAGG




G




T






SILENT-









GGTCAGGTGC








NONCODING









ATTT[G/T]GGCA









GATGCGCTTGA









GTGGGGGGGC






623-624




cg42462046




 142




GGCCACAGCG




C




T






SILENT-









CCCTGCCCCA








NONCODING









GAGAA[C/T]GG









CGGGTGGGCT









GGGTCCGGCT









GCG






625-626




cg42462046




 98




CCTGCCGCGC




C




T






SILENT-









GCGGCGGGGC








NONCODING









TCCTC[C/T]TCG









CTGTGGGAAAA









GTGGGGCCAC









A






627-628




cg42468895




 296




GCCTGGGCAA




G




gap






SILENT-









CAGAGCAGCA








NONCODING









AGACT[G/gap]T









CTTTACACTCG









GGGTGAGTAG









TCG






629-630




cg42518152




 110




GACACACATGC




C




G






SILENT-









ACACGGTTTCA








NONCODING









CCA[C/G]CACG









GCTTCTCTCCA









GCCTTCTCTT






631-632




cg42534385




 391




GCCGCCTACC




G




gap






SILENT-









ACAAGTCGGTG








NONCODING









TGGC[G/gap]GG









GGGTGGAGGC









CAAGCTGCACC









TG






633-634




cg42534385




 396




TACCACAAGTC




gap




G






SILENT-









GGTGTGGCGG








NONCODING









GGGG[gap/G]TG









GAGGCCAAGC









TGCACCTGCG









CCG






635-636




cg42560726




 616




CGCCTGTAATC




A




G






SILENT-









TCAGCACTCTG








NONCODING









GGA[A/G]GTCA









AGGCAGGTGG









ATCACTTGAGC






637-638




cg42656733




 325




GGGACCACGA




T




C






SILENT-









TGGACTGAGC








NONCODING









CAGCT[T/C]TGC









CCGCCCGCCC









CCGCGCCCAG









GG






639-640




cg42658258




 510




GGCCTGCAGA




C




gap






SILENT-









CTCCGGGCCC








NONCODING









AGGGC[C/gap]A









CCGGCCTCTC









CTACCTGCTCC









TGC






641-642




cg42673467




 656




ATAGACCAACA




C




T






SILENT-









ATCATGTATCC








NONCODING









TGC[C/T]ACTTG









GGATGCCAGC









ACCCATGCCA






643-644




cg42691712




 697




CCGAGCAGTG




C




A






SILENT-









GCCGCGTGCA








NONCODING









GGAGT[C/A]CA









GAGTGGAGCC









GTGACTCACAA









TT






645-646




cg42705180




 89




CTTTGGCAAAT




G




T






SILENT-









TGGGGACTGA








NONCODING









AGAC[G/T]GGA









AGGGTGGAGA









GTAGGCGGAA









CC






647-648




cg42705180




 96




AAATTGGGGAC




G




A






SILENT-









TGAAGACGGG








NONCODING









AAGG[G/A]TGG









AGAGTAGGCG









GAACCAGGTG









GT






649-650




cg42718789




 107




GAATTCAACAA




A




G






SILENT-









CACTATAGAGT








NONCODING









CAA[A/G]AGGA









AACGAGTCGA









GTGAAACCAGT






651-652




cg42719781




 228




GATTCTGATTT




G




A






SILENT-









TAGTGATGATG








NONCODING









AAC[G/A]CTGT









GGAGAATCCA









GCAAAAGGAAA






653-654




cg42848362




 34




CATGCAGGCG




C




T






SILENT-









CGCCTGTAGTC








NONCODING









CCAG[C/T]TACT









CGGGAGGCTG









ACGCAGGAGA









A






655-656




cg42848627




 156




TCCTCCATCAC




A




G






SILENT-









CAGGCTGTCTA








NONCODING









ACG[A/G]GGCT









GAAGAAGTACC









ATCCATGAGT






657-658




cg42895723




 781




CAGGGTGCCA




T




G






SILENT-









GGCACTTCTTT








NONCODING









AATG[T/G]GTTC









TTTCTTTATGTG









ATTATTTGA






659-660




cg42910590




 365




GCCCGGGACG




T




C






SILENT-









AGGGAGAATCT








NONCODING









GCAG[T/C]AGC









TGAGGACCCC









ACATGGGGTG









AG






661-662




cg42913480




 419




ATGTGGGGAAA




A




G






SILENT-









AGCAAGAGAG








NONCODING









ATCA[A/G]ATTG









TTACTGTGTCT









GTGTAGAAAG






663-664




cg42919036




1141




GAGGCAGATG




C




T






SILENT-









ATTCCCAAGAG








NONCODING









AACT[C/T]ACCA









AATCAAGACAA









ATGTCCTAGA






665-666




cg42919304




 313




GAGGAAGGCA




G




A






SILENT-









AACAGAAAGGC








NONCODING









AAGG[G/A]CAG









CAAACCTTTAA









TGCCTACCTCC






667-668




cg42922781




 404




GAGGCGGGTT




G




A






SILENT-







 9









AGTGCCCATG








NONCODING









GATCC[G/A]GT









GTCTGGGAAG









GGGCCCACAG









AAG






669-670




cg42924993




 364




TTTTAGGCCAG




A




G






SILENT-









GTGCAGTGGC








NONCODING









TCAC[A/G]CCCT









TAATCCCAGCA









CTTTGGGAGG






671-672




cg42924993




 450




AGACCAGCCT




A




G






SILENT-









GGCCAACATG








NONCODING









GTGAA[A/G]CC









CCGTCTCTACT









AAAAATACAAA









A






673-674




cg42925336




 402




TCATGAGGAAG




G




T






SILENT-









GCCAGGACAA








NONCODING









GTGT[G/T]GCA









GAGCGGCTTA









CCCCCATGGC









AC






675-676




cg42943021




 16




GCGCGCCAGG




C




T






SILENT-









ACGCC[C/T]GG








NONCODING









CTGTTTTGTATT









TTTAGTAGAGA






677-678




cg42943021




 23




GCGCGCCAGG




T




C






SILENT-









ACGCCCGGCT








NONCODING









GT[T/C]TTGTAT









TTTTAGTAGAG









ACAGGGTT






679-680




cg42943021




 27




CGCGCCAGGA




T




C






SILENT-









CGCCCGGCTG








NONCODING









TTTTG[T/C]ATT









TTTAGTAGAGA









CAGGGTTTTGA






681-682




cg42943021




 68




AGGGTTTTGAG




T




C






SILENT-









TGATCTGTCCA








NONCODING









CCT[T/C]AGCCT









CCCAAAGTGCT









GGGATTACA






683-684




cg42943021




 90




CCTTAGCCTCC




T




C






SILENT-









CAAAGTGCTGG








NONCODING









GAT[T/C]ACAG









GCATGAGCCA









CTGTGCCCCG









C






685-686




cg43008177




 233




ACATCAAATTA




C




A






SILENT-









GCAATTACCAT








NONCODING









AGA[C/A]ATGTA









TTTCATTGAATA









AATAGCTT






687-688




cg43008177




 280




CTTTTGTTTGTT




gap




G






SILENT-









TGTTTGTTTGTT








NONCODING









T[gap/G]CAGGG









AAATTTAGAAC









AATTATTAG






689-690




cg43008177




 313




ATTTAGAACAA




A




T






SILENT-









TTATTAGATGTT








NONCODING









AT[A/T]GTGCCT









CTTCTCGTGTT









GATACGTG






691-692




cg43040173




 591




GGGAAACCCC




C




gap






SILENT-









AGGGGAGGCG








NONCODING









GAGGC[C/gap]A









GCGGGGATTT









CTGAAGCCAAG









TGG






693-694




cg43054295




 281




GAGGCTGGCG




C




G






SILENT-







X









GGCTAGGGCT








NONCODING









GAGTG[C/G]AG









CGCCTGCTTAG









AGACCTTCGG









GA






695-696




cg43054295




 317




TAGAGACCTTC




C




T






SILENT-







X









GGGAGAACTTC








NONCODING









TGC[C/T]GGAA









CCCCGACGGC









TCAGAGGCGC









C






697-698




cg43054295




 355




GCTCAGAGGC




A




C






SILENT-







X









GCCCTGGTGC








NONCODING









TTCAC[A/C]CTG









CGGCCCGGCA









CGCGCGTGGG









CT






699-700




cg43060167




 673




TTGCCCCCCG




C




T






SILENT-







 4









CCAACCTACTC








NONCODING









AACC[C/T]CTTC









CAGATAAAGAC









AGTGGGCACT






701-702




cg43076634




 27




GCGCGCCTAC




T




G






SILENT-









CACGTCAAGCT








NONCODING









AATT[T/G]TTTG









TATTTTAGTAG









AGGTGGGGTT






703-704




cg43089031




 202




AAATGGGCCA




A




G






SILENT-









GGCGCGGTGA








NONCODING









CTCAC[A/G]CCT









ATAATCCCAGC









ACTTTGGGAGG






705-706




cg43089031




 244




TTTGGGAGGC




A




G






SILENT-









CGAGGAGGGT








NONCODING









GGATC[A/G]CC









TGAGGTCAGG









AATTCCAGACC









AG






707-708




cg43149120




 317




CCAAGAAGAC




T




C






SILENT-







17









GCTGGAGGGA








NONCODING









GGCTG[T/C]TA









GGAGGGACTC









TGAGCTTCACA









CC






709-710




cg43256880




 128




GGGAATCTTCT




A




G






SILENT-









CTTTGACGTAT








NONCODING









GGG[A/G]AGCC









TCAGAAAGACA









TTTTCCTAAT






711-712




cg43256880




 191




GAATGGCCCTA




A




G






SILENT-









TGATGTTTCCT








NONCODING









TCG[A/G]AACT









GGTACTGCTCA









GCCCTGATCA






713-714




cg43261262




 805




GTGTTTGGATT




G




A






SILENT-







16









TGATCATGGAT








NONCODING









GTA[G/A]CATAC









ACCAAAATCCA









CCGAGACCT






715-716




cg43276309




1051




TTGGCTTGGGG




C




T






SILENT-







19









GGTCCACAGT








NONCODING







(19q13.3)









GAGG[C/T]AGA









TGCTGGGCGT









GAAGAATCTGC









T






717-718




cg43276309




1149




GGCCTGGAGG




G




T






SILENT-







19









GGCCACCAAG








NONCODING







(19q13.3)









ATGCA[G/T]GA









GCTGGGCCTG









GAGAGGCTGC









AAA






719-720




cg43276380




 686




GTCCACTGTGA




G




C






SILENT-









GGCAGAGGCT








NONCODING









GGGC[G/C]TGA









AGAATCTGCTG









TGAGGCAGAT









G






721-722




cg43304080




 347




GCCGGGCAGA




G




A






SILENT-









GTGGAGCAGC








NONCODING









TTGGG[G/A]CC









GTGCCCAGGG









CGGTGGCTGT









GAG






723-724




cg43323676




 303




GGTGTGGTGT




G




A






SILENT-









GGATTGTAGCT








NONCODING









TCCC[G/A]AAAC









TCATGGCGCCT









CCCCTCGGAC






725-726




cg43326623




 188




GGATAACCAGA




C




T






SILENT-









ATTATCACAGC








NONCODING









ACC[C/T]TCTCA









TTCCCAGCGC









GTCCTTCTGA






727-728




cg43328092




 168




TTGCTGTGTAA




A




G






SILENT-







 8









GAATAGGTCCC








NONCODING









TCA[A/G]CATGA









AGATGTGTCTG









CGTCTGAGC






729-730




cg43328259




 499




AGCCTTCAGG




C




T






SILENT-









GAGCGTGGAG








NONCODING









ACAGG[C/T]TTT









GAAGACAAGAT









TCCCAAAAGGA






731-732




cg43328259




 771




ACGGCCATGG




T




C






SILENT-









AGACTGCAATT








NONCODING









CCAT[T/C]CAGA









TCACGTTTCTT









CAATCTGAAG






733-734




cg43336005




 591




GATGTTTGGCT




T




G






SILENT-







22









GCACGCGAGC








NONCODING









CCAC[T/G]CGG









GACAGACCCAA









GAACACGAATT






735-736




cg43916688




3263




GCGGCGAGAG




G




gap






SILENT







11









CGGCTCCTCTG








NONCODING









CGCA[G/gap]CC









GGCGCCGGCT









CCGCTTCCCCT









TC






737-738




cg43917746




 251




TGGATAGAAGT




T




C






SILENT-







20









GCTTCACTAAT








NONCODING









TGC[T/C]TTATT









TAAGCATACAA









GAAAAAAAG






739-740




cg43917746




 471




CAAGCATGGC




A




G






SILENT-







20









CCAGCCAGCA








NONCODING









CCACC[A/G]CC









CCCAAAACGAA









CAAAACAAGAG









A






741-742




cg43918370




 648




TCACCAGGAAA




A




G






SILENT-









GCCTGCCTCCT








NONCODING









CCG[A/G]GGAC









CTGCCCGCCT









CCGGGAGCAG









C






743-744




cg43919788




1419




AGGGACATAGA




C




G






SILENT-









ACCAAGCCCCA








NONCODING









GGG[C/G]TGCC









CAGCTACACGA









CCGCCGCTGG






745-746




cg43919798




 780




ACTCACCTTAT




C




G






SILENT-









TCTTCATTTCC








NONCODING









CCT[C/G]GTGA









ATCCTCCAGGC









CTTTCTCTAC






747-748




cg43921083




 318




CAGGCCGGTG




T




C






SILENT-







17









ACCCCCCATG








NONCODING









GAGCC[T/C]CA









CATGGCGAAG









AGGATGAGGA









AGG






749-750




cg43921103




 471




CAGGGAGACG




gap




A






SILENT-









CCAGCATTAAA








NONCODING









AAAA[gap/A]GA









GAGATGTGTTT









ATTCCATGATC









A






751-752




cg43921107




 386




GGGTGCTGGC




C




T






SILENT-









CTTCCTGGCCT








NONCODING









CGGC[C/T]TTCT









TCTTGGTGGTC









GACGCGTATT






753-754




cg43921619




 466




AGAGCTGGAG




G




T






SILENT-







17









GAGGTATTTGT








NONCODING









GAAG[G/T]AGC









AGGGAGAAGA









GGAGCTGCTG









AG






755-756




cg43925523




 173




GGATTCTTGGC




G




A






SILENT-









TGAAGAATTCC








NONCODING









TCT[G/A]GAAAT









ATTTCCCAGGC









CAAAGGTGG






757-758




cg43926586




 75




ACACCATGCTG




G




gap






SILENT-









GCCGTGGTGC








NONCODING









GAAA[G/gap]CC









TCCCGAGTCCA









TACAGATACCA









C






759-760




cg43926872




 697




TTCTGCGGCC




T




G






SILENT-







19









GCATCCTGAGC








NONCODING









ATGG[T/G]GAA









CACAGATGATG









TCAACGCCATC






761-762




cg43927587




4712




GCGGGATGTTT




G




A






SILENT-







 2









CTTGGGGGCA








NONCODING









GCTC[G/A]GGT









GGAGACACGA









CACTTTCTACT









G






763-764




cg43927587




4793




CTTCCTCCAGC




C




T






SILENT-







 2









GGCACAGGGT








NONCODING









ATTG[C/T]AGAT









GGGTCACCAG









CCCCATGTTTT






765-766




cg43927733




 274




CTACACAGGAA




A




G






SILENT-









GCTGGAAAGAT








NONCODING









GAC[A/G]AGAT









GAATGGTTTTG









GAAGACTTGA






767-768




cg43927893




1071




AGCATTGGTGC




T




C






SILENT-









CCTCATGGCTC








NONCODING









ATG[T/C]GGAC









GCAGTAATCCG









CCACTGTGCA






769-770




cg43927893




1329




GCCAACATCGA




G




A






SILENT-









GCCACTCTTTG








NONCODING









ACC[G/A]GTTG









CTCATTTTCTG









GTCTGACCGG






771-772




cg43929036




 299




TGGGCAGCCG




G




gap






SILENT-









ACCCCTTGCAG








NONCODING









CGCT[G/gap]GC









CAGCGGCCGC









CACCACCACAC









CG






773-774




cg43929036




 300




GGGCAGCCGA




G




gap






SILENT-









CCCCTTGCAGC








NONCODING









GCTG[G/gap]CC









AGCGGCCGCC









ACCACCACACC









GC






775-776




cg43929139




2789




TAATCTGTTCA




A




G






SILENT-









ACCCGAGGTCT








NONCODING









TTG[A/G]AAACG









AAGATCAAAAC









AATAATGAA






777-778




cg43929139




2828




AACAATAATGA




C




T






SILENT-









ACCAGAGAGC








NONCODING









GAAT[C/T]TTGG









GATGATTTCTA









TCATTGCACA






779-780




cg43929139




2834




AATGAACCAGA




A




G






SILENT-









GAGCGAATCTT








NONCODING









GGG[A/G]TGAT









TTCTATCATTG









CACATACAAA






781-782




cg43929139




2861




GATTTCTATCA




A




G






SILENT-









TTGCACATACA








NONCODING









AAT[A/G]ATGG









GAATTTTAGTA









TGTTTTATCA






783-784




cg43929652




 984




CACTTGTTAGG




G




A






SILENT-







17









CTCTTGTCAGC








NONCODING









ATT[G/A]ATAAC









TGGCATGTTTT









ATTGCAGCC






785-786




cg43929990




1931




GTGTTAACTTG




C




T






SILENT-







10









AAGAGATTCAA








NONCODING









CTT[C/T]TTCTT









TGACTGGTACT









TCCGCCCTA






787-788




cg43931352




 111




GCGTCGTTCCT




T




C






SILENT-









CCGGTCCATCT








NONCODING









CGC[T/C]CATG









CTCAGGGCGG









TGGCAAAGGG









G






789-790




cg43931759




2018




AGCTGGGATGT




G




C






SILENT-







 6









ACCTGGAGAG








NONCODING









ATAG[G/C]GGG









TAGTTCTCCCT









ACTGCCCAGG









C






791-792




cg43931795




1456




AGAATGGCAG




G




gap






SILENT-







 7









GCGGACCGTG








NONCODING









GCGAA[G/gap]G









CTCTGCCCTGG









TTGAACATTTC









TG






793-794




cg43933469




 689




TGTGCCAGGT




C




A






SILENT-









GCCCGTCTGA








NONCODING









GCTGG[C/A]TC









CATCATGACGC









GTCACTTTGTC









C






795-796




cg43934707




 651




CGGTGGGACC




G




A






SILENT-









AGCGCCATCAC








NONCODING









CTCC[G/A]TAC









GGATGTTCTCC









CTCCGGAAGC









C






797-798




cg43934839




1383




CCAACCATGCA




A




G






SILENT-







13









TTAAGTTTAAC







NONCODING









CAA[A/G]AGCT









GCAATATTCCA









GATTCTTAAA






799-800




cg43934839




1547




GGTGCCTGTTT




C




T






SILENT-







13









ACTTCTGGTCT








NONCODING









GCG[C/T]GGGC









TCAGGTTTCAA









AGAGCTTGCT






801-802




cg43934938




4214




TGAGGAAACCT




C




T






SILENT-







16









AGGAAATCTCG








NONCODING









GTG[C/T]ACTAG









GAAGTGAATCC









CGCAGGACA






803-804




cg43934938




4231




TCTCGGTGCAC




C




T






SILENT-







16









TAGGAAGTGAA








NONCODING









TCC[C/T]GCAG









GACAGCTGCA









CTCAGGGATAC






805-806




cg43934938




4351




GATTGTCTTTC




A




T






SILENT-







16









TGCCACAGAAC








NONCODING









AGC[A/T]GCAG









ACGTGTCGGG









AGGTTAGCTGC






807-808




cg43934938




4391




AGGTTAGCTGC




A




C






SILENT-







16









GGAAAGAAATC








NONCODING









GGG[A/C]TGCC









GCGGAGCACA









GAGTGATTTGG






809-810




cg43934938




4395




TAGCTGCGGAA




C




T






SILENT-







16









AGAAATCGGGA








NONCODING









TGC[C/T]GCGG









AGCACAGAGT









GATTTGGAACT






811-812




cg43935748




1437




ATGCTATGGGT




C




T






SILENT-







14









ATCTGTTTCAG








NONCODING









AAG[C/T]TCTGT









TGGTATCTTGT









GGTGTCTGC






813-814




cg43935826




 266




TCATTACGGTC




A




G






SILENT-









ACAATGACGAT








NONCODING









GTC[A/G]GAAA









CCATGCAATGA









AACCAATAAA






815-816




cg43936051




 705




TCAGAAAAAAA




C




T






SILENT-







 4









GCTATCCAGCT








NONCODING









TTT[C/T]GTGGA









ATCTGGTGAAG









TTTACACTT






817-818




cg43939976




1317




CCGTCGGTCCT




C




gap






SILENT-







 6









GGCGTAGCGC








NONCODING









CTCC[C/gap]GT









GTCCGGGGTA









GATCTTGTACC









CG






819-820




cg43941070




1254




ATGTTCCCCGG




G




A






SILENT-







17









CCTGCGACCAA








NONCODING









GAC[G/A]CTTTT









TCCTGACTACT









TCTTCAACT






821-822




cg43941070




1278




CGCTTTTTCCT




C




T






SILENT-







17









GACTACTTCTT








NONCODING









CM[C/T]TCTGA









CATAGGTTTTG









CTGATATAA






823-824




cg43941070




1284




TTCCTGACTAC




C




T






SILENT-







17









TTCTTCAACTC








NONCODING









TGA[C/T]ATAGG









TTTTGCTGATA









TAAACGCAA






825-826




cg43941070




1305




CTGACATAGGT




C




T






SILENT-







17









TTTGCTGATAT








NONCODING









AAA[C/T]GCAAA









CCCGGCTCTAT









ACCTACCAA






827-828




cg43942215




5435




AGCCAATATAG




C




T






SILENT-







12









GGCCTCGTCTC








NONCODING









ACT[C/T]AGGTG









TCAGTGCTGCA









GTATGGAAG






829-830




cg43942920




1632




TGCGTGGTGA




T




G






SILENT-







X









CGGGCAGTGA








NONCODING









GGACA[T/G]GT









GCGTGCACTTC









TTTTGATGTGGA









G






831-832




cg43942990




 826




GGCTGTAGAAT




G




A






SILENT-







 1









ACCTTCTCCTT








NONCODING







(1p11)









GAC[G/A]GGGT









ACAGCAGCTCC









ACATCCCTCT






833-834




cg43943163




 881




CTGAGCTATAA




T




C






SILENT-







 1









ACTTGTCATAG








NONCODING









ATT[T/C]GCTGT









GTCATCGCAG









GCTGCAACTG






835-836




cg43944408




1076




AGGAGGCAAT




gap




G






SILENT-







17









GAGATGATGG








NONCODING









GGTGA[gap/G]G









GAAACATGAAA









GTAACACTTGA









TT






837-838




cg43944446




 906




ACTCTGATCGG




T




gap






SILENT-







16









TTATTATCCCC








NONCODING









TCA]T/gap]TTTT









TGTAGGAAATA









AGTTTGCTTG






839-840




cg43946473




 880




CACAGAAAAGA




C




T






SILENT-







11









CAACAGATGTG








NONCODING









TTT[C/T]TAAGG









CACGATTTACA









TACTAAATT






841-842




cg43946992




 136




ACATCTGGGTT




A




G






SILENT-









GACCAGGAGC








NONCODING









CACA[A/G]AAGT









TCCCATCATGA









GAAAGGGGGC






843-844




cg43947759




 876




TCAACTTCTGT




gap




G






SILENT-







22









AAGATGGGGG








NONCODING









GGGG[gap/G]A









CAAAAAGAGAA









GTAAAGTTAAG









AA






845-846




cg43948617




1024




AGAGGTTATCA




A




G






SILENT-









AGGACATTTAA








NONCODING









GGA[A/G]TCCT









GATCCTCAGAA









CTTCTCTGGG






847-848




cg43949585




1172




GTAATGTTAAA




A




G






SILENT-







 8









ACTAAATACAG








NONCODING









ATG[A/G]TAATA









ATTGCTATTTC









ACAGTGATG






849-850




cg43949806




 460




TTACCAACCCT




C




T






SILENT-







16









GGGGCTTTATA








NONCODING







(16p13.1)









CTC[C/T]CTCTC









CACCAATCCCT









GATGACCCC






851-852




cg43950348




1308




GGCATTGCAG




A




C






SILENT-







 9









CGGCTCGGGG








NONCODING









TCCAA[A/C]GC









CTCACTACCAG









TCTGGGTCCG









GC






853-854




cg43950620




 262




AGCAAGGAATG




A




G






SILENT-







 8









TTCTTATTCTTT








NONCODING









GT[A/G]GGAGC









TCCTTCTTTAC









ACTGTCAGG






855-856




cg43951104




 679




TAGTGAAAACC




C




T






SILENT-







17









AAGTGACAAAC








NONCODING









ACA[C/T]TCCTC









GACCCCAAGTT









CTTCCACAT






857-858




cg43951505




 603




AAAGGTGGAAA




A




C






SILENT-









ATGAGGTTGAT








NONCODING









CGC[A/C]GCAT









TCAGAAAGTGT









ATAAGACCTA






859-860




cg43952456




 394




AATGCTAAACT




T




C






SILENT-









GCTTTCATGCT








NONCODING









AAT[T/C]TTCTG









ACTGTTTACTT









ACCGGGTAA






861-862




cg43952456




 401




AACTGCTTTCA




C




A






SILENT-









TGCTAATTTTCT








NONCODING









GA[C/A]TGTTTA









CTTACCGGGTA









AGAGCGAT






863-864




cg43952456




 416




AATTTTCTGAC




G




C






SILENT-









TGTTTACTTAC








NONCODING









CGG[G/C]TAAG









AGCGATGGGA









CTGTTTTCATT






865-866




cg43953844




1867




CTCAACCTGCC




T




C






SILENT-







 7









TTAGCTGCACT








NONCODING









CTC[T/C]TACCT









ACAGCTGGACA









GTACCTGTC






867-868




cg43955665




 184




CTGCCAGCTGA




G




A






SILENT-







16









CAGGATCTTTT








NONCODING







(16q22)









GCT[G/A]GGCC









CCCTTCTCTGT









GCTGAGTGGA






869-870




cg43955829




 593




ATAAGGCCATT




C




T






SILENT-









CAGCGAGGGA








NONCODING









CCAT[C/T]AAGT









GCAACTTTGCG









GGGGTTGCCT






871-872




cg43955863




 290




AGGACACCAA




T




C






SILENT-







 7









GGTACCCAATG








NONCODING









CCTG[G/C]TTAT









TCACCATCAAC









AAAGAAGACC






873-874




cg43955871




1011




AGCGCTCCTCC




A




G






SILENT-







 7









AGCAGGGACA








NONCODING









GCTC[A/G]CTG









ATGAGGTCGGT









GATGGCGTTG









G






875-876




cg43957194




 591




CCATTTTTAGC




G




A






SILENT-

















ATCAGAAACAC








NONCODING









AAG]G/A]AAATA









AAATTCGTGGT









TAGATTGAT






877-878




c943957205




 782




ATACCTGAGGT




T




C






SILENT-




13




j









TTCATGTCTTTA








NONCODING









GT[T/C]GCCTTA









TCATAATCCCA









AATATACA






879-880




cg43958108




 777




CAGCAAACCTG




A




G






SILENT-







 5









AATGGCACAAT








NONCODING









GGA[A/G]CACA









GACTTAAAAGA









TGCTTCAGTG






881-882




cg43959150




 709




ACTTCCACGCG




C




gap






SILENT-







22









GTGAACGTGG








NONCODING









CGCA[C/gap]CC









GTTCGCTTCAG









CAGTTTCCTAG









G






883-884




cg43959363




 614




CTTCATTTCTTT




G




A






SILENT-







17









GGTTTCTTGGG








NONCODING









TA[G/A]TGGGC









GCCGGAACAG









CAAGATGTGA






885-886




cg43960242




 16




ACTGCAACGC




A




G






SILENT-







19









GGAGG[A/G]GC








NONCODING









AGGATGGAGAT









CCCTGTGCCTG









T






887-888




cg43960953




2143




AACAGTGGGC




C




A






SILENT-









ATGTCTTCTCG








NONCODING









CGGT[C/A]GAT









CGGTTTCTCTG









GCTCCTTCTTA






889-890




cg43962392




 910




AAAGAAGGTAG




C




T






SILENT-







 1









AGGAACTTGG








NONCODING









GAGA[C/T]TGA









GGGAAAGATA









GGAGAGAGGA









AG






891-892




cg43964611




 319




GCTGGGCTTC




G




A






SILENT-









CCCGAGCTGG








NONCODING









AGAGC[G/A]GG









GAGGACCAGC









CCTTCTCCAGG









CT






893-894




cg43965993




2007




GTCGTTTTCTC




T




C






SILENT-







 4









AAAAAAATATC








NONCODING









GTA[T/C]AAGTG









ACTCATCCTGT









CTGCTAACT






895-896




cg43966536




 659




CCCAGTATGTA




G




A






SILENT-







 7









CCACCCCGTTT








NONCODING









CTC[G/A]TAAAT









GAAGGCAGCA









GCTCCAGCCA






897-898




cg43967276




 314




CACGTGCGTG




T




C






SILENT-







 9









GGGTGTTGGC








NONCODING









ATTCT[T/C]GTT









ATTTAACACGG









GAAGGAGGTG









A






899-900




cg43967511




 276




TAGTCCTGTTG




G




C






SILENT-







12









ACCTGGAAATG








NONCODING









GTG[G/C]CAGG









TGAAGTCTCTC









CACAGCATGC






901-902




cg43968814




3546




GCCAGAGCTTC




C




T






SILENT-







17









CGCGCCCTCG








NONCODING







(17q22)









CCTG[C/T]CCA









GGTGTCCTGCT









CGCCTCCATCT






903-904




cg43969044




 734




CCTCTGATGTT




A




G






SILENT-







 5









CAGTGAAGAG








NONCODING









GACC[A/G]GAA









AAGTCTGCTAG









AGCAGTACCAT






905-906




cg43970408




 949




CTCTTTGGCTT




G




gap






SILENT-







17









GTTTTGGCGCT








NONCODING









GGC[G/gap]CTG









GCAGAGGCTG









AGACACGGCG









AG






907-908




cg43970722




 769




CCACCATCTCC




C




T






SILENT-









GTTGTTTCTGG








NONCODING









AAG[C/T]ACCCT









CCAGGCAGGC









CAGCCAGCAT






909-910




cg43971702




 705




CTCAGCTCCTC




C




T






SILENT-







20









CAAATGGTTGT








NONCODING









CCA[C/T]CCCA









GACGACTGGG









GGGTTGGTGG









A






911-912




cg43971764




4293




CCTCAGCCTCC




T




C






SILENT-







15









CAAAGTGCTGG








NONCODING









GAT[T/C]ACAG









GCATGAGCCA









CCACGCCCGG









C






913-914




cg43972482




 412




TTAGTAGAGAC




T




C






SILENT-







 8









GGGGTTTCACC




NONCODING









ATG[T/C]TGGTC









AGGCTGGTCTC









GAACTCCTG






915-916




cg43973408




 205




ACCGAGGAGC




A




G






SILENT-







19









AGGAATATGAG








NONCODING







(19q13.4)









GAGG[A/G]GCA









GCCGGAAGAG









GAGGCTGCGG









AG






917-918




cg43974489




 666




GGCTAGCCCA




T




C






SILENT-







 1









CCTGCCATGGT








NONCODING









TGCC[T/C]TTCT









GCTTGGGGAT









GCCCTGTCT









G






919-920




cg43974987




1386




CATCCTCAGAG




C




gap






SILENT-







22









TCTGAGCGGC








NONCODING









ACCG[C/gap]AG









ACCTTCTTTTTC









AAGTTCACTAA






921-922




cg43977577




 590




CGTAGTGTAAA




G




gap






SILENT-









GAACGTAAATT








NONCODING









GAA[G/gap]GCC









CCGGGCCAATT









CTGGGAAGA









G






923-924




cg43977577




 591




GTAGTGTAAAG




G




gap






SILENT-









AACGTAAATTG








NONCODING









AAG[G/gap]CCC









CGGGCCAATTC









TGGGAAGAGG









A






925-926




cg43977954




2556




TGTGAACGGC




T




A






SILENT-









CCGGAGAGAG








NONCODING









CTGGG[T/A]GG









TGTATGGGGTG









ACCTCCTGGG









GG






927-928




cg43979039




 674




CGCCCCTCACA




C




G






SILENT-







 2









GTGAAGAATCA








NONCODING









GGA[C/G]AGCC









ACTCTCTGGTT









TTCTCACAAC






929-930




cg43979039




 769




GTCCTGGTCCC




A




G






SILENT-







 2









ACACAGAGAGA








NONCODING









GGA[A/G]GCGC









CACAACCCACT









CTGCCAACCT






931-932




cg43980463




 970




AGAAGCTGGCT




G




gap






SILENT-







16









GGTAGGACCC








NONCODING









GCAG[G/gap]GA









CCAGCTGACCA









GGCTTGTGCTC









A






933-934




cg43980508




 349




TTCTTGCAATT




C




T






SILENT-







 5









ATCCAGGCAG








NONCODING







(5q13.3)









GTGA[C/T]GAC









AACTTGATGCA









GGAAATCAACC






935-936




cg43980653




1001




CCCCCTCAGC




C




gap






SILENT-







18









CATTGCCCATG








NONCODING









AGGG[C/gap]CT









CCACGTTGTCT









GATGGTCGCT









GG






937-938




cg43981661




 577




TCATGAGCCCG




C




T






SILENT-







17









CTGACCCGGT








NONCODING









GGGC[C/T]GAG









GGCAGGTAGG









GAGACTTCCTC









G






939-940




cg43982025




 488




TGGGAAGGCG




G




gap






SILENT-







19









CATATCCTGGC








NONCODING









GGCA[G/gap]CA









GCACGTGGCA









CCAGGTGCCA









GGC






941-942




cg43982782




1142




GGTCCCTGCC




G




C






SILENT-







 4









ACAGCCGTGG








NONCODING







(4p16.3)









AGGGC[G/C]GA









CGTGACCTACG









CGGCCATGGT









GG






943-944




cg43983035




1104




CAGGCGGACC




T




C






SILENT-







 5









TGGAGGTGTCA








NONCODING







(5q31.3)









GCCA[T/C]GGC









CCTGACCACTT









CAGAAGCCAAC






945-946




cg43983194




 196




GGCCAACCAT




T




C






SILENT-







17









GAGGAGTGCA








NONCODING









AAGGG[T/C]GG









CACCGGCCAG









TGCCCCTGGA









CAC






947-948




cg43983194




 451




GCAGCCATGC




G




A






SILENT-







17









GCACCTCCAC








NONCODING









GCACG[G/A]CC









GAGCTCAACCC









GAAGACCACG









CG






949-950




cg43983314




 730




CAAATGGCAAC




T




C






SILENT-









ATAATGAAATC








NONCODING









ACT[T/C]TCTGC









ATCCAGAATTA









CACCCCCAA






951-952




cg43983314




 733




ATGGCAACATA




T




A






SILENT-









ATGAAATCACT








NONCODING









TTC[T/A]GCATC









CAGAATTACAC









CCCCAAGGT






953-954




cg43983314




 856




CCGCGAGGTG




C




T






SILENT-









CCCTATGCCTA








NONCODING









CATC[C/T]GTGA









GGGCCATGAG









AAGCAGGCCG









A






955-956




cg43984242




 307




AGCTCATCCAG




C




gap






SILENT-









CTGGACCAGG








NONCODING









CGAA[C/gap]CC









CTGGCCGCTG









TGCTGAAGGA









GGT






957-958




cg43986540




1009




GACACCGCCT




G




A






SILENT-









GGCCTGGTGC








NONCODING









TCCAG[G/A]GG









TGAAGCAGGC









CAGAATCCTGG









GG






959-960




cg43987682




1309




AGCCCTTCTCC




A




G






SILENT-







17









AGCACCTTGGC








NONCODING









AAA[A/G]ATGTC









CGTCAGCACCT









CTTTGATGG






961-962




cg43991793




 915




CCCCTGTGCCA




C




T






SILENT-







 9









TCATTTGGGCC








NONCODING









CCC[C/T]AGAC









ACTGGAGGAC









AGCGTGAGATA






963-964




cg43991835




 194




AAAAGCGGCG




A




G






SILENT-







19









AGGAACGCTTG








NONCODING







(19q13.1)









AAGG[A/G]AAT









GGAGGCGGAG









ATGGCCCTGTT









T






965-966




cg43994222




 230




CACAGAGATTT




G




A






SILENT-







18









TACATCACCTT








NONCODING









TCA[G/A]AACG









CAACAGGGTC









CGGGACAGGG









A






967-968




cg43995297




 288




AGGATCCCTG




C




T






SILENT-









GCTCGCGTCC








NONCODING









CCAAC[C/T]GG









TTCCGTGTCTC









ACCTGGGTCCT









G






969-970




cg43995517




 460




TGGCCGTAGG




A




G






SILENT-







12









AGAAAATACAA








NONCODING









CCCT[A/G]CCC









GGAACAGCAAT









AGCTCCCGCC









A






971-972




cg43995517




 516




TTACCTTGGAA




C




G






SILENT-







12









CCCAGCCCTAC








NONCODING









AGC[C/G]CGAG









CAGCTGTCCCT









CTGCCTCCCC






973-974




cg43995517




 532




CCCTACAGCCC




C




T






SILENT-







12









GAGCAGCTGT








NONCODING









CCCT[C/T]TGCC









TCCCCGGGCC









CGCCCTGGCC









G






975-976




cg43997174




 735




GGTCATCACAC




A




G






SILENT-







 1









ACAAGTGGACC








NONCODING







(1p13.1)









ACC[A/G]GCCT









GAGTGCAAAAT









TCAAGTGCAC






977-978




cg43997490




 511




AACATGAGGAC




G




A






SILENT-







 1









CCTCTGGATTA








NONCODING









ATC[G/A]AATTA









CAGCTGCTAGC









CAGGAACAT






979-980




cg43997710




 242




ATGTGAGGTGT




G




A






SILENT-









GACCTCACGAA








NONCODING









GAA[G/A]CAAAT









TTAATATTATAA









TGGGAAGC






981-982




cg43997768




 261




AGATCCCAAAG




A




G






SILENT-







 8









CCCAGTACACA








NONCODING









AGT[A/G]TCTAC









GGAGCCCTCA









AGAAAATCAT






983-984




cg43999766




 367




CATGTGAAGAG




gap




T






SILENT-







17









ACCCAGCCTCT








NONCODING









TCA[gap/T]AGG









GTATCCAAGAT









AAACTTCCGTT






985-986




cg44000102




 879




CTTATTCTTCTT




T




gap






SILENT-







X









TGAATACAATG








NONCODING









AC[T/gap]TCTG









GCACTGATCG









GGTCAGTTTCT






987-988




cg44000102




 880




TTATTCTTCTTT




T




gap






SILENT-







X









GAATACAATGA








NONCODING









CT[T/gap]CTGG









CACTGATCGG









GTCAGTTTCTT






989-990




cg44000102




 882




ATTCTTCTTTGA




T




gap






SILENT-







X









ATACAATGACT




NONCODING









TC[T/gap]GGCA









CTGATCGGGTC









AGTTTCTTCC






991-992




cg44000241




 412




CTCCCTCACGG




gap




T






SILENT-









AGCCAGCGGC








NONCODING









CGGG[gap/T]AA









TGCAGACATCA









GAACGTGAGG









GG






993-994




cg44001933




 566




CTCCGCGCAC




C




G






SILENT-









AGTGGTGGCC








NONCODING









ACCGC[C/G]AC









TGGTGCTGAAG









TGTCGGCGTGT






995-996




cg44002491




 17




GCGCGCCCTT




T




C






SILENT-









CTTCCC[T/C]TA








NONCODING









CTGCGAGGAG









CCACCGCCTCT









TT






997-998




cg44003839




 280




CCCCATCACCA




A




G






SILENT-









GGCGGTTCTC








NONCODING









CCCG[A/G]TCT









CCAGCGACAG









CCCCAGGGCT









CC






999-1000




cg44003987




1374




CACACGCACAC




A




C






SILENT-









GTACATTCACT








NONCODING









ACA[A/C]ACGT









GCAGCCTCCT









GCACACGTGC









A






1001-1002




cg44005542




 627




TCCTAATCCCA




G




gap






SILENT-









TGCCAGAACC








NONCODING









GAAG[G/gap]CT









AATGGCCACAT









TCTTCTTTTAAA






1003-1004




cg44007769




1257




GGGTCTGCTG




A




T






SILENT-









AGTTGGAGGA








NONCODING









GTGCA[A/T]TGT









CGCCCTGGGA









GCCCTCCTGG









AG






1005-1006




cg44009153




 928




AGCCATGGGTT




A




G






SILENT-









TGGGTAATAAG








NONCODING









AAG[A/G]GAGA









GCATTTGGGGT









TCAAGAGAGG






1007-1008




cg44009645




5337




CTTTTCCATGT




T




C






SILENT-









GGCTCAATATC








NONCODING









AAC[T/C]TTTCC









CGTCTAATGAT









GACAAATCT






1009-1010




cg44012500




 579




GGCATCTCATC




G




gap






SILENT-







19









TTGCTGGGGCT








NONCODING







(19q13.2)









GTT[G/gap]GGT









CCCCTGGACCT









CAAATCCCAAT






1011-1012




cg44014720




 482




CAGAGCCCTGT




T




C






SILENT-









CGTGGCCCTG








NONCODING









TCCA[T/C]CTCC









TGCGCCAGGA









AACACAGGTTC






1013-1014




cg44020161




 854




CTCCAAGAGTT




G




gap






SILENT-







17









CTGGTCTCCCG








NONCODING









CGA[G/gap]GG









GCGGAGTTCC









CTCCCCAGTCC









CG






1015-1016




cg44021014




 561




GAGGCACACA




G




A






SILENT









GACAGATGATG








NONCODING









AGCA[G/A]CTCT









TCTCCTTAAAG









AAGTCTGTGT






1017-1018




cg44024536




1149




AGGGTGTGTGT




G




A






SILENT-







 9









GTGTGTGTGTG








NONCODING









TGT[G/A]TGTGT









GTGTGTGTGCG









TGTGCG






1019-1020




cg44024536




1153




TGTGTGTGTGT




G




A






SILENT-







 9









GTGTGTGTGTG








NONCODING









TGT[G/A]TGTGT









GTGTGCGTGTG









CG






1021-1022




cg44026832




 299




TGGCTGCAGA




G




A






SILENT







X









GGTTGAGCCTC








NONCODING









CTGA[G/A]CCC









CTGCTTGGTGA









CAAGGGACCT









G






1023-1024




cg44026832




 446




CTTCTGGGCTG




C




T






SILENT-







X









TGGCCCTGCTC








NONCODING









TTG[C/T]TGGCT









ACTCTCATGGA









GCAGGGCTT






1025-1026




cg44029982




1149




GGCTGTAGGG




G




A






SILENT-









GATGTTGGTCT








NONCODING









CCTG[G/A]AAAA









AGGCGCTGAG









GGCTGTCTCGA






1027-1028




cg44030164




 245




GGGTGTAGAC




G




T






SILENT-







14









GCTGCTGGCC








NONCODING









AGCCC[G/T]CC









GCAGCCGAGG









TTCTCGGCACC









GC






1029-1030




cg44031677




 249




CAGACCCGAG




C




T






SILENT-









GTGCCCAGGG








NONCODING









CATTC[C/T]GGA









GGCAGCCGAG









GGCAGCAGCT









CC






1031-1032




cg44031863




 401




GGACTTCTGCT




C




T






SILENT-







 1









GCGTCTTCGG




NONCODING









CCAC[C/T]TCTC









CTCTTGCCTTT









TGGTGGACCC






1033-1034




cg44033624




1409




GAAAAGGGATA




G




gap






SILENT-







10









CTTTGATAATTA








NONCODING







(7q31)









AG[G/gap]CCAG









AGGCCCATTAG









TTGAGAAAGT






1035-1036




cg44033878




1552




AAGGAGGGAT




T




G






SILENT-







 1









ATGTTCCACGT








NONCODING









AACT[T/G]GCTG









GGACTGTACCC









AAGAATTAAA






1037-1038




cg44034830




 701




GCAGGAGCCT




G




A






SILENT-







 5









GCAGGAGGCT








NONCODING









GGAAA[G/A]TC









AGGCTAGGGA









TATAGCAGGGA









TG






1039-1040




cg44127556




 521




TGGCGACGAC




A




G






SILENT-









TCTGGAGTGG








NONCODING









CGGAT[A/G]CG









GGGGAGGCGG









ATGTCCCTGGG









TC






1041-1042




cg44128344




 137




CCCCAGGATTC




T




C






SILENT-







19









TGGCCTGCTTC








NONCODING









ACC[T/C]CTGG









AGCACCAGGC









CAGGCGGTGT









C






1043-1044




cg44131644




 894




TGGCCCCCGA




G




C






SILENT-







21









CGAGCTGTACA








NONCODING









CGCC[G/C]CAC









AGCTGGCTGCT









CCGCGTGGTAT






1045-1046




cg44911913




 967




AGGGCGGCTG




C




T






SILENT-







15









CGGGGCTGCC








NONCODING









CCTGG[C/T]CC









CCCGGCCCCT









CCTGGGCGCC









CTC






1047-1048




cg44911913




 992




CCCCCCGGCC




C




G






SILENT-







15









CCTCCTGGGC








NONCODING









GCCCT[C/G]GT









CCCGCTCCTG









GCCCTGCTCC









CTG






1049-1050




cg44912347




 853




TCTTTACATTA




A




G






SILENT-







11









GAGCCAATTTA








NONCODING









AGA[A/G]GGCG









CTGGTATGTAT









GAGCTTTTTG






1051-1052




cg44913333




 159




CTCCCCAGAAT




A




G






SILENT-







12









TCCTAGACTGG








NONCODING









GTT[A/G]ATAGG









GTCATATTGTG









AATGTCTCA






1053-1054




cg44914547




 175




GGTCCCCCTG




C




T






SILENT-









CTTCTTCCCTG








NONCODING









CAGA[C/T]ATG









GTGGAGCTGC









TGCTGCTGCAG









A






1055-1056




cg44914547




 86




GGACCTGCAC




G




C






SILENT-









AGTGTACAGAC








NONCODING









ACAC[G/C]TGTT









CTCTGGTCCTA









TAATGCTCTA






1057-1058




cg44914864




 641




GCTCCGATGC




T




C






SILENT-









GCGATGCATTC








NONCODING









ATAG[T/C]GTCG









CCTTTCAGGAA









AGTTCGGTGT






1059-1060




cg44915149




 857




GGGTCATCGA




C




T






SILENT-









CCCCATTGATG








NONCODING









GCAC[C/T]AAG









AACTTCGTGCA









CGGGTCTGTTG






1061-1062




cg44916019




 445




CCCCTACCTGG




A




G






SILENT-







22









CCTGGCTGGC








NONCODING









CTTC[A/G]CGA









CCACACTCAAC









TACTGCGTATG






1063-1064




cg44916367




 200




GGTGCCACCA




A




G






SILENT-









GGCTCTTTTTA








NONCODING









ACAA[A/G]CAGT









TCTCACAGAAA









CTAATCAAGT






1065-1066




cg44916367




 248




AGTGAGAACTC




A




G






SILENT-









ACTTGTTACCA








NONCODING









CAA[A/G]GATG









GCACCAGGCT









ATTCATGAAGG






1067-1068




cg44919480




 413




GCCACGAAGT




A




G






SILENT-









GATTGTGTCTG








NONCODING









CAGC[A/G]TGT









GGGCGGAACC









ACACCTTGGCC









T






1069-1070




cg44919480




 504




GATGGGGCAG




C




T






SILENT-









CTGGGCCTTG








NONCODING









GCAAC[C/T]AG









ACAGACGCTGT









TCCCAGCCCC









GC






1071-1072




cg44919623




 302




GCCCGCTGCG




G




A






SILENT-









ATATGTCGTCC








NONCODING









TTTG[G/A]GAAC









CTCAGCAGCCA









GCCGTAAGTC






1073-1074




cg44919623




 307




CTGCGATATGT




C




T






SILENT-









CGTCCTTTGGG








NONCODING









AAC[C/T]TCAGC









AGCCAGCCGT









AAGTCCTCAC






1075-1076




cg44920877




 23




ACGCGTTCTGC




G




A






SILENT-









GAGGCCATGC








NONCODING









G[G/A]GTCTAT









GCCCCGCGGC









CGTTGGCCT






1077-1078




cg44920877




 27




CGCGTTCTGC




T




C






SILENT-









GAGGCCATGC








NONCODING









GGGTC[T/C]AT









GCCCCGCGGC









CGTTGGCCTC









GCC






1079-1080




cg44920877




 45




GCGGGTCTAT




G




A






SILENT-









GCCCCGCGGC








NONCODING









CGTTG[G/A]CC









TCGCCCACACC









CCCGGCCCCA









CT






1081-1082




cg44920877




 58




CCGCGGCCGT




C




T






SILENT-









TGGCCTCGCC








NONCODING









CACAC[C/T]CC









CGGCCCCACT









GCGGGTGGAG









AGA






1083-1084




cg44920877




 68




TGGCCTCGCC




A




G






SILENT-









CACACCCCCG








NONCODING









GCCCC[A/G]CT









GCGGGTGGAG









AGACGTCGGG









CCC






1085-1086




cg44923068




 71




CGGACACCCG




G




gap






SILENT-







 1









GCGGGAGCTG








NONCODING









GCGGA[G/gap]C









TCGTGAAGCG









GAAGCAGGAG









CTGG






1087-1088




cg44928274




 452




AGCAAGGGGA




G




gap






SILENT-









AGATCATCAGC








NONCODING









GGCA[G/gap]CA









GCGGCAGCCT









GCTGTCTTCAG









GT






1089-1090




cg44928732




 299




GTGTACCAGC




T




G






SILENT-









GCCTGATCCG








NONCODING









GGACA[T/G]TC









CCTGCCGCAC









GGTCACGCCT









GAC






1091-1092




cg44932136




 964




ACAGACGCGC




gap




C






SILENT-









ACACACACGC








NONCODING









GCACA[gap/C]G









ACGCACACACA









GACGCACACA









CGC






1093-1094




cg44932136




 976




ACACACGCGC




gap




A






SILENT-









ACAGACGCACA








NONCODING









CACA[gap/A]GA









CGCACACACG









CACAGACACAC









AC






1095-1096




cg44932184




 644




TGGGCACGAG




G




gap






SILENT-









CGTGGCTTCG








NONCODING









GCGGA[G/gap]C









AGGATGAACTG









TCTCAGAGACT









GG






1097-1098




cg44938456




2706




TGACCTTGACC




C




T






SILENT-







 2









AGTTTGATCAG








NONCODING









TTA[C/T]TGCCC









ACGCTGGAGA









AGGCAGCACA






1099-1100




cg44938456




2729




TACTGCCCACG




A




G






SILENT-







 2









CTGGAGAAGG








NONCODING









CAGC[A/G]CAG









TTGCCAGGCTT









ATGTGAGACAG






1101-1102




cg44938456




2738




CGCTGGAGAA




A




G






SILENT-







 2









GGCAGCACAG








NONCODING









TTGCC[A/G]GG









CTTATGTGAGA









CAGACAGGAT









GG






1103-1104




cg44938456




2744




AGAAGGCAGC




A




G






SILENT-







 2









ACAGTTGCCAG








NONCODING









GCTT[A/G]TGTG









AGACAGACAG









GATGGATGGT






1105-1106




cg44938456




2775




GACAGACAGG




A




C






SILENT-







 2









ATGGATGGTGC








NONCODING









GGTC[A/C]CCA









GTGTAACCATC









AAATCGGAGAT






1107-1108




cg44938456




2792




GTGCGGTCAC




A




G






SILENT-







 2









CAGTGTAACCA








NONCODING









TCAA[A/G]TCG









GAGATCCTGCC









AGCTTCACTTC






1109-1110




cg44938456




2807




TAACCATCAAA




A




C






SILENT-







 2









TCGGAGATCCT








NONCODING









GCC[A/C]GCTT









CACTTCAGTCC









GCCACTGCCA






1111-1112




cg5633308




 186




ACAAGGGGAA




A




T






SILENT-









CATCAACACAA








NONCODING









GAGA[A/T]GCC









TACAAAGGGC









GAGCGTGCAA









GC






1113-1114




cg32160481




 124




GAGATGGGGG




C




G




Thr




Ser




CONSERVA-




MHC




Human Gene Similar to SWISSPROT-ID: P30508




5.60E−37









TCATGGCGCC






(1193)




(1194)




TIVE





HLA CLASS 1 HISTOCOMPATIBILITY









CCGAA[C/G]CC










ANTIGEN, CW*1201 ALPHA CHAIN









TCCTCCTGCTG










PRECURSOR (HLA-CX52) -


HOMO SAPIENS











CTCTCAGGGG










(HUMAN), 366 aa.









CC






1115-1116




cg32160481




 164




TCTCAGGGGC




G




T




Glu




Asp




CONSERVA-




MHC




Human Gene Similar to SWISSPROT-ID: P30508




5.60E−37









CCTGGCCCTG






(1195)




(1196)




TIVE





HLA CLASS 1 HISTOCOMPATIBILITY









ACCGA[G/T]AC










ANTIGEN, CW*1201 ALPHA CHAIN









CTGGGCGGGT










PRECURSOR (HLA-CX52) -


HOMO SAPIENS











GAGTGCGGGG










(HUMAN), 366 aa.









TCG






1117-1118




cg27790564




 65




AGCTTGAGCAG




A




G




Val




Ala




CONSERVA-




protease




Human Gene Similar to SWISSNEW-ID: P53616




9.70E−31









AGCCTTGACCT






(1197)




(1198)




TIVE





PROTEASOME COMPONENT SUN4 -









TCT[A/G]CACAC












SACCHAROMYCES CEREVISIAE











AAACTGTTAGT










(BAKER'S YEAST), 420 aa.|









GTCGGTGTT










pcls: SWISSPROT-ID: P53616
















PROTEASOME COMPONENT SUN4 -


















SACCHAROMYCES CEREVISIAE


















(BAKER'S YEAST), 420 aa.






1119-1120




cg43967452




 418




GCGTTACCGG




G




A




Asp




Asn




CONSERVA-




ribosomalprotrot




Human Gene Similar to SWISSPROT-ID: P32899




2.80E−49




15









CTGCAGCGGC






(1199)




(1200)




TIVE





PUTATIVE 40S RIBOSOMAL PROTEIN









GGGAG[G/A]AC










YHR148W -


SACCHAROMYCES CEREVISIAE











TACACGCGCTA










(BAKER'S YEAST), 183 aa.









CAACCAGCTGA






1121-1122




cg39548335




 302




TGTATTCTCGT




T




C




Lys




Arg




CONSERVA-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




2.90E−48









TGTTACCGGCG






(1201)




(1202)




TIVE




FIED





P43618 HYPOTHETICAL 41.3 KD PROTEIN









GCC[T/C]TCCT











IN SAP155-YMR31 INTERGENIC REGION -









GGGAGTGCTC













Saccharomyces cerevisiae


(Baker's yeast), 361 aa.









ATTATTCATTT






1123-1124




cg38435145




 132




TGCTGGAGAAT




C




T




Ala




Val




CONSERVA-




UNCLASSI-




Human Gene Similar to TREMBLNEW-ACC:




4.10E−46









TTGGCCACAAA






(1203)




(1204)




TIVE




FIED




BM74914 KIAA0891 PROTEIN -


HOMO SAPIENS











GAG[C/T]TGCC










(HUMAN), 1371 aa (fragment).









AAGATAGCTGG









GCCAGGAAGA






1125-1126




cg42894694




1008




TTTTGTATTTTT




T




C




Val




Ala




CONSERVA-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




8.30E−34









AGTAGAGACG






(1205)




(1206)




TIVE




FIED




P39194 !!!! ALU SUBFAMILY SQ WARNING









GGG[T/C]TTCA










ENTRY -


Homo sapiens


(Human), 593 aa.









CCATGTTGGCC









AGGCTGGTCT






1127-1128




cg27847601




 281




TGAGGTGGGA




A




G




Lys




Arg




CONSERVA-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




1.00E−31









GGACTGCCTG






(1207)




(1208)




TIVE




FIED




P39191 !!!! ALU SUBFAMILY SB2 WARNING









AACCA[A/G]GG










ENTRY -


Homo sapiens


(Human), 603 aa.









AGGTGGAGGC









TGCAGTGAGC









CAA






1129-1130




cg43976973




1025




GTGAGCATCAT




C




G




Lys




Asn




NON-




ATPase_asso-




Human Gene Similar to TREMBLNEW-ID:




3.50E−33









GATGCTGCTGT






(1209)




(1210)




CONSERVA-




ciated




G263099 TAT BINDING PROTEIN 7, TBP-









CGG[C/G]TTCG








TIVE





7 = TRANSCRIPTIONAL ACTIVATOR -









GGGGGCAGCA












HOMO SAPIENS


, 458 aa.









CGTCCACCAGT






1131-1132




cg43925450




 848




GCAACTGGTG




G




T




Arg




Leu




NON-




glycoprotein




Human Gene Similar to TREMBLNEW-ID:




3.80E−37




17









CTCCAGGCCG






(1211)




(1212)




CONSERVA-





E1249608 MEMBRANE-BOUND SMALL GTP-









AAACC[G/T]AG








TIVE





BINDING - LIKE PROTEIN -


ARABIDOPSIS











GTGTGGACCTC










THALIANA (MOUSE-EAR CRESS), 217 aa.









CAGGAGAACAA









C






1133-1134




cg43305492




 342




CTCCAAAAACC




A




C




Met




Leu




NON-




immunoglob




Human Gene Similar to TREMBLNEW-ID:




7.40E−49









AGGTGGTCCTT






(1213)




(1214)




CONSERVA-





G2734101 IMMUNOGLOBULIN HEAVY CHAIN,









ACA[A/C]TGACC








TIVE





VD(5)J(4) LIKE GENE PRODUCT -


HOMO











AACATGGACCC










SAPIENS (HUMAN), 151 aa.









TGTGGACAC






1135-1136




cg34758710




 463




GGGCATCATTG




C




T




Leu




Phe




NON-




MHC




Human Gene Similar to SPTREMBL-ID: Q30916




2.20E−39









CTGGCCTGGTT






(1215)




(1216)




CONSERVA-





MHC CLASS A -


PAN TROGLODYTES











GTC[C/T]TTGCA








TIVE





(CHIMPANZEE), 357 aa (fragment).









GCTGTAGTCAC









TGGAGCTGC






1137-1138




cg43923640




 511




TCCGCTGAGG




T




C




Phe




Leu




NON-




oncogene




Human Gene Similar to SWISSPROT-ID: P24407




1.30E−34




 1









AAATTCAAGCT






(1217)




(1218)




CONSERVA





RAS-RELATED PROTEIN RAB-8 (ONCOGENE









GGTG[T/C]TCCT








TIVE





C-MEL) -


HOMO SAPIENS


(HUMAN), AND









GGGGGAGCAA












CANIS FAMILIARIS


(DOG), 207 aa.









AGCGTTGGAAA






1139-1140




cg27790564




 15




GGTACCACAGA




C




T




Ala




Thr




NON-




protease




Human Gene Similar to SWISSNEW-ID: P53616




9.70E−31









TAG[C/T]AATGG






(1219)




(1220)




CONSERVA-





PROTEASOME COMPONENT SUN4 -









AGCCGGAGAC








TIVE







SACCHAROMYCES CEREVISIAE


(BAKER'S









CTTGTTAACA










YEAST), 420 aa.|pcls: SWISSPROT-ID: P53616
















PROTEASOME COMPONENT SUN4 -


















SACCHAROMYCES CEREVISIAE


(BAKER'S
















YEAST), 420 aa.






1141-1142




cg39517733




 241




CCCTCATCAAA




G




A




Pro




Ser




NON-




struct




Human Gene Similar to SPTREMBL-




3.90E−36









GATGGGGGCT






(1221)




(1222)




CONSERVA-





ID: Q14425 GASTRIC MUCIN -


HOMO SAPIENS











GTTG[G/A]TGG








TIVE





(HUMAN), 850 aa (fragment).









GCACTTGGGG









TAGCAGCCTTC









C






1144-1143




cg39517733




 88




CCAGCATCAT




T




C




Ser




Gly




NON-




struct




Human Gene Similar to SPTREMBL-ID: Q14425




3.9E−36









GGGTACAGTT






(1223)




(1224)




CONSERVA-





GASTRIC MUCIN -


HOMO SAPIENS


(HUMAN),









CACAC[T/C]A








TIVE





850 aa (fragment).









CTCTCCGTAC









AAACGCAGG









AATAA






1145-1146




cg39565684




 240




TTCTGGGATAA




C




T




Arg




Gln




NON-




synthase




Human Gene Similar to SWISSNEW-ID: P53167




1.40E−33









GGTATTGGTAC






(1225)




(1226)




CONSERVA-





PSEUDOURIDYLATE SYNTHASE 2









CAT[C/T]GGGA








TIVE





(EC 4.2.1.70) (PSEUDOURIDINE SYNTHASE 2) -









ATCGCACGCG












SACCHAROMYCES CEREVISIAE


(BAKER'S









GATCTAGCATT










YEAST), 370 aa.″pcls: SWISSPROT-ID: P53167
















PSEUDOURIDYLATE SYNTHASE 2
















(EC 4.2.1.70) (PSEUDOURIDINE SYNTHASE 2) -


















SACCHAROMYCES CEREVISIAE


(BAKER'S
















YEAST), 370 aa.|pcls: SPTREMBL-ID: Q06713
















PSEUDOURIDINE SYNTHASE 2 -


















SACCHAROMYCES CEREVISIAE


(BAKER'S
















YEAST), 370 aa.






1147-1148




cg32152874




 143




CAAGGTGTACG




A




C




Met




Leu




NON-




transcriptfactor




Human Gene Similar to SPTREMBL-ID: Q24140




1.50E−39









TGTCCATGCCG






(1227)




(1228)




CONSERVA-





NEURON SPECIFIC ZINC FINGER TRANSCRIP-









GCC[A/C]TGGC








TIVE





TION FACTOR -


DROSOPHILA











CATGCACCTGC












MELANOGASTER (FRUIT FLY), 664 aa.











TCACGCACGA













1149-1150




cg43025141




 686




CTACGCCAAGC




G




A




Ala




Thr




NON-




transferase




Human Gene Similar to SPTREMBL-ID: O08832




1.10E−49









GCAACGCCCT






(1229)




(1230)




CONSERVA-





POLYPEPTIDE GALNAC TRANSFERASE-T4 -









GCGC[G/A]CCG








TIVE







MUS MUSCULUS


(MOUSE), 578 aa.









CCGAGGTGTG









GATGGATGACT









T






1151-1152




cg44028935




 80




GGCAGGATGA




T




A




Leu




Gln




NON-




ubiquitin




Human Gene Similar to SWISSPROT-ID: P52491




3.00E−36









TCAAGCTGTTC






(1231)




(1232)




CONSERVA-





UBIQUITIN-CONJUGATING ENZYME E2-21.2









TCGC[T/A]GAA








TIVE





KD (EC 6.3.2.19) (UBIQUITIN-PROTEIN









GCAGCAGAAG










LIGASE) (UBIQUITIN CARRIER PROTEIN) -









AAGGAGGAGG












SACCHAROMYCES CEREVISIAE


(BAKER'S









AG










YEAST), 188 aa.






1153-1154




cg44028935




 82




CAGGATGATCA




A




C




Lys




Gln




NON-




ubiquitin




Human Gene Similar to SWISSPROT-ID: P52491




3.00E−36




 8









AGCTGTTCTCG






(1233)




(1234)




CONSERVA-





UBIQUITIN-CONJUGATING ENZYME E2-21.2









CTG[A/C]AGCA








TIVE





KD (EC 6.3.2.19) (UBIQUITIN-PROTEIN









GCAGAAGAAG










LIGASE) (UBIQUITIN CARRIER PROTEIN) -









GAGGAGGAGT












SACCHAROMYCES CEREVISIAE


(BAKER'S









C










YEAST), 188 aa.






1155-1156




cg44028935




 83




AGGATGATCAA




A




T




Lys




Met




NON-




ubiquitin




Human Gene Similar to SWISSPROT-ID: P52491




3.00E−36




 8









GCTGTTCTCGC






(1235)




(1236)




CONSERVA-





UBIQUITIN-CONJUGATING ENZYME E2-21.2









TGA[A/T]GCAG








TIVE





KD (EC 6.3.2.19) (UBIQUITIN-PROTEIN









CAGAAGAAGG










LIGASE) (UBIQUITIN CARRIER PROTEIN) -









AGGAGGAGTC












SACCHAROMYCES CEREVISIAE


(BAKER'S
















YEAST), 188 aa.






1157-1158




cg44028935




 86




ATGATCAAGCT




A




G




Gln




Arg




NON-




ubiquitin




Human Gene Similar to SWISSPROT-ID: P52491




3.00E−36




 8









GTTCTCGCTGA






(1237)




(1238)




CONSERVA-





UBIQUITIN-CONJUGATING ENZYME E2-21.2









AGC[A/G]GCAG








TIVE





KD (EC 6.3.2.19) (UBIQUITIN-PROTEIN









AAGAAGGAGG










LIGASE) (UBIQUITIN CARRIER PROTEIN) -









AGGAGTCGGC












SACCHAROMYCES CEREVISIAE


(BAKER'S
















YEAST), 188 aa.






1159-1160




cg39548335




 228




ATCAGTACCTC




G




A




Gln




End




NON-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC: P43618




2.90E−48









ATCCTTGAGAC






(1239)




(1240)




CONSERVA-




FIED




HYPOTHETICAL 41.3 KD PROTEIN IN SAP155-









GTT[G/A]TTCTT








TIVE





YMR31 INTERGENIC REGION -


Saccharomyces











CAAGGGCTCTC












cerevisiae


(Baker's yeast), 361 aa.









TCCCGAAAG






1161-1162




cg39548335




 48




TCATCTATTAG




T




C




Thr




Ala




NON-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC: P43618




2.90E−48









ATTCCGTGCTT






(1241)




(1242)




CONSERVA-




FIED




HYPOTHETICAL 41.3 KD PROTEIN IN SAP155-









GAG[T/C]TTTAT








TIVE





YMR31 INTERGENIC REGION -


Saccharomyces











TAGTAGTTGTA












cerevisiae


(Baker's yeast), 361 aa.









TCGTTGCCT






1163-1164




cg39510144




 109




ATTGGGTTGTA




C




T




Pro




Leu




NON-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




2.00E−47









TGCAGAACTCT






(1243)




(1244)




CONSERVA-




FIED




Q12284 ERV2 PROTEIN PRECURSOR -









ATC[C/T]ATGCG








TIVE







Saccharomyces cerevisiae


(Baker's yeast), 196 aa.









GGGAATGTTCA









TATCACTTT






1165-1166




cg39565075




 28




CGGAGAAGAG




T




G




Phe




Val




NON-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC: P36121




1.40E−46









GAGTCACTGCA






(1245)




(1246)




CONSERVA-




FIED




HYPOTHETICAL 32.1 KD PROTEIN IN DBP7-









CGTG[T/G]TTCA








TIVE





GCN3 INTERGENIC REGION -


Saccharomyces











GTATGCTAATA












cerevisiae (Baker's yeast), 282 aa.











GACCAAGGCT






1167-1168




cg21427396




 193




TCAAACCCATG




C




T




Arg




Cys




NON-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC: P30870




7.60E−44









GTCGAGAGTG






(1247)




(1248)




CONSERVA-




FIED




GLUTAMATE-AMMONIA-LIGASE ADENYLYL-









GGAG[C/T]GTT








TIVE





TRANSFERASE (EC 2.7.7.42) (GLUTAMINE-









ATGCCATGGTT










SYNTHETASE ADENYLYLTRANSFERASE)









AAAGCCCGTGT










(ATASE) -


Escherichia coli


, 946 aa.






1169-1170




cg34394308




 431




GTTGATGCTGC




G




A




Gly




Asp




NON-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC: P50442




1.50E−42









TCAGCCGGCG






(1249)




(1250)




CONSERVA-




FIED




GLYCINE AMIDINOTRANSFERASE PRE-









CACG[G/A]CAC








TIVE





CURSOR (EC 2.1.4.1) (L-ARGININE:GLYCINE









GCCTCCAGAAT










AMIDINOTRANSFERASE) (TRANSAMIDINASE)









TGTGCTACTCG










(AT) -


Rattus norvegicus


(Rat), 423 aa.






1171-1172




cg43129081




 415




CCTCGATCTCC




G




A




Ala




Thr




NON-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




3.70E−41









TGACCTCGTGA






(1251)




(1252)




CONSERVA-




FIED




P39189 !!!! ALU SUBFAMILY SB WARNING









TCC[G/A]CCCA








TIVE





ENTRY -


Homo sapiens


(Human), 587 aa.









CCTTGGCCTCC









CAAAGTGCTG






1173-1174




cg29693502




 197




CTGATCGACGT




T




G




Asn




Thr




NON-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




3.00E−39









ATTCCAACAGC






(1253)




(1254)




CONSERVA-




FIED




Q10379 PROBABLE GLUTAMATE-AMMONIA-









TCA]T/G]TCGTC








TIVE





LIGASE ADENYLYLTRANSFERASE









AACTCCCGGTC










(EC 2.7.7.42) (GLUTAMINE-SYNTHETASE









GCCGGCACC










ADENYLYLTRANSFERASE) (ATASE) -
















Mycobacterium






1175-1176




cg27850036




 136




TAGGTGTCTAG




C




T




Gly




Ser




NON-




UNCLASSI-




Human Gene Similar to SWISSNEW-ACC: P11653




1.00E−38









CCAGTCCATGT






(1255)




(1256)




CONSERVA-




FIED




METHYLMALONYL COA MUTASE ALPHA-









CAC[C/T]GTAGA








TIVE





SUBUNIT (EC 5.4.99.2) (MCM-ALPHA) -









CGTCCTCGGA












Propionibacterium freudenreichii shermanii


, 727 aa.









GTAGAGGTGA






1177-1178




cg43936560




1406




TTGGGCAGCTT




G




C




Ala




Pro




NON-




UNCLASSI-




Human Gene Similar to SPTREMBL-ACC: Q69566




4.70E−36









ATCTGTGTGCC






(1257)




(1258)




CONSERVA-




FIED




(HHV-6) U1102, VARIANT A DNA, COMPLETE









CAG[G/C]CGGC








TIVE





VIRION GENOME - HUMAN HERPESVIRUS-6,









ATATCTGTGCA










413 aa.









TGTGCGTGTG






1179-1180




cg42538578




 142




GCAGCACTTG




C




T




Cys




Tyr




NON-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




7.40E−34




 8









GCCTTCCCTCT






(1259)




(1260)




CONSERVA-




FIED




Q09753 BETA-DEFENSIN 1 PRECURSOR









GTAA[C/T]AGGT








TIVE





(HBD-1) (DEFENSIN, BETA 1) -


Homo sapiens











GCCTTGAATTT










(Human), 68 aa.









TGGTAAAGAT






1181-1182




cg42475469




 376




CTGACCTCAAG




C




T




Glu




Lys




NON-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




5.70E−32









TGATCCACCTG






(1261)




(1262)




CONSERVA-




FIED




P39194 !!!! ALU SUBFAMILY SQ WARNING









CCT[C/T]AGCCT








TIVE





ENTRY -


Homo sapiens


(Human), 593 aa.









CCCAAAGTGCT









GGGATTACA






1183-1184




cg27847601




 280




CTGAGGTGGG




A




G




Lys




Glu




NON-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




1.00E−31









AGGACTGCCT






(1263)




(1264)




CONSERVA-




FIED




P39191 !!!! ALU SUBFAMILY SB2 WARNING









GAACC[A/G]AG








TIVE





ENTRY -


Homo sapiens


(Human), 603 aa.









GAGGTGGAGG









CTGCAGTGAG









CCA






1185-1186




cg27847601




 298




CTGAACCAAGG




G




A




Val




Met




NON-




UNCLASSI-




Human Gene Similar to SWISSPROT-ACC:




1.00E−31









AGGTGGAGGC






(1265)




(1266)




CONSERVA-




FIED




P39191 !!!! ALU SUBFAMILY SB2 WARNING









TGCA[G/A]TGA








TIVE





ENTRY -


Homo sapiens


(Human), 603 aa.









GCCAAGATCAC









AGCACTATGCT






1187-1188




cg44931270




1970




CGGCGCCGGG




C




gap




Gln




His




FRAMESHIFT




collagen




Human Gene Similar to SPTREMBL-ID: Q28396




2.10E−37









AGGCTGTGGG





(1267)




(1268)







TYPE II COLLAGEN -


EQUUS CABALLUS











TCTGG[C/gap]T










(HORSE), 1418 aa.









GCGCACGGTC









TCGGTCAGCA









GAGC






1189-1190




cg43925450




 333




GATCTGTCAAT




gap




T




Leu




Ser




FRAMESHIFT




glycoprotein




Human Gene Similar to TREMBLNEW-ID:




3.80E−37




17









TTAAGCTGGTT






(1269)




(1270)






E1249608 MEMBRANE-BOUND SMALL GTP-









CTG[gap/T]CTG










BINDING - LIKE PROTEIN -


ARABIDOPSIS











GGGGAGTCTG












THALIANA


(MOUSE-EAR CRESS), 217 aa.









CGGTAGGCAA









AT






1191-1192




cg43941918




 724




CAGCCACAGTT




gap




T




Val




Ser




FRAMESHIFT




immunoglob




Human Gene Similar to TREMBLNEW-ID:




2.80E−42









CGTTTGATCTC






(1271)




(1272)






G240581 IMMUNOGLOBULIN G2B VARIABLE









CAC[gap/T]CTT










REGION LIGHT CHAIN, AUTOANTOBODY









GGTCCCTTGG










BV04-01 VARIABLE REGION LIGHT CHAIN -









CCGAAAGTGC










MUSSP, 113 aa.









GC
























SEQUENCE LISTING











The patent contains a lengthy “Sequence Listing” section. A copy of the “Sequence Listing” is available in electronic form from the USPTO






web site (http://seqdata.uspto.gov/sequence.html?DocID=06670464B1). An electronic copy of the “Sequence Listing” will also be available from the






USPTO upon request and payment of the fee set forth in 37 CFR 1.19(b)(3).












Claims
  • 1. An isolated polynucleotide consisting of the polymorphic nucleotide sequence of SEQ ID NO: 1142.
  • 2. An isolated allele-specific oligonucleotide that hybridizes to a first polynucleotide at a polymorphic site encompassed therein, wherein the first polynucleotide comprises the polymorphic nucleotide sequence of SEQ ID NO: 1142.
  • 3. An isolated nucleic acid molecule of between 10 and 50 bases of which at least 10 contiguous bases are from SEQ ID NO: 1141 wherein said isolated nucleic acid molecule comprises the nucleotide corresponding to position 26 of SEQ ID NO: 1141 wherein said nucleotide is not a guanosine.
  • 4. The isolated nucleic acid molecule of claim 3, wherein the nucleotide at position 26 is an adenosine.
  • 5. An isolated nucleic acid molecule consisting of a sequence complementary to the isolated nucleic acid molecule of claim 3.
  • 6. An isolated nucleic acid molecule of between 10 and 50 bases of which at least 10 contiguous bases are from SEQ ID NO: 1144 wherein said isolated nucleic acid molecule comprises the nucleotide corresponding to position 26 of SEQ ID NO: 1144 wherein said nucleotide is not a cytosine.
  • 7. The isolated nucleic acid molecule of claim 6, wherein the nucleotide at position 26 is a thymidine.
  • 8. An isolated nucleic acid molecule consisting of a sequence complementary to the isolated nucleic acid molecule of claim 6.
  • 9. An isolated nucleic acid molecule of between 10 and 50 bases of which at least 10 contiguous bases are from SEQ ID NO: 1149 wherein said isolated nucleic acid molecule comprises the nucleotide corresponding to position 26 of SEQ ID NO: 1149 wherein said nucleotide is not a guanosine.
  • 10. The isolated nucleic acid molecule of claim 9, wherein the nucleotide at position 26 is an adenosine.
  • 11. An isolated nucleic acid molecule consisting of a sequence complementary to the isolated nucleic acid molecule of claim 9.
  • 12. An isolated nucleic acid molecule of between 10 and 50 bases of which at least 10 contiguous bases are from SEQ ID NO: 1179 wherein said isolated nucleic acid molecule comprises the nucleotide corresponding to position 26 of SEQ ID NO: 1179 wherein said nucleotide is not a cytosine.
  • 13. The isolated nucleic acid molecule of claim 12, wherein the nucleotide at position 26 is a thymidine.
  • 14. An isolated nucleic acid molecule consisting of a sequence complementary to the isolated nucleic acid molecule of claim 12.
  • 15. An isolated nucleic acid molecule consisting of between 10-50 contiguous bases from SEQ ID NO: 1141 wherein said isolated nucleic acid molecule comprises the nucleotide at position 26 of SEQ ID NO: 1141 wherein nucleotide is not a guanosine, and wherein said isolated nucleic acid molecule encodes a polypeptide which has gastric mucin activity.
  • 16. The isolated nucleic acid molecule of claim 15, wherein the nucleotide at position 26 is an adenosine.
  • 17. An isolated nucleic acid molecule consisting of between 10-50 contiguous bases from SEQ ID NO: 1144 wherein said isolated nucleic acid molecule comprises the nucleotide at position 26 of SEQ ID NO: 1144 wherein said nucleotide is not a cytosine, and wherein said isolated nucleic acid molecule encodes a polypeptide which has gastric mucin activity.
  • 18. The isolated nucleic acid molecule of claim 17, wherein the nucleotide at position 26 is an thymidine.
  • 19. An isolated nucleic acid molecule consisting of between 10-50 contiguous bases from SEQ ID NO: 1149 in which the nucleotide at position 26 is not a guanosine, wherein said isolated nucleic acid molecule encodes a polypeptide which has N-acetylgalactosaminyltransferase (GALNAC transferase) activity.
  • 20. An isolated nucleic acid molecule consisting of between 10-50 contiguous bases from SEQ ID NO: 1149 wherein said isolated nucleic acid molecule comprises the nucleotide at position 26 of SEQ ID NO: 1149 wherein said nucleotide is not a guanosine, and wherein said isolated nucleic acid molecule encodes a polypeptide which has N-acetylgalactosaminyltransferase (GALNAC transferase) activity.
  • 21. An isolated nucleic acid molecule consisting of between 10-50 contiguous bases from SEQ ID NO: 1179 wherein said isolated nucleic acid molecule comprises the nucleotide at position 26 of SEQ ID NO: 1179 wherein said nucleotide is not a cytosine, and wherein said isolated nucleic acid molecule encodes a polypeptide which has defensin beta 1 (DEFB1) activity.
  • 22. The isolated nucleic acid molecule of claim 21, wherein the nucleotide at position 26 is an thymidine.
  • 23. An isolated polynucleotide consisting of a polymorphic nucleotide sequence of SEQ ID NO: 1143.
  • 24. An isolated polynucleotide consisting of a polymorphic nucleotide sequence of SEQ ID NO: 1150.
  • 25. An isolated polynucleotide consisting of a polymorphic nucleotide sequence of SEQ ID NO: 1180.
  • 26. An isolated allele-specific oligonucleotide that hybridizes to a first polynucleotide at a polymorphic site encompassed therein, wherein the first polynucleotide comprises the polymorphic nucleotide sequence of SEQ ID NO: 1143.
  • 27. An isolated allele-specific oligonucleotide that hybridizes to a first polynucleotide at a polymorphic site encompassed therein, wherein the first polynucleotide comprises the polymorphic nucleotide sequence of SEQ ID NO: 1150.
  • 28. An isolated allele-specific oligonucleotide that hybridizes to a first polynucleotide at a polymorphic site encompassed therein, wherein the first polynucleotide comprises the polymorphic nucleotide sequence of SEQ ID NO: 1180.
RELATED APPLICATIONS

This application claims the benefit of the U.S. Provisional Application No. 60/109,024, filed Nov. 17, 1998 which is incorporated herein by reference in its entirety.

US Referenced Citations (1)
Number Name Date Kind
5856104 Chee et al. Jan 1999 A
Foreign Referenced Citations (13)
Number Date Country
717113 Jun 1996 EP
0 785 280 Jul 1997 EP
WO 9322456 Nov 1993 WO
WO 9511995 May 1995 WO
WO 9729212 Aug 1997 WO
WO 9814466 Apr 1998 WO
WO 9814470 Apr 1998 WO
WO 9818967 May 1998 WO
WO 9820165 May 1998 WO
WO 9821316 May 1998 WO
WO 9830717 Jul 1998 WO
WO 9838846 Sep 1998 WO
WO 9856954 Dec 1998 WO
Non-Patent Literature Citations (33)
Entry
Iyers et al., “Quod erat demonstradum? The mystery of experimental validation of apparently erroneous computational analysis of protein sequences,” Genome Biology, 2001, vol. 2, No. 12, pp. 1-11.*
Accession No. Z50788 on GenBank Database, Bensch et al., Oct. 24, 1995.*
Accession No. AA918951 on GenBank Database, NCI-CGAP, Jun. 10, 1998.*
Sulston et al., “Toward a complete human genome sequence,” Genome Research, Nov. 1998, vol. 8, No. 11, pp. 1097-1108.*
Abravaya, K., J. J. Carrino, et al. (1995). “Detection of point mutations with a modified ligase chain reaction (Gap- LCR).” Nucleic Acids Res 23(4): 675-82.
Adams, M. D., J. M. Kelley, et al. (1991). “Complementary DNA sequencing: expressed sequence tags and human genome project.” Science 252(5013): 1651-6.
Barany, F. (1991). “Genetic disease detection and DNA amplification using cloned thermostable ligase.” Proc Natl Acad Sci U S A 88(1): 189-93.
Cotton, R. G., N. R. Rodrigues, et al. (1988). “Reactivity of cytosine and thymine in single-base-pair mismatches with hydroxylamine and osmium tetroxide and its application to the study of mutations.” Proc Natl Acad Sci U S A 85(12): 4397-401.
Evans, W. E. and M. V. Relling (1999). “Pharmacogenomics: translating functional genomics into rational therapeutics.” Science 286(5439): 487-91.
Faham, M. and D. R. Cox (1995). “A novel in vivo method to detect DNA sequence variation.” Genome Res 5(5): 474-82.
Fischer, S. G. and L. S. Lerman (1983). “DNA fragments differing by single base-pair substitutions are separated in denaturing gradient gels: correspondence with melting theory.” Proc Natl Acad Sci U S A 80(6): 1579-83.
Gibbs, R. A., P. N. Nguyen, et al. (1989). “Detection of single DNA base differences by competitive oligonucleotide priming.” Nucleic Acids Res 17(7): 2437-48.
Kren, B. T., B. Parashar, et al. (1999). “Correction of the UDP-glucuronosyltransferase gene defect in the gunn rat model of crigler-najjar syndrome type I with a chimeric oliogonucleotide.” Proc Natl Acad Sci U S A 96(18): 10349-54.
Landegren, U., R. Kaiser, et al. (1988). “A ligase-mediated gene detection technique.” Science 241(4869): 1077-80.
Maskos, U. and E. M. Southern (1993). “A novel method for the parallel analysis of multiple mutations in multiple samples.” Nucleic Acids Res 21(9): 2269-70.
Myers, R. M., Z. Larin, et al. (1985). “Detection of single base substitutions by ribonuclease cleavage at mismatches in RNA:DNA duplexes.” Science 230(4731): 1242-6.
Newton, C. R., A. Graham, et al. (1989). “Analysis of any point mutation in DNA. The amplification refractory mutation system (ARMS).” Nucleic Acids Res 17(7): 2503-16.
Nikiforov, T. T., R. B. Rendle, et al. (1994). “Genetic Bit Analysis: a solid phase method for typing single nucleotide polymorphisms.” Nucleic Acids Res 22(20): 4167-75.
Orita, M., Y. Suzuki, et al. (1989). “Rapid and sensitive detection of point mutations and DNA polymorphisms using the polymerase chain reaction.” Genomics 5(4): 874-9.
Orita, M., H. Iwahana, et al. (1989). “Detection of polymorphisms of human DNA by gel electrophoresis as single-strand conformation polymorphisms.” Proc Natl Acad Sci U S A 86(8): 2766-70.
Orum, H., P. E. Nielsen, et al. (1993). “Single base pair mutation analysis by PNA directed PCR clamping.” Nucleic Acids Res 21(23): 5332-6.
Rhodes, M., R. Straw, et al. (1998). “A high-resolution microsatellite map of the mouse genome.” Genome Res 8(5): 531-42.
Saiki, R. K., P. S. Walsh, et al. (1989). “Genetic analysis of amplified DNA with immobilized sequence-specific oligonucleotide probes.” Proc Natl Acad Sci U S A 86(16): 6230-4.
Syvanen, A. C., K. Aalto-Setala, et al. (1990). “A primer-guided nucleotide incorporation assay in the genotyping of apolipoprotein E.” Genomics 8(4): 684-92.
Taillon-Miller, P., Z. Gu, et al. (1998). “Overlapping genomic sequences: A treasure trove of single-nucleotide polymorphisms [In Process Citation].” Genome Res 8(7): 748-54.
Thiede, C., E. Bayerdorffer, et al. (1996). “Simple and sensitive detection of mutations in the ras proto-oncogenes using PNA-mediated PCR clamping.” Nucleic Acids Res 24(5): 983-4.
Wagner, R., P. Debbie, et al. (1995). “Mutation detection using immobilized mismatch binding protein (MutS).” Nucleic Acids Res 23(19): 3944-8.
Wallace, R. B., J. Shaffer, et al. (1979). “Hybridization of synthetic oligodeoxyribonucleotides to phi chi 174 DNA: the effect of single base pair mismatch.” Nucleic Acids Res 6(11): 3543-57.
Youil, R., B. W. Kemper, et al. (1995). “Screening for mutations by enzyme mismatch cleavage with T4 endonuclease VII.” Proc Natl Acad Sci U S A 92(1): 87-91.
Xiong and Jin (1997). “Biallelic markers in genetics studies of human diseases: Their power, accuracy, and density in population-based linkage analysis.” American Journal of Human Genetics 61(4) suppl.:1759.
Fan et al. (1997). “Genetic mapping: Finding and analyzing single-nucleotide polymorphisms with high-density DNA arrays. ”American Journal of Human Genetics 61(4) suppl.: 1601.
Wang et al. (1998). “Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome.” Science 280:1077-1082.
International Search Report for PCT/US 99/27293. Mailed on Dec. 21, 2000.
Provisional Applications (1)
Number Date Country
60/109024 Nov 1998 US