IL-4 receptor sequence variation associated with type 1 diabetes

1. FIELD OF THE INVENTION

[0001] The present invention relates to the fields of immunology and molecular biology. More specifically, it relates to methods and reagents for detecting nucleotide sequence variability in the IL4 receptor locus that is associated with type 1 diabetes.

2. DESCRIPTION OF RELATED ART

[0002] The immunological response to an antigen is mediated through the selective differentiation of CD4+T helper precursor cells (Th0) to T helper type 1 (Th1) or T helper type 2 (Th2) effector cells, with functionally distinct patterns of cytokine (also described as lymphokine) secretion. Th1 cells secrete interleukin 2 (IL-2), IL-12, tumor necrosis factor (TNF), lymphotoxin (LT), and interferon gamma (IFN-g) upon activation, and are primarily responsible for cell-mediated immunity such as delayed-type hypersensitivity. Th2 cells secrete IL-4, IL-5, IL-6, IL-9, and IL-13 upon activation, and are primarily responsible for extracellular defense mechanisms. The role of Th1 and Th2 cells is reviewed in Peltz, 1991, Immunological Reviews 123: 23-35, incorporated herein by reference.

[0003] IL4 and IL13 play a central role in IgE-dependent inflammatory reactions. IL4 induces IgE antibody production by B Cells and further provides a regulatory function in the differentiation of Th0 to Th1 or Th2 effector cells by both promoting differentiation into Th2 cells and inhibiting differentiation into Th1 cells. IL13 also induces IgE antibody production by B Cells.

[0004] IL4 and IL13 operate through the IL4 receptor (IL4R), found on both B and T cells, and the IL13R, found on B cells, respectively. The human IL4 receptor (IL4R) is a heterodimer comprising the IL4R α chain and γc chain. The α-chain of the IL4 receptor also serves as the α-chain of the IL13 receptor. IL4 binds to both IL4R and IL13R through the IL4R α-chain and can activate both B and T cells, whereas IL13 binds only to IL13R through the IL13R α1 chain and activate only T cells.

3. SUMMARY OF INVENTION

[0005] The present invention relates to a newly discovered association between sequence variants within the IL-4 receptor (IL4R) and type 1 diabetes. Identification of the allelic sequence variant(s) present provides information that assists in characterizing individuals according to their risk of type 1 diabetes.

[0006] Several single-nucleotide polymorphisms within the IL4R gene have been identified and are indicated in Table 2, below. Although several million sequence variants are possible from the SNPs in Table 2, not all of the possible variants have been observed.

[0007] In the methods of the invention, the genotype of the IL4R is determined in order to provide information useful for assessing an individual's risk for particular Th1-mediated diseases, in particular, type 1 diabetes. Individuals who have at least one allele statistically associated with type 1 diabetes possess a factor contributing to the risk of a type 1 diabetes. The statistical association of IL4R alleles (sequence variants) is shown in the examples.

[0008] As IL4R is but one component of the complex system of genes involved in an immune response, the effect of the IL4R locus is expected to be small. Other factors, such as an individual's HLA genotype, may exert dominating effects which, in some cases, may mask the effect of the IL4R genotype. For example, particular HLA genotypes are known to have a major effect on the likelihood of type 1 diabetes (see Noble et al., 1996, Am. J. Hum. Genet. 59:1134-1148, incorporated herein by reference). The IL4R genotype is likely to be more informative as an indicator of predisposition towards type 1 diabetes among individuals who have HLA genotypes that confer neither increased nor decreased risk. Furthermore, because allele frequencies at other loci relevant to immune system-related diseases differ between populations and, thus, populations exhibit different risks for immune system-related diseases, it is expected that the effect of the IL4R genotype may be of different magnitude in some populations. Although the contribution of the IL4R genotype may be relatively minor by itself, genotyping at the IL4R locus will contribute information that is, nevertheless, useful for a characterization of an individual's predisposition towards type 1 diabetes. The IL4R genotype information may be particularly useful when combined with genotype information from other loci.

[0009] The present invention provides preferred methods, reagents, and kits for IL4R genotyping. The genotype can be determined using any method capable of identifying nucleotide variation consisting of single nucleotide polymorphic sites. The particular method used is not a critical aspect of the invention. A number of suitable methods are described below.

[0010] In a preferred embodiment of the invention, genotyping is carried out using oligonucleotide probes specific to variant sequences. Preferably, a region of the IL4R gene which encompasses the probe hybridization region is amplified prior to, or concurrent with, the probe hybridization. Probe-based assays for the detection of sequence variants are well known in the art.

[0011] Alternatively, genotyping is carried out using allele-specific amplification or extension reactions, wherein allele-specific primers are used which support primer extension only if the targeted variant sequence is present. Typically, an allele-specific primer hybridizes to the IL4R gene such that the 3′ terminal nucleotide aligns with a polymorphic position. Allele-specific amplification reactions and allele-specific extension reactions are well known in the art.

4. BRIEF DESCRIPTION OF THE FIGURE

[0012]
FIG. 1 provides a schematic of a molecular haplotyping method.

5. BRIEF DESCRIPTION OF THE TABLES

[0013] Table 1 provides the nucleotide sequence of the coding region of an IL4R (SEQ ID NO: 2);

[0014] Table 2 provides IL4R SNPs useful in the methods of the invention;

[0015] Table 3 provides probes used to identify IL4R polymorphisms (SEQ ID NO: 3-19);

[0016] Table 4 provides computationally estimated haplotype frequencies compared between Filipino controls and diabetics (SEQ ID NO: 20-24);

[0017] Table 5 provides genotypes of affected and nonaffected individuals;

[0018] Table 6 provides single nucleotide polymorphisms detected;

[0019] Table 7 provides amplicon primers and lengths (SEQ ID NO: 25-36);

[0020] Table 8 provides hybridization probes and titers (SEQ ID NO: 37-53);

[0021] Table 9 provides allele frequency of wildtype allele in HBDI founders;

[0022] Table 10 provides D′ and Δ values for pairs of IL4R SNPs;

[0023] Table 11A provides results of single locus TDT analysis;

[0024] Table 11B provides results of single locus TDT analysis;

[0025] Table 12 provides allele-specific PCR primers (SEQ ID NO: 54-62);

[0026] Table 13 provides IBD distributions for IL4R haplotypes;

[0027] Table 14A provides haplotype transmissions;

[0028] Table 14B provides haplotype transmissions;

[0029] Table 14C provides haplotype transmissions;

[0030] Table 15A provides SNP by SNP allele transmissions;

[0031] Table 15B provides SNP by SNP allele transmissions;

[0032] Table 16A provides a TDT analysis;

[0033] Table 16B provides a TDT analysis;

[0034] Table 16C provides a TDT analysis;

[0035] Table 17A provides a TDT analysis;

[0036] Table 17B provides a TDT analysis;

[0037] Table 18 provides allele frequencies in Filipino controls and diabetics;

[0038] Table 19 provides estimated haplotype frequencies; and

[0039] Table 20 provides observed haplotype frequencies.

6. DETAILED DESCRIPTION OF THE INVENTION

[0040] To aid in understanding the invention, several terms are defined below.

[0041] The term “IL4R gene” refers to the genomic nucleic acid sequence that encodes the interleukin 4 receptor protein. The nucleotide sequence of a gene, as used herein, encompasses coding regions, referred to as exons, intervening, non-coding regions, referred to as introns, and upstream or downstream regions. Upstream or downstream regions can include regions of the gene that are transcribed but not part of an intron or exon, or regions of the gene that comprise, for example, binding sites for factors that modulate gene transcription. The gene sequence of a Human mRNA for IL4R is provided at GenBank accession number X52425 (SEQ ID NO: 1). The coding region is provided as SEQ ID NO: 2.

[0042] The term “allele”, as used herein, refers to a sequence variant of the gene. Alleles are identified with respect to one or more polymorphic positions, with the rest of the gene sequence unspecified. For example, an IL4R may be defined by the nucleotide present at a single SNP, or by the nucleotides present at a plurality of SNPs. In certain embodiments of the invention, an IL4R is defined by the genotypes of 6, 7 or 8 IL4R SNPs. Examples of such IL4R SNPs are provided in Table 2, below.

[0043] For convenience, allele present at the higher or highest frequency in the population will be referred to as the wild-type allele; less frequent allele(s) will be referred to as mutant-allele(s). This designation of an allele as a mutant is meant solely to distinguish the allele from the wild-type allele and is not meant to indicate a change or loss of function.

[0044] The term “predisposing allele” refers to an allele that is positively associated with an autoimmune disease such as type 1 diabetes. The presence of a predisposing allele in an individual could be indicative that the individual has an increased risk for the disease relative to an individual without the allele.

[0045] The term “protective allele” refers to an allele that is negatively associated with an autoimmune disease such as type 1 diabetes. The presence of a protective allele in an individual could be indicative that the individual has a decreased risk for the disease relative to an individual without the allele.

[0046] The terms “polymorphic” and “polymorphism”, as used herein, refer to the condition in which two or more variants of a specific genomic sequence, or the encoded amino acid sequence, can be found in a population. The terms refer either to the nucleic acid sequence or the encoded amino acid sequence; the use will be clear from the context. The polymorphic region or polymorphic site refers to a region of the nucleic acid where the nucleotide difference that distinguishes the variants occurs, or, for amino acid sequences, a region of the amino acid where the amino acid difference that distinguishes the protein variants occurs. As used herein, a “single nucleotide polymorphism”, or SNP, refers to a polymorphic site consisting of a single nucleotide position.

[0047] The term “genotype” refers to a description of the alleles of a gene or genes contained in an individual or a sample. As used herein, no distinction is made between the genotype of an individual and the genotype of a sample originating from the individual. Although, typically, a genotype is determined from samples of diploid cells, a genotype can be determined from a sample of haploid cells, such as a sperm cell.

[0048] The haplotype refers to a description of the variants of a gene or genes contained on a single chromosome, i.e, the genotype of a single chromosome.

[0049] The term “target region” refers to a region of a nucleic acid which is to be analyzed and usually includes a polymorphic region.

[0050] Individual amino acids in a sequence are represented herein as AN or NA, wherein A is the amino acid in the sequence and N is the position in the sequence. In the case that position N is polymorphic, it is convenient to designate the more frequent variant as A1N and the less frequent variant as NA2. Alternatively, the polymorphic site, N, is represented as A1NA2, wherein A1 is the amino acid in the more common variant and A2 is the amino acid in the less common variant. Either the one-letter or three-letter codes are used for designating amino acids (see Lehninger, BioChemistry 2nd ed., 1975, Worth Publishers, Inc. New York, N.Y.: pages 73-75, incorporated herein by reference). For example, 150V represents a single-amino-acid polymorphism at amino acid position 50, wherein isoleucine is the present in the more frequent protein variant in the population and valine is present in the less frequent variant. The amino acid positions are numbered based on the sequence of the mature IL4R protein, as described below.

[0051] Representations of nucleotides and single nucleotide changes in DNA sequences are analogous. For example, A398G represents a single nucleotide polymorphism at nucleotide position 398, wherein adenine is the present in the more frequent (wild-type) allele in the population and guanine is present in the less frequent (mutant) allele. The nucleotide positions are numbered based on the IL4R coding region sequence provided as SEQ ID NO: 2, shown below. It will be clear that in a double stranded form, the complementary strand of each allele will contain the complementary base at the polymorphic position.

[0052] Conventional techniques of molecular biology and nucleic acid chemistry, which are within the skill of the art, are fully explained in the literature. See, for example, Sambrook et al., 1989, Molecular Cloning—A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York; Oligonucleotide Synthesis (M. J. Gait, ed., 1984); Nucleic Acid Hybridization (B. D. Hames and S. J. Higgins. eds., 1984); the series, Methods in Enzymology (Academic Press, Inc.); and the series, Current Protocols in Human Genetics (Dracopoli et al., eds., 1984 with quarterly updates, John Wiley & Sons, Inc.); all of which are incorporated herein by reference. All patents, patent applications, and publications mentioned herein, both supra and infra, are incorporated herein by reference.

METHODS OF THE INVENTION

[0053] The present invention provides methods of determining an individual's risk for any autoimmune disease or condition or any Th-1 mediated disease. Such diseases or conditions include, but are not limited to, rheumatoid arthritis, multiple sclerosis, type 1 diabetes mellitus (insulin dependent diabetes mellitus or IDDM), inflammatory bowel diseases, systemic lupus erythematosus, psoriasis, scleroderma, Grave's disease, systemic sclerosis, myasthenia gravis, Gullian-Barre syndromes and Hashimoto's thyroiditis. In certain embodiments of the invention, the methods are used to determine an individual's risk for IDDM. Preferably, the individual is a human.

[0054] IL4R mRNA Sequence

[0055] The nucleotide sequence of the coding region of a IL4R mRNA is available from GenBank under accession number X52425, nucleotides 176-2653 and provided as SEQ ID NO: 2, shown in a 5′ to 3′ orientation in Table 1, below. The IL4R mRNA is provided at SEQ ID NO: 1. Although only one strand of the nucleic acid is shown in Table 1, those of skill in the art will recognize that SEQ ID NO: 1 and SEQ ID NO: 2 identify regions of double-stranded genomic nucleic acid, and that the sequences of both strands are fully specified by the sequence information provided.

1TABLE 11atggggtggc tttgctctgg gctcctgttc cctgtgagct gcctggtcct gctgcaggtgSEQ ID NO: 261gcaagctctg ggaacatgaa ggtcttgcag gagcccacct gcgtctccga ctacatgagc121atctctactt gcgagtggaa gatgaatggt cccaccaatt gcagcaccga gctccgcctg181ttgtaccagc tggtttttct gctctccgaa gcccacacgt gtatccctga gaacaacgga241ggcgcggggt gcgtgtgcca cctgctcatg gatgacgtgg tcagtgcgga taactataca301ctggacctgt gggctgggca gcagctgctg tggaagggct ccttcaagcc cagcgagcat361gtgaaaccca gggccccagg aaacctgaca gttcacacca atgtctccga cactctgctg421ctgacctgga gcaacccgta tccccctgac aattacctgt ataatcatct cacctatgca481gtcaacattt ggagtgaaaa cgacccggca gatttcagaa tctataacgt gacctaccta541gaaccctccc tccgcatcgc agccagcacc ctgaagtctg ggatttccta cagggcacgg601gtgagggcct gggctcagtg ctataacacc acctggagtg agtggagccc cagcaccaag661tggcacaact cctacaggga gcccttcgag cagcacctcc tgctgggcgt cagcgtttcc721tgcattgtca tcctggccgt ctgcctgttg tgctatgtca gcatcaccaa gattaagaaa781gaatggtggg atcagattcc caacccagcc cgcagccgcc tcgtggctat aataatccag841gatgctcagg ggtcacagtg ggagaagcgg tcccgaggcc aggaaccagc caagtgccca901cactggaaga attgtcttac caagctcttg ccctgttttc tggagcacaa catgaaaagg961gatgaagatc ctcacaaggc tgccaaagag atgcctttcc agggctctgg aaaatcagca1021tggtgcccag tggagatcag caagacagtc ctctggccag agagcatcag cgtggtgcga1081tgtgtggagt tgtttgaggc cccggtggag tgtgaggagg aggaggaggt agaggaagaa1141aaagggagct tctgtgcatc gcctgagagc agcagggatg acttccagga gggaagggag1201ggcattgtgg cccggctaac agagagcctg ttcctggacc tgctcggaga ggagaatggg1261ggcttttgcc agcaggacat gggggagtca tgccttcttc caccttcggg aagtacgagt1321gctcacatgc cctgggatga gttcccaagt gcagggccca aggaggcacc tccctggggc1381aaggagcagc ctctccacct ggagccaagt cctcctgcca gcccgaccca gagtccagac1441aacctgactt gcacagagac gcccctcgtc atcgcaggca accctgctta ccgcagcttc1501agcaactccc tgagccagtc accgtgtccc agagagctgg gtccagaccc actgctggcc1561agacacctgg aggaagtaga acccgagatg ccctgtgtcc cccagctctc tgagccaacc1621actgtgcccc aacctgagcc agaaacctgg gagcagatcc tccgccgaaa tgtcctccag1681catggggcag ctgcagcccc cgtctcggcc cccaccagtg gctatcagga gtttgtacat1741gcggtggagc agggtggcac ccaggccagt gcggtggtgg gcttgggtcc cccaggagag1801gctggttaca aggccttctc aagcctgctt gccagcagtg ctgtgtcccc agagaaatgt1861gggtttgggg ctagcagtgg ggaagagggg tataagcctt tccaagacct cattcctggc1921tgccctgggg accctgcccc agtccctgtc cccttgttca cctttggact ggacagggag1981ccacctcgca gtccgcagag ctcacatctc ccaagcagct ccccagagca cctgggtctg2041gagccggggg aaaaggtaga ggacatgcca aagcccccac ttccccagga gcaggccaca2101gacccccttg tggacagcct gggcagtggc attgtctact cagcccttac ctgccacctg2161tgcggccacc tgaaacagtg tcatggccag gaggatggtg gccagacccc tgtcatggcc2221agtccttgct gtggctgctg ctgtggagac aggtcctcgc cccctacaac ccccctgagg2281gccccagacc cctctccagg tggggttcca ctggaggcca gtctgtgtcc ggcctccctg2341gcaccctcgg gcatctcaga gaagagtaaa tcctcatcat ccttccatcc tgcccctggc2401aatgctcaga gctcaagcca gacccccaaa atcgtgaact ttgtctccgt gggacccaca2461tacatgaggg tctcttag

[0056] In the methods of the present invention, the genotype of one or more SNPs in the IL4R gene are determined. The SNPs can be any SNPs in the IL4R locus including SNPs in exons, introns or upstream or downstream regions. Examples of such SNPs are provided in Table 2, below, and discussed in detail in the Examples.

[0057] In certain embodiments, the genotype of one IL4R SNP can be used to determine an individual's risk for an autoimmune disease. In other embodiments, the genotypes of a plurality of IL4R SNPs are used. For example, in certain embodiments, the genotypes of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25 or 26 of the SNPs in Table 1 can be used to determine an individual's risk for an autoimmune disease.

2TABLE 2IL4R SNPsAccdbSNPWTVarX52425.1AC004525.1FormalID rs#ExonVariationalleleallele(cDNA)(genomic)SNP namePC(−3223)TGANA128387G128387APT(−1914)CAGNA127078A127078G3I50VAG 39894272A398G4N142NCT 67692548C676T4C92516TCTNa92516C92516T4A92417TATNa92417A92417T22348967P249PCG 99780189C997G22348979F288FTC111476868T1114C18050119E375AAC137476608A1374C9E375EGA137576607G1375A22348989L389LGT141776565G1417T18050129C406RTC146676516T1466C22348999C406CCT146876514C1468T22349009L408LTC147476508T1474C18050139S411LCT148276500C1482T18050159S478PTC168276300T1682C9V5541GA191076072G1910A9P650SCT219875784C2198T18050169S727ATG242975553T2429G9G759GCT256775455C2567T18050149S761PTC253175451T2531C9P774PTC25727541012572C104963193′UTRGA304474938G3044A883293′UTRAG328974693A3289G867493′UTRCT339174581C3391T

[0058] Genotyping Methods

[0059] In the methods of the present invention, the alleles present in a sample are identified by identifying the nucleotide present at one or more of the polymorphic sites. Any type of tissue containing IL4R nucleic acid may be used for determining the IL4R genotype of an individual. A number of methods are known in the art for identifying the nucleotide present at a single nucleotide polymorphism. The particular method used to identify the genotype is not a critical aspect of the invention. Although considerations of performance, cost, and convenience will make particular methods more desirable than others, it will be clear that any method that can identify the nucleotide present will provide the information needed to identify the genotype. Preferred genotyping methods involve DNA sequencing, allele-specific amplification, or probe-based detection of amplified nucleic acid.

[0060] IL4R alleles can be identified by DNA sequencing methods, such as the chain termination method (Sanger et al., 1977, Proc. Natl. Acad. Sci. 74:5463-5467, incorporated herein by reference), which are well known in the art. In one embodiment, a subsequence of the gene encompassing the polymorphic site is amplified and either cloned into a suitable plasmid and then sequenced, or sequenced directly. PCR-based sequencing is described in U.S. Pat. No. 5,075,216; Brow, in PCR Protocols, 1990, (Innis et al., eds., Academic Press, San Diego), chapter 24; and Gyllensten, in PCR Technology, 1989 (Erlich, ed., Stockton Press, New York), chapter 5; each incorporated herein by reference. Typically, sequencing is carried out using one of the automated DNA sequencers which are commercially available from, for example, PE Biosystems (Foster City, Calif.), Pharmacia (Piscataway, N.J.), Genomyx Corp. (Foster City, Calif.), LI-COR Biotech (Lincloln, Nebr.), GeneSys technologies (Sauk City, Wis.), and Visable Genetics, Inc. (Toronto, Canada).

[0061] IL4R alleles can be identified using amplification-based genotyping methods. A number of nucleic acid amplification methods have been described which can be used in assays capable of detecting single base changes in a target nucleic acid. A preferred method is the polymerase chain reaction (PCR), which is now well known in the art, and described in U.S. Pat. Nos. 4,683,195; 4,683,202; and 4,965,188; each incorporated herein by reference. Examples of the numerous articles published describing methods and applications of PCR are found in PCR Applications, 1999, (Innis et al., eds., Academic Press, San Diego), PCR Strategies, 1995, (Innis et al., eds., Academic Press, San Diego); PCR Protocols, 1990, (Innis et al., eds., Academic Press, San Diego); and PCR Technology, 1989, (Erlich, ed., Stockton Press, New York); each incorporated herein by reference. Commercial vendors, such as PE Biosystems (Foster City, Calif.) market PCR reagents and publish PCR protocols.

[0062] Other suitable amplification methods include the ligase chain reaction (Wu and Wallace 1988, Genomics 4:560-569); the strand displacement assay (Walker et al., 1992, Proc. Natl. Acad. Sci. USA 89:392-396, Walker et al. 1992, Nucleic Acids Res. 20:1691-1696, and U.S. Pat. No. 5,455,166); and several transcription-based amplification systems, including the methods described in U.S. Pat. Nos. 5,437,990; 5,409,818; and 5,399,491; the transcription amplification system (TAS ) (Kwoh et al., 1989, Proc. Natl. Acad. Sci. USA 86:1173-1177); and self-sustained sequence replication (3SR) (Guatelli et al., 1990, Proc. Natl. Acad. Sci. USA 87:1874-1878 and WO 92/08800); each incorporated herein by reference. Alternatively, methods that amplify the probe to detectable levels can be used, such as Qβ-replicase amplification (Kramer and Lizardi, 1989, Nature 339:401-402, and Lomeli et al., 1989, Clin. Chem. 35:1826-1831, both of which are incorporated herein by reference). A review of known amplification methods is provided in Abramson and Myers, 1993, Current Opinion in Biotechnology 4:41-47, incorporated herein by reference.

[0063] Genotyping also can be carried out by detecting IL4R mRNA. Amplification of RNA can be carried out by first reverse-transcribing the target RNA using, for example, a viral reverse transcriptase, and then amplifying the resulting cDNA, or using a combined high-temperature reverse-transcription-polymerase chain reaction (RT-PCR), as described in U.S. Patent Nos. 5,310,652; 5,322,770; 5,561,058; 5,641,864; and 5,693,517; each incorporated herein by reference (see also Myers and Sigua, 1995, in PCR Strategies, supra, chapter 5).

[0064] IL4R alleles can be identified using allele-specific amplification or primer extension methods, which are based on the inhibitory effect of a terminal primer mismatch on the ability of a DNA polymerase to extend the primer. To detect an allele sequence using an allele-specific amplification- or extension-based method, a primer complementary to the IL4R gene is chosen such that the 3′ terminal nucleotide hybridizes at the polymorphic position. In the presence of the allele to be identified, the primer matches the target sequence at the 3′ terminus and primer is extended. In the presence of only the other allele, the primer has a 3′ mismatch relative to the target sequence and primer extension is either eliminated or significantly reduced. Allele-specific amplification- or extension-based methods are described in, for example, U.S. Pat. Nos. 5,137,806; 5,595,890; 5,639,611; and U.S. Pat. No. 4,851,331, each incorporated herein by reference.

[0065] Using allele-specific amplification-based genotyping, identification of the alleles requires only detection of the presence or absence of amplified target sequences. Methods for the detection of amplified target sequences are well known in the art. For example, gel electrophoresis (see Sambrook et al., 1989, supra.) and the probe hybridization assays described above have been used widely to detect the presence of nucleic acids.

[0066] Allele-specific amplification-based methods of genotyping can facilitate the identification of haplotypes, as described in the examples. Essentially, the allele-specific amplification is used to amplify a region encompassing multiple polymorphic sites from only one of the two alleles in a heterozygous sample. The SNP variants present within the amplified sequence are then identified, such as by probe hybridization or sequencing.

[0067] An alternative probe-less method, referred to herein as a kinetic-PCR method, in which the generation of amplified nucleic acid is detected by monitoring the increase in the total amount of double-stranded DNA in the reaction mixture, is described in Higuchi et al., 1992, Bio/Technology 10:413-417; Higuchi et al., 1993, Bio/Technology 11:1026-1030; Higuchi and Watson, in PCR Applications, supra, Chapter 16; U.S. Pat. Nos. 5,994,056 and 6,171,785; and European Patent Publication Nos. 487,218 and 512,334, each incorporated herein by reference. The detection of double-stranded target DNA relies on the increased fluorescence that DNA-binding dyes, such as ethidium bromide, exhibit when bound to double-stranded DNA. The increase of double-stranded DNA resulting from the synthesis of target sequences results in an increase in the amount of dye bound to double-stranded DNA and a concomitant detectable increase in fluorescence. For genotyping using the kinetic-PCR methods, amplification reactions are carried out using a pair of primers specific for one of the alleles, such that each amplification can indicate the presence of a particular allele. By carrying out two amplifications, one using primers specific for the wild-type allele and one using primers specific for the mutant allele, the genotype of the sample with respect to that SNP can be determined. Similarly, by carrying out four amplifications, each with one of the possible pairs possible using allele specific primers for both the upstream and downstream primers, the genotype of the sample with respect to two SNPs can be determined. This gives haplotype information for a pair of SNPs.

[0068] Alleles can be identified using probe-based methods, which rely on the difference in stability of hybridization duplexes formed between the probe and the IL4R alleles, which differ in the degree of complementarity. Under sufficiently stringent hybridization conditions, stable duplexes are formed only between the probe and the target allele sequence. The presence of stable hybridization duplexes can be detected by any of a number of well known methods. In general, it is preferable to amplify the nucleic acid prior to hybridization in order to facilitate detection. However, this is not necessary if sufficient nucleic acid can be obtained without amplification.

[0069] A probe suitable for use in the probe-based methods of the present invention, which contains a hybridizing region either substantially complementary or exactly complementary to a target region of SEQ ID NO: 2 or the complement of SEQ ID NO: 2, wherein the target region encompasses the polymorphic site, and exactly complementary to one of the two allele sequences at the polymorphic site, can be selected using the guidance provided herein and well known in the art. Similarly, suitable hybridization conditions, which depend on the exact size and sequence of the probe, can be selected empirically using the guidance provided herein and well known in the art. The use of oligonucleotide probes to detect single base pair differences in sequence is described in, for example, Conner et al., 1983, Proc. Natl. Acad. Sci. USA 80:278-282, and U.S. Pat. Nos. 5,468,613 and 5,604,099, each incorporated herein by reference.

[0070] In preferred embodiments of the probe-based methods for determining the IL4R genotype, multiple nucleic acid sequences from the IL4R gene which encompass the polymorphic sites are amplified and hybridized to a set of probes under sufficiently stringent hybridization conditions. The IL4R alleles present are inferred from the pattern of binding of the probes to the amplified target sequences. In this embodiment, amplification is carried out in order to provide sufficient nucleic acid for analysis by probe hybridization. Thus, primers are designed such that regions of the IL4R gene encompassing the polymorphic sites are amplified regardless of the allele present in the sample. Allele-independent amplification is achieved using primers which hybridize to conserved regions of the IL4R gene. The IL4R gene sequence is highly conserved and suitable allele-independent primers can be selected routinely from SEQ ID NO: 1. One of skill will recognize that, typically, experimental optimization of an amplification system is helpful.

[0071] Suitable assay formats for detecting hybrids formed between probes and target nucleic acid sequences in a sample are known in the art and include the immobilized target (dot-blot) format and immobilized probe (reverse dot-blot or line-blot) assay formats. Dot blot and reverse dot blot assay formats are described in U.S. Pat. Nos. 5,310,893; 5,451,512; 5,468,613; and 5,604,099; each incorporated herein by reference.

[0072] In a dot-blot format, amplified target DNA is immobilized on a solid support, such as a nylon membrane. The membrane-target complex is incubated with labeled probe under suitable hybridization conditions, unhybridized probe is removed by washing under suitably stringent conditions, and the membrane is monitored for the presence of bound probe. A preferred dot-blot detection assay is described in the examples.

[0073] In the reverse dot-blot (or line-blot) format, the probes are immobilized on a solid support, such as a nylon membrane or a microtiter plate. The target DNA is labeled, typically during amplification by the incorporation of labeled primers. One or both of the primers can be labeled. The membrane-probe complex is incubated with the labeled amplified target DNA under suitable hybridization conditions, unhybridized target DNA is removed by washing under suitably stringent conditions, and the membrane is monitored for the presence of bound target DNA. A preferred reverse line-blot detection assay is described in the examples.

[0074] Probe-based genotyping can be carried out using a “TaqMan” or “5′-nuclease assay”, as described in U.S. Pat. Nos. 5,210,015; 5,487,972; and 5,804,375; and Holland et al., 1988, Proc. Natl. Acad. Sci. USA 88:7276-7280, each incorporated herein by reference. In the TaqMan assay, labeled detection probes that hybridize within the amplified region are added during the amplification reaction mixture. The probes are modified so as to prevent the probes from acting as primers for DNA synthesis. The amplification is carried out using a DNA polymerase that possesses 5′ to 3′ exonuclease activity, e.g., Tth DNA polymerase. During each synthesis step of the amplification, any probe which hybridizes to the target nucleic acid downstream from the primer being extended is degraded by the 5′ to 3′ exonuclease activity of the DNA polymerase. Thus, the synthesis of a new target strand also results in the degradation of a probe, and the accumulation of degradation product provides a measure of the synthesis of target sequences.

[0075] Any method suitable for detecting degradation product can be used in the TaqMan assay. In a preferred method, the detection probes are labeled with two fluorescent dyes, one of which is capable of quenching the fluorescence of the other dye. The dyes are attached to the probe, preferably one attached to the 5′ terminus and the other is attached to an internal site, such that quenching occurs when the probe is in an unhybridized state and such that cleavage of the probe by the 5′ to 3′ exonuclease activity of the DNA polymerase occurs in between the two dyes. Amplification results in cleavage of the probe between the dyes with a concomitant elimination of quenching and an increase in the fluorescence observable from the initially quenched dye. The accumulation of degradation product is monitored by measuring the increase in reaction fluorescence. U.S. Pat. Nos. 5,491,063 and 5,571,673, both incorporated herein by reference, describe alternative methods for detecting the degradation of probe which occurs concomitant with amplification.

[0076] The TaqMan assay can be used with allele-specific amplification primers such that the probe is used only to detect the presence of amplified product. Such an assay is carried out as described for the kinetic-PCR-based methods described above. Alternatively, the TaqMan assay can be used with a target-specific probe.

[0077] The assay formats described above typically utilize labeled oligonucleotides to facilitate detection of the hybrid duplexes. Oligonucleotides can be labeled by incorporating a label detectable by spectroscopic, photochemical, biochemical, immunochemical, or chemical means. Useful labels include 32p, fluorescent dyes, electron-dense reagents, enzymes (as commonly used in ELISAS), biotin, or haptens and proteins for which antisera or monoclonal antibodies are available. Labeled oligonucleotides of the invention can be synthesized and labeled using the techniques described above for synthesizing oligonucleotides. For example, a dot-blot assay can be carried out using probes labeled with biotin, as described in Levenson and Chang, 1989, in PCR Protocols: A Guide to Methods and Applications (Innis et al., eds., Academic Press. San Diego), pages 99-112, incorporated herein by reference. Following hybridization of the immobilized target DNA with the biotinylated probes under sequence-specific conditions, probes which remain bound are detected by first binding the biotin to avidin-horseradish peroxidase (A-HRP) or streptavidin-horseradish peroxidase (SA-HRP), which is then detected by carrying out a reaction in which the HRP catalyzes a color change of a chromogen.

[0078] Whatever the method for determining which oligonucleotides of the invention selectively hybridize to IL4R allelic sequences in a sample, the central feature of the typing method involves the identification of the IL4R alleles present in the sample by detecting the variant sequences present.

[0079] The present invention also relates to kits, container units comprising useful components for practicing the present method. A useful kit can contain oligonucleotide probes specific for the IL4R alleles. In some cases, detection probes may be fixed to an appropriate support membrane. The kit can also contain amplification primers for amplifying a region of the IL4R locus encompassing the polymorphic site, as such primers are useful in the preferred embodiment of the invention. Alternatively, useful kits can contain a set of primers comprising an allele-specific primer for the specific amplification of IL4R alleles. Other optional components of the kits include additional reagents used in the genotyping methods as described herein. For example, a kit additionally can contain an agent to catalyze the synthesis of primer extension products, substrate nucleoside triphosphates, means for labeling and/or detecting nucleic acid (for example, an avidin-enzyme conjugate and enzyme substrate and chromogen if the label is biotin), appropriate buffers for amplification or hybridization reactions, and instructions for carrying out the present method.

[0080] The examples of the present invention presented below are provided only for illustrative purposes and not to limit the scope of the invention. Numerous embodiments of the invention within the scope of the claims that follow the examples will be apparent to those of ordinary skill in the art from reading the foregoing text and following examples.

7 EXAMPLE 1

Genotyping Protocol

Probe-Based Identification of IL4R Alleles

[0081] This example describes an genotyping method in which six regions of the IL4R gene that encompass eight polymorphic sites are amplified simultaneously and the nucleotide present at each of the eight sites is identified by probe hybridization. The probe detection is carried out using an immobilized probe (line blot) format.

[0082] Amplification Primers

[0083] Amplification of six regions of the IL4R gene, which encompass eight polymorphic sites, is carried out using the primer pairs shown below. All primers are shown in the 5′ to 3′ orientation.

[0084] The following primers amplify a 114 base-pair region encompassing codon 398.

3RR192BCAGCCCCTGTGTCTGCAGA(SEQ ID NO: 25)RR193BGTCCAGTGTATAGTTATCCGCACTGA(SEQ ID NO: 31)

[0085] The following primers amplify a 163 base-pair region encompassing codon 676.

4DBM0177BCTGACCTGGAGCAACCCGTA(SEQ ID NO: 26)DBM0178BACTGGGCCTCTGCTGGTCA(SEQ ID NO: 32)

[0086] The following primers amplify a 228 base-pair region encompassing codons 1374, 1417, and 1466.

5DBM0023BATTGTGTGAGGAGGAGGAGGAGGTA(SEQ ID NO: 27)DBM0022BGTTGGGCATGTGAGCACTCGTA(SEQ ID NO: 33)

[0087] The following primers amplify a 129 base-pair region encompassing codon 1682.

6DBM0097BCTCGTCATCGCAGGCAA(SEQ ID NO: 28)DBM0098BAGGGCATCTCGGGTTCTA(SEQ ID NO: 34)

[0088] The following primers amplify a 198 base-pair region encompassing codon 1902.

7RR200BGCCGAAATGTCCTCCAGCA(SEQ ID NO: 29)RR178BCCACATTTCTCTGGGGACACA(SEQ ID NO: 35)

[0089] The following primers amplify a 177 base-pair region encompassing codon 2531.

8DBM0112BCCGGCCTCCCTGGCA(SEQ ID NO: 30)DBM0071BGCAGACTCAGCAACAAGAGG(SEQ ID NO: 36)

[0090] To facilitate detection in the probe detection format described below, the primers are labeled with biotin attached to the 5′ phosphate. Reagents for synthesizing oligonucleotides with a biotin label attached to the 5′ phosphate are commercially available from Clonetech (Palo Alto, Calif.) and Glenn Research (Sterling, Va.). A preferred reagent is Biotin-ON from Clonetech.

[0091] Amplification

[0092] The PCR amplification is carried out in a total reaction volume of 25-100 μl containing the following reagents:

[0093] 0.2 ng/μl purified human genomic DNA

[0094] 0.2 mM each primer

[0095] 800 mM total dNTP (200 mM each dATP, dTTP, dCTP, dGTP)

[0096] 70 mM KC1

[0097] 12 mM Tris-HCl, pH 8.3

[0098] 3 mM MgCl2,

[0099] 0.25 units/μl AmpliTaq Gold™ DNA polymerase*

[0100] * developed and manufactured by Hoffmann-La Roche and commercially available from Applera (Foster City, Calif.).

[0101] Amplification is carried out in a GeneAmp7 PCR System 9600 thermal cycler (Applera, Foster City, Calif.), using the specific temperature cycling profile shown below.

9Pre-reaction incubation:94° C. for 12.5 minutes33 cycles:denature:95° C. for 45 secondsanneal:61° C. for 30 secondsextend:72° C. for 45 secondsFinal extension:72° C. for 7 minutesHold:10° C.-15° C.

[0102] Detection Probes

[0103] Preferred probes used to identify the nucleotides present at the 8 SNPs present in the amplified IL4R nucleic acids are described in Table 3. The probes are shown in the 5′ to 3′ orientation. Two probes are shown for the detection of T1466; a mixture of the two probes is used.

[0104] Probe Hybridization Assay, Immobilized Probe Format

[0105] In the immobilized probe format, the probes are immobilized to a solid support prior to being used in the hybridization. The probe-support complex is immersed in a solution containing denatured amplified nucleic acid (biotin labeled) to allow hybridization to occur. Unbound nucleic acid is removed by washing under stringent hybridization conditions, and nucleic acid remaining bound to the immobilized probes is detected using a chromogenic reaction. The details of the assay are described below.

[0106] For use in the immobilized probe detection format, described below, a moiety is attached to the 5′ phosphate of the probe to facilitate immobilization on a solid support. Preferably, Bovine Serum Albumen (BSA) is attached to the 5′ phosphate essentially as described by Tung et al., 1991, Bioconjugate Chem. 2:464-465, incorporated herein by reference. Alternatively, a poly-T tail is added to the 5′ end as described in U.S. Pat. No. 5,451,512, incorporated herein by reference.

[0107] The probes are applied in a linear format to sheets of nylon membrane (e.g., BioDyne B nylon filters, Pall Corp., Glen Cove, N.Y.) using a Linear Striper and Multispense2000™ controller (IVEK, N. Springfield, Vt.). Probe titers are chosen to achieve signal balance between the allelic variants; the titers used are provided in the table of probes, above. Each sheet is cut to strips between 0.35 and 0.5 cm in width. To denature the amplification products, 20 μl of amplification product (based on a 50 μl reaction) are added to 20 μl of denaturation solution (1.6% NaOH) and incubated at room temperature for 10 minutes to complete denaturation.

[0108] The denatured amplification product (40 ml) is added to the well of a typing tray containing 3 ml of hybridization buffer (4×SSPE, 0.5% SDS) and the membrane strip. Hybridizations is allowed to proceed for 15 minutes at 55° C. in a rotating water bath. Following hybridization, the hybridization solution is aspirated, the strip is rinsed in 3 ml warm wash buffer (2×SSPE, 0.5% SDS) by gently rocking strips back and forth, and the wash buffer is aspirated. Following rinsing, the strips are incubated in 3 ml enzyme conjugate solution (3.3 ml hybridization buffer and 12 mL of strepavidin-horseradish peroxidase (SA-HRP)) in the rotating water bath for 5 minutes at 55° C. Then the strips are rinsed with wash buffer, as above, incubated in wash buffer at 55° for 12 minutes (stringent wash), and finally rinsed with wash buffer again.

[0109] Target nucleic acid, now HRP-labeled, which remains bound to the immobilized amplification product are visualized as follows. A color development solution is prepared by mixing 100 ml of citrate buffer (0.1 M Sodium Citrate, pH 5.0), 5 ml 3,3 ′, 5,5′-tetramethylbenzidine (TMB) solution (2 mg/ml TMB powder from Fluka, Milwaukee, Wis., dissolved in 100% EtOH), and 100 μl of 3% hydrogen peroxide. The strips first are rinsed in 0.1 M sodium citrate (pH 5.0) for 5 minutes, then incubated in the color development solution with gentle agitation for 8 to 10 minutes at room temperature in the dark. The TMB, initially colorless, is converted by the target-bound HRP, in the presence of hydrogen peroxide, into a colored precipitate. The developed strips are rinsed in water for several minutes and immediately photographed.

8 EXAMPLE 2

Association with Type 1 Diabetes

[0110] IL4R genotyping was carried out on individuals from 282 Caucasian families ascertained because they contained two offspring affected with type 1 diabetes. The IL4R genotypes of all individuals were determined. IL4R genotyping was carried out using a genotyping method essentially as described in Example 1. In addition to the 564 offspring (2 sibs in each of 282 families) in the affected sib pairs on which ascertainment was based, there were 26 other affected children. There were 270 unaffected offspring among these families.

[0111] The family-based samples were provided as purified genomic DNA from the Human Biological Data Interchange (HBDI), which is a repository for cell lines from families affected with type 1 diabetes. All of the HBDI families used in this study are nuclear families with unaffected parents (genetically unrelated) and at least two affected siblings. These samples are described further in Noble et al., 1996, Am. J. Hum. Genet. 59:1134-1148, incorporated herein by reference.

[0112] It is known that the HLA genotype can have a significant effect, either increased or decreased depending on the genotype, on the risk for type 1 diabetes. In particular, individuals with the HLA DR genotype DR3-DQB1*0201/DR4-DQB1*0302 (referred to as DR3/DR4 below) appear to be at the highest risk for type 1 diabetes (see Noble et al., 1996, Am. J. Hum. Genet. 59:1134-1148, incorporated herein by reference). These high-risk individuals have about a 1 in 15 chance of being affected with type 1 diabetes. Because of the strong effect of this genotype on the likelihood of type 1 diabetes, the presence of the DR3-DQB1*0201/DR4-DQB1*0302 genotype could mask the contribution from the IL4R allelic variants.

[0113] Individuals within these families also were genotyped at the HLA DRB1 and DQB1 loci. Of the affected sib pairs, both sibs have the DR3/DR4 genotype in 90 families. Neither affected sib has the DR 3/4 genotype in 144 families. Exactly one of the affected pair has the DR 3/4 genotype in the remaining 48 families.

9 EXAMPLE 3

Association with Type I Diabetes in Philippine Samples

[0114] Subjects

[0115] Samples from 183 individuals from the Philippines were genotyped using the reverse lineblot method essentially as described in Example 1. Among the 183 individuals, 89 individuals have type I diabetes and 94 are matched controls.

[0116] (Sample 91IDDM not typed)

[0117] Results

[0118] The genotypes of the affected and nonaffected individuals are shown in the Table 4 (SEQ ID NO: 20-24). Both the actual numbers and the frequencies are provided for each genotype. The data (Table 5) confirm the presence of an association of IL4R SNP variants with type I diabetes.

10 EXAMPLE 4

Methods of Genotyping

[0119] Eight exemplary SNPs in the human IL4R gene are listed in Table 6. Each SNP is described by its position in the reference GenBank accession sequence X52425.1 (SEQ ID NO: 1). For example, SNP 1 is found at position 398 of X52425.1 (SEQ ID NO: 1), where an “A” nucleotide is present. The variant allele at this position has a “G” nucleotide. The SNPs will be referred to by the SNP # in the subsequent text.

[0120] The regions of the IL4R gene that encompass the SNPs are amplified and the nucleotide present identified by probe hybridization. The probe detection is carried out using an immobilized probe (line blot) format, to be described.

[0121] Amplicons and Primers

[0122] The pairs of primers used to amplify the regions encompassing the eight SNPs are listed in Table 7 (SEQ ID NO: 25-36). SNPs numbers 3, 4, and 5 are co-amplified on the same 228 basepair fragment. The primers are modified at the 5′ phosphate by conjugation with biotin. Reagents for synthesizing oligonucleotides with a biotin label attached to the 5′ phosphate are commercially available from Clontech (Palo Alto, Calif.) and Glenn Research (Sterling, Va.). A preferred reagent is Biotin-ON from Clontech.

[0123] Amplification Conditions

[0124] The six amplicons are amplified together in a single PCR reaction in a total reaction volume of 25-100 ml containing the following reagents:

[0125] 0.2 ng/ml purified human genomic DNA

[0126] 0.2 mM each primer

[0127] 800 mM total dNTP (200 mM each dATP, dTTP, dCTP, dGTP)

[0128] 70 mM KCl

[0129] 12 mM Tris-HCI, pH 8.3

[0130] 3 mM MgCl2

[0131] 0.25 units/ml AmpliTaq Gold™ DNA polymerase*

[0132] *developed and manufactured by Hoffinann-La Roche and commercially available from PE Biosystems (Foster City, Calif.).

[0133] Amplification is carried out in a GeneAmp 7 PCR System 9600 thermal cycler (PE Biosystems, Foster City, Calif.), using the specific temperature cycling profile shown below:

10Pre-reaction incubation:94° C. for 12.5 minutes33 cycles:Denature:95° C., 45 secondsAnneal:61° C., 30 secondsExtend:72° C., 45 secondsFinal Extension:72° C., 7 minutes.Hold:10° C.-15° C.

[0134] Hybridization Probes and Conditions

[0135] The probes are immobilized to a solid support prior to being used in the hybridization. The probe-support complex is immersed in a solution containing denatured amplified nucleic acid to allow hybridization to occur. Unbound nucleic acid is removed by washing under sequence-specific hybridization conditions, and nucleic acid remaining bound to the immobilized probes is detected. The detection is carried out using the chromogenic substrate TMB.

[0136] For use in the immobilized probe detection format, described below, a moiety is attached to the 5′ phosphate of the probe to facilitate immobilization on a solid support. Preferably, Bovine Serum Albumin (BSA) is attached to the 5′ phosphate essentially as described by Tung et al., 1991, Bioconjugate Chem. 2:464-465, incorporated herein by reference. Alternatively, a poly-T tail is added to the 3′ end as described in U.S. Pat. No. 5,451,512 incorporated herein by reference.

[0137] The probes are applied in a linear format to sheets of nylon membrane using a Linear Striper and Multispense2000™ controller (IVEK, N. Springfield, Vt.). The allele-specific probes and their titers are shown in Table 8. The detection of the wildtype allele of SNP #5 is carried out using a mixture of two probes as listed; this mixture enables the detection of SNP #5 indiscriminately of another nearby SNP (not relevant to this report). The probe titers listed are chosen to achieve signal balance between the allelic variants. Following probe application, each nylon sheet is cut widthwise into strips between 0.35 and 0.55 cm wide.

[0138] To denature the amplification products 20 ml of amplification product is added to 20 ml of denaturation solution (1.6% NaOH) and incubated at room temperature. The denatured amplification product (40 ml) is added to the well of a typing tray containing 3 ml of hybridization buffer (3×SSPE, 0.5% SDS) and the membrane strip. Hybridization is allowed to proceed for 15 minutes at 55° C. in a rotating water bath. Following hybridization, the hybridization solution is aspirated, the strip rinsed in 3 ml warm wash buffer (1.5×SSPE, 0.5% SDS) by gently rocking the strips back and forth, and the wash buffer is aspirated. Following rinsing, the strips are incubated in 3 ml enzyme conjugate solution (3.3 ml hybridization buffer and 12 ml of streptavidin-horseradish peroxidase (SA-HRP)) in the rotating water bath for 5 minutes at 55° C. Then the strips are rinsed with wash buffer, as above, incubated in wash buffer at 55° C. for 12 minutes (stringent wash), and finally rinsed with wash buffer again.

[0139] Target nucleic acids, now HRP-labeled, which remains bound to the immobilized amplification product are visualized as follows. The strips are rinsed in 0.1 M sodium citrate (pH 5.0) for 5 minutes at room temperature, then incubated in the color development solution with gentle agitation for 8-10 minutes at room temperature in the dark. The color development solution is prepared by mixing 100 ml of citrate buffer (0.1 M sodium citrate, pH 5.0), 5 ml 3,3 ′, 5,5′-tetramethylbenzidine (TMB) solution (2 mg/ml TMB powder from Fluka (Milwaukee, Wis.) dissolved in 100% EtOH), and 100 ml of 3% hydrogen peroxide. The TMB, initially colorless, is converted by the target-bound HRP in the presence of hydrogen peroxide into a colored precipitate. The developed strips are rinsed in water for several minutes and immediately photographed.

11 EXAMPLE 5

Association with Type I Diabetes in HBDI Subjects

[0140] Subjects

[0141] IL4R genotyping was carried out on individuals from 282 Caucasian families ascertained because they contained two offspring affected with type I diabetes. The IL4R genotypes of all individuals were determined. IL4R genotyping was carried out using the reverse-line blot method described. In addition to the 564 offspring (two sibs in each of 282 families in the affected sib pairs on which ascertainment was based), there were 26 other affected children. There were 270 unaffected offspring among these families.

[0142] The family-based samples were provided as purified genomic DNA from the Human Biological Data Interchange (HBDI), which is a repository for cell lines from families affected with type I diabetes. All of the HBDI families used in this study are nuclear families with unaffected parents and at least two affected siblings. These samples are described further in Noble et al., 1996, Am. J. Hum. Genet. 59:1134-1148, incorporated herein by reference.

[0143] Statistical Analysis, Methods and Algorithms

[0144] Since the eight SNPs in IL4R are both physically and genetically very closely linked to each other, the presence of a particular allele at a particular SNP is correlated with the presence of another particular allele at a nearby SNP. This non-random association of two or more SNPs′ alleles is known as linkage disequilibrium (LD).

[0145] Linkage disequilibrium among the eight IL4R SNPs was assessed using the genotypes of the 282 pairs of parents. These 564 individuals are not related to each other except by marriage. A summary of the calculated frequency of the WT allele for each SNP in this group of 564 individuals (the “HBDI founders”) is shown in Table 9.

[0146] The calculation of LD can be performed in several ways. We used two complementary methods to assess LD between all pairs of IL4R SNP loci. In the first method, we calculated the values of two distinct but related metrics for LD, namely D and D (Devlin and Risch 1995), using the Maximum Likelihood Estimation algorithm of Hill (Hill 1974). The values for D and D for all pairs of IL4R SNPs are shown in Table 10, in the lower left triangular portion. Both D and D can have values that range between B1 and +1. Values near +1 or B1 suggest strong linkage disequilibrium; values near zero indicate the absence of LD.

[0147] A second measure of LD uses a permutation test method implemented in the Arlequin program (L. Excoffier, University of Geneva, CH) (Excoffier and Slatkin 1995; Slatkin and Excoffier 1996). This method maximizes the likelihood ratio statistic (S=−2log (LH*/LH)) by permuting alleles and recalculating S over a large number of iterations until S is maximized. These iterations allow the determination of the null distribution of S, and thus the maximum S obtained can be converted into an exact P-value (significance level). These P-values are listed in the upper right triangular portion of Table 10.

[0148] Table 10 of pairwise LD shows that there is significant evidence for LD between SNPs 1 and 2, and among (all combinations of) SNPs 3, 4, 5, 6, 7 and 8. SNPs 3 through 8 are known to exist within 1200 basepairs of each other in a single exon (exon 9) of the IL4R gene, and the LD between these SNPs is evidence for very small genetic distances as well.

[0149] The Transmission Disequilibrium Test (TDT) of Spielman (Spielman and Ewens 1996; Spielman and Ewens 1998) was performed on the IL4R genotype data for the 282 affected sib pairs (viz., a family structure consisting of the two parents and the two affected children). The TDT was used to test for the association of the individual alleles of the eight IL4R SNPs to type I diabetes. The TDT assesses whether an allele is transmitted from heterozygous parents to their affected children at a frequency that is significantly different than expected by chance. Under the null hypothesis of no association of an allele with disease, a heterozygous parent will transmit or will not transmit an allele with equal frequency to an affected child. The significance of deviation from the null hypothesis can be assessed using the McNemar chi-squared test statistic (=(T−NT){circumflex over ( )}2/(T+NT), where T is the observed number of transmissions and NT is the observed number of non-transmissions). The significance (P-value) of the McNemar chi-squared test statistic is equal to the Pearson chi-squared statistic with one degree of freedom (Glantz 1997).

[0150] The results of the single SNP locus TDT results are shown in tables 10A and 10B. The TDT/S-TDT program (version 1.1) of Spielman was used to perform the counting of transmitted and non-transmitted alleles (Spielman, McGinnis et al. 1993; Spielman and Ewens 1998). The table lists the observed transmissions of the wildtype allele at each SNP locus. Since these are biallelic polymorphisms, the transmission counts of the variant allele are equal to the non-transmissions of the wildtype allele.

[0151] The counts of transmissions and non-transmissions of alleles to the probands only shown in Table 11A do not quite reach statistical significance, at a=0.05. However, it is valid to count transmission events to all affected children. However, when the TDT is used in this way (or, for that matter, with more than one child per family), then a significant test statistic is evidence of linkage only, not of association and linkage. Table 11B shows the TDT analysis when 26 additional affected children are included. The results presented in Table 11B below show that there is a significant deviation from the expected transmission frequencies for alleles of SNPs 3, 4, 5 and 6. Inspection of the “% transmission” values for these SNPs indicates that the wildtype allele is transmitted to affected children at frequencies greater than the expectation of 50%.

[0152] The evidence for strong LD among the eight IL4R SNPs suggested to us that we could detect the transmission of the ordered set of alleles from each parent to each affected child in the HBDI cohort. This ordered set of alleles corresponds physically to one of the two parental chromosomes, and is called a haplotype. By inferring the parental haplotypes and their transmission or non-transmission to affected children, we expect to obtain much more statistical information than from alleles alone.

[0153] We inferred IL4R haplotypes using a combination of two methods. As the first step, we used the GeneHunter program (Falling Rain Genomics, Palo Alto, Calif.) (Kruglyak, Daly et al. 1996), as it very rapidly calculates haplotypes from genotype data from pedigrees. We then inspected each HBDI family pedigree individually using the Cyrillic program (Cherwell Scientific Publishing, Palo Alto, Calif.), to resolve any ambiguous or unsupported haplotype assignments. Unambiguous and non-recombinant haplotypes could be confidently assigned in all but six of the 282 families. The haplotype data for these 276 families were used in subsequent data analysis.

[0154] The IL4R gene has the property that many of the SNPs reside within the 3′-most exon (exon 9), whose coding region is approximately 1.5 kb long. We have exploited this to develop a method for directly haplotyping up to five of these exon 9 alleles (viz., SNPs #3-7) without needing parental genotypes. As many of these SNPs direct changes to the amino acid sequence of the IL4R protein, different haplotypes encode different proteins with likely different functions.

[0155] Haplotypes, in an individual for which no parental genotypic information is known, can be inferred unambiguously only when at most one of the SNP sites of those is heterozygous. In other cases, the ambiguity must be resolved experimentally.

[0156] We use two allele-specific primers with one common primer to perform PCR reactions (using Stoffel Gold™ polymerase) to separately amplify the DNA from each chromosome, as shown in FIG. 1 below. The alleles on each amplicon are then detected by the same strip hybridization procedure, and the linked alleles called directly. The choice of allele-specific (colored or shaded arrows) and common (black arrows) primers depends on which SNP loci are heterozygous. The primers are modified at the 5′ phosphate by conjugation with biotin, and are shown in Table 12 (SEQ ID NO: 54-62).

[0157] For each haplotyping assay, two PCR reactions are set up for each DNA to be tested. One reaction contains the common primer and the wildtype allele-specific primer, the other contains the common primer and the variant allele-specific primer. Each PCR reaction is made in a total reaction volume of 50-100 ml containing the following reagents:0.2 ng/ml purified human genomic DNA

[0158] 0.2 mM each primer

[0159] 800 mM total dNTP (200 mM each dATP, dTTP, dCTP, dGTP)

[0160] 10 mM KCl

[0161] 10 mM Tris-HCl, pH 8.0

[0162] 2.5 mM MgCl2

[0163] 0.12 units/ml Stoffel Gold™ DNA polymerase*

[0164] *developed and manufactured by Roche Molecular Systems but not commercially available.

[0165] Amplification is carried out in a GeneAmp 7 PCR System 9600 thermal cycler (PE Biosystems, Foster City, Calif.), using the specific temperature cycling profile shown below:

11Pre-reaction incubation:94° C. for 12.5 minutes33 cycles:Denature:95° C., 45 secondsAnneal:64° C., 30 secondsExtend:72° C., 45 secondsFinal Extension:72° C., 7 minutes.Hold:10° C.-15° C.

[0166] Following amplification, each PCR product reaction is denatured and separately used for hybridization to the membrane-bound probes as described above.

[0167] Haplotype Sharing in Affected Sibs

[0168] Evidence for linkage of IL4R to type 1 diabetes (as opposed to association) can be assessed by the haplotype sharing method. This method assesses the distribution over all families of the number of chromosomes that are identical-by-descent (IBD) between the two affected siblings in each family. For example, if in a family, the father transmits the same one of his two IL4R haplotypes to both children, and the mother transmits the same one of her two IL4R haplotypes to both children, then the children are said to share two chromosomes IBD (or, to be IBD=2). If both parents transmit different IL4R haplotypes to their two children, the children are said to be IBD=0.

[0169] Under the null hypothesis of no linkage of IL4R to type 1 diabetes, the proportion of families IBD=0 is 25%, IBD=1 is 50% and IBD=2 is 25%, as expected by random assortment (see Table 13). Evidence for a statistically significant difference from this expectation can be assessed using the chi-square statistic.

[0170] Identity-by-descent (IBD) values of parental IL4R haplotypes in the affected sibs could be determined unambiguously in 256 families. In the rest of the families, one or both parents were homozygous and/or the parental source of the child's chromosomes could not be determined. The distribution of IBD is shown in Table 13.

[0171] It is known that the HLA genotype can have a significant effect, either increased or decreased depending on the genotype, on the risk for type 1 diabetes. In particular, individuals with the HLA DR genotype DR3-DQB1*0201/DR4-DQB1*0302 (referred to as DR3/4 below) appear to be at the highest risk for type 1 diabetes (see Noble, Valdes et al., 1996), incorporated herein by reference). These high-risk individuals have about a 1 in 15 chance of being affected with type 1 diabetes. Because of the strong effect of this genotype on the likelihood of type 1 diabetes, the presence of the DR3/4 genotype could mask the contribution of IL4R alleles or haplotypes.

[0172] The distribution of IBD in families was stratified into two groups based on the DR3/4 genotype of the children. The first group contains the families in which one or both of the sibs are DR3/4 (“Either/both sib DR3/4”, n=119). The second group contains the families where neither child is DR3/4 (“Neither sib DR3/4”, n=137). The IBD distribution in these subgroups is shown in Table 13. There was no statistically significant departure from the expected distribution of IBD sharing in the “either/both sib DR3/4” subgroup of families. There is a statistically significant departure from the expected distribution of IBD sharing in the “neither sib DR3/4” subgroup of families (Table 13). This indicates that there is evidence for linkage of the IL4R loci to IDDM in the “neither sib DR3/4” families.

[0173] Association by AFBAC

[0174] Association of IL4R haplotypes with type I diabetes was assessed using the AFBAC (Affected Family Based Control) method (Thomson 1995). In essence, two groups of haplotypes, and the haplotype frequencies in the groups, are compared with each other as in a case/control scheme of sampling. These two groups are the case (transmitted) and the control (AFBAC) haplotypes.

[0175] The case haplotypes, namely those transmitted to the affected children, are collected and counted as follows. For every pair of siblings, regardless of the status of the parents (homozygote or heterozygote) we count all four transmitted chromosomes. However, the haplotypes in the two siblings in a pair are not independent of each other. The way to make a statistically conservative and valid enumeration is to divide all counts by two.

[0176] The control (AFBAC) haplotypes are those that are never transmitted to the affected pair of children (Thomson 1995). The AFBAC haplotypes permit an unbiased estimate of control haplotype frequencies. AFBACs can only be determined from heterozygous parents, and furthermore, only when the parent transmits one haplotype to both children; the other, never-transmitted haplotype is counted in the AFBAC population. The AFBAC population serves as a well-matched set of control haplotypes for the study.

[0177] Table 14A shows the comparison of transmitted and AFBAC frequencies for all HBDI haplotypes that were observed at least five times in the complete sample set. Each row represents data on an individual haplotype. However, in all 16 distinct haplotypes were observed in the HBDI data set, although some very rarely. The seven rarest haplotypes are grouped together in the “others” row. Each haplotype is listed by the allele present at each of the nine IL4R SNPs.

[0178] Tables 13B and 13C show the comparison of transmitted and AFBAC frequencies for all HBDI haplotypes seen in the “either/both sib DR3/4” and the “neither sib DR3/4” subgroups of families, respectively. These tables show that stratifying the families based on the DR3/4 genotype of the children permits the identification of haplotypes that are associated with IDDM. In particular, in the “neither sib DR3/4” subgroup one haplotype (labeled “2 1 2 2 2 2 2 1”) is significantly underrepresented in the pool of transmitted chromosomes (P<0.005).

[0179] From the transmitted and AFBAC haplotype frequency information in Tables 14B and 14C, one can derive by counting the frequencies of transmitted and AFBAC alleles. The locus-by-locus AFBAC analyses are shown in Tables 15A and 15B.

[0180] The data present in Tables 15A and 15B show that there statistically significant evidence, in the “neither sib DR3/4” subgroup of families, that alleles of SNPs numbers 3, 4 5, 6, and 7 are associated with IDDM. The evidence for association is especially strong for SNP #6. In the “either/both sib DR3/4” subgroup, there is the same trend of allelic association, although the trend does not quite reach statistical significance.

[0181] Association by Haplotype-Based TDT

[0182] The TDT analysis can be utilized for determining the transmission (or non-transmission) of 8-locus haplotypes from parents to affected children, once the haplotypes have been inferred or assigned by molecular means. Tables 16A, B, and C summarize the TDT results for the HBDI families. Table 16A counts informative transmission events only to one child (the proband) per family, Table 16B counts informative transmissions to the two primary affected children per family, and Table 16C counts informative transmissions to all affected children. The 8-locus haplotype TDT results reach statistical significance when all affected children (2 or more per family) are included.

[0183] The TDT analyses can be performed on families after stratifying for the DR3/4 genotype of the children. The summary of counts of informative transmissions to the two primary affected children per family, in the “either/both sib DR3/4” and the “neither sib DR3/4” subgroups of families, are shown in tables 17A and 17B respectively. As presented above, there is significant evidence of linkage of IDDM to IL4R in the “neither sib DR3/4” subgroup. The data in Table 17B indicate that there is significant evidence of association of IL4R haplotypes to IDDM, in the presence of this linkage. In particular, in the “neither sib DR3/4” subgroup one haplotype (labeled “2 1 2 2 2 2 2 1”) is significantly under-transmitted to affected children.

12. EXAMPLE 6

Association with Type I Diabetes in Philippine Samples

[0184] Samples from 183 individuals from the Philippines were genotyped using the reverse lineblot method. 89 individuals had type I diabetes, 94 were matched controls.

[0185] Genotyping Methods

[0186] These subjects were genotyped by the same methods as described above for the HBDI samples. Molecular haplotyping of IL4R SNPs was also performed as described above.

[0187] Statistical Methods & Algorithms

[0188] Allele and haplotype frequencies between groups were compared using the z-test. Haplotype compositions and frequencies were estimated from the genotype data using the Arlequin program (L. Excoffier, University of Geneva, CH) (Excoffier and Slatkin 1995, Slatkin and Excoffier 1996).

[0189] Results

[0190] The wildtype allele frequencies for each of the eight IL4R SNPs in the Filipino control and diabetic groups are shown in Table 18. Table 18 provides evidence that the allele frequencies for SNPs #3 and 4 are significantly different between the two groups, and suggests an association to IDDM.

[0191] It is also possible to infer and construct the multi-locus IL4R haplotypes in the Filipino subjects, either computationally by Maximum-likelihood estimation (MLE), or by using molecular haplotyping methods described previously. Table 19 lists the five most frequent computationally estimated haplotypes and their frequencies in the Filipino diabetics and controls, and presents the significance of the differences in frequencies.

[0192] Table 20 lists the observed haplotypes as derived and inferred by molecular haplotyping; the unambiguous seven-locus haplotypes (SNP#1 allele not shown, as indicated by the “x”) are compiled. Tables 18 and 19 both provide evidence of a statistically significant difference in the frequency of one or more haplotypes between the Filipino control and diabetic populations, and support the presence of an association of IL4R to IDDM. In particular, the haplotype (labeled “x 1 2 2 2 2 2 1”) is significantly underrepresented in the Filipino diabetics group.

[0193] Citations

[0194] Devlin, B. and N. Risch (1995). “A comparison of linkage disequilibrium measures for fine-scale mapping.” Genomics 29(2): 311-22.

[0195] Excoffier, L. and M. Slatkin (1995). “Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population.” Mol Biol Evol 12(5): 921-7.

[0196] Glantz, S. A. (1997). Primer of biostatistics. New York, McGraw-Hill Health Professions Division.

[0197] Hill, W. G. (1974). “Estimation of linkage disequilibrium in randomly mating populations.” Heredity 33(2): 229-39.

[0198] Kruglyak, L., M. J. Daly, et al. (1996). “Parametric and nonparametric linkage analysis: a unified multipoint approach.” Am J Hum Genet 58(6): 1347-63.

[0199] Noble, J. A., A. M. Valdes, et al. (1996). “The role of HLA class II genes in insulin-dependent diabetes mellitus: molecular analysis of 180 Caucasian, multiplex families.” Am J Hum Genet 59(5): 1134-48.

[0200] Slatkin, M. and L. Excoffier (1996). “Testing for linkage disequilibrium in genotypic data using the Expectation-Maximization algorithm.” Heredity 76(Pt 4): 377-83.

[0201] Spielman, R. S. and W. J. Ewens (1996). “The TDT and other family-based tests for linkage disequilibrium and association.” Am J Hum Genet 59(5): 983-9.

[0202] Spielman, R. S. and W. J. Ewens (1998). “A sibship test for linkage in the presence of association: the sib transmission/disequilibrium test.” Am J Hum Genet 62(2): 450-8.

[0203] Spielman, R. S., R. E. McGinnis, et al. (1993). “Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM).” Am J Hum Genet 52(3): 506-16.

[0204] Thomson, G. (1995). “Mapping disease genes: family-based association studies.” Am J Hum Genet 57(2): 487-98.

[0205] Various embodiments of the invention have been described. The descriptions and examples are intended to be illustrative of the invention and not limiting. Indeed, it will be apparent to those of skill in the art that modifications may be made to the various embodiments of the invention described without departing from the spirit of the invention or scope of the appended claims set forth below.

[0206] All references cited herein are hereby incorporated by reference in their entireties.

IL-4 receptor sequence variation associated with type 1 diabetes

Information

Publication Number

Date Filed

Date Published

Inventors

CPC

US Classifications

International Classifications

Abstract

Description

Claims

Provisional Applications (1)