Marker Gene for Arthrorheumatism Test

Information

  • Patent Application
  • 20080113346
  • Publication Number
    20080113346
  • Date Filed
    March 29, 2005
    19 years ago
  • Date Published
    May 15, 2008
    16 years ago
Abstract
It is intended to identify rheumatoid arthritis susceptibility genes by a highly efficient, low-cost mapping method using microsatellites. In the present invention, novel rheumatoid arthritis susceptibility genes, that is, TNXB, NOTCH4, RAB6A, MPRL48, UCP2, and UCP3 genes, in the human genomic DNA sequence were identified by conducting case-control association analysis on rheumatoid arthritis by use of microsatellite polymorphic markers assigned at approximately 100-kb intervals to narrow down candidate regions and then conducting association analysis and linkage analysis with SNP as a marker.
Description
TECHNICAL FIELD

The present invention relates to rheumatoid arthritis susceptibility genes identified de novo by a gene mapping method using microsatellite polymorphic markers, and to use thereof.


BACKGROUND ART

Arthrorheumatism (Rheumatoid arthritis: RA) is a chronic inflammatory disease characterized by autoimmunity. RA, which exhibits progressive inflammation with meningeal cell overproliferation in joints, is pathologically classified into joint tissue diseases. The morbidity of RA with respect to population is high and reaches approximately 1% of various races. The familial aggregation and monozygotic twin concordance rates of RA have previously been reported to be relatively high, suggesting the presence of an inheriting factor in its pathogenesis. Indeed, it is known that in the family of a proband with RA, a closer relative of the proband has higher risk of recurrence. According to previous reports, the ratio of risk of the disease in the siblings (λs) of the proband falls within 2 to 10.


Among RA susceptibility genes previously found, the HLA-DRB1 locus in the HLA class III region on 6p21.3 has been thought to most strongly contribute to RA and estimated to account for 30 to 50% of total genetic risk. On the contrary, this also suggests the presence of other genes undiscovered having genetic contribution as strong as HLA-DRB1. Some of such other genes have been considered to reside in the HLA region and have linkage with HLA-DRB1. Many researchers have continuously conducted studies to identify those other genes by various approaches including genomewide linkage analysis such as sib-pair analysis (Non-Patent Documents 1 to 3) and genetic association analysis such as case-control analysis (case-control study) on candidate genes or chromosome regions (Non-Patent Documents 4 to 6). However, these studies fell short of the identification of all RA susceptibility genes and the full explanation of mechanisms of its onset.


An approach that examines the association between bases exhibiting single nucleotide polymorphisms (SNPs) in the human genomic DNA sequence and disease has received attention as a method for identifying novel disease-related genes or the like. However, SNPs are derived from one-nucleotide substitution on the genome and therefore result in only two alleles in general. In this approach, since only some SNPs, which are present within approximately 5 kb from a disease-related gene to be mapped, exhibit association, genome mapping with SNPs as polymorphic markers requires assigning an enormous number of SNPs as markers for analysis. Under the present circumstances, this approach is therefore applied only to a limited region narrowed down to some extent. On the other hand, a microsatellite polymorphic marker has many alleles and is characterized in that it exhibits association even at some position distant from a gene to be mapped. However, the microsatellite polymorphic marker presented problems in that too many polymorphic markers assigned make analysis difficult in light of time and labors, as with SNPs, while too few polymorphic markers assigned make marker spacings too large and might overlook a disease-related gene.


The present inventors have developed a gene mapping method using microsatellite polymorphic markers assigned at approximately 50-kb to 150-kb intervals on average and have found that a region where a disease-related gene or gene relating to human phenotypes with genetic factors is present can be identified at high efficiency and low cost by using the method (Patent Document 1).


Non-Patent Document 1: Conelis, F. et al., Proc. Natl. Acad. Sci. USA, 95, 10746 (1998)


Non-Patent Document 2: Shiozawa, S. et al., Int. Immunol., 10, 1891 (1998)


Non-Patent Document 3: Jawaheer, D. et al., Am. J. Him. Genet., 68, 927 (2001)


Non-Patent Document 4: Okamoto, K., et al., Am. J. Hum. Genet., 72, 303 (2003)


Non-Patent Document 5: Suzuki, A. et al., Nat. Genet., 34, 395 (2003)


Non-Patent Document 6: Tokuhiro, S. et al., Nat. Genet., 35, 341 (2003)


Patent Document 1: International Publication of WO01/79482


DISCLOSURE OF THE INVENTION

Accordingly, an object of the present invention is to identify novel RA susceptibility genes by applying a precise mapping method with microsatellite markers capable of completely identifying disease susceptibility genes at higher cost efficiency than that of conventional approaches of SNP association analysis to multifactorial disorder RA for the first time. A further object of the present invention is to eventually develop the effective prevention/treatment of RA by collecting data on RA pathogenesis or onset mechanisms on the basis of the information of the identified RA susceptibility genes or RA-related proteins as expression products of the genes and performing proper screening.


In the present invention, a gene mapping method using microsatellite (hereinafter, referred to as “MS”) was used to identify novel RA susceptibility genes whose associations with RA had not been known so far.


The RA susceptibility genes identified de novo by the present invention are TNXB and NOTCH4 genes (chromosome 6) as well as RAB6A, MPRL48, FLJ11848, UCP2, and UCP3 genes (chromosome 11) in the human genomic DNA sequence. The present inventors conducted the association analysis of RA with SNPs present in the genomic DNA sequences of these de novo-identified genes, and found statistically significant association for the first time.


Thus, in the first aspect, the present invention provides a marker gene for arthrorheumatism test consisting of a consecutive partial DNA sequence comprising at least one base exhibiting single nucleotide polymorphism present in a TNXB, NOTCH4, RAB6A, MPRL48, UCP2 or UCP3 gene in the human genomic DNA sequence, or of a complementary strand of the partial DNA sequence.


In the second aspect, the present invention provides a test method and test kit for RA using the marker gene.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is a diagram showing the positions where MS markers used in Example of the present application are mapped on chromosomes;



FIG. 2 is a diagram showing the mapping and P-values of MS markers used in a first-phase screening. The P-values of 133 MS markers exhibiting significance are indicated by circles (∘);



FIG. 3 is a diagram showing the positions where MS and SNP markers selected in Example of the present application are mapped on chromosomes, blocks predicted by EM and Clark algorithms, and P-values for allele frequency;



FIG. 4 is a diagram showing the distribution of tissue expression of RA susceptibility genes identified by the present invention, and so on;



FIG. 5-1 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-2 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-3 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-4 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-5 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-6 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-7 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-8 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-9 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-10 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-11 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-12 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-13 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-14 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-15 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-16 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-17 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-18 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-19 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-20 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-21 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-22 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-23 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-24 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-25 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-26 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-27 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-28 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-29 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-30 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-31 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-32 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-33 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-34 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-35 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-36 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-37 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-38 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-39 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-40 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention;



FIG. 5-41 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention; and



FIG. 5-42 is a list showing information on the designations (left in each column) and Genbank registration numbers (right in each column) of microsatellite markers and primers used in the present invention.





BEST MODE FOR CARRYING OUT THE INVENTION

A gene mapping method used in the present invention is a method described in the Patent Document 1. Specifically, this method comprises: using forward and reverse primers corresponding to each DNA sequence of consecutive DNA sequences comprising MS polymorphic markers assigned at given intervals, preferably approximately 100-kb intervals, on the human genome to amplify the DNA sequence samples by polymerase chain reaction PCR; performing electrophoresis on a high resolution gel such as a DNA sequencer; and measuring and analyzing the microsatellite polymorphic marker-containing DNA sequence fragments, which are amplification products.


MS polymorphic markers exhibiting false positive can be decreased drastically without forced correction by adopting multi-phased screening that involves performing a first (first-phase) screening using forward and reverse primers corresponding to MS polymorphic markers assigned genomewide and performing a second (second-phase) screening on MS polymorphic markers exhibiting positive in the first screening by use of a different sample population.


The position of a target gene is restricted by the multi-phased screening using MS. Then, candidate regions or gene loci can further be determined in detail by another gene mapping method. For example, analysis using SNP is effective for this purpose. Specifically, the polymorphism frequencies of SNPs in the candidate regions that appear to have the target gene are compared, for example by association analysis, between populations of patients and normal individuals, and SNP markers with linkage disequilibrium detected by haplotype analysis can be detected by linkage disequilibrium analysis.


To identify RA susceptibility genes, the present invention adopted a previously reported pooled DNA method as a screening method with good cost efficiency using 27,158 MS markers including 20,755 newly established loci. The genome association analysis was conducted by a three-phased screening method involving three major steps as described above: (1) three-phased genomic screening for reducing a type I error rate; (2) the confirmation of association of pools by individual genotyping on positive MS loci; and (3) identification by detailed individual genotyping on SNP markers in the neighborhoods of candidate regions in screened and additional populations.


The association analysis of the whole genome demonstrated the strongest association of the HLA-DRB1 gene, which has previously been known to have association with RA (P=9.7×10 −20). Furthermore, strong association was observed, independently of HLA-DRB1, in NOTCH4 (P=1.1×10−11) and TNXB (P=7.6×10−7) genes on chromosome 6 also carrying HLA-DRB1. Moreover, novel association was found in a mitochondrial-related gene cluster on 11q13.4 containing mitochondrial ribosomal protein L48 (MPRL48) and two mitochondrial proteins called uncoupling proteins (UCP2 and UCP3). Weak association was seen on 10p13 and 14q23.1. In addition to these novel associations, association was confirmed in IkBL (Non-Patent Document 4) and PADI4 (Non-Patent Document 5) genes, which have already been reported to have association with RA, as with HLA-DRB1.


Namely, statistically significant difference in allele frequencies of SNPs present in TNXB, NOTCH4, RAB6A, MPRL48, UCP2, and UCP3 genes found de novo to have association was observed between RA patients and normal individuals. Thus, a consecutive partial DNA sequence comprising at least one base exhibiting signal nucleotide polymorphism present in any of these gene regions or a complementary strand of the partial DNA sequence can be utilized as a marker gene for arthrorheumatism test.


Specifically, it is preferred that the base exhibiting single nucleotide polymorphism should be selected from the group consisting of:


the 61st base in SEQ ID NO: 1 or a corresponding base on a complementary strand thereof;


the 61st base in SEQ ID NO: 2 or a corresponding base on a complementary strand thereof;


the 61st base in SEQ ID NO: 3 or a corresponding base on a complementary strand thereof;


the 61st base in SEQ ID NO: 4 or a corresponding base on a complementary strand thereof;


the 401st base in SEQ ID NO: 5 or a corresponding base on a complementary strand thereof;


the 495th base in SEQ ID NO: 6 or a corresponding base on a complementary strand thereof;


the 61st base in SEQ ID NO: 7 or a corresponding base on a complementary strand thereof;


the 61st base in SEQ ID NO: 8 or a corresponding base on a complementary strand thereof;


the 61st base in SEQ ID NO: 9 or a corresponding base on a complementary strand thereof;


the 61st base in SEQ ID NO: 10 or a corresponding base on a complementary strand thereof;


the 401st base in SEQ ID NO: 11 or a corresponding base on a complementary strand thereof;


the 401st base in SEQ ID NO: 12 or a corresponding base on a complementary strand thereof;


the 401st base in SEQ ID NO: 13 or a corresponding base on a complementary strand thereof;


the 503rd base in SEQ ID NO: 14 or a corresponding base on a complementary strand thereof;


the 201st base in SEQ ID NO: 15 or a corresponding base on a complementary strand thereof;


the 511th base in SEQ ID NO: 16 or a corresponding base oh a complementary strand thereof;


the 201st base in SEQ ID NO: 17 or a corresponding base on a complementary strand thereof;


the 51st base in SEQ ID NO: 18 or a corresponding base on a complementary strand thereof;


the 61st base in SEQ ID NO: 19 or a corresponding base on a complementary strand thereof;


the 497th base in SEQ ID NO: 20 or a corresponding base on a complementary strand thereof;


the 201st base in SEQ ID NO: 21 or a corresponding base on a complementary strand thereof; and


the 201st base in SEQ ID NO: 22 or a corresponding base on a complementary strand thereof.


SEQ ID NOs: 1 to 5 represent partial sequences of the TNXB gene, SEQ ID NOs: 6 to 13 represent partial sequences of the NOTCH4 gene, SEQ ID NO: 14 represents a partial sequence of the RAB6A gene, SEQ ID NOs: 15 to 18 represent partial sequences of the MPRL48 gene, SEQ ID NOs: 19 and 20 represent partial sequences of the FLJ11848 gene, SEQ ID NO: 21 represents a partial sequence of the UCP2 gene, and SEQ ID NO: 22 represents a partial sequence of UCP3.


These marker genes can be used in genetic testing on RA.


For example, the consecutive DNA sequence comprising the base exhibiting single nucleotide polymorphism is amplified, for example by PCR, using forward and reverse primers positioned to keep the base exhibiting single nucleotide polymorphism in between them. Nucleotide sequences of the obtained DNA fragments can be determined and compared with a determined corresponding nucleotide sequence from a normal individual to thereby test the presence or absence of a genetic factor for RA.


The forward primer used in the test is a primer having the same nucleotide sequence as a sequence extending in the 3′-end direction from the 5′ end of the DNA sequence of the marker gene containing the base exhibiting single nucleotide polymorphism, which has been mapped on the human genome, and includes those of 15 to 100 bases, preferably 15 to 25 bases, more preferably 18 to 22 bases, in length. The reverse primer is a primer having a nucleotide sequence complementary to a sequence extending in the 5′-end direction from the 3′ end of the DNA sequence of the marker gene, and those of 15 to 100 bases, preferably 15 to 25 bases, more preferably 18 to 22 bases, in length can be used as the reverse primer.


Examples of primers for amplifying the marker genes having the DNA sequences of SEQ ID NOs: 1 to 22 include those having DNA sequences represented by SEQ ID NOs: 23 to 66. The relationship of their correspondence is as follows:

















Marker gene
Forward primer
Reverse primer









SEQ ID NO: 1
SEQ ID NO: 23
SEQ ID NO: 24



SEQ ID NO: 2
SEQ ID NO: 25
SEQ ID NO: 26



SEQ ID NO: 3
SEQ ID NO: 27
SEQ ID NO: 28



SEQ ID NO: 4
SEQ ID NO: 29
SEQ ID NO: 30



SEQ ID NO: 5
SEQ ID NO: 31
SEQ ID NO: 32



SEQ ID NO: 6
SEQ ID NO: 33
SEQ ID NO: 34



SEQ ID NO: 7
SEQ ID NO: 35
SEQ ID NO: 36



SEQ ID NO: 8
SEQ ID NO: 37
SEQ ID NO: 38



SEQ ID NO: 9
SEQ ID NO: 39
SEQ ID NO: 40



SEQ ID NO: 10
SEQ ID NO: 41
SEQ ID NO: 42



SEQ ID NO: 11
SEQ ID NO: 43
SEQ ID NO: 44



SEQ ID NO: 12
SEQ ID NO: 45
SEQ ID NO: 46



SEQ ID NO: 13
SEQ ID NO: 47
SEQ ID NO: 48



SEQ ID NO: 14
SEQ ID NO: 49
SEQ ID NO: 50



SEQ ID NO: 15
SEQ ID NO: 51
SEQ ID NO: 52



SEQ ID NO: 16
SEQ ID NO: 53
SEQ ID NO: 54



SEQ ID NO: 17
SEQ ID NO: 55
SEQ ID NO: 56



SEQ ID NO: 18
SEQ ID NO: 57
SEQ ID NO: 58



SEQ ID NO: 19
SEQ ID NO: 59
SEQ ID NO: 60



SEQ ID NO: 20
SEQ ID NO: 61
SEQ ID NO: 62



SEQ ID NO: 21
SEQ ID NO: 63
SEQ ID NO: 64



SEQ ID NO: 22
SEQ ID NO: 65
SEQ ID NO: 66










Alternatively, the presence or absence of an inheriting factor for RA can also be examined by using the marker genes of the present invention as probes to screen a DNA sample from a test subject, determining a nucleotide sequence of the obtained DNA of the test subject, and then comparing the sequence with a sequence from a normal individual.


In this context, the probe used may be the marker gene of the present invention itself or may be the consecutive DNA sequence comprising the base exhibiting single nucleotide polymorphism present in the marker gene, a complementary strand thereof, or sequences hybridized by them. Preferably, a probe of 15 to 100 bases, preferably 15 to 25 bases, more preferably 18 to 22 bases, in length can be used.


On the other hand, coding regions encoded by the RA susceptibility genes can be determined by determining the full-length nucleotide sequences of the RA susceptibility genes on the basis of TNXB, NOTCH4, RAB6A, MPRL48, UCP2 and UCP3 genes found de novo to have association. As a result, amino acid sequences of proteins encoded by the genes can be identified. Since proteins with these amino acid sequences are highly likely to participate in RA pathogenesis or onset mechanisms, RA can be prevented or treated by promoting or inhibiting the functions of these proteins.


Thus, the present invention also relates to a screening method using the proteins. Substances promoting or isolating the functions of the proteins, that is, agonists or antagonists can be identified by the screening method. The antagonist used herein encompasses not only chemical small molecules but also biologically relevant substances such as antibodies, antibody fragments, and antisense oligonucleotides. These agonists or antagonists are effective as diagnostic, preventive, and/or therapeutic drugs for RA.


The protein described above may be produced in a transformed cell obtained by preparing a vector comprising a DNA sequence containing at least the coding region of any of the marker genes identified by the present invention, and then transforming the vector into an appropriate host cell.


EXAMPLE

Microsatellite (MS) Detection and PCR Primer Design:


MS sequences with 2-, 3-, 4-, 5-, or 6-base repeat units were detected with Apollo program applicable to Sputnik in four versions of the human genome draft sequences from Golden Path October 2000to NCBI build 30. PCR primers for amplifying these repeats under single reaction conditions were automatically designed with Discover program applicable to Primer Express. To prevent differential amplification, these PCR primers were designed to contain no SNP in their sequences (Sham et al, 2002).


A pattern with a number of peaks exhibiting the polymorphisms of MS markers in a pool of Japanese (Barcellos. L. F. et al., Am. J. Hum. Genet., 61, 734 (1997)) was compared with that of European pools. As a result, individual polymorphic MS markers in the Japanese pool exhibited a different pattern from that of two European pools (data not shown). The result of the comparison between the races showed that the pattern with a number of peaks in the Japanese pool reflects polymorphism in MS length and is not experimental error.


In the present invention, 27,158 polymorphic MS markers were assigned and mapped on the human draft sequence (NCBI build 30) (FIG. 1). Among these markers, 20,755 markers were assigned de novo by the present inventors, while remaining 6,403 were known markers such as Genethon and CHLC markers. The average heterozygosity and average allele number of 27, 039 markers except for 119 markers mapped on the Y gene were 0.67±0.16 and 6.4±3.1, respectively. The average marker spacing thereof was 108.1 kb (SD=64.5 kb; max=930.1 kb) (see Table 1). These markers can detect linkage disequilibrium up to approximately 50 kb distant from a disease locus at a rough estimate. Accordingly, these markers were used to conduct case-control association analysis on RA. Those 27,039 microsatellites and primer sequences used for their amplification were deposited in Genbank as registration numbers listed in FIG. 5.









TABLE 1







Supplementary Table 1. Microsatellite marker spacing









Spacing (kb)












Chromosome
Average
SD
Max.







 1
104.7
64.5
581.7



 2
103.6
61.2
521.4



 3
103.0
67.0
766.7



 4
113.0
68.4
522.4



 5
107.2
61.9
461.8



 6
108.6
58.3
428.5



 7
100.2
62.4
634.6



 8
108.3
68.6
587.3



 9
106.4
65.5
930.1



10
106.6
62.6
510.2



11
104.8
64.6
463.0



12
106.8
66.6
674.2



13
108.3
57.4
356.6



14
115.9
52.6
350.2



15
119.1
67.4
625.9



16
114.1
77.3
526.0



17
112.1
70.9
546.4



18
106.5
63.6
563.7



19
110.8
69.9
421.8



20
108.6
56.9
337.0



21
108.2
58.0
378.7



22
120.3
68.1
505.7



X
123.5
65.5
443.5



Total
108.1
64.5
930.1







Microsatellite markers were mapped on the NCBI build 30.






In the present invention, 940 test subjects with RA (case population) and the same number of normal test subjects (control population) were adopted. By permission of the ethical committee of each organization associated with the present invention, informed consent was obtained from each test subject in the case and control populations used in this analysis. RA phenotypes were determined according to American Rheumatism Association diagnostic criteria for RA. All personal data associated with medical information and blood samples were carefully discarded in organization which collected them.


Average age at disease onset in the case population was 47.7±13.1 years old, with the sex ratio of 1:4 (male:female). The average age and sex ratio of the case and control populations were set as equally as possible. The sexes of all samples involved were confirmed by amelogenin (enamel protein) genotyping (Akane, A., et al., Forensic Sci. Int., 49, 81 (1991)). Preliminary PCR test for checking DNA levels was conducted by PCR direct sequencing as previously reported (Voorter, C. E. et al., Tissue Antigens, 49,471 (1997)), while HLA-DRB1 genotypes were examined.


DNA Sample Preparation and Typing:

DNA was extracted with QIAamp DNA blood kit (QIAGEN) from the sample of each test subject in the populations under standardized conditions for preventing variations in DNA level. Subsequently, to check DNA degradation and RNA contamination, 0.8% agarose gel electrophoresis was performed. After optical density measurement for checking protein contamination, the DNA concentration was determined by three measurements using PicoGreen fluorescence assay (Molecular Probes) as previously described (Collins, H. E. et al., Hum. Genet., 106, 218 (2000)). The standardized pipetting and dispensation of the DNA samples were performed with robots such as Biomek 2000 and Multimek 96 (Beckman).


The pooled DNA template for typing two groups of approximately 30,000 MS markers was prepared simultaneously with or immediately after the DNA quantification. The pooled DNA level was further tested by comparing allelic distribution between individuals and pooled typing results using three MS markers. After this test, approximately 30,000 PCR reaction mixtures containing all the components except for the PCR primers were prepared and subsequently dispensed to 96-well PCR reaction plates, followed by storage until use.


After PCR reaction, pooled MS typing and individual genotyping were conducted according to standard protocols using ABI3700 DNA analyzer (Applied Biosystems). The pooled DNA typing could maintain constant accuracy throughout the experiment by using the standardized preparation method. Various data such as peak positions and heights were automatically read by the PickPeak and MultiPeaks programs developed by Applied Biosystems Japan, from the multipeak pattern in the chromatograph files, that is, ABI fsa files.


Three-Phased Genome Screening by Pooled DNA Method:

A population of 375 individuals with RA (case) and the same number of unaffected individuals (control) were equally divided into three pairs of case and control populations (125 individuals each). Population stratification test was conducted using 22 randomly selected microsatellites sufficient at least for population stratification according to Pritchard's method (Pritchard, J. K. and Rosenberg, N. A., Am. J. Hum. Genet., 65, 220 (1999)). The results showed the absence of any significant stratification in either case or control populations (Table 2). The prevention of false association by the population stratification test is very important for late-onset diseases such as RA (Risch2000) where the collection of internal controls is difficult.









TABLE 2







Supplementary Table 2. Stratification test among case and control populations









Fisher's exact P values













1st screening







(n = 125)
2nd screening
3rd screening
Additional samples
Total














# of

(n = 125)
(n = 125)
(n = 565)
(n = 940)


























Al-


# of


# of


# of


# of



Chr.
Markers
2x2
lele
2xm
2x2
Allele
2xm
2x2
Allele
2xm
2x2
Allele
2xm
2x2
Allele
2xm


























 1
D1S0368i
0.296
6
0.780
0.113
5
0.181
0.248
7
0.524
0.331
8
0.939
0.093
9
0.627


 2
D2S1336
0.171
8
0.555
0.070
8
0.034
0.059
9
0.480
0.250
10
0.585
0.453
10
0.860


 3
D3S2439
0.025
9
0.351
0.015
10
0.073
0.040
9
0.414
0.028
11
0.503
0.194
11
0.775


 4
G10243
0.387
7
0.884
0.268
7
0.906
0.095
7
0.233
0.076
8
0.202
0.020
8
0.063


 5
D5S0029i
0.857
3
0.960
0.710
3
0.933
0.068
5
0.172
0.186
4
0.399
0.146
5
0.415


 6
G10114
0.373
6
0.852
0.038
6
0.272
0.282
7
0.464
0.012
6
0.036
0.064
8
0.255


 7
D7S1802
0.157
8
0.446
0.157
9
0.664
0.258
8
0.882
0.009
11
0.230
0.013
12
0.232


 8
HUMUT1239
0.022
7
0.034
0.399
7
0.814
0.123
7
0.466
0.098
6
0.453
0.282
8
0.815


 9
D9S01471
0.372
7
0.717
0.125
6
0.341
0.238
7
0.279
0.156
7
0.733
0.226
8
0.513


10
G08808
0.210
6
0.541
0.074
7
0.123
0.123
6
0.475
0.030
9
0.046
0.166
10
0.220


11
D11S0689i
0.215
10
0.595
0.062
10
0.124
0.553
9
0.986
0.308
13
0.902
0.148
13
0.699


12
G08964
0.248
8
0.286
0.071
7
0.109
0.229
8
0.673
0.146
9
0.920
0.333
9
0.820


13
D13S0102i
0.089
6
0.476
0.135
6
0.368
0.187
6
0.256
0.001
7
0.016
0.002
7
0.020


14
D14S608
0.015
10
0.163
0.253
8
0.600
0.340
10
0.667
0.051
10
0.164
0.028
10
0.114


15
G07912
0.194
6
0.306
0.179
6
0.602
0.248
9
0.624
0.207
7
0.627
0.076
9
0.284


16
D16S0026i
0.107
8
0.696
0.358
9
0.782
0.136
10
0.658
0.037
11
0.389
0.047
11
0.443


17
D17S0044i
0.104
16
0.888
0.361
15
0.999
0.061
12
0.415
0.064
23
0.706
0.053
24
0.590


18
D18S0013i
0.123
9
0.299
0.499
8
0.933
0.172
8
0.280
0.315
10
0.961
0.382
10
0.920


19
D19S0019i
0.498
8
0.977
0.216
7
0.584
0.078
8
0.365
0.036
8
0.183
0.290
8
0.581


20
D20S0030i
0.247
7
0.557
0.047
7
0.149
0.339
7
0.724
0.503
8
0.970
0.125
9
0.534


21
D21S0036i
0.036
13
0.558
0.109
11
0.339
0.050
11
0.099
0.060
13
0.365
0.059
15
0.626


22
D22S0155i
0.246
4
0.443
0.003
4
0.010
0.124
4
0.207
8.07E−05
4
3.44E−04
0.001
4
0.010


X
HUMUT1223
0.115
5
0.515
0.140
6
0.416
0.122
5
0.182
0.006
6
0.069
0.021
6
0.081












Pritchard's chi square
69.4
81.8
65.4
70.0
54.6


df
65
64
62
65
65


P value
0.331
0.066
0.361
0.314
0.818









After the population stratification test, three pooled DNA templates from each case or control population were used in three-phased genomic screening. This screening method simply means reproduction in three independent sample populations and is known to be suitable for excluding many false positives due to Type I errors caused by multiple testing (Barcellos, L. et al., Am. J. Hum. Genet., 61, 724 (1997)). The first (first-phase) screening indicated that 2,847 MS markers were statistically significant (P<0.05) by the Fisher's exact test for either 2×2 or 2×m contingency tables (m=the number of alleles). Subsequent second (second-phase) screening indicated that of these 2,847 markers, 372 MS markers were significant. After further third (third-phase) screening, 133 positive MS markers were obtained. These results are shown in Table 3.









TABLE 3







Supplementary Table 3. Summary of the phased genome


screen by the pooled DNA method









Screening phases











1st (n = 125 each)
2nd (n = 125 each)
3rd (n = 125 each)














Number of
Number of
Number of
Number of
Number of
Number of


Chromosome
Marker
Positive*
Marker
Positive
Marker
Positive
















 1
2,241
232
232
27
25
10


 2
2,373
249
249
29
25
7


 3
1,991
204
204
33
30
8


 4
1,740
184
184
23
23
11


 5
1,733
168
168
22
20
11


 6
1,619
170
170
36
29
10


 7
1,599
201
201
25
23
9


 8
1,375
124
124
14
11
4


 9
1,101
135
135
9
9
4


10
1,281
127
127
18
16
7


11
1,303
139
139
16
13
5


12
1,260
144
144
16
12
3


13
893
99
99
12
9
6


14
762
79
79
19
19
4


15
689
71
71
9
7
4


16
732
57
57
12
13
5


17
725
77
77
6
4
2


18
750
86
86
8
7
5


19
503
67
67
10
11
3


20
565
50
50
7
7
2


21
324
37
37
4
4
1


22
293
33
33
6
5
4


X
1,187
114
114
11
13
8


Total
27,039
2847 (1,377)
2,847
372 (215)
335
133 (53)





*Number of positive markers by the Fisher exact test for the 2x2 or 2xm contingency tables. The number of positive markers by 2xm is indicated in parenthesis.






The number of the positive MS markers was larger than statistically expected, suggesting that experimental errors caused by the pooled DNA method were contained therein, as previously reported (Shaw, S. H. et al., Genome Res., 8, 111 (1998); and Shaw, P. et al., Nat. Rev. Genet., 3, 862 (2002). Thus, we carefully verified these positive markers by individual genotyping in the screened populations. As a result, 47 markers were significant. Of these markers, 25 were excluded due to their low positive allele frequencies (<0.05), resulting in a list of 23 positive MS markers (Table 4).









TABLE 4







Table 1. Twenty-five positive microsatellite markers from individual genotyping
















Allele






Number of
Positive
frequencies
Fisher's exact P values
Odds


















Markers
Cytobands
allele
allele
Control
Case
2x2
PC
2xm
PC
Ratio
95% CI





















D6S0588i
6p21.3
10
5
0.430
0.572
0.000000055
0.000014
0
0
1.78
1.45-2.18


D6S0483i
6p21.3
18
7
0.089
0.176
0.00000092
0.00024
0
0
2.18
1.59-2.98


D6S1061
6p21.3
24
16
0.095
0.183
0.000001
0.00026
0
0
2.14
1.57-2.90


D11S0497i
11q13.4
5
2
0.513
0.613
0.000031
0.008
0.00052
0.012
1.55
1.26-1.91


D6S0025i
6p21.3
6
2
0.125
0.185
0.002
0.51
0.0005
0.012
1.59
1.20-2.11


D10S0168i
10p13
4
2
0.408
0.499
0.0005
0.13
0.001
0.024
1.44
1.18-1.77


D14S0452i
14q23.1
9
4
0.370
0.452
0.001
0.26
0.0006
0.014
1.40
1.14-1.72


D8S0127i
8q13.3
16
3
0.116
0.069
0.002
1
0.009
0.25
0.57
0.40-0.81


D7S0086i
7p21.1
11
4
0.095
0.053
0.002
1
0.03
0.75
0.54
0.36-0.80


D10S0607i
10q26.13
5
1
0.827
0.882
0.003
1
0.02
0.5
1.59
1.19-2.14


D13S0561i
13q31.1
10
8
0.130
0.183
0.005
1
0.16
1
1.50
1.13-2.00


G08462
5q14.1
9
4
0.190
0.136
0.005
1
0.09
1
0.67
0.51-0.89


D16S0496i
16q12.2
10
7
0.204
0.267
0.005
1
0.07
1
1.41
1.11-1.79


D5S0228i
5q12.1
11
7
0.305
0.371
0.004
1
0.02
0.5
1.35
1.09-1.67


D53400
5q34
18
2
0.063
0.101
0.008
1
0.03
0.75
1.69
1.15-2.46


D6S0811i
6q22.33
6
3
0.445
0.515
0.008
1
0.01
0.25
1.31
1.07-1.61


D20S910
20p12.1
14
7
0.301
0.365
0.009
1
0.18
1
1.34
1.08-1.66


D4S0017i
4q25
22
5
0.071
0.111
0.009
1
0.12
1
1.64
1.14-2.35


D16S0232i
16q24.1
4
2
0.444
0.380
0.01
1
0.06
1
0.77
0.63-0.95


D3S1500i
3p24.3
4
1
0.781
0.725
0.01
1
0.005
0.13
0.74
0.58-0.94


D20S470
20p12.1
14
7
0.111
0.073
0.02
1
0.59
1
0.64
0.45-0.91


DXS0486i
Xq25
8
1
0.118
0.090
0.09
1
0.19
1
0.68
0.51-1.04


D18S0090i
18q12.1
20
13
0.193
0.153
0.05
1
0.54
1
0.76
0.58-0.99





PC means corrected P values by Bonferronl's correction.


The Fisher's exact test was carried out in the case and control populations (n = 375 each).


This means allele frequency of which has the lowest P value in the locus.






Specific data serving as a basis for Table 4 are shown in Table 5. As an example, this table classifies the region determined by each MS marker as positive (+) (which was judged as having significant disease association in the rheumatoid arthritis group (P) as compared with the normal individual group (C)) or as negative (−). For example, “+/+” means that both alleles are positive, and “+/−” means that one of alleles is positive, according to the classification. The use of this table allows for, for example, the digitization of the possibility of rheumatoid arthritis onset by grading each test subject according to specific algorithm on the basis of these numeric values. In the table, “∘” denotes mistyping.

















TABLE 5









D6S0588i
D6S1061
D6S0483i
D6S0025i
D11S0497i
D10S0168i
D14S0452i
























C
P
C
P
C
P
C
P
C
P
C
P
C
P







274
192
311
254
314
255
317
290
219
158
299
287
350
311


+
+
199
316
5
11
14
18
37
54
277
340
190
202
155
174


+

450
421
57
110
43
101
20
31
428
435
446
450
409
448


o
o
15
52
565
606
567
607
564
606
14
48
3
42
24
48





















Total
938
981
938
981
938
981
938
981
938
981
938
981
938
981



















D8S0127i
D7S0086i
D10S0607i
D13S0561
G08462
D16S0496i
D5S0228i
D5S400


























C
P
C
P
C
P
C
P
C
P
C
P
C
P
C
P







727
763
793
798
20
16
650
647
617
662
568
514
189
152
789
754


+
+
17
12
8
5
672
692
42
54
40
18
61
61
43
55
8
7


+

163
155
125
117
238
227
215
220
272
245
292
355
142
168
130
175


o
o
31
9
12
19
8
4
31
18
9
14
17
9
564
564
11
3























Total
938
939
938
939
938
939
938
939
938
939
938
939
938
939
938
939



















D6S0811i
D20S910
D4S0017i
D16S0232
D3S1500i
D20S470
D18S0090i
DXS0486i


























C
P
C
P
C
P
C
P
C
P
C
P
C
P
C
P







270
250
215
194
324
301
286
352
68
64
296
324
244
268
296
318


+
+
209
241
65
93
3
9
170
152
528
471
5
4
13
8
13
7


+

449
437
93
88
47
65
479
430
318
351
73
47
117
99
63
50


o
o
10
11
565
564
564
564
3
5
24
53
564
564
564
564
566
564























Total
938
939
938
939
938
939
938
939
938
939
938
939
938
939
938
939









The seven most significant markers in the list of Table 4 were also significant after Bonferroni's correction (Pc<0.05). Therefore, in this Example, SNP genotyping was focused on these candidate regions.


SNP Genotyping:

Among the seven most significant markers, four (i.e., the first, second, third, and fifth) were located in the HLA region on 6p21.3 (FIG. 3), whereas the fourth, sixth, and seventh significant markers were located on 11q13.4, 10p13, and 14q23.1, respectively (cytobands are designated under the NCBI build 30).


SNPs in the neighborhoods of these candidate regions were selected from dbSNP database of NCBI homepage and JSNP database of the homepage of The Institute of Medical Science, The University of Tokyo. These SNPs were genotyped using TaqMan assay or direct sequencing. The TaqMan assay was conducted using the standard protocol of ABI PRISM 7900HT Sequence Detection System (Applied Biosystems) equipped with 384-Well Block Module and Automation Accessory. The direct sequencing of the PCR products was conducted according to a standard approach using ABI3700 DNA analyzer (Applied Biosystems). In the HLA region, additional SNPs were selected from IkBL to C4B genes in order to verify previously reported RA association around the centromeric end of the HLA class III region. See Table 6 for the details of the selected SNPs.


Genotyping was conducted on 165 SNPs in the case and control populations used in the MS typing. Of these SNPs, 41 were neither polymorphic nor STSs (sequence tagged sites) (see Table 6) and were therefore excluded from subsequent analysis. Among the remaining 124 SNPs, 54 were statistically significant by case-control association analysis (P<0.05) (Table 7). LD block structures were predicted for these 124 SNPs by EM algorithm (FIG. 2), and case-control association analysis using haplotypes in each block was conducted according to this algorithm (Table 8). To reproduce these SNP allelic associations, these 54 positive SNPs were genotyped in additional populations composed of 565 case individuals and 565 control individuals. Finally, 45 positive SNPs were obtained in the combined (n=2×940) population consisting of all the samples used in this experiment. Among these positive SNPs, 24 was also significant (Pc<0.05) after Bonferroni's correction (Table 7).


Hereinafter, the analysis result of each chromosome will be described.


6p21.3


In the HLA region on 6p21.3, 28 of 71 polymorphic SNPs were statistically significant (Pc<0.05) in the first test. Preliminary genotyping on HLA-DRB1 revealed that the HLA-DRB1*0405 allele was most significant (P=1.3×10−12). The result was, as expected, consistent with many previous reports on Japanese populations (Wakitani, S. et al., Br. J. Rheumatol., 36, 630 (1997); and Shibue, T. et al, Arthritis Rheum., 43, 753 (2000)) and demonstrated that the method used in the present invention is effective for detecting the association of susceptibility genes with RA. In addition to HLA-DRB1, the association of the IkBL gene (MIM*601022) promoter SNP rs3219185 was also reproduced (P=5.4′ 10−5), albeit with relatively low frequency of the minor allele. Moreover, strong association was seen around the NOTCH4 (MIM*164951) and TNXB (MIM*600985) genes, which were approximately 250 kb and 300 kb, respectively, distant from HLA-DRB1.


The NOTCH4 gene is one of proto-oncogenes with epidermal growth factor (EGF) repeats. NOTCH4 encodes a large transmembrane receptor predicted to be involved in the signal transduction of cell proliferation, cell differentiation, and angiogenesis (Yung Yu, C. et al., Immunol. Today, 21, 320 (2000)). In NOTCH4, nine SNPs were statistically significant, among which two caused amino acid exchange. Among these nine SNPs, rs2071282, the SNP in exon 4, exhibited the strongest association (P=3.1×10−8) and caused Leu203Pro exchange at the fourth EGF repeat in the extracellular domain of NOTCH4. On the other hand, rs915894 in exon 3 was moderately significant (P=0.044) and caused Lys116G1n exchange at the third EGF repeat.


The TNXB gene encodes one of extracellular matrix proteins with 34 fibronectin type III-like (FNIII) and 18 EGF repeats and participates in at least one of essential functions of collagen deposition in connective tissues (Mao, J. R. et al., Nat. Genet. 30, 421 (2002)). In TNXB, five SNPs were statistically significant, of which four caused amino acid exchange. Among these five SNPs, rsp185819 in exon 10 exhibited the strongest association (P=6.8×10−5) and caused His1248Arg exchange at the seventh FNIII repeat. Other SNPs, rs2075563 (Glu3260Lys) in exon 29, rs2269428 (His2363Pro) in exon 21, and rs3749960 (Phe2300Tyr) in exon 20, were also significant and located in the 26th, 18th, and 17th FNIII repeats, respectively.


These six positive SNPs were finally confirmed in the combined (n=2×940) population (Table 6). Further, haplotype analysis demonstrated these results for IkBL, NOTCH4, and TNXB (Table 7), indicating the absence of, in all blocks of each gene, common haplotypes with greater risks than that of single SNP in each gene. When multiple logistic regression analysis was conducted for the SNPs in IkBL, TNXB, and NOTCH4 with those in HLA-DRB1, three genes, DRB1*0405 (ORs=2.29-8.84), rs3219185 in IkBL (ORs=1.16-2.67), and rs185819 in TNXB (ORs=1.00-1.62), were significant (P<0.05) in a partially recessive model. Two SNPs, DRB1 (ORs=2.16-4.69) and TNXB (ORs=1.02-1.84), were significant in a partially dominant model. On the other hand, when the analysis was limited to the shared epitope (SE) of DRB1, SE (ORs=1.79-3.85), IkBL (ORs=1.11-2.54), and rs2071282 in NOTCH4 (ORs=1.13-7.14) were significant only in the partially recessive model. These results suggested that these loci independently correspond to RA in the partially recessive model.


11q13.4


The candidate region on 11q13.4 contained nine genes including three mitochondrial-related genes MRPL48, UCP2, and UCP3. Although MRPL48 was recently found as a gene having homology to mammalian mitochondrial ribosomal proteins (MRPs) (Zhang, Z. and Gerstein, M., Genomics, 81, 468 (2003)), its function is still unknown. UCP2 (MIM*601693) and UCP3 (MIM*602044) encode transporter proteins on the inner mitochondrial membrane and participate in energy consumption. UCP2 is also known as a susceptibility gene for obesity and diabetes. RAS-associated protein RAB6A (MIM*179513) was centromerically found with respect to MRPL48. Further, three novel genes were located in regions FLJ11848, LOC374407, and DKFZP586P0123. FLJ11848 has WD40 repeats and widely participates in cell-cell interaction (Smith, T. F. et al., Trends Biochem. Sci., 24, 181 (1999)) LOC374407 has been found to have homology to heat shock protein 40 homolog (HSP40 homolog) and structural similarity to spermatogenesis apoptosis-related protein. DKFZP586P0123 has one protein kinase C conserved region.


In these genes, 16 of 25 polymorphic SNPs were statistically significant in the first test. Although these positive SNPs were scattered over the region tested, most significant associations (P=0.00015) were observed in two SNPs, rs1792174 in 5′-UTR and rs1792160 in intron 3 of MRPL48. MRPL48 also had two other positive SNPs, rs1792193 (P=0.003) in intron 5 and rs1051090 (P=0.007) in 3′-UTR. Positive SNPs were also observed in all of other genes UCP2, UCP3, RAB38 and FLJ11848. However, only one common haplotype in the block b2 containing MRPL48 and FLJ11848 exhibited significant association as strong as the single SNP in MRPL48. These positive SNPs in MRPL48 were finally confirmed after Bonferroni's correction in the combined population (Table 7). On the other hand, rs1527302 in DKFZP586P0123 was significant (P=0.00078) both in the first test and after haplotype analysis. However, the SNP allelic association was not confirmed in the combined population. These results suggested that other causative SNPs are present in the block b2.


10q13, 14q23.1, and PADI4


The candidate region on 10p13 had two genes, DKFZP761F241 and optineurin (OPTN). Three SNPs in the DKFZP761F241 gene were statistically significant in the first test and however, was not confirmed after correction in the combined population. No common haplotype existed in regions that remained after Bonferroni's correction in each population.


On the other hand, the candidate region on 14q23.1 contained only reticulon 1 gene (MIM*600865), which encodes the neuroendocrine-specific protein group. Even after Bonferroni's correction in the combined samples, rs2182138 in intron 3 of RTN1 was still statistically significant (P=0.0002). No common haplotype was observed in both regions that remained after correction.


Further, in the PADI4 gene that appeared to be a susceptibility gene for RA by the candidate gene approach (Non-Patent Document 5), four positive SNPs, padi89 (P=0.002), padi90 (P=0.004), rs874881 (P=0.002), and rs2240340 (P=0.002), were replicated in the populations of this Example. D1S1144i, a CA microsatellite marker in intron 6 of the PADI4 gene, was confirmed to be included in the RA marker set and exhibit slight significance (P=0.008) but low associated allele frequency (P=0.037 in the control population).









TABLE 6







Supplementary Table 4. List of all SNPs










Location














Cytobands
SNPs
Method
g/cSNP
gene name
portion
Note





6p21.3
rs1615839
TaqMan(AbO)*1
cSNP






rs2242955
TaqMan(AbO)
cSNP
MICS
intron



rs2071595
TaqMan(AbO)
cSNP
BAT1
intron



rs3219156
Sequencing*2
cSNP

BL

promoter



rs3219185
Sequencing
cSNP

BL

promoter



rs3219184
Sequencing
cSNP

BL

promoter



rs2071592
Sequencing
cSNP

BL

promoter



rs2239708
Sequencing
cSNP

BL

intron



rs2071591
Sequencing
cSNP

BL

intron



rs769178
TaqMan(AoD)*3
gSNP



rs2269475
TaqMan(AbO)
gSNP
AIF1
intron



rs2857 93
TaqMan(AoD)
gSNP



rs3130071
TaqMan(AoD)
cSNP
BAT2
syn
Not polymorphic



rs1048069
TaqMan(AoD)
cSNP
BAT2
nonsyn



rs2242 6
TaqMan(AoD)
cSNP
BAT2
intron



rs 06299
TaqMan(AoD)
cSNP
BAT3
intron



rs 05263
TaqMan(AoD)
cSNP
C6orf5B
5′UTR



rs2142234
TaqMan(AoD)
cSNP
LY6G5B
intron
Not polymorphic



rs 052 7
TaqMan(AbO)
cSNP
LY6G5B
nonsyn



rs605273
TaqMan(AoD)
cSNP
BAT5
intron



rs2242653
TaqMan(AbO)
cSNP
LY6GBD
nonsyn



rs400547
TaqMan(AoD)
cSNP
CUC1
intron



rs1150793
TaqMan(AoD)
cSNP
MSH5
intron



rs707936
TaqMan(AoD)
cSNP
C6orf27
nonsyn



rs707929
TaqMan(AoD)
cSNP
C6orf27
intron



rs2242 68
TaqMan(AbO)
gSNP
LSM2
intron



rs2075800
TaqMan(AoD)
cSNP
HSPA1L
nonsyn



rs2227955
TaqMan(AoD)
cSNP
HSPA1L
nonsyn



rs2242 7
TaqMan(AbO)
gSNP
HSPA1A
5′UTR



rs605203
TaqMan(AoD)
gSNP



rs2072579
TaqMan(AbO)
cSNP
C6orf46
5′UTR
Not polymorphic



rs1042563
TaqMan(AoD)
cSNP
C2
syn



rs3763303
TaqMan(AbO)
cSNP
C2
syn
Not polymorphic



rs1048709
TaqMan(AoD)
cSNP
BF
syn



rs444 21
TaqMan(AoD)
cSNP
S V2L
intron



rs474534
TaqMan(AoD)
cSNP
DOM32
intron



rs2072564
TaqMan(AbO)
cSNP
TNXB
intron
Only heterozygote



rs2242569
TaqMan(AbO)
cSNP
TNXB
syn



rs2075563
TaqMan(AbO)
cSNP
TNXB
nonsyn



rs22 9428
TaqMan(AbO)
cSNP
TNXB
nonsyn



rs3749960
TaqMan(AbO)
cSNP
TNXB
nonsyn



rs3749962
TaqMan(AbO)
cSNP
TNXB
syn
Only heterozygote



rs204877
TaqMan(AoD)
cSNP
TNXB
intron



rs165619
TaqMan(AoD)
cSNP
TNXB
nonsyn



rs204900
TaqMan(AoD)
cSNP
TNXB
nonsyn



rs204896
TaqMan(AbO)
cSNP
TNXB
nonsyn



rs429150
TaqMan(AoD)
cSNP
TNXB
intron



rs204999
TaqMan(AoD)
gSNP



rs2071299
TaqMan(AbO)
cSNP
EGFL8
nonsyn



rs406359
TaqMan(AbO)
gSNP
AGPAT1
intron



rs2070 00
TaqMan(AbO)
cSNP
AGER
nonsyn



rs2071267
TaqMan(AoD)
cSNP
NOTCH4
intron



rs206018
TaqMan(AoD)
gSNP
NOTCH4
intron



rs2849012
TaqMan(AoD)
cSNP
NOTCH4
intron



rs422951
TaqMan(AoD)
cSNP
NOTCH4
nonsyn



rs520692
Sequencing
cSNP
NOTCH4
nonsyn



rs520688
Sequencing
cSNP
NOTCH4
syn



rs2071284
Sequencing
cSNP
NOTCH4
intron



rs2071263
Sequencing
cSNP
NOTCH4
syn



rs2071262
TaqMan(AbO)
cSNP
NOTCH4
nonsyn



rs2071281
Sequencing
cSNP
NOTCH4
syn



rs415009
Sequencing
cSNP
NOTCH4
syn



rs915894
TaqMan(AoD)
cSNP
NOTCH4
nonsyn



rs443196
TaqMan(AbO)
cSNP
NOTCH4
syn



rs367396
Sequencing
cSNP
NOTCH4
5′UTR



rs3132953
TaqMan(AbO)
gSNP


Only heterozygote



rs999575
TaqMan(AoD)
gSNP



rs391233
TaqMan(AoD)
cSNP
C6orf10
intron



rs2273019
TaqMan(AbO)
cSNP
C6orf10
intron



rs2073044
TaqMan(AoD)
cSNP
C6orf10
intron



rs2294876
TaqMan(AoD)
cSNP
BTNL2
intron



rs2076523
TaqMan(AoD)
cSNP
BTNL2
nonsyn



rs3135344
TaqMan(AoD)
gSNP



rs3129 55
TaqMan(AbO)
gSNP


Not polymorphic



rs2227139
TaqMan(AbO)
gSNP



rs13454556
TaqMan(AbO)
gSNP


Not polymorphic



rs3830130
TaqMan(AbO)
cSNP
HLA-DRB3
intron
Multi cluster



rs3826616
TaqMan(AbO)
cSNP
HLA-DRB3
nonsyn
Multi cluster



rs382 540
TaqMan(AbO)
gSNP


Multi cluster



rs3830121
TaqMan(AbO)
cSNP
HLA-DRB1
intron
Multi cluster



rs2858664
TaqMan(AbO)
gSNP


Multi cluster



rs2269799
TaqMan(AbO)
cSNP
HLA-DQA1
intron
Multi cluster



rs3135000
TaqMan(AbO)
gSNP


Multi cluster



rs2647012
TaqMan(AbO)
gSNP



rs2655559
TaqMan(AbO)
gSNP


Multi cluster



rs2071796
TaqMan(AbO)
gSNP



rs1049110
TaqMan(AbO)
cSNP
HLA-DQB2
nonsyn



rs2071560
TaqMan(AoD)
cSNP
HLA-DQB2
intron


11q13.4
rs3781900
TaqMan(AbO)
gSNP



rs2006734
TaqMan(AbO)
cSNP
PLEKHB1
intron



rs6590
TaqMan(AbO)
cSNP
PLEKHB1
3′UTR



rs3182799
Sequencing
cSNP
RAB6A
3′UTR
Not polymorphic



rs3741142
Sequencing
cSNP
RAB6A
3′UTR



rs3182792
Sequencing
cSNP
RAB6A
syn
Not polymorphic



rs3182790
Sequencing
cSNP
RAB6A
syn
Not polymorphic



rs3182790
Sequencing
cSNP
RAB6A
syn
Not polymorphic



rs3182768
Sequencing
cSNP
RAB6A
nonsyn
Not polymorphic



rs1464906
TaqMan(AbO)
cSNP
RAB6A
intron



rs3203705
TaqMan(AbO)
cSNP
RAB6A
nonsyn
Only heterozygote



rs2140693
TaqMan(AbO)
cSNP
RAB6A
intron



rs1043234
TaqMan(AbO)
cSNP
RAB6A
5′UTR



rs1621654
TaqMan(AbO)
gSNP


Not polymorphic



rs1792174
TaqMan(AbO)
cSNP
MPRL48
5′UTR



rs1723634
TaqMan(AoD)
cSNP
MPRL48
intron
Only heterozygote



rs1792160
TaqMan(AoD)
cSNP
MPRL48
intron



rs1453825
TaqMan(AbO)
cSNP
MPRL48
intron
Not polymorphic



rs1792193
TaqMan(AoD)
cSNP
MPRL48
intron



rs1051090
TaqMan(AbO)
cSNP
MPR143
3′UTR



rs2010583
TaqMan(AbO)
gSNP



rs1870681
TaqMan(AoD)
cSNP
FLJ11849
Intron
Not polymorphic



rs2057912
TaqMan(AbO)
cSNP
FLJ11848
nonsyn



rs3741138
TaqMan(AoD)
cSNP
FLJ11848
nonsyn



rs1818529
TaqMan(AbO)
cSNP
FLJ11848
Intron
Not polymorphic



rs935985
TaqMan(AbO)
cSNP
FLJ11848
Intron



rs837028
TaqMan(AbO)
gSNP



rs653263
TaqMan(AoD)
cSNP
LOC374407
syn



rs655717
TaqMan(AbO)
gSNP



rs680339
TaqMan(AoD)
cSNP
UCP2
nonsyn



rs668514
TaqMan(AoD)
gSNP



rs2075677
TaqMan(AoD)
cSNP
UCP3
syn



rs2229706
TaqMan(AbO)
cSNP
UCP3
nonsyn
Not polymorphic



rs1800849
TaqMan(AoD)
gSNP



rs1320428
TaqMan(AoD)
gSNP


Not polymorphic



rs1685343
TaqMan(AoD)
gSNP


Not polymorphic



rs626072
TaqMan(AoD)
cSNP
DKFZP560P0123
Intron



rs528032
TaqMan(AoD)
cSNP
DKFZP586P0124
Intron
Not polymorphic



rs888650
TaqMan(AoD)
cSNP
DKFZP586P0123
Intron



rs1527302
TaqMan(AoD)
cSNP
DKFZP586P0123
Intron


10p13
rs2493762
TaqMan(AbO)
gSNP



rs963335
TaqMan(AbO)
gSNP


Not polymorphic



rs2280076
TaqMan(AbO)
cSNP
DKFZP761F241
3′UTR



rs1439915
TaqMan(AoD)
cSNP
DKFZP761F241
Intron



rs988762
TaqMan(AoD)
cSNP
DKFZP761F241
Intron



rs2668002
TaqMan(AoD)
cSNP
DKFZP761F241
Intron
Not polymorphic



rs2658907
TaqMan(AoD)
cSNP
DKFZP761F241
Intron



rs662141
TaqMan(AoD)
cSNP
DKFZP761F241
Intron



rs920409
TaqMan(AoD)
cSNP
DKFZP761F241
Intron



rs3957005
TaqMan(AbO)
gSNP


Not polymorphic



rs585850
TaqMan(AbO)
gSNP



rs1347979
TaqMan(AoD)
gSNP



rs571066
TaqMan(AoD)
gSNP



rs2580915
TaqMan(AbO)
cSNP
OPTN
5′UTR
Not polymorphic



rs860592
TaqMan(AoD)
cSNP
OPTN
Intron



rs2244380
TaqMan(AoD)
cSNP
OPTN
Intron



rs785884
TaqMan(AoD)
cSNP
OPTN
Intron



rs1802343
TaqMan(AbO)
cSNP
OPTN
nonsyn
Not polymorphic



OPTN-1
TaqMan(AbO)
cSNP
OPTN
nonsyn



rs1324252
TaqMan(AoD)
gSNP


14q23.1
rs725951
TaqMan(AbO)
gSNP



rs2073318
TaqMan(AbO)
gSNP



rs1980579
TaqMan(AoD)
gSNP



rs1950789
TaqMan(AoD)
cSNP
RTN1
Intron



rs1884737
TaqMan(AoD)
cSNP
RTN1
Intron



rs2349898
TaqMan(AbO)
cSNP
RTN1
Intron



rs1952043
TaqMan(AoD)
cSNP
RTN1
Intron



rs1957989
TaqMan(AoD)
cSNP
RTN1
Intron



rs2182139
TaqMan(AoD)
cSNP
RTN1
Intron



rs1952041
TaqMan(AoD)
cSNP
RTN1
Intron
Multi cluster



rs1957996
TaqMan(AbO)
cSNP
RTN1
Intron
Not polymorphic



rs1952032
TaqMan(AbO)
cSNP
RTN1
Intron



rs1957983
TaqMan(AoD)
cSNP
RTN1
Intron
Multi cluster



rs1253288
TaqMan(AbO)
cSNP
RTN1
Intron



rs927325
TaqMan(AbO)
cSNP
RTN1
Intron



rs2064992
TaqMan(AoD)
cSNP
RTN1
Intron



rs1951363
TaqMan(AoD)
gSNP


Multi cluster





*1Typed by TaqMan systems. Primers and probes were prepared by Assays-on-Demand ™


*2Typed by direct sequencing


Used primers


SNPs on NcBL (6p21.3)


Template PCR forward 5′-QCAAGAGATGAGGCCTAACCTAAC-3′


Template PCR reverse 5′-CATCCTACGATAGTCTTCTTCCGTC-3′


Sequencing primer 5′-TACCTGGGCTCCTGAGCCT-3′


Sequencing primer 5′-AGAAGCTCGGAGACGGGAG-3′


SNPs (rs520092-rs2071283, rs2071251, rs415929) on NOTCH4 (8p21.3)


Template PCR forward 5′-TCCTTCTCTACCTCCCACCTCCTGA-3′


Template PCR reverse 5′-CACTGCTGCCGCCATTACCAC-3′


Sequencing primer 5′-GCCTCAGGTGAGCAGTGCCAG-3′


SNPs (rs357394) on NOTCH4 (8p21.3)


Template PCR forward 5′-GCCTGACCTTTCATGTCCCCATC-3′


Template PCR reverse 5′-GGTGTCCAGGACATTGTGTGACACA-3′


Sequencing were performed using reverse primer.


SNPs on RAB5A (11p13.4)


Template PCR forward 5′-CAGGCAGCAATGATGAATTG-3′


Template PCR reverse 5′-TCCATTTGAGCACCTTATATGG-3′


Sequencing were performed using reverse primer.


*3Typed by TaqMan systems. Primers and probes were prepared by Assays-by-Design ™



indicates data missing or illegible when filed














TABLE 7





Table 2. SNP allelic association

















Samples for pooled screens



(Control:Case = 375:375)










Frequencies















Genes
Al-
Con-

Odds



















Cytobands
SNPs
Name
Portion
Amino Acid
lele
trol
Case
P-value*
Pc
Ratio
95% CI





6p21.3
rs3219185



promoter

G
0.026
0.972
0.000064
0.0028
2.79
1.07-4.06



rs769178



A
0.176
0.231
0.0090
0.47
1.40
1.09-1.81



rs2242656
BAT3
intron8

A
0.880
0.898
0.033
1
1.42
1.04-1.95



rs605273
BAT5
intron4

C
0.800
0.809
0.026
1
1.44
1.05-1.96



rs2242668
LSM2
intron2

A
0.874
0.911
0.024
1
1.48
1.08-2.06



rs474834
DOM3Z
intron5

T
0.908
0.948
0.0020
0.10
1.90
1.27-2.85



rs2242589
TNXB
exon29

G
0.085
0.094
0.045
1
1.50
1.02-2.19



rs2075563
TNXB
exon29*
Glu3260Lys
G
0.101
0.170
0.00012
0.0060
1.62
1.35-2.47



rs2269428
TNXB
exon21*
His2363Pro
A
0.101
0.165
0.00034
0.018
1.76
1.29-2.36



rs3749980
TNXB
exon20*
Phe2300Tyr
T
0.101
0.165
0.00034
0.018
1.76
1.29-2.36



rs185819
TNXB
exon10*
His1248Arg
A
0.631
0.727
0.000068
0.0036
1.58
1.28-1.94



rs204900



G
0.928
0.973
0.000084
0.0033
2.83
1.68-4.78



rs2071289
EGFL8
exon6*
Glu204Ala
A
0.016
0.038
0.022
1
2.30
1.15-4.57



rs2849012
NOTCH4
intron7

G
0.873
0.774
0.000015
0.00077
1.86
1.32-2.09



rs620688
NOTCH4
exon8

G
0.338
0.414
0.0030
0.18
1.38
1.12-1.70



rs2071284
NOTCH4
intron4

A
0.104
0.203
0.0000000098
0.0000061
2.21
1.64-2.96



rs2071283
NOTCH4
exon4

A
0.104
0.203
0.000000008
0.0000051
2.21
1.64-2.98



rs2071282
NOTCH4
exon4*
Leu203Pro
T
0.103
0.207
0.000000031
0.0000016
2.26
1.69-3.04



rs2071281
NOTCH4
exon4

T
0.104
0.203
0.000000098
0.0000061
2.21
1.64-2.96



rs415929
NOTCH4
exon4

G
0.336
0.414
0.0030
0.16
1.38
1.12-1.70



rs915894
NOTCH4
exon3*
Lys118Gln
A
0.499
0.552
0.044
1
1.24
1.01-1.82



rs443198
NOTCH4
exon3

T
0.501
0.584
0.015
0.78
1.29
1.05-1.58



rs2273019
C8orf10
intron11

A
0.382
0.471
0.00067
0.030
1.44
1.17-1.77



rs2294878
BTNL2
intron2

C
0.644
0.734
0.00019
0.0097
1.53
1.23-1.90



rs2227139



A
0.596
0.723
0.00000022
0.000011
1.77
1.43-2.20




HLA-DRB1


D406
0.129
0.276
0.00000000000013
0.000000000067
2.67
2.04-3.49



rs2847012



A
0.617
0.898
0.0000087
0.00045
1.96
1.46-2.64



rs2071798



T
0.691
0.779
0.00014
0.0073
1.57
1.25-1.99



rs1049110
HLA-DOB2
exon5*
Gln161Arg
A
0.761
0.807
0.033
1
1.31
1.03-1.66


11q13.4
rs3781909



C
0.411
0.484
0.0040
0.21
1.35
1.10-1.05



rs2006734
PLEKHB1
intron5

T
0.412
0.479
0.0090
0.47
1.31
1.07-1.61



rs2140693
RAB6A
intron1

C
0.492
0.873
0.0020
0.10
1.39
1.13-1.70



rs1792174
MPRL48
5′UTR

A
0.500
0.598
0.00015
0.0080
1.49
1.21-1.83



rs1792160
MPRL48
intron3

A
0.500
0.598
0.00015
0.0080
1.49
1.21-1.83



rs1792193
MPRL48
intron5

T
0.541
0.617
0.0030
0.16
1.37
1.11-1.08



rs1051090
MPRL48
3′UTR

C
0.980
0.984
0.0070
0.36
2.58
1.30-5.04



rs3741138
FLJ11348
exon7*
Ala209Gly
C
0.829
0.888
0.05
1
1.33
1.00-1.76



rs935985
FLJ11348
intron11

C
0.789
0.948
0.0030
0.16
1.50
1.15-1.98



rs637028



T
0.823
0.888
0.022
1
1.40
1.05-1.85



rs863283
LOC374407
exon3

A
0.393
0.485
0.00034
0.017
1.45
1.19-1.79



rs865717



T
0.456
0.535
0.0020
0.10
1.37
1.12-1.68



rs880339
UCP2
exon4*
Ala56Val
G
0.480
0.536
0.0060
0.26
1.36
1.10-1.65



rs2075877
UCP3
exon5

G
0.427
0.487
0.020
1
1.27
1.04-1.58



rs1800849



G
0.657
0.734
0.0010
0.062
1.44
1.15-1.79



rs1527302
DKFZP586P0123
intron2

T
0.648
0.730
0.00078
0.041
1.47
1.18-1.83


10p13
rs2280078
DKFZP761F241
3′UTR

A
0.722
0.779
0.012
0.62
1.36
1.07-1.72



rs2668907
DKFZP761F241
intron2

A
0.410
0.497
0.00076
0.039
1.43
1.16-1.75



rs662141
DKFZP761F241
intron2

T
0.444
0.533
0.00088
0.034
1.43
1.17-1.75



rs1347979



G
0.824
0.876
0.0080
0.31
1.61
1.13-2.01


14q23.1
rs725951



T
0.777
0.840
0.0020
0.10
1.51
1.17-1.96



rs2073318



G
0.786
0.834
0.021
1
1.37
1.06-1.77



rs1980579



T
0.783
0.835
0.013
0.66
1.40
1.06-1.82



rs1950789
RTN1
intron8

C
0.807
0.880
0.0070
0.36
1.47
1.12-1.94



rs2182138
RTN1
intron3

C
0.791
0.842
0.014
0.73
1.40
1.06-1.53



rs927326
RTN1
intron1

A
0.459
0.513
0.039
1
1.24
1.02-1.52












All sample tested



(Control:Case = 940:940)












Frequencies

Odds

















Cytobands
SNPs
Control
Case
P-value
Pc
Ratio
95% CI







6p21.3
rs3219185
0.929
0.984
0.0000038
0.00020
2.01
1.49-2.71




rs769178
0.186
0.227
0.002
0.10
1.29
1.10-1.61




rs2242656
0.898
0.866
0.07
1
1.20
0.99-1.46




rs605273
0.808
0.806
0.07
1
1.20
0.99-1.46




rs2242668
0.882
0.901
0.08
1
1.21
0.99-1.49




rs474834
0.912
0.935
0.008
0.42
1.39
1.09-1.78




rs2242589
0.073
0.083
0.03
1
1.30
1.03-1.64




rs2075563
0.106
0.162
0.00000076
0.00004
1.62
1.34-1.96




rs2269428
0.107
0.159
0.000003
0.00016
1.58
1.30-1.91




rs3749980
0.107
0.180
0.0000024
0.00013
1.59
1.31-1.92




rs185819
0.647
0.711
0.000037
0.0019
1.34
1.17-1.53




rs204900
0.936
0.985
0.000042
0.0022
1.90
1.40-2.59




rs2071289
0.018
0.036
0.002
0.10
1.92
1.26-2.89




rs2849012
0.092
0.762
0.0000016
0.000062
1.43
1.23-1.65




rs620688
0.326
0.408
0.00000022
0.000011
1.42
1.25-1.63




rs2071284
0.113
0.189
0.000000000080
0.0000000042
1.63
1.52-2.20




rs2071283
0.112
0.189
0.000000000057
0.000000003
1.54
1.53-2.21




rs2071282
0.113
0.193
0.000000000011
0.00000000058
1.67
1.56-2.25




rs2071281
0.113
0.189
0.00000000011
0.0000000068
1.82
1.52-2.19




rs415929
0.229
0.408
0.00000055
0.000029
1.41
1.23-1.61




rs915894
0.503
0.558
0.001
0.052
1.24
1.09-1.41




rs443198
0.504
0.569
0.000076
0.0039
1.30
1.14-1.47




rs2273019
0.406
0.470
0.000091
0.0047
1.30
1.14-1.48




rs2294878
0.644
0.733
0.0000000039
0.0000002
1.52
1.32-1.75




rs2227139
0.607
0.718
0.000000000000097
0.000000000045
1.66
1.44-1.49





0.147
0.267
0.000000000000000000097
0.0000000000000000051
2.11
1.79-2.49




rs2847012
0.827
0.887
0.00000013
0.0000057
1.85
1.37-1.98




rs2071798
0.713
0.768
0.00015
0.0076
1.33
1.15-1.54




rs1049110
0.778
0.802
0.07
1
1.18
0.99-1.36



11q13.4
rs3781909
0.428
0.480
0.001
0.052
1.24
1.09-1.41




rs2006734
0.425
0.476
0.002
0.10
1.22
1.06-1.39




rs2140693
0.509
0.564
0.00078
0.039
1.25
1.10-1.42




rs1792174
0.522
0.580
0.00045
0.023
1.26
1.11-1.44




rs1792160
0.522
0.680
0.00035
0.018
1.27
1.11-1.44




rs1792193
0.561
0.608
0.00075
0.039
1.25
1.10-1.43




rs1051090
0.971
0.978
0.4
1
1.21
0.81-1.80




rs3741138
0.833
0.862
0.01
0.73
1.28
1.05-1.50




rs935985
0.804
0.838
0.007
0.36
1.26
1.07-1.49




rs637028
0.831
0.867
0.003
0.15
1.32
1.10-1.58




rs863283
0.428
0.471
0.008
0.42
1.19
1.05-1.35




rs865717
0.483
0.526
0.009
0.47
1.19
1.05-1.36




rs880339
0.487
0.528
0.01
0.68
1.18
1.04-1.34




rs2075877
0.445
0.477
0.05
1
1.14
1.00-1.29




rs1800849
0.867
0.711
0.004
0.21
1.23
1.07-1.41




rs1527302
0.867
0.703
0.003
0.18
1.24
1.06-1.42



10p13
rs2280078
0.768
0.768
1
1
1.00
0.86-1.16




rs2668907
0.438
0.453
0.4
1
1.00
0.93-1.21




rs662141
0.481
0.499
0.3
1
1.08
0.95-1.22




rs1347979
0.828
0.868
0.01
0.73
1.25
1.06-1.50



14q23.1
rs725951
0.787
0.823
0.006
0.31
1.26
1.07-1.48




rs2073318
0.784
0.823
0.002
0.10
1.29
1.10-1.51




rs1980579
0.782
0.822
0.002
0.10
1.29
1.10-1.51




rs1950789
0.810
0.348
0.002
0.10
1.31
1.11-1.56




rs2182138
0.788
0.835
0.0002
0.012
1.36
1.16-1.61




rs927326
0.462
0.502
0.02
0.83
1.17
1.03-1.33







*1gSNPs



*2Nonsynonymous SNPs



*3Fisher's exact lest P-value in 2x2 table of




indicates data missing or illegible when filed














TABLE 8





Table 3. LD blocks and haplotype association with RA























Block*
SNPs
Included genes
Number of
Number of
Positive
Haplotype frequencies
















Cytobands
name
size (kb)
end-start
name
SNPs
haplotype
haplotype
Control
95% CI





6p21.3
a1
8.26
rs2071595-rs2071592
BAT1-lkBL
5
5
4
0.074
0.055-0.093



a2
0.04
rs2239708-rs2071591
lkBL
2
3
1
0.451
0.415-0.485



a3
19.03
rs2269475-rs1046089
BAT2
3
4
4
0.008
0.003-0.015



a4
127.97
rs2242656-rs707929
BAT3-C6orf27
10
11
4
0.072
0.056-0.090



a5
4.13
rs2242668-rs2075800
LSM2-HSPA1L
2
3
3
0.126
0.102-0.150



a6
58.45
rs2242569-rs429150
TNXB
9
9
2
0.362
0.326-0.399



a7
9.75
rs206018-rs2849012
NOTCH4
2
3
1
0.673
0.640-0.706



a8
0.65
rs422951-rs415929
NOTCH4
8
3
3
0.103
0.083-0.126



a9
0.02
rs915894-rs443198
NOTCH4
2
4
2
0.493
0.460-0.531



DRB1
0.26
rs2308754-rs1141742
DRB1
64
29
*0405
0.129




a10
16.35
rs2071798-rs2071550
DQB2
3
4
4
0.069
0.052-0.089


11q13.4
b1
6.86
rs2008734-rs6590
PLEKHB1
2
3
2
0.412
0.376-0.447



b2
139.37
rs1792174-rs935985
MRPL48-
8
6
1
0.500
0.461-0.533






FLJ11848



b3
4.69
rs655717-rs660339
UCP2
2
2
2
0.454
0.417-0.489



b4
9.02
rs668514-rs2075577
UCP3
2
3
2
0.352
0.320-0.387



b5
17.81
rs886650-rs1527302
DKFZP586P0123
2
3
2
0.353
0.320-0.386


10p13
c1
3.89
rs1347979-rs571066

2
4
3
0.175
0.150-0.203



c2
6.07
rs2244380-rs765884
OPTN
2
3
3
0.101
0.079-0.123



c3
13.64
rs999999-rs1324252
OPTN
2
3
3
0.039
0.027-0.052


14q23.1
d1
28.86
rs1952043-rs2182138
RTN1
3
5
3
0.207
0.180-0.238



d2
19.22
rs927326-rs2064992
RTN1
2
4
2
0.457
0.423-0.493
















Block*
Haplotype frequencies
Fisher's exact P values
Odds


















Cytobands
name
size (kb)
Case
95% CI
2x2
PC
Ratio
95% CI







6p21.3
a1
8.26
0.028
0.018-0.040
0.000033
0.0037
0.35
0.21-0.59




a2
0.04
0.495
0.459-0.531
0.10
1
1.19
0.97-1.46




a3
19.03
0.012
0.005-0.020
0.45
1
1.51
0.53-4.26




a4
127.97
0.030
0.019-0.041
0.00014
0.016
0.39
0.23-0.64




a5
4.13
0.089
0.070-0.109
0.024
1
0.68
0.49-0.94




a6
58.45
0.264
0.232-0.296
0.000046
0.0051
0.63
0.50-0.78




a7
9.75
0.774
0.745-0.805
0.000015
0.0017
1.66
1.32-2.09




a8
0.65
0.203
0.175-0.231
0.00000010
0.000011
2.20
1.64-2.95




a9
0.02
0.432
0.400-0.465
0.020
1
0.76
0.64-0.96




DRB1
0.26
0.276

0.0000000000013
0.00000000014
2.67
2.04-3.49




a10
16.35
0.025
0.014-0.037
0.000077
0.0086
0.35
0.20-0.60



11q13.4
b1
6.86
0.479
0.445-0.513
0.011
1
1.31
1.07-1.61




b2
139.37
0.595
0.565-0.629
0.00019
0.021
1.48
1.21-1.82




b3
4.69
0.535
0.501-0.569
0.0027
0.31
1.38
1.12-1.69




b4
9.02
0.273
0.240-0.306
0.0010
0.11
0.69
0.55-0.86




b5
17.81
0.267
0.238-0.297
0.00044
0.049
0.67
0.54-0.84



10p13
c1
3.89
0.120
0.097-0.145
0.0035
0.39
0.64
0.48-0.86




c2
6.07
0.121
0.100-0.145
0.25
1
1.22
0.89-1.69




c3
13.64
0.047
0.032-0.063
0.52
1
1.22
0.74-2.01



14q23.1
d1
28.86
0.158
0.133-0.185
0.014
1
0.72
0.55-0.93




d2
19.22
0.509
0.472-0.545
0.050
1
1.23
1.01-1.51







*LD blocks were inferred by the EM algorithm.






EXPRESSION ANALYSIS

To study the expression patterns of these genes in various tissues including synovial cells, we performed quantitative reverse transcription-PCR (QRT-PCR) using RNA from these tissues.


Total RNA was isolated by ISOGEN (Nippon Gene) from synovial membranes surgically obtained from eight RA and four osteoarthritis (OA) patients. We also isolated total RNA from a synovial cell line (SW982) provided by American Type Culture Collection (ATCC). Other RNAs from various tissues are commercially available from Clontech, Invitrogen, Origene, and Stratagene. We evaluated the quality and quantity of these RNAs by use of Agilent 2100 Bioanalyzer (Agilent) and confirmed their quantities by RiboGreen RNA fluorescence assay (Molecular Probes). Complimentary DNAs were synthesized from these total RNAs using random hexamers and TaqMan reverse transcription reagents kit (Applied Biosystems). We obtained cDNA-specific primers and probes by the ‘Assay-by-Design (AbD)’ for the ten genes tested and by the ‘Assay-on-Demand (AoD)’ for GAPD used as a housekeeping control gene, all of which were provided by Applied Biosystems. After preliminary experiments, 210 nM probes, 756 nM primers, and 0.48 ng/ml cDNA at the final concentration in 50 ml reaction volume were used in 96-well reaction plates on ABI PRISM 7900 according to the standard approach recommended by Applied Biosystems. Each plate was processed three times to calculate the average and SD for each sample. Estimated quantity was calculated each time using a standard curve in each well. All quantity data normalized to GAPD were tested by the Smirnov's test with a 5% significance level. After the reciprocal transformation of all the normalized quantity data, the Student's t-test was conducted for expression levels between RA and OA synovial tissues.


The consistently high expression of NOTCH4 in the lung and of TNBX in the adrenal gland were observed (FIG. 3a). Our results also showed that all the genes were expressed in the RA synovial cells. TNXB and NOTCH4 had significantly high expression levels in the RA synovial cells, whereas RTN1 had the lowest level. We also compared the expression levels of these genes between RA and OA synovial cells. The expression levels of the MRPL48 (P=0.049) and DKFZP761F241 (P=0.027) genes exhibited relatively significant difference between the RA and OA synovial cells by the Student's t-test (Table 9 and FIG. 3b). MRPL48 expression in the RA synovial tissue was approximately twice that in the OA tissue. Three-quarters of the RA tissue donors were homozygous for a positive haplotype in the block b2 of the MRPL48 locus.









TABLE 9







Table 4. Expression levels of RA candidate genes in


OA and RA synovial cells












OA synovial cell

RA synovial cell














Gene
Average
S.D.
Average
S.D.

















IkBL
3.1
2.4
1.6
0.8



TNXB
339.3
349.7
80.7
35.1



NOTCH4
36.7
0.4
39.7
30.8



MRPL48
2.5*
0.2
4.4
1.6



FLJ11848
0.9
0
1.3
0.7



UCP2
1.5
0.9
2.8
2.1



DKFZP761F241
17.5*
8.1
7.1
0.8



OPTN
7.4
2.4
12.7
3.4



RTN1
0.8
0.1
0.9
0.6



BLT2**
51.2
70.7
11.1
6.9







*Expression levels of MRPL48 (P = 0.049) and DKFZP761F241 (P = 0.027) genes showed relatively significant difference between RA and OA synovial tissues.



**The BLT2 (leukotriene B4 receptor subtype 2) gene was employed as a positive control, which has been known to have strong expression in the RA






Statistical Analysis:

To calculate P-values, two types of the Fisher's exact test were used for the 2×2 contingency tables for each individual allele and the 2×m contingency tables for each locus. In this context, m refers to the number of markers observed in a population. To practice the Fisher's exact test for the 2×m contingency tables, Markov chain/Monte Carlo simulation method was adopted. We simply presented “allelic” but not phenotypic association for the 2×2 contingency tables for MS, SNP and haplotype. These P-values were corrected by Bonferroni's correction, wherein the coefficient was the total number of the contingency tables tested. These analyses were conducted with software package MCFishman. Other basic statistical analyses including multiple logistic regression analysis and Mantel-Haenszel test were performed using SPSS program package and Microsoft Excel (trade name). We predicted LD block structures for these SNPs by using the confidence intervals of the D′ value as a LD measure (Gabriel, S. B. et al., Science, 296, 2225 (2002); and Dawson, E. et al., Nature, 418, 544 (2002)). Moreover, haplotypes in each block and their frequencies were estimated by EM and Clark algorithms. Finally, to evaluate the reliability of haplotypes in each block, the 95% confidence interval was calculated from each haplotype frequency given by bootstrap resampling of up to 2000 times on the basis of the estimated haplotype frequencies, which was implemented in the Right program (Mano. S. et al., Ann. Hum. Genet., in press).


In this Example, strong association was found in TNXB and NOTCH4 genes 250 kb distant from HLA-DRB1 in the candidate region narrowed down by the MS markers. These genes are known to be located in LD blocks evidently different from that of HLD-DRB1 (Cullen, M. et al., Am. J. Hum. Genet., 71, 759 (2002); and Walsh, E. C. et al., Am. J. Hum. Genet., 73, 580 (2003)). In agreement with the multiple logistic regression analysis result, the result of Mantel-Haenszel test also showed that positive SNPs in TNXB and NOTCH4 are independent of HLA-DRB1*0405 or SE in both partially dominant and partially recessive models (data not shown). Further, the candidate region was highly identical to one of additional susceptibility regions previously predicted (Jawaheer, D. et al., Am. J. Hum. Genet., 71, 585 (2002)). TNXB is known as a causative gene of one type of Ehlers-Danlos syndrome (MIM*600985) characterized by dysfunction in connective tissues including joints. Its gene products participate in connective tissue functions and in structures via the deposition of collagens of various types (Mao, J. R. et al., Nat. Genet., 30, 421 (2002)), probably including synovial tissues shown here. Type II collagen-induced arthritis in mice is known to mimic rheumatoid arthritis (Moore. A R., Methods Mol. Biol., 225, 175 (2003)).


The present inventors believe that the amino acid exchanges of the TNXB gene product serve as functional factors for RA via a hypothetical pathway associated with collagen metabolism. In recent years, it was reported that the NOTCH4 gene product might participate in over proliferation via tumor necrosis factor (TNF) of synovial cells and in RA (Ando, K. et al., Oncogene, 22, 7796 (2003)). However, large parts of NOTCH4 function are still unclear.


On 11q13.4, although MRPL48 function is still unknown, its expression pattern indicated the association of this gene with RA. The candidate region 11q13.4 contains other interesting genes RAB6A, FLJ11848, UCP2, and UCP3. As with this region, even though further association analysis for 10q13 and 14q23.1 requires using higher-density SNP markers, it is interesting that other chromosomes were found by the method of the present invention. These results chiefly suggested that our marker set and method are highly practicable and applicable to other complicated diseases, at least to oligogene diseases with major genes such as HLA-DRB1 in RA.


Interestingly, our data suggested that the seven most significant MS markers are individually positioned in particular LD blocks as a trend (FIG. 3). These markers were observed on the “Clark blocks” rather than the “EM blocks”. In many cases, positive MS alleles were obviously associated with positive SNP haplotypes in these blocks.

Claims
  • 1. A marker gene for rheumatoid arthritis test consisting of a consecutive partial DNA sequence comprising at least one base exhibiting single nucleotide polymorphism present in a TNXB, NOTCH4, RAB6A, MPRL48, UCP2 or UCP3 gene in the human genomic DNA sequence, or of a complementary strand of the partial DNA sequence.
  • 2. The marker gene according to claim 1, wherein the base exhibiting single nucleotide polymorphism is characterized by being selected from the group consisting of: the 61st base in SEQ ID NO: 1 or a corresponding base on a complementary strand thereof;the 61st base in SEQ ID NO: 2 or a corresponding base on a complementary strand thereof;the 61st base in SEQ ID NO: 3 or a corresponding base on a complementary strand thereof;the 61st base in SEQ ID NO: 4 or a corresponding base on a complementary strand thereof;the 401st base in SEQ ID NO: 5 or a corresponding base on a complementary strand thereof;the 495th base in SEQ ID NO: 6 or a corresponding base on a complementary strand thereof;the 61st base in SEQ ID NO: 7 or a corresponding base on a complementary strand thereof;the 61st base in SEQ ID NO: 8 or a corresponding base on a complementary strand thereof;the 61st base in SEQ ID NO: 9 or a corresponding base on a complementary strand thereof;the 61st base in SEQ ID NO: 10 or a corresponding base on a complementary strand thereof;the 401st base in SEQ ID NO: 11 or a corresponding base on a complementary strand thereof;the 401st base in SEQ ID NO: 12 or a corresponding base on a complementary strand thereof;the 401st base in SEQ ID NO: 13 or a corresponding base on a complementary strand thereof;the 503rd base in SEQ ID NO: 14 or a corresponding base on a complementary strand thereof;the 201st base in SEQ ID NO: 15 or a corresponding base on a complementary strand thereof;the 511th base in SEQ ID NO: 16 or a corresponding base on a complementary strand thereof;the 201st base in SEQ ID NO: 17 or a corresponding base on a complementary strand thereof;the 51st base in SEQ ID NO: 18 or a corresponding base on a complementary strand thereof;the 61st base in SEQ ID NO: 19 or a corresponding base on a complementary strand thereof;the 497th base in SEQ ID NO: 20 or a corresponding base on a complementary strand thereof;the 201st base in SEQ ID NO: 21 or a corresponding base on a complementary strand thereof; andthe 201st base in SEQ ID NO: 22 or a corresponding base on a complementary strand thereof.
  • 3. The marker gene according to claim 1, wherein the marker gene is 50 to 1500 bp in length.
  • 4. The marker gene according to claim 3, wherein the marker gene is 100 to 1000 bp in length.
  • 5. A method for testing rheumatoid arthritis comprising collecting a partial DNA sequence corresponding to a marker gene according to claim 1 from a test subject, determining a nucleotide sequence of the partial DNA sequence, and comparing the nucleotide sequence with a corresponding nucleotide sequence obtained from a normal individual.
  • 6. A test kit for rheumatoid arthritis comprising a marker gene according to claim 1 or a primer thereof.
  • 7. The test kit according to claim 6, wherein the primer has a DNA sequence represented by any of SEQ ID NOs: 23 to 66.
  • 8. A vector comprising a DNA sequence of a marker gene according to claim 1.
  • 9. A host cell transformed with a vector according to claim 8.
  • 10. A polypeptide encoded by a marker gene according to claim 1.
  • 11. A method for producing a polypeptide, comprising incubating a host cell according to claim 9 under conditions suitable for expression.
  • 12. A screening method using a polypeptide according to claim 10.
  • 13. An agonist and/or antagonist obtained by a screening method according to claim 12.
  • 14. A diagnostic, preventive, and/or therapeutic drug for rheumatoid arthritis comprising an agonist and/or antagonist according to claim 13.
Priority Claims (1)
Number Date Country Kind
2004-096989 Mar 2004 JP national
PCT Information
Filing Document Filing Date Country Kind 371c Date
PCT/JP05/05904 3/29/2005 WO 00 8/20/2007