Polynucleotide associated with a type II diabetes mellitus comprising single nucleotide polymorphism, microarray and diagnostic kit comprising the same and method for analyzing polynucleotide using the same

Abstract
Provided is a polynucleotide for diagnosis or treatment of type II diabetes mellitus, including at least 10 contiguous nucleotides of a nucleotide sequence selected from the group consisting of nucleotide sequences of SEQ ID NOS: 1-80 and including a nucleotide at position 101 of the nucleotide sequence, or a complementary polynucleotide thereof.
Description
TECHNICAL FIELD

The present invention relates to a polynucleotide associated with type II diabetes mellitus, a microarray and a diagnostic kit including the same, and a method of analyzing polynucleotides associated with type II diabetes mellitus.


2. Background Art


The genomes of all organisms undergo spontaneous mutation in the course of their continuing evolution, generating variant forms of progenitor nucleic acid sequences (Gusella, Ann. Rev. Biochem. 55, 831-854 (1986)). The variant forms may confer an evolutionary advantage or disadvantage, relative to a progenitor form, or may be neutral. In some instances, a variant form confers a lethal disadvantage and is not transmitted to subsequent generations of the organism. In other instances, a variant form confers an evolutionary advantage to the species and is eventually incorporated into the DNA of most members of the species and effectively becomes the progenitor form. In many instances, both progenitor and variant form(s) survive and co-exist in a species population. The coexistence of multiple forms of a sequence gives rise to polymorphisms.


Several different types of polymorphisms have been known, including restriction fragment length polymorphisms (RFLPs), short tandem repeats (STRs), variable number tandem repeats (VNTRs) and single-nucleotide polymorphisms (SNPs). Among them, SNPs take the form of single-nucleotide variations between individuals of the same species. When SNPs occur in protein coding sequences, any one of the polymorphic forms may give rise to the expression of a defective or a variant protein. On the other hand, when SNPs occur in non-coding sequences, some of these polymorphisms may result in the expression of defective or variant proteins (e.g., as a result of defective splicing). Other SNPs have no phenotypic effects.


It is known that human SNPs occur at a frequency of 1 in about 1,000 bp. When such SNPs induce a phenotypic expression such as a disease, polynucleotides containing the SNPs can be used as primers or probes for diagnosis of a disease. Monoclonal antibodies specifically binding with the SNPs can also be used in diagnosis of a disease. Currently, research into the nucleotide sequences and functions of SNPs is under way by many research institutes. The nucleotide sequences and other experimental results of the identified human SNPs have been made into database to be easily accessible.


Even though findings available to date show that specific SNPs exist on human genomes or cDNAs, phenotypic effects of such SNPs have not been revealed. Functions of most SNPs have not been disclosed yet except some SNPs.


It is known that 90-95% of total diabetes patients suffer type II diabetes mellitus. Type II diabetes mellitus is a disorder which is developed in persons who abnormally produce insulin or have low sensitivity to insulin, thereby resulting in large change in blood glucose level. When disorder of insulin secretion leads to the condition of type II diabetes mellitus, blood glucose cannot be transferred to body cells, which renders the conversion of food into energy difficult. It is known that a genetic cause has a role in type II diabetes mellitus. Other risk factors of type II diabetes mellitus are age over 45, familial history of diabetes mellitus, obesity, hypertension, and high cholesterol level. Currently, diagnosis of diabetes mellitus is mainly made by measuring a pathological phenotypic change, i.e., blood glucose level, using fasting blood glucose (FSB) test, oral glucose tolerance test (OGTT), and the like [National Institute of Diabetes and Digestive and Kidney Diseases of the National Institutes of Health, http://www.niddk.nih.gov, 2003]. When diagnosis of type II diabetes mellitus is made, type II diabetes mellitus can be prevented or its onset can be delayed by exercise, special diet, body weight control, drug therapy, and the like. In this regard, it can be said that type II diabetes mellitus is a disease in which early diagnosis is highly desirable. Millenium Pharmaceuticals Inc. reported that diagnosis and prognosis of type II diabetes mellitus can be made based on genotypic variations present on HNFI gene [PR newswire, Sep. 1, 1998]. Sequenom Inc. reported that FOXA2 (HNF3β) gene is highly associated with type II diabetes mellitus [PR Newswire, Oct. 28, 2003]. Even though there are reports about some genes associated with type II diabetes mellitus, researches into the incidence of type II diabetes mellitus have been focused on specific genes of some chromosomes in specific populations. For this reason, research results may vary according to human species. Furthermore, all causative genes responsible for type II diabetes mellitus have not yet been identified. Diagnosis of type II diabetes mellitus by such a molecular biological technique is now uncommon. In addition, early diagnosis before incidence of type II diabetes mellitus is currently unavailable. Therefore, there is an increasing need to find new SNPs highly associated with type II diabetes mellitus and related genes that are found in whole human genomes and to make early diagnosis of type II diabetes mellitus using the SNPs and the related genes.







DETAILED DESCRIPTION OF THE INVENTION
Technical Goal of the Invention

The present invention provides a polynucleotide containing single-nucleotide polymorphism associated with type II diabetes mellitus.


The present invention also provides a microarray and a type II diabetes mellitus diagnostic kit, each of which includes the polynucleotide containing single-nucleotide polymorphism associated with type II diabetes mellitus.


The present invention also provides a method of analyzing polynucleotides associated with type II diabetes mellitus.


Disclosure Of The Invention

The present invention provides a polynucleotide for diagnosis or treatment of type II diabetes mellitus, including at least 10 contiguous nucleotides of a nucleotide sequence selected from the group consisting of nucleotide sequences of SEQ ID NOS: 1-80 and including a nucleotide of a polymorphic site (position 101) of the nucleotide sequence, or a complementary polynucleotide thereof.


The polynucleotide includes a contiguous span of at least 10 nucleotides containing the polymorphic site of a nucleotide sequence selected from the nucleotide sequences of SEQ ID NOS: 1-80. The polynucleotide is 10 to 400 nucleotides in length, preferably 10 to 100 nucleotides in length, and more preferably 10 to 50 nucleotides in length. Here, the polymorphic site of each nucleotide sequence of SEQ ID NOS: 1-80 is at position 101.


Each of the nucleotide sequences of SEQ ID NOS: 1-80 is a polymorphic sequence. The polymorphic sequence refers to a nucleotide sequence containing a polymorphic site at which single-nucleotide polymorphism (SNP) occurs. The polymorphic site refers to a position of a polymorphic sequence at which SNP occurs. The nucleotide sequences may be DNAs or RNAs.


In the present invention, the polymorphic sites (position 101) of the polymorphic sequences of SEQ ID NOS: 1-80 are associated with type II diabetes mellitus. This is confirmed by DNA sequence analysis of blood samples derived from type II diabetes mellitus patients and normal persons. Association of the polymorphic sequences of SEQ ID NOS: 1-80 with type II diabetes mellitus and characteristics of the polymorphic sequences are summarized in Tables 1-1, 1-2, 2-1, and 2-2.

TABLE 1-1SNPsequence(SEQ IDAllele frequencyGenotype frequencyASSAY_IDSNPNO.)cas_A2con_A2Deltacas_A1A1cas_A1A2cas_A2A2DMX_001C→T1 and 20.5920.4920.154136109DMX_003A→G3 and 40.2920.2020.0915710833DMX_005A→G5 and 60.8710.9130.042371224DMX_008G→A7 and 80.2180.1580.0618010313DMX_009T→G 9 and 100.6640.7370.07331138129DMX_011A→G11 and 120.866.0.9310.065766225DMX_012C→G13 and 140.5270.6140.0877214088DMX_014A→C15 and 160.9030.8370.066058240DMX_016A→G17 and 180.2750.2090.06615811624DMX_019T→C19 and 200.9610.9240.037121275DMX_027G→C21 and 220.8440.890.046387208DMX_028T→C23 and 240.9450.9770.032033266DMX_029C→A25 and 260.0570.1040.047268283DMX_030C→T27 and 280.0770.1290.052251412DMX_031A→T29 and 300.9160.860.056246251DMX_032T→A31 and 320.7180.5930.12526117157DMX_033T→C33 and 340.8160.90.0841089198DMX_044A→T35 and 360.8460.7870.059778213DMX_049T→A37 and 380.1070.060.047236602DMX_052C→T39 and 400.940.9070.033329261df = 2Genotype frequencyChi_exact_p-ASSAY_IDcon_A1A1con_A1A2con_A2A2Chi_valueValueDMX_001771517212.3842.05E−03DMX_003190971213.5271.16E−03DMX_0054442518.0151.82E−02DMX_0082058067.0512.94E−02DMX_009191191617.8142.01E−02DMX_01113925813.6981.06E−03DMX_012441391119.3619.28E−03DMX_014107721114.5396.97E−04DMX_01618295136.9473.10E−02DMX_0192412547.6192.22E−02DMX_0273602376.8423.27E−02DMX_0280142848.2685.72E−03DMX_0292415259.1311.04E−02DMX_0302217039.6837.89E−03DMX_0316722219.6368.08E−03DMX_03251142107204.54E−05DMX_03345123916.7182.34E−04DMX_04415931816.6873.53E−02DMX_0492633609.4598.83E−03DMX_0521522388.5841.37E−02Odds ratio (OR): multiple modelRiskHWESample call ratealleleORCIcon_HWcas_HWcas_call_ratecon_call_rateA2T1.49(1.193, 1.887).027, HWE1.195, HWE11A2G1.61(1.245, 2.123).01, HWE4.819, HWE0.991A1A1.56(1.074, 2.259)2.208, HWE.646, HWE0.991A2A1.49(1.104, 1.996).265, HWE.167, HWE0.990.97A1T1.42(1.106, 1.82).195, HWE.424, HWE0.991A1A2.10(1.414, 3.115).026, HWE.948, HWE0.990.99A1C1.43(1.135, 1.8).032, HWE1.218, HWE10.98A2C1.82(1.274, 2.551)1.527, HWE1.834, HWE0.990.99A2G1.45(1.1, 1.883).089, HWE.241, HWE0.990.97A2C2.04(1.215, 3.413)1.004, HWE.549, HWE0.990.99A1G1.50(1.067, 2.098).069, HWE3.4, HWE0.991A1T2.43(1.286, 4.585).077, HWE.133, HWE10.99A1C1.93(1.247, 2.975)1.514, HWEHWD10.99A1C1.79(1.215, 2.64).51, HWE1.004, HWE0.980.98A2T1.79(1.238, 2.591).214, HWE.004, HWE11A2A1.75(1.374, 2.227).148, HWE.582, HWE11A1T2.02(1.434, 2.831)2.023, HWE.005, HWE0.990.98A2T1.47(1.099, 1.996).452, HWE.013, HWE0.990.96A2G1.89(1.227, 2.874).527, HWE.623, HWE0.991A2T1.61(1.035, 2.506).688, HWE4.52, HWE0.980.97













TABLE 1-2













SNP





sequence



(SEQ ID
Allele frequency
Genotype frequency















ASSAY_ID
SNP
NO.)
cas_A2
con_A2
Delta
cas_A1A1
cas_A1A2
cas_A2A2





DMX_054
C→A
41 and 42
0.14
0.09
0.05
222
70
7


DMX_056
A→G
43 and 44
0.362
0.273
0.089
123
137
40


DMX_060
G→A
45 and 46
0.957
0.925
0.032
0
25
267


DMX_061
T→C
47 and 48
0.758
0.81
0.052
11
121
164


DMX_062
C→T
49 and 50
0.421
0.508
0.087
106
133
59


DMX_063
G→C
51 and 52
0.902
0.953
0.051
2
55
243


DMX_065
C→T
53 and 54
0.92
0.958
0.038
4
39
250


DMX_067
G→A
55 and 56
0.903
0.941
0.038
2
54
243


DMX_068
A→G
57 and 58
0.081
0.133
0.052
252
42
3


DMX_069
T→C
59 and 60
0.44
0.498
0.058
96
143
60


DMX_104
T→C
61 and 62
0.274
0.204
0.07
158
115
24


DMX_105
A→C
63 and 64
0.769
0.838
0.069
19
100
180


DMX_116
T→C
65 and 66
0.6
0.668
0.068
41
157
101


DMX_117
T→C
67 and 68
0.188
0.251
0.063
199
89
12


DMX_120
A→G
69 and 70
0.818
0.871
0.053
7
95
197


DMX_136
T→C
71 and 72
0.211
0.263
0.052
188
96
15


DMX_139
A→G
73 and 74
0.17
0.105
0.065
205
88
7


DMX_150
A→G
75 and 76
0.926
0.958
0.032
0
44
252


DMX_152
A→C
77 and 78
0.562
0.64
0.078
62
136
99


DMX_154
A→G
79 and 80
0.269
0.199
0.07
153
131
15













df = 2












Genotype frequency

Chi_exact_p-














ASSAY_ID
con_A1A1
con_A1A2
con_A2A2
Chi_value
Value







DMX_054
248
48
3
7.14
2.82E−02



DMX_056
160
116
24
10.581
5.04E−03



DMX_060
3
39
257
6.171
4.57E−02



DMX_061
13
87
198
8.911
1.16E−02



DMX_062
72
146
77
9.468
8.79E−03



DMX_063
1
26
272
12.347
2.08E−03



DMX_065
0
24
263
7.84
1.98E−02



DMX_067
2
31
264
7.087
2.89E−02



DMX_068
227
59
10
7.934
1.89E−02



DMX_069
66
164
65
7.165
2.78E−02



DMX_104
184
95
12
7.821
2.00E−02



DMX_105
9
79
212
8.646
1.33E−02



DMX_116
29
139
129
6.554
3.77E−02



DMX_117
171
103
23
6.582
3.72E−02



DMX_120
3
70
222
6.853
3.25E−02



DMX_136
156
123
16
6.311
4.26E−02



DMX_139
239
59
2
11.102
3.88E−03



DMX_150
0
25
270
5.851
2.06E−02



DMX_152
41
129
123
7.034
2.97E−02



DMX_154
187
100
9
9.045
1.09E−02












Odds ratio (OR): multiple model











Risk

HWE
Sample call rate













allele
OR
CI
con_HW
cas_HW
cas_call_rate
con_call_rate

















A2
A
1.64
(1.145, 2.364)
.504, HWE
.819, HWE
1
1


A2
G
1.52
(1.179, 1.923)
.283, HWE
.041, HWE
1
1


A2
A
1.82
(1.1, 3.012)
4.113, HWE
.042, HWE
0.97
1


A1
T
1.36
(1.031, 1.798)
1.122, HWE
3.894, HWE
0.99
0.99


A1
C
1.42
(1.131, 1.788)
.034, HWE
2.43, HWE
0.99
0.98


A1
G
2.22
(1.395, 3.534)
.504, HWE
.08, HWE
1
1


A1
C
2.00
(1.205, 3.314)
.043, HWE
9.409, HWE
0.98
0.96


A1
G
1.72
(1.109, 2.653)
1.047, HWE
.077, HWE
1
0.99


A1
A
1.75
(1.2, 2.557)
6.304, HWE
4.107, HWE
0.99
0.99


A1
T
1.27
(1.007, 1.59)
3.708, HWE
.364, HWE
1
0.98


A2
C
1.47
(1.122, 1.927)
.011, HWE
.284, HWE
0.99
0.97


A1
A
1.56
(1.165, 2.077)
.64, HWE
1.497, HWE
1
1


A1
T
1.34
(1.059, 1.7)
.838, HWE
2.473, HWE
1
0.99


A1
T
1.44
(1.095, 1.902)
2.116, HWE
.464, HWE
1
0.99


A1
A
1.51
(1.097, 2.072)
.497, HWE
.894, HWE
1
0.98


A1
T
1.33
(1.02, 1.746)
1.611, HWE
.42, HWE
1
0.98


A2
G
1.75
(1.247, 2.445)
.498, HWE
.32, HWE
1
1


A1
A
1.81
(1.095, 3.006)
.174, HWE
.654, HWE
0.99
0.98


A1
A
1.38
(1.095, 1.748)
.774, HWE
1.715, HWE
0.99
0.98


A2
G
1.47
(1.129, 1.942)
.768, HWE
3.616, HWE
1
0.99























TABLE 2-1













SNP sequence

Chromosome




ASSAY_ID
rs
SNP site
(SEQ ID NO)
Chromosome #
position
Band
Gene





DMX_001
rs502612
C→T
1 and 2
1
167373461
1q24.2
PRRX1


DMX_003
rs1483
A→G
3 and 4
1
223672376
1q42.13
CDC42BPA


DMX_005
rs632585
A→G
5 and 6
1
228802209
1q42.2
between genes


DMX_008
rs177560
G→A
7 and 8
11
16911751
11p15.1
between genes


DMX_009
rs1394720
T→G
 9 and 10
11
4533242
11p15.4
between genes


DMX_011
rs488115
A→G
11 and 12
11
74409538
11q13.4
between genes


DMX_012
rs2063728
C→G
13 and 14
11
77863284
11q14.1
FLJ23441


DMX_014
rs725834
A→C
15 and 16
13
99254859
13q32.3
CLYBL


DMX_016
rs767837
A→G
17 and 18
13
48218663
13q14.2
between genes


DMX_019
rs929703
T→C
19 and 20
14
77691031
14q24.3
between genes


DMX_027
rs739637
G→C
21 and 22
17
37534470
17q21.2
RAB5C


DMX_028
rs1990936
T→C
23 and 24
17
44307486
17q21.32
between genes


DMX_029
rs2051672
C→A
25 and 26
17
5847149
17p13.2
between genes


DMX_030
rs1038308
C→T
27 and 28
18
44538585
18q21.1
KIAA0427


DMX_031
rs655080
A→T
29 and 30
18
57917416
18q21.33
PIGN


DMX_032
rs1943317
T→A
31 and 32
18
62419479
18q22.1
between genes


DMX_033
rs929476
T→C
33 and 34
19
33499519
19q12
between genes


DMX_044
rs1984388
A→T
35 and 36
22
30658575
22q12.3
between genes


DMX_049
rs1707709
T→G
37 and 38
3
166922235
3q26.1
between genes


DMX_052
rs1786
C→T
39 and 40
4
15340722
4p15.33
between genes













Description
SNP function
Amino acid change
Remarks





Paired related homeobox 1
Intron
No change


CDC42 binding protein kinase alpha (DMPK45 analogue)
Intron
No change



Between genes
No change



Between genes
No change



Between genes
No change



Between genes
No change


Imaginary protein FLJ23441
Intron
No change


Citrate lyase beta analogue
Intron
No change



Between genes
No change



Between genes
No change


RAB5C, RAS oncogene family
Intron
No change



Between genes
No change



Between genes
No change


KIAA0427
Coding-synon
No change


Phosphatidylinositol glycan, class N
Intron
No change



Between genes
No change



Between genes
No change



Between genes
No change



Between genes
No change



Between genes
No change























TABLE 2-2













SNP sequence

Chromosome




ASSAY_ID
rs
SNP site
(SEQ ID NO)
Chromosome #
position
Band
Gene





DMX_054
rs872883
C→A
41 and 42
4
6582619
4p16.1
PPP2R2C


DMX_056
rs752139
A→G
43 and 44
5
175943870
5q35.2
PC-LKC


DMX_060
rs1769972
G→A
45 and 46
6
106782512
6q21
APG5L


DMX_061
rs1322532
T→C
47 and 48
6
19175693
6p22.3
Between genes


DMX_062
rs2058501
C→T
49 and 50
7
120274187
7q31.31
FLJ21986


DMX_063
rs1563047
G→C
51 and 52
7
134030698
7q33
CALD1


DMX_065
rs38809
C→T
53 and 54
7
91792235
7q21.2
PEX1


DMX_067
rs1054748
G→A
55 and 56
8
37837626
8p12
RAB11FIP


DMX_068
rs1434940
A→G
57 and 58
8
69660204
8q13.3
VEST1


DMX_069
rs1059033
T→C
59 and 60
9
77736025
9q21.2
GNAQ


DMX_104
rs492220
T→C
61 and 62
1
94254590
1p22.1
ABCA4


DMX_105
rs685328
A→C
63 and 64
10
117138050
10q25.3
ATRNL1


DMX_116
rs1461986
T→C
65 and 66
13
75506683
13q22.2
Between genes


DMX_117
rs1815620
T→C
67 and 68
14
50995615
14q22.1
Between genes


DMX_120
rs293398
A→G
69 and 70
15
87459425
15q26.1
ABHD2


DMX_136
rs1686492
T→C
71 and 72
2
10915411
2p25.1
Between genes


DMX_139
rs1237905
A→G
73 and 74
2
168278137
2q24.3
Between genes


DMX_150
rs589682
A→G
75 and 76
3
122172648
3q13.33
STXBP5L


DMX_152
rs607209
A→C
77 and 78
4
16808165
4p15.32
Between genes


DMX_154
rs197367
A→G
79 and 60
7
36219096
7p14.2
ANLN















Amino





acid


Description
SNP function
change
Remarks





Protein phosphatase 2 (former 2A), regulatory subunit B
Intron
No change


(PR 52), gamma isoform


Protocadherin LKC
Intron
No change


APG5 autophagy 545 analogue (S. cerevisiae)
Intron
No change



Between genes
No change


Imaginary protein FLJ21986
Intron
No change


Caldesmon 1
Intron
No change


Peroxisome biogenesis factor 1
Intron
No change


RAB11 family interaction protein 1 (class I)
No classified
No change
Between genes in


Vestibule-1 protein
Intron
No change
NCBI bulid 119


Guanine nucleotide binding protein (G protein),
Intron
No change


q polypeptide


ATP45; binding cassette, sub45; family A (ABC1),
Intron
No change


member 4


Attractin45 analogue 1
Intron
No change
KIAA0534 in NCBI



Between genes
No change
build 119



Between genes
No change


2-containing abhydrolase domain
mma-utr
No change



Between genes
No change



Between genes
No change


Syntaxin binding protein 545 analogue
Intron
No change
KIAA 1006 in NCBI



Between genes
No change
build 119


Anillin, actin binding protein (scraps homolog, Drosophila)
coding-nonsynon
K→R









In Tables 1-1 and 1-2, the contents in columns are as defined below.

    • Assay_ID represents a marker name.
    • SNP is a polymorphic base of a SNP polymorphic site. Here, A1 and A2 represent respectively a low mass allele and a high mass allele as a result of sequence analysis according a homogeneous MassExtension (hME) technique (Sequenom) and are optionally designated for convenience of experiments.
    • SNP sequence represents a sequence containing a SNP site, i.e., a sequence containing allele A1 or A2 at position 101.
    • At the allele frequency column, cas_A2, con_A2, and Delta respectively represent allele A2 frequency of a case group, allele A2 frequency of a normal group, and the absolute value of the difference between cas_A2 and con_A2. Here, cas_A2 is (genotype A2A2 frequency×2+genotype A1A2 frequency)/(the number of samples×2) in the case group and con_A2 is (genotype A2A2 frequency×2+genotype A1A2 frequency)/(the number of samples×2) in the normal group.
    • Genotype frequency represents the frequency of each genotype. Here, cas_A1A1, cas_A1A2, and cas_A2A2 are the number of persons with genotypes A1A1, A1A2, and A2A2, respectively, in the case group, and con_A1A1, con_A1A2, and con_A2A2 are the number of persons with genotypes A1A1, A1A2, and A2A2, respectively, in the normal group.
    • df=2 represents a chi-squared value with two degree of freedom. Chi-value represents a chi-squared value and p-value is determined based on the chi-value. Chi_exact_p-value represents p-value of Fisher's exact test of chi-square test. When the number of genotypes is less than 5, results of the chi-square test may be inaccurate. In this respect, determination of more accurate statistical significance (p-value) by the Fisher's exact test is required. The chi_exact_p-value is a variable used in the Fisher's exact test. In the present invention, when the p-values≦0.05, it is considered that the genotype of the case group is different from that of the normal group, i.e., there is a significant difference between the case group and the normal group.
    • At the risk allele column, when a reference allele is A2 and the allele A2 frequency of the case group is larger than the allele A2 frequency of the normal group (i.e., cas_A2>con_A2), the allele A2 is regarded as risk allele. In an opposite case, allele A1 is regarded as risk allele.
    • Odds ratio represents the ratio of the probability of risk allele in the case group to the probability of risk allele in the normal group. In the present invention, the Mantel-Haenszel odds ratio method was used. Cl represents 95% confidence interval for the odds ratio and is represented by (lower limit of the confidence interval, upper limit of the confidence interval). When 1 falls under the confidence interval, it is considered that there is insignificant association of risk allele with disease.
    • HWE represents Hardy-Weinberg Equilibrium. Here, con_HWE and cas_HWE represent degree of deviation from the Hardy-Weinberg Equilibrium in the normal group and the case group, respectively. Based on chi_value-6.63 (p-value=0.01, df=i) in a chi-square (df—1) test, a value larger than 6.63 was regarded as Hardy-Weinberg Disequilibrium (HWD) and a value smaller than 6.63 was regarded as Hardy-Weinberg Equilibrium (HWE).
    • Call rate represents the number of genotype-interpretable samples to the total number of samples used in experiments. Here, cas_call_rate and con_call_rate represent the ratio of the number of genotype-interpretable samples to the total number (300 persons) of samples used in the case group and the normal group, respectively.


Tables 2-1 and 2-2 present characteristics of SNP markers based on the NCBI build 123.


As shown in Tables 1-1, 1-2, 2-1, and 2-2, according to the chi-square test of the polymorphic markers of SEQ ID NOS: 1-80 of the present invention, chi_exact_p-value ranges from 4.54×10−4 to 0.0104 in 95% confidence interval. This shows that there are significant differences between expected values and measured values in allele occurrence frequencies in the polymorphic markers of SEQ ID NOS: 1-80. Odds ratio ranges from 1.34 to 2.43, which shows that the polymorphic markers of SEQ ID NOS: 1-80 are associated with type II diabetes mellitus.


The SNPs of SEQ ID NOS: 1-80 of the present invention occur at a significant frequency in a type II diabetic patient group and a normal group. Therefore, the polynucleotide according to the present invention can be efficiently used in diagnosis, fingerprinting analysis, or treatment of type II diabetes mellitus. In detail, the polynucleotide of the present invention can be used as a primer or a probe for diagnosis of type II diabetes mellitus. Furthermore, the polynucleotide of the present invention can be used as antisense DNA or a composition for treatment of type II diabetes mellitus.


The present invention also provides an allele-specific polynucleotide for diagnosis of type II diabetes mellitus, which is hybridized with a polynucleotide including a contiguous span of at least 10 nucleotides containing a nucleotide of a polymorphic site of a nucleotide sequence selected from the group consisting of the nucleotide sequences of SEQ ID NOS: 1-80, or a complement thereof.


The allele-specific polynucleotide refers to a polynucleotide specifically hybridized with each allele. That is, the allele-specific polynucleotide has the ability that distinguishes nucleotides of polymorphic sites within the polymorphic sequences of SEQ ID NOS: 1-80 and specifically hybridizes with each of the nucleotides. The hybridization is performed under stringent conditions, for example, conditions of 1M or less in salt concentration and 25° C. or more in temperature. For example, conditions of 5×SSPE (750 mM NaCl, 50 mM Na Phosphate, 5 mM EDTA, pH 7.4) and 25-30° C. are suitable for allele-specific probe hybridization.


In the present invention, the allele-specific polynucleotide may be a primer. As used herein, the term “primer” refers to a single stranded oligonuleotide that acts as a starting point of template-directed DNA synthesis under appropriate conditions, for example in a buffer containing four different nucleoside triphosphates and polymerase such as DNA or RNA polymerase or reverse transcriptase and an appropriate temperature. The appropriate length of the primer may vary according to the purpose of use, generally 15 to 30 nucleotides. Generally, a shorter primer molecule requires a lower temperature to form a stable hybrid with a template. A primer sequence is not necessarily completely complementary with a template but must be complementary enough to hybridize with the template. Preferably, the 3′ end of the primer is aligned with a nucleotide of each polymorphic site (position 101) of SEQ ID NOS: 1-80. The primer is hybridized with a target DNA containing a polymorphic site and starts an allelic amplification in which the primer exhibits complete homology with the target DNA. The primer is used in pair with a second primer hybridizing with an opposite strand. Amplified products are obtained by amplification using the two primers, which means that there is a specific allelic form. The primer of the present invention includes a polynucleotide fragment used in a ligase chain reaction (LCR).


In the present invention, the allele-specific polynucleotide may be a probe. As used herein, the term “probe” refers to a hybridization probe, that is, an oligonucleotide capable of sequence-specifically binding with a complementary strand of a nucleic acid. Such a probe may be a peptide nucleic acid as disclosed in Science 254, 1497-1500 (1991) by Nielsen et al. The probe according to the present invention is an allele-specific probe. In this regard, when there are polymorphic sites in nucleic acid fragments derived from two members of the same species, the probe is hybridized with DNA fragments derived from one member but is not hybridized with DNA fragments derived from the other member. In this case, hybridization conditions should be stringent enough to allow hybridization with only one allele by significant difference in hybridization strength between alleles. Preferably, the central portion of the probe, that is, position 7 for a 15 nucleotide probe, or position 8 or 9 for a 16 nucleotide probe, is aligned with each polymorphic site of the nucleotide sequences of SEQ ID NOS: 1-80. Therefore, there may be caused a significant difference in hybridization between alleles. The probe of the present invention can be used in diagnostic methods for detecting alleles. The diagnostic methods include nucleic acid hybridization-based detection methods, e.g., southern blot. In a case where DNA chips are used for the nucleic acid hybridization-based detection methods, the probe may be provided as an immobilized form on a substrate of a DNA chip.


The present invention also provides a microarray for diagnosis of type II diabetes mellitus, including the polynucleotide according to the present invention or the complementary polynucleotide thereof. The polynucleotide of the microarray may be DNA or RNA. The microarray is the same as a common microarray except that it includes the polynucleotide of the present invention.


The present invention also provides a type II diabetes mellitus diagnostic kit including the polynucleotide of the present invention. The type II diabetes mellitus diagnostic kit may include reagents necessary for polymerization, e.g., dNTPs, various polymerases, and a colorant, in addition to the polynucleotide according to the present invention.


The present invention also provides a method of diagnosing type II diabetes mellitus in an individual, which includes: isolating a nucleic acid sample from the individual; and determining a nucleotide of at least one polymorphic site (position 101) within polynucleotides of SEQ ID NOS: 1-80 or complementary polynucleotides thereof. Here, when the nucleotide of the at least one polymorphic site of the sample nucleic acid is the same as at least one risk allele presented in Tables 1-1, 1-2, 2-1, 2-2, 3, 4, and 5, it may be determined that the individual has a higher likelihood of being diagnosed as at risk of developing type II diabetes mellitus.


The step of isolating the nucleic acid sample from the individual may be carried out by a common DNA isolation method. For example, the nucleic acid sample can be obtained by amplifying a target nucleic acid by polymerase chain reaction (PCR) followed by purification. In addition to PCR, there may be used LCR (Wu and Wallace, Genomics 4, 560 (1989), Landegren et al., Science 241, 1077 (1988)), transcription amplification (Kwoh et al., Proc. Natl. Acad. Sci. USA 86, 1173 (1989)), self-sustained sequence replication (Guatelli et al., Proc. Natl. Acad. Sci. USA 87, 1874 (1990)), or nucleic acid sequence based amplification (NASBA). The last two methods are related with isothermal reaction based on isothermal transcription and produce 30 or 100-fold RNA single strands and DNA double strands as amplification products.


According to an embodiment of the present invention, the step of determining a nucleotide of a polymorphic site includes hybridizing the nucleic acid sample onto a microarray on which polynucleotides for diagnosis or treatment of type II diabetes mellitus, including at least 10 contiguous nucleotides derived from the group consisting of nucleotide sequences of SEQ ID NOS: 1-80 and including a nucleotide of a polymorphic site (position 101), or complementary polynucleotides thereof are immobilized; and detecting the hybridization result.


A microarray and a method of preparing a microarray by immobilizing a probe polynucleotide on a substrate are well known in the pertinent art. Immobilization of a probe polynucleotide associated with type II diabetes mellitus of the present invention on a substrate can be easily performed using a conventional technique. Hybridization of nucleic acids on a microarray and detection of the hybridization result are also well known in the pertinent art. For example, the detection of the hybridization result can be performed by labeling a nucleic acid sample with a labeling material generating a detectable signal, such as a fluorescent material (e.g., Cy3 and Cy5), hybridizing the labeled nucleic acid sample onto a microarray, and detecting a signal generated from the labeling material.


According to another embodiment of the present invention, as a result of the determination of a nucleotide sequence of a polymorphic site, when at least one nucleotide sequence selected from SEQ ID NOS: 2. 4, 5, 8, 9, 11, 13, 16, 18, 20, 21, 23, 25, 27, 30, 32, 33, 36, 38, 40, 42, 44, 46, 47, 49, 51, 53, 55, 57, 59, 62, 63, 65, 67, 69, 71, 75, 77, and 80 containing risk alleles is detected, it may be determined that the individual has a higher likelihood of being diagnosed as at risk of developing type II diabetes mellitus. If more nucleotide sequences containing risk alleles are detected in an individual, it may be determined that the individual has a much higher likelihood of being diagnosed as at risk of developing type II diabetes mellitus.


Hereinafter, the present invention will be described more specifically by Example. However, the following Example is provided only for illustrations and thus the present invention is not limited thereto.


Effect of the Invention


The polynucelotide according to the present invention can be used in diagnosis, treatment, or fingerprinting analysis of type II diabetes mellitus.


The microarray and diagnostic kit including the polynucleotide according to the present invention can be used for efficient diagnosis of type II diabetes mellitus.


The method of analyzing polynucleotides associated with type II diabetes mellitus according to the present invention can efficiently detect the presence or a risk of type II diabetes mellitus.


BEST MODE FOR CARRYING OUT THE INVENTION
EXAMPLE
Example 1

In this Example, DNA samples were extracted from blood streams of a patient group consisting of 300 Korean persons that had been identified as type II diabetes mellitus patients and had been being under treatment and a normal group consisting of 300 persons free from symptoms of type II diabetes mellitus and being of the same age with the patient group, and occurrence frequencies of specific SNPs were evaluated. The SNPs were selected from a known database (NCBI dbSNP:http://www.ncbi.nlm.nih.gov/SNP/) or (Sequenom:http://www.realsnp.com/). Primers hybridizing with sequences around the selected SNPs were used to assay the nucleotide sequences of SNPs in the DNA samples.


1. Preparation of DNA Samples


DNA samples were extracted from blood streams of type II diabetes mellitus patients and normal persons. DNA extraction was performed according to a known extraction method (Molecular cloning: A Laboratory Manual, p 392, Sambrook, Fritsch and Maniatis, 2nd edition, Cold Spring Harbor Press, 1989) and the specification of a commercial kit manufactured by Centra system. Among extracted DNA samples, only DNA samples having a purity (A260/A280 nm) of at least 1.7 were used.


2. Amplification of Target DNAs


Target DNAs, which are predetermined DNA regions containing SNPs to be analyzed, were amplified by PCR. The PCR was performed by a common method as the following conditions. First, 2.5 ng/ml of target genomic DNAs were prepared. Then, the following PCR mixture was prepared.

Water (HPLC grade)2.24 μl10x buffer (15 mM MgCl2, 25 mM MgCl2) 0.5 μldNTP Mix (GIBCO) (25 mM for each)0.04 μlTaq pol (HotStar) (5 U/μl)0.02 μlForward/reverse primer Mix (1 μM for each)0.02 μlDNA1.00 μlTotal volume5.00 μl


Here, the forward and reverse primers were designed based on upstream and downstream sequences of SNPs in known database. These primers are listed in Table 3 below.


The thermal cycles of PCR were as follows: incubation at 95° C. for 15 minutes; 45 cycles at 95° C. for 30 seconds, at 56° C. for 30 seconds, and at 72° C. for 1 minute; and incubation at 72° C. for 3 minutes and storage at 4° C. As a result, amplified DNA fragments which were 200 or less nucleotides in length were obtained.


3. Analysis of SNPs in Amplified Target DNA Fragments


Analysis of SNPs in the amplified target DNA fragments was performed using a homogeneous MassExtension (hME) technique available from Sequenom. The principle of the MassExtension technique was as follows. First, primers (also called as “extension primers”) ending immediately before SNPs within the target DNA fragments were designed. Then, the primers were hybridized with the target DNA fragments and DNA polymerization was performed. At this time, a polymerization solution contained a reagent (e.g., ddTTP) terminating the polymerization immediately after the incorporation of a nucleotide complementary to a first allelic nucleotide (e.g., A allele). In this regard, when the first allele (e.g., A allele) exists in the target DNA fragments, products in which only a nucleotide (e.g., T nucleotide) complementary to the first allele extended from the primers will be obtained. On the other hand, when a second allele (e.g., G allele) exists in the target DNA fragments, a nucleotide (e.g., C nucleotide) complementary to the second allele is added to the 3′-ends of the primers and then the primers are extended until a nucleotide complementary to the closest first allele nucleotide (e.g., T nucleotide) is added. The lengths of products extended from the primers were determined by mass spectrometry. Therefore, alleles present in the target DNA fragments could be identified. Illustrative experimental conditions were as follows.


First, unreacted dNTPs were removed from the PCR products. For this, 1.53 μl of deionized water, 0.17 μl of HME buffer, and 0.30 μl of shrimp alkaline phosphatase (SAP) were added and mixed in 1.5 ml tubes to prepare SAP enzyme solutions. The tubes were centrifuged at 5,000 rpm for 10 seconds. Thereafter, the PCR products were added to the SAP solution tubes, sealed, incubated at 37° C. for 20 minutes and then 85° C. for 5 minutes, and stored at 4° C.


Next, homogeneous extension was performed using the amplified target DNA fragments as templates. The compositions of the reaction solutions for the extension were as follows.

Water (nanoscale deionized water)1.728 μlhME extension mix (10x buffer containing 2.25 mM d/ddNTPs)0.200 μlExtension primers (100 μM for each)0.054 μlThermosequenase (32 U/μl)0.018 μlTotal volume 2.00 μl


The reaction solutions were thoroughly mixed with the previously prepared target DNA solutions and subjected to spin-down centrifugation. Tubes or plates containing the reaction solutions were compactly sealed and incubated at 94° C. for 2 minutes, followed by homogeneous extension for 40 cycles at 94° C. for 5 seconds, at 52° C. for 5 seconds, and at 72° C. for 5 seconds, and storage at 4° C. The homogeneous extension products thus obtained were washed with a resin (SpectroCLEAN). Extension primers used in the extension are listed in Table 3 below.

TABLE 3Primers for amplification and extension primers for homogeneousextension for target DNAsAmplification primer(SEQ ID NO)ExtensionMarkerForward primerReverse primerprimer (SEQ ID NO)DMX_001818283DMX_003848586DMX_005878889DMX_008909192DMX_009939495DMX_011969798DMX_01299100101DMX_014102103104DMX_016105106107DMX_019108109110DMX_027111112113DMX_028114115116DMX_029117118119DMX_030120121122DMX_031123124125DMX_032126127128DMX_033129130131DMX_044132133134DMX_049135136137DMX_052138139140DMX_054141142143DMX_056144145146DMX_060147148149DMX_061150151152DMX_062153154155DMX_063156157158DMX_065159160161DMX_067162163164DMX_068165166167DMX_069168169170DMX_104171172173DMX_105174175176DMX_116177178179DMX_117180181182DMX_120183184185DMX_136186187188DMX_139189190191DMX_150192193194DMX_152195196197DMX_154198199200


Nucleotides of polymorphic sites in the extension products were assayed using mass spectrometry, MALDI-TOF (Matrix Assisted Laser Desorption and Ionization-Time of Flight). The MALDI-TOF is operated according to the following principle. When an analyte is exposed to a laser beam, it flies toward a detector positioned at the opposite side in a vacuum state, together with an ionized matrix. At this time, the time taken for the analyte to reach the detector is calculated. A material with a smaller mass reaches the detector more rapidly. The nucleotides of SNPs in the target DNA fragments are determined based on a difference in mass between the DNA fragments and known nucleotide sequences of the SNPs.


Determination results of the nucleotides of polymorphic sites of the target DNAs using the MALDI-TOF are shown in Tables 1-1, 1-2, 2-1, and 2-2. Each allele may exist in the form of homozygote or heterozygote in an individual. However, the distribution between heterozygotes frequency and homozygotes frequency in a given population does not exceed a statistically significant level. According to Mendel's Law of inheritance and Hardy-Weinberg Law, a genetic makeup of alleles constituting a population is maintained at a constant frequency. When the genetic makeup is statistically significant, it can be considered to be biologically meaningful. The SNPs according to the present invention occur in type II diabetes mellitus patients at a statistically significant level, as shown in Tables 1-1, 1-2, 2-1, and 2-2, and thus, can be efficiently used in diagnosis of type II diabetes mellitus.

Claims
  • 1. A polynucleotide for diagnosis or treatment of type II diabetes mellitus, comprising at least 10 contiguous nucleotides of a nucleotide sequence selected from the group consisting of nucleotide sequences of SEQ ID NOS: 1-80 and comprising a nucleotide at position 201 of the nucleotide sequence, or a complementary polynucleotide thereof.
  • 2. A polynucleotide for diagnosis or treatment of type II diabetes mellitus, which is hybridized with the polynucleotide of claim 1 or the complementary polynucleotide thereof.
  • 3. The polynucleotide of claim 1, which is 10 to 100 nucleotides in length.
  • 4. The polynucleotide of claim 1, which is a primer or a probe.
  • 5. A microarray for diagnosis of type II diabetes mellitus, which comprises the polynucleotide of claim 1 or the complementary polynucleotide thereof.
  • 6. A kit for diagnosis of type II diabetes mellitus, which comprises the polynucleotide of claim 1 or the complementary polynucleotide thereof.
  • 7. A method of diagnosing type II diabetes mellitus in an individual, which comprises: (a) isolating a nucleic acid sample from the individual; and (b) determining a nucleotide of at least one polymorphic site (position 101) within polynucleotides of SEQ ID NOS: 1-80 or complementary polynucleotides thereof.
  • 8. The method of claim 7, wherein the step of determining the nucleotide of the at least one polymorphic site comprises: hybridizing the nucleic acid sample onto a microarray on which the polynucleotide of claim 1 or its complementary polynucleotide is immobilized; and detecting a hybridization result.
  • 9. The method of claim 7, wherein when at least one nucleotide sequence selected from SEQ ID NOS: 2. 4, 5, 8, 9, 11, 13, 16, 18, 20, 21, 23, 25, 27, 30, 32, 33, 36, 38, 40, 42, 44, 46, 47, 49, 51, 53, 55, 57, 59, 62, 63, 65, 67, 69, 71, 75, 77, and 80 containing risk alleles is detected, it is determined that the individual has a higher likelihood of being diagnosed as at risk of developing type II diabetes mellitus.
  • 10. The polynucleotide of claim 2, which is 10 to 100 nucleotides in length.
Priority Claims (2)
Number Date Country Kind
10-2003-0096191 Dec 2003 KR national
10-2004-0111102 Dec 2004 KR national
PCT Information
Filing Document Filing Date Country Kind 371c Date
PCT/KR04/03441 12/24/2004 WO 8/23/2005