The present invention relates to methods for determining and differentiating between several rheumatic conditions.
Arthritis is a symptom of many rheumatic conditions including rheumatoid arthritis (RA), osteoarthritis (OA), seronegative arthritis (SA), infectious arthritis (INF), microcrystalline arthritis (MIC), systemic lupus erythematosus (SLE) and other systemic disorders. In many cases, the diagnosis can be established based on the clinical presentation and additional laboratory or radiological tests. However, in some cases, in particular in early disease, it can be very hard to discriminate between these disorders and the correct identification can only be made after several weeks or months of evolution.
Expression profiling has already shown its usefulness in identifying genes in specific cell types under defined conditions and in establishing characteristic patterns of gene expression in a variety of diseases. WO 2004/110244 describes a method for detecting a predisposition to developing established rheumatoid arthritis (RA) in a subject by obtaining a biological sample from the subject, determining expression levels of at least two genes in the biological sample, and comparing the expression level of each gene with a standard, wherein the comparing detects a predisposition to developing established RA in the subject. WO 2004/035827 describes libraries of polynucleotide sequences and polynucleotide array useful for prognosticating or diagnosing rheumatoid arthritis or osteoarthritis.
However, these observations are not useful in daily medical practice, when the differential diagnosis needs to be made not only between RA versus normal but between a larger spectrum of differential diagnosis. Yet, it is very important to provide patients with early identification and classification between different rheumatic conditions and thereby to provide adequate therapy in order to achieve early remission and avoid long-term damage.
The present inventors have designed an arthritis discrimination test, which allows early identification of several rheumatic conditions.
The present inventors have identified a series of genes useful in screening for and differentiating several arthritis diseases. The expression profiles of these genes can be used in identifying whether the individual has a rheumatic condition selected from, but not limited to systemic lupus erythematosus (SLE), osteoarthritis (OA), rheumatoid arthritis (RA), seronegative arthritis (SA) or microcrystalline arthritis (MIC).
The present invention therefore concern a method for the determination and the classification of rheumatic conditions in at least one biological sample of a subject afflicted with said rheumatic condition, comprising the steps of
The present inventors have specifically selected 2059 genes listed above found by ANOVA to be differentially expressed between the five conditions SLE, OA, RA, SA, or MIC.
In a particular embodiment, the identification step comprises the comparison of the level of expression of said at least 100 genes with the level of expression of said genes in a reference sample of the same type obtained from subject afflicted with determined rheumatic conditions, according to clustering analysis.
In particular, the present invention is performed using a synovial sample.
In an embodiment, said method comprises determining in said synovial sample the expression level of at least 100 up to 2059 genes or fragments thereof selected from the group of genes or fragments thereof listed above.
In an embodiment, said at least 100 genes or fragments thereof selected from the 2059 genes listed above, are the following genes which were found to be sufficient to define a specific signature in every disease: AI923633; ARPC4; ASH1L; AW008502; AW296081; AW612461; BAG2; BCL11B; BE676335; BG231773; BSG; BTBD1; C12orf30/FLJ13089; C14orf131; CALU; CCL5; CCND1; CD79A; CDC42BPA; CDC42SE1/SPEC1; CHD1; CHD9/FLJ12178; CPSF1; CST7; CTDSPL; DUT; EIF3EIP/EIF3S6IP; ETV6; EXTL2; FLJ21395 fis; FN1; FRMD4A; G3BP1; GPR4; GRID1; GYG1; HMGB1; IFI27; IFI6/G1P3; IFIT3/IFIT4; IL7R; JAK3; JOSD3/MGC5306; KIAA0090; KIAA1128; KIAA1377; KIAA1450; LCK; LOC129607; MYEOV2/LOC150678; NBL1; NELL2; OAS1; PARP12/ZC3HDC1; PDE5A; PGF; PHF21A/BHC80; PIK3C2A; PTBP1; PTEN; PTPN7; QKI; RAB8A; RALGPS2; RAP2A; RAPGEF2; RASGRP1; RBBP6; RGS5; RPL4; RSAD2/cig5; SFRS2B/SRP46; SFRS6; SIPA1L3; SLC15A2; SPARCL1; SUPT16H; SYNCOILIN; TARP /// TRGC2 /// TRGV9; TBC1D20/C20orf140; TBC1D24/KIAA1171; THRAP3; TI-227H /// TUG1; TLE2; TMEM43/MGC3222; TNFSF8; TOX; TRBC1 /// TRBV19; TSPAN3/TM4SF8; UCKL1/URKL1; WDR90/LOC197336; ZFHX3/ATBF1; T87730.
In an embodiment, said method comprises determining in said synovial sample the expression level of at least 264 genes or fragments thereof selected from the group comprising PLAC8, IRF4, SAMD3, HLA-DOB, EOMES, PDK1, BG548679, AK25048, AU146285, P2RY8, AI225238, JAK3, LAX, PTPN7, RP26, TRIM, SLAMF1, PTPRCAP, LCK, PTP4A3, AI825068, BCL11B, CD79A, IL7R, GPR18, STRBP, C20orf103, AA732944, ZAP70, SLC38A1, RUNX3, TOSO, BCL11A, NELL2, ICAM3, LTB, TCF7, TRD@; ANK3, CREB5, FAH, FKBP7, AF009316, KLF4, ANKH, NTN4, THBS4, SCARA3, CYP3A5, MGP, AMN, CMAH, FN1, GABRA4, ABCC6, BE503823, SLC24A6; CXCL11, LOC113730, cig5, STAT1, GM2A, G1P3, IFIT4, EIF5A, EPSTI1, MX1, IFI44L, IFI27, LOC129607, G1P2, IFIT1, FLJ39885, OAS2, FLJ20637, OAS1, MDA5, OAS3, LILRA2, TGFBRAP1, BST2, OASL, CEB1, HLA-DOA, KIS, GNB4, CLECSF12, AW262311, CALR, FPRL2, MAP3K2, FLJ20668, CYBB, SNX10, GRB2, GPR43, FLJ20035, C1QG, ARF6, IFI35, FLJ21069, BE674143; ZNF607, X07868, AU157716, CCDC3, CPSF1, AW029203, AK022838, CCNL2, C7orf10, SEC24D, AFG3L1, TLE2, PCSK5, AA706701, C18orf18, OSBPL6, BC042472, AUTS2, SOX4, PTPRD, AA572675, COL18A1, COL16A1, GPM6B, SCG2, TNC, TP53I11, COL12A1, AL832806, AA912540, MYST4, STEAP, ARNT2, AA854843, KIAA0484, AU147442, PKN3, SYNE1, DSPG3, AW903934, FBXO26, LOC200772, AF116709; SCRG1, DNAJC12, FNDC4, AK024204, NBL1, BF591996, DIO2, SGCD, T90703, SPOCK, CKLFSF4, AW162210, CDC42BPA, KIAA1171, FKSG17, N73742, ZNF515, PTPRS, CTDSPL, NRP2, EXTL2, CCND1, CALU, PVR, ATBF1, AFURS1, SYNPO, MYO7A, KIAA1450, SLC35B4; BCL2L11, MBNL2, TXNIP, RNASET2, DHX9, RUNX2; MSI2, FZD8, AW265065, DBC-1, ANGPTL2, FBI4, C14orf131, BF057799, SRPR, TTC3, COPA, PCDHGA11; CXCL13, AI823917, AA789123, CD209, IAN4L1, NOD3, KIAA1268, HRMT1L1, AI821404, COPG, FLJ33814, H963, TAP1, CUL5, AW504569, AA809449, CCL8, ZC3HDC1, SYNCOILIN, KIAA0090, GGA2, NAP1L, ETV6, AL037998, AK024712, AW612461, BM873997, AK000795, FLJ00133, FOXC1, MGC43690, BF590303, AI090764, BF221547, ID4, AW296081, PGF, RGS5, W80359, AF086069, FLJ32949, RAMP3, T87730, AI742685, GPR4, GRID1, FAM20A, FRMD4, FLJ13089, TBX2, RAPGEF2, BM353142; SLC1A4, LIG3, EMILIN2, ALDH1A1, TNFSF8, PB1, AA365670, YWHAZ, ZNF581, BAG4, CWF19L1, SLC15A2, PTPRC, STRN3, TCL1A, PKHD1L1, and CTLA4.
These 264 genes or fragment thereof are selected from the 2059 genes listed herein above and define a specific gene signature in each disease.
The present invention also relates to a method for the determination and the classification of a rheumatic condition in at least one synovial sample of a subject afflicted with said rheumatic condition, comprising the steps of
Said 20 genes or fragments thereof are selected from the 2059 genes listed above and were found to be sufficient to define a specific gene signature in every disease.
In a particular embodiment, said synovial sample is a synovial tissue. In another particular embodiment, said synovial sample is a synovial fluid. In yet another embodiment said method is performed on cells from the synovial fluid.
The present inventors have found that gene expression profiling in peripheral blood mononuclear cell (PBMC) is not adequate in order to make a correct diagnosis of joint disorders. Sensitivity and specificity of the gene expression profiles are too low. Therefore, all the previous studies about gene profiling in PBMC are not suitable for development of a diagnostic tool. By contrast, the present invention is based on the study of gene expression profiles in synovial tissue of patients with rheumatic conditions.
Preferably, the present invention provides a method for the determination and the classification of a rheumatic condition in at least one synovial sample of a subject afflicted with said rheumatic condition, comprising the steps of
The point of the present invention is that rheumatologists are not confronted to a differential diagnosis between only two conditions but to a differential diagnosis between several inflammatory conditions (in particular, RA, SLE, OA, MIC and SA). The present invention addresses the simultaneous differential diagnosis of all these conditions.
In a particular embodiment, the present invention provides a method for the determination and the classification of a rheumatic condition in at least one synovial sample of a subject afflicted with said rheumatic condition, comprising the steps of
In an embodiment, said identification step comprises a step of using a supervised hierarchical clustering algorithm to evaluate whether the general profile of expression of these at least 20 genes, preferably at least 100 genes, preferably at least 264 genes, yet more preferably up to 2059 related to each other, fits into one diagnostic category.
The method of the present invention is not based on the comparison level of one gene as compared to standard values; it is based on the pattern of expression of all the genes listed herein, for example at least 20 genes, preferably at least 100 genes, preferably at least 264 genes, yet more preferably up to 2059 genes or fragments thereof.
The method of the invention is therefore based on the identification of gene signatures in the evaluated samples, determined from the analysis of the expression of said at least 20 genes, preferably said at least 100 genes or fragments thereof, preferably at least 264 genes, yet more preferably up to 2059 genes or fragments thereof. The pattern of expression of said genes allows the studied sample to cluster with a group of RA reference samples previously collected earlier. The same is true for OA, SLE, MIC and SA. Levels of gene expression in healthy subjects are no longer needed in order to obtain a result.
In an embodiment, the expression profiles of these genes can be used in identifying whether the individual has a rheumatic condition selected from SLE, OA, RA, SA, or MIC. In particular, the pattern of expression of said genes allows the studied sample to cluster with a group of RA, OA, SLE, MIC or SA reference samples.
The way said at least 20 genes, preferably at least 100 genes preferably at least 264 genes, yet more preferably up to 2059 genes or fragments thereof, are clustering is important, independently of the disease to be determined. The samples are compared with reference samples using, for example, Pearson correlation.
In an embodiment, the present invention provides early identification between at least five rheumatic conditions based on the analysis of gene expression profiles in a biological sample, such as synovial sample, of a subject with arthritis. Preferably said method uses low-density DNA-spotted microarrays.
Those skilled in the art will immediate recognize the many other effects and advantages of the present method and the numerous possibilities for end uses of the present invention from the detailed description and examples provided below.
Determination of the expression profile of at least 20 genes, preferably at least 100 genes listed herein provides a tool to screen for, diagnose, and also classify these diseases. The present invention allows determining or diagnosing whether subjects are afflicted with a particular form of arthritis.
Preferably, the method comprises determining in said biological sample, in particular in said synovial sample, the expression level of at least 20, at least 50, at least 100, at least 120, at least 150, at least 180, at least 200, at least 220, at least 240, at least 250, at least 260, at least 264, at least 270, or at least 300 genes or fragments thereof, preferably up to 2059 genes or fragments thereof.
In an embodiment, the method comprises determining in said biological sample, in particular in said synovial sample, the expression level of all the genes or fragment thereof listed in Table 1. In an embodiment, the method comprises determining in said biological sample, in particular in said synovial sample, the expression level of all the genes or fragment thereof listed in Table 2. In an embodiment, the method comprises determining in said biological sample, in particular in said synovial sample, the expression level of all the genes or fragment thereof listed in Table 3. In an embodiment, the method comprises determining in said biological sample, in particular in said synovial sample, the expression level of all the genes or fragment thereof listed in Table 4.
In an embodiment, said method comprises:
More preferably, said method comprises determining in said biological sample the expression level of at least 20, at least 50, at least 100, at least 120, at least 150, at least 180, at least 200, at least 220, at least 240, at least 250, at least 260, at least 264, at least 270, or at least 300 genes or fragments thereof (preferably up to 2059 genes of fragment thereof) selected from the 2059 genes listed in Table 1 that were found by ANOVA to be differentially expressed between the five conditions.
The method also comprises determining in said biological sample the expression level of at least 20, at least 50, at least 100, at least 120, at least 150, at least 180, at least 200, at least 220, at least 240, at least 250, at least 260, at least 270, or at least 300 genes or fragments thereof (preferably up to 2059 genes of fragment thereof) selected from the groups (a), (b), (c), (d), (e), (f), (g), (h), (i), and (j), wherein group (a) comprises IPLAC8, IRF4, SAMD3, HLA-DOB, EOMES, PDK1, BG548679, AK25048, AU146285, P2RY8, AI225238, JAK3, LAX, PTPN7, RP26, TRIM, SLAMF1, PTPRCAP, LCK, PTP4A3, AI825068, BCL11B, CD79A, IL7R, GPR18, STRBP, C20orf103, AA732944, ZAP70, SLC38A1, RUNX3, TOSO, BCL11A, NELL2, ICAM3, LTB, TCF7, TRD@; wherein group (b) comprises ANK3, CREB5, FAH, FKBP7, AF009316, KLF4, ANKH, NTN4, THBS4, SCARA3, CYP3A5, MGP, AMN, CMAH, FN1, GABRA4, ABCC6, BE503823, SLC24A6; wherein group (c) comprises CXCL11, LOC113730, cig5, STAT1, GM2A, G1P3, IFIT4, EIF5A, EPSTI1, MX1, IFI44L, IFI27, LOC129607, G1P2, IFIT1, FLJ39885, OAS2, FLJ20637, OAS1, MDA5, OAS3, LILRA2, TGFBRAP1, BST2, OASL, CEB1, HLA-DOA, KIS, GNB4, GM2A, CLECSF12, AW262311, CALR, FPRL2, MAP3K2, FLJ20668, CYBB, SNX10, GRB2, GPR43, FLJ20035, C1QG, ARF6, IFI35, GM2A, FLJ21069, BE674143, wherein group (d) comprises ZNF607, X07868, AU157716, CCDC3, CPSF1, AW029203, AK022838, CCNL2, C7orf10, SEC24D, AFG3L1, TLE2, PCSK5, AA706701, C18orf18, OSBPL6, BC042472, AUTS2, SOX4, PTPRD, AA572675, COL18A1, COL16A1, GPM6B, SCG2, TNC, TP53I11, SCRG1, COL12A1, AL832806, AA912540, MYST4, STEAP, ARNT2, AA854843, KIAA0484, AU147442, PKN3, SYNE1, DSPG3, AW903934, FBXO26, LOC200772, AF116709; wherein group (e) comprises C7orf10, SCRG1, DNAJC12, FNDC4, AK024204, NBL1, BF591996, ANKH, DIO2, SGCD, T90703, SPOCK, CKLFSF4, AW162210, CDC42BPA, KIAA1171, FKSG17, N73742, ZNF515, PTPRS, CTDSPL, NRP2, EXTL2, CCND1, CALU, PVR, ATBF1, AFURS1, SYNPO, MYO7A, KIAA1450, SLC35B4; wherein group (f) comprises BCL2L11, MBNL2, LOC113730, TXNIP, SRPR, RNASET2, DHX9, RUNX2; wherein group (g) comprises CYP3A5, MSI2, FZD8, AW265065, DBC-1, FN1, ANGPTL2, FBI4, C14orf131, BF057799, KLF4, SRPR, TTC3, COPA, PCDHGA11I; wherein group (h) comprises IRF4, CXCL13, AI823917, BCL11B, AA789123, cig5, PLAC8, CD209, IAN4L1, G1P3, NOD3, KIAA1268, HRMT1L1, AI821404, G1P2, COPG, FLJ33814, H963, TAP1, PTP4A3, CUL5, JAK3, AW504569, AA809449, CCL8, ZC3HDC1, SYNCOILIN, KIAA0090, GGA2, NAP1L, ETV6; wherein group (i) comprises AW903934, AL037998, AK024712, AW612461, BM873997, AK000795, FLJ00133, FOXC1, MGC43690, BF590303, AI090764, BF221547, ID4, AW296081, PGF, RGS5, W80359, AF086069, FLJ32949, RAMP3, T87730, AI742685, GPR4, GRID1, FAM20A, FRMD4, RUNX2, FLJ13089, TBX2, RAPGEF2, BM353142; wherein group (j) comprises SLC1A4, LIG3, EMILIN2, ALDH1A1, TNFSF8, PB1, SNX10, AA365670, YWHAZ, EIF5A, ZNF581, BAG4, ARF6, HLA-DOA, LILRA2, CWF19L1, SLC15A2, PTPRC, GM2A, STRN3, CLECSF12, TCL1A, PKHD1L1 and CTLA4.
Preferably said at least 100 genes are selected from these 2059 genes, based on their ability to define a specific gene signature for each disease. Preferably the method comprises determining the level of expression of 100 genes or fragments thereof selected from the group comprising AI923633; ARPC4; ASH1L; AW008502; AW296081; AW612461; BAG2; BCL11B; BE676335; BG231773; BSG; BTBD1; C12orf30/FLJ13089; C14orf131; CALU; CCL5; CCND1; CD79A; CDC42BPA; CDC42SE1/SPEC1; CHD1; CHD9/FLJ12178; CPSF1; CST7; CTDSPL; DUT; EIF3EIP/EIF3S6IP; ETV6; EXTL2; FLJ21395 fis; FN1; FRMD4A; G3BP1; GPR4; GRID1; GYG1; HMGB1; IFI27; IFI6/G1P3; IFIT3/IFIT4; IL7R; JAK3; JOSD3/MGC5306; KIAA0090; KIAA1128; KIAA1377; KIAA1450; LCK; LOC129607; MYEOV2/LOC150678; NBL1; NELL2; OAS1; PARP12/ZC3HDC1; PDE5A; PGF; PHF21A/BHC80; PIK3C2A; PTBP1; PTEN; PTPN7; QKI; RAB8A; RALGPS2; RAP2A; RAPGEF2; RASGRP1; RBBP6; RGS5; RPL4; RSAD2/cig5; SFRS2B/SRP46; SFRS6; SIPA1L3; SLC15A2; SPARCL1; SUPT16H; SYNCOILIN; TARP /// TRGC2 /// TRGV9; TBC1D20/C20orf140; TBC1D24/KIAA1171; THRAP3; TI-227H /// TUG1; TLE2; TMEM43/MGC3222; TNFSF8; TOX; TRBC1 /// TRBV19; TSPAN3/TM4SF8; UCKL1/URKL1; WDR90/LOC197336; ZFHX3/ATBF1; T87730.
Preferably said at least 20 genes are selected from these 2059 genes, based on their ability to define a specific gene signature for each disease. Preferably the method comprises determining the level of expression of 20 genes or fragments thereof selected from the group comprising FLJ21395 fis, TTC3, NELL2, PTPN7, HLA-DOA, GPR171, COPA, KIAA0484, TRAT1, FNDC4, BCL11B, C14orf131, FKBP7, TBC1D24, AL037998, AI225238, LOC113730, AA789123, KIAA1377, AW504569.
In a particular embodiment, the present invention provides a method of determining and classifying a rheumatic condition in at least one biological sample of a subject afflicted with said rheumatic condition, said method comprising:
According to the present invention, the clustering of said gene expression profiles is performed based on the information of differentially-expressed genes listed herein.
In a preferred embodiment, the analyses are based on the levels of expression of all the genes described in the categories (a), (b), (c), (d), (e), (f), (g), (h), (i) and (j) listed herein.
According to the present invention, group (a) comprises genes more specifically over-expressed in RA: PLAC8, IRF4, SAMD3, HLA-DOB, EOMES, PDK1, BG548679, AK25048, AU146285, P2RY8, AI225238, JAK3, LAX, PTPN7, RP26, TRIM, SLAMF1, PTPRCAP, LCK, PTP4A3, AI825068, BCL11B, CD79A, IL7R, GPR18, STRBP, C20orf103, AA732944, ZAP70, SLC38A1, RUNX3, TOSO, BCL11A, NELL2, ICAM3, LTB, TCF7, TRD@;
group (b) comprises genes more specifically down-regulated in RA: ANK3, CREB5, FAH, FKBP7, AF009316, KLF4, ANKH, NTN4, THBS4, SCARA3, CYP3A5, MGP, AMN, CMAH, FN1, GABRA4, ABCC6, BE503823, SLC24A6;
group (c) comprises genes more specifically over-expressed in SLE: CXCL11, LOC113730, cig5, STAT1, GM2A, G1P3, IFIT4, EIF5A, EPSTI1, MX1, IFI44L, IFI27, LOC129607, G1P2, IFIT1, FLJ39885, OAS2, FLJ20637, OAS1, MDA5, OAS3, LILRA2, TGFBRAP1, BST2, OASL, CEB1, HLA-DOA, KIS, GNB4, GM2A, CLECSF12, AW262311, CALR, FPRL2, MAP3K2, FLJ20668, CYBB, SNX10, GRB2, GPR43, FLJ20035, C1QG, ARF6, IFI35, GM2A, FLJ21069, BE674143;
group (d) comprises genes more specifically down-regulated in SLE: ZNF607, X07868, AU157716, CCDC3, CPSF1, AW029203, AK022838, CCNL2, C7orf10, SEC24D, AFG3L1, TLE2, PCSK5, AA706701, C18orf18, OSBPL6, BC042472, AUTS2, SOX4, PTPRD, AA572675, COL18A1, COL16A1, GPM6B, SCG2, TNC, TP53I11, SCRG1, COL12A1, AL832806, AA912540, MYST4, STEAP, ARNT2, AA854843, KIAA0484, AU147442, PKN3, SYNE1, DSPG3, AW903934, FBXO26, LOC200772, AF116709;
group (e) comprises genes more specifically over-expressed in OA: C7orf10, SCRG1, DNAJC12, FNDC4, AK024204, NBL1, BF591996, ANKH, DIO2, SGCD, T90703, SPOCK, CKLFSF4, AW162210, CDC42BPA, KIAA1171, FKSG17, N73742, ZNF515, PTPRS, CTDSPL, NRP2, EXTL2, CCND1, CALU, PVR, ATBF1, AFURS1, SYNPO, MYO7A, KIAA1450, SLC35B4;
group (f) comprises genes more specifically down-regulated in OA: BCL2L11, MBNL2, LOC113730, TXNIP, SRPR, RNASET2, DHX9, RUNX2;
group (g) comprises genes more specifically over-expressed in MIC: CYP3A5, MSI2, FZD8, AW265065, DBC-1, FN1, ANGPTL2, FBI4, C14orf131, BF057799, KLF4, SRPR, TTC3, COPA, PCDHGA11;
group (h) comprises genes more specifically down-regulated in MIC: IRF4, CXCL13, AI823917, BCL11B, AA789123, cig5, PLAC8, CD209, IAN4L1, G1P3, NOD3, KIAA1268, HRMT1L1, AI821404, G1P2, COPG, FLJ33814, H963, TAP1, PTP4A3, CUL5, JAK3, AW504569, AA809449, CCL8, ZC3HDC1, SYNCOILIN, KIAA0090, GGA2, NAP1L, ETV6;
group (i) comprises genes more specifically over-expressed in SA: AW903934, AL037998, AK024712, AW612461, BM873997, AK000795, FLJ00133, FOXC1, MGC43690, BF590303, AI090764, BF221547, ID4, AW296081, PGF, RGS5, W80359, AF086069, FLJ32949, RAMP3, T87730, AI742685, GPR4, GRID1, FAM20A, FRMD4, RUNX2, FLJ13089, TBX2, RAPGEF2, BM353142; and
group (j) comprises genes more specifically down-regulated in SA: SLC1A4, LIG3, EMILIN2, ALDH1A1, TNFSF8, PB1, SNX10, AA365670, YWHAZ, EIF5A, ZNF581, BAG4, ARF6, HLA-DOA, LILRA2, CWF19L1, SLC15A2, PTPRC, GM2A, STRN3, CLECSF12, TCL1A, PKHD1L1, CTLA4.
As used herein the term “clustering” refers to the activity of collecting, assembling and/or uniting into a cluster or clusters items with the same or similar elements, a “cluster” referring to a group or number of the same or similar items, i.e. gene expression profiles, gathered or occurring closely together based on similarity of characteristics.
The process of clustering used in a method of the present invention may be any mathematical process known to compare items for similarity in characteristics, attributes, properties, qualities, effects, parameters, etc. Statistical analysis, such as for instance multivariance analysis, or other methods of analysis may be used. Preferably methods of analysis such as self-organizing maps, hierarchical clustering, multidimensional scaling, principle component analysis, supervised learning, k-nearest neighbors, support vector machines and the like.
In an embodiment, the clustering step is performed according to a statistical procedure, comprising: hierarchical clustering selected from complete linkage clustering; average linkage clustering and/or single linkage clustering; using at least one of the following metrics selected from Euclidean distance; Manhattan distance; Average dot product; Pearson correlation; Pearson uncentered; Pearson squared; Cosine correlation; Covariance value; Spearman Rank correlation; Kedall's Tau; or Mutual information. Preferably the present method comprises performing supervised hierarchical clustering analysis. Pearson correlation coefficients are calculated between pairs of samples.
In a preferred embodiment, the method of the invention comprises the steps of determining the level of expression of the following genes (that define a specific signature in each disorder): PLAC8, IRF4, SAMD3, HLA-DOB, EOMES, PDK1, BG548679, AK25048, AU146285, P2RY8, AI225238, JAK3, LAX, PTPN7, RP26, TRIM, SLAMF1, PTPRCAP, LCK, PTP4A3, AI825068, BCL11B, CD79A, IL7R, GPR18, STRBP, C20orf103, AA732944, ZAP70, SLC38A1, RUNX3, TOSO, BCL11A, NELL2, ICAM3, LTB, TCF7, TRD@; ANK3, CREB5, FAH, FKBP7, AF009316, KLF4, ANKH, NTN4, THBS4, SCARA3, CYP3A5, MGP, AMN, CMAH, FN1, GABRA4, ABCC6, BE503823, SLC24A6; CXCL11, LOC113730, cig5, STAT1, GM2A, G1P3, IFIT4, EIF5A, EPSTI1, MX1, IFI44L, IFI27, LOC129607, G1P2, IFIT1, FLJ39885, OAS2, FLJ20637, OAS1, MDA5, OAS3, LILRA2, TGFBRAP1, BST2, OASL, CEB1, HLA-DOA, KIS, GNB4, CLECSF12, AW262311, CALR, FPRL2, MAP3K2, FLJ20668, CYBB, SNX10, GRB2, GPR43, FLJ20035, C1QG, ARF6, IFI35, FLJ21069, BE674143; ZNF607, X07868, AU157716, CCDC3, CPSF1, AW029203, AK022838, CCNL2, C7orf10, SEC24D, AFG3L1, TLE2, PCSK5, AA706701, C18orf18, OSBPL6, BC042472, AUTS2, SOX4, PTPRD, AA572675, COL18A1, COL16A1, GPM6B, SCG2, TNC, TP53I11, COL12A1, AL832806, AA912540, MYST4, STEAP, ARNT2, AA854843, KIAA0484, AU147442, PKN3, SYNE1, DSPG3, AW903934, FBXO26, LOC200772, AF116709; SCRG1, DNAJC12, FNDC4, AK024204, NBL1, BF591996, DIO2, SGCD, T90703, SPOCK, CKLFSF4, AW162210, CDC42BPA, KIAA1171, FKSG17, N73742, ZNF515, PTPRS, CTDSPL, NRP2, EXTL2, CCND1, CALU, PVR, ATBF1, AFURS1, SYNPO, MYO7A, KIAA1450, SLC35B4; BCL2L11, MBNL2, TXNIP, RNASET2, DHX9, RUNX2; MSI2, FZD8, AW265065, DBC-1, ANGPTL2, FBI4, C14orf131, BF057799, SRPR, TTC3, COPA, PCDHGA11; CXCL13, AI823917, AA789123, CD209, IAN4L1, NOD3, KIAA1268, HRMT1L1, AI821404, COPG, FLJ33814, H963, TAP1, CUL5, AW504569, AA809449, CCL8, ZC3HDC1, SYNCOILIN, KIAA0090, GGA2, NAP1L, ETV6, AL037998, AK024712, AW612461, BM873997, AK000795, FLJ00133, FOXC1, MGC43690, BF590303, AI090764, BF221547, ID4, AW296081, PGF, RGS5, W80359, AF086069, FLJ32949, RAMP3, T87730, AI742685, GPR4, GRID1, FAM20A, FRMD4, FLJ13089, TBX2, RAPGEF2, BM353142; SLC1A4, LIG3, EMILIN2, ALDH1A1, TNFSF8, PB1, AA365670, YWHAZ, ZNF581, BAG4, CWF19L1, SLC15A2, PTPRC, STRN3, TCL1A, PKHD1L1, CTLA4; and identifying whether the subject's synovial sample has a pattern or profile or expression of said genes which correlates with the presence of a rheumatic condition such SLE, OA, RA, MIC or SA by clustering analysis compared to reference samples.
In a preferred embodiment, the method of the invention comprises the steps of determining the level of expression of the following genes (that define a specific signature in each disorder): AI923633; ARPC4; ASH1L; AW008502; AW296081; AW612461; BAG2; BCL11B; BE676335; BG231773; BSG; BTBD1; C12orf30/FLJ13089; C14orf131; CALU; CCL5; CCND1; CD79A; CDC42BPA; CDC42SE1/SPEC1; CHD1; CHD9/FLJ12178; CPSF1; CST7; CTDSPL; DUT; EIF3EIP/EIF3S6IP; ETV6; EXTL2; FLJ21395 fis; FN1; FRMD4A; G3BP1; GPR4; GRID1; GYG1; HMGB1; IFI27; IFI6/G1P3; IFIT3/IFIT4; IL7R; JAK3; JOSD3/MGC5306; KIAA0090; KIAA1128; KIAA1377; KIAA1450; LCK; LOC129607; MYEOV2/LOC150678; NBL1; NELL2; OAS1; PARP12/ZC3HDC1; PDE5A; PGF; PHF21A/BHC80; PIK3C2A; PTBP1; PTEN; PTPN7; QKI; RAB8A; RALGPS2; RAP2A; RAPGEF2; RASGRP1; RBBP6; RGS5; RPL4; RSAD2/cig5; SFRS2B/SRP46; SFRS6; SIPA1L3; SLC15A2; SPARCL1; SUPT16H; SYNCOILIN; TARP /// TRGC2 /// TRGV9; TBC1D20/C20orf140; TBC1D24/KIAA1171; THRAP3; TI-227H /// TUG1; TLE2; TMEM43/MGC3222; TNFSF8; TOX; TRBC1 /// TRBV19; TSPAN3/TM4SF8; UCKL1/URKL1; WDR90/LOC197336; ZFHX3/ATBF1; T87730, and identifying whether the subject's synovial sample has a pattern or profile or expression of said genes which correlates with the presence of a rheumatic condition such SLE, OA, RA, MIC or SA by clustering analysis compared to reference samples.
In a preferred embodiment, the method of the invention comprises the steps of determining the level of expression of the following genes (that define a specific signature in each disorder): FLJ21395 fis, TTC3, NELL2, PTPN7, HLA-DOA, GPR171, COPA, KIAA0484, TRAT1, FNDC4, BCL11B, C14orf131, FKBP7, TBC1D24, AL037998, AI225238, LOC113730, AA789123, KIAA1377, AW504569, and identifying whether the subject's synovial sample has a pattern or profile or expression of said genes which correlates with the presence of a rheumatic condition such SLE, OA, RA, MIC or SA by clustering analysis compared to reference samples.
In an embodiment, the step of providing reference profiles for OA, RA, SA, SLE and MIC, comprises the steps of: providing a plurality of reference samples from a plurality of reference subjects afflicted by OA, RA, SA, SLE and MIC; providing reference profiles by establishing a gene expression profile for each of said reference samples individually; clustering said individual reference profiles according to a statistical procedure, comprising hierarchical clustering; and Pearson correlation coefficient analysis; and assigning an OA, RA, SA, SLE or MIC class to each cluster.
In an embodiment, the method comprises determining the level of expression of the above listed genes, performing supervised hierarchical clustering analysis, measuring correlation coefficient and identifying whether the subject's sample has a pattern or profile or expression of said genes which correlates with the presence of a rheumatic condition.
The present invention provides a method for diagnosing SLE, OA, RA, SA, or MIC in a subject afflicted by a undefined rheumatic conditions comprising: producing a classification for several SLE, OA, RA, SA, and MIC references samples using the genes listed herein; defining cluster-specific genes for each cluster by selecting those genes of which the expression level characterizes the clustered position of the corresponding SLE, OA, RA, SA, or MIC class, determining the level of expression of at least 20, preferably at least 100 number of said cluster-specific genes in subject afflicted with a rheumatic condition; establishing whether the level of expression of said cluster-specific genes in said subject shares sufficient similarity to the level of expression that characterizes an SLE, OA, RA, SA, or MIC class to thereby determine the presence of a specific rheumatic condition corresponding to said class in said subject.
As used herein the term “biological sample” refers to a sample that comprises a biomolecule that permits the expression level of a gene to be determined. Representative biomolecules include, but are not limited to total RNA, mRNA, and polypeptides, and derivatives of these molecules such as cDNAs and ESTs. As such, a biological sample can comprise a cell or a group of cells. Preferably, said biological sample is a synovial sample, more preferably a knee synovial sample.
As used herein the term “subject” refers to any vertebrate species. Preferably, the term subject encompasses warm-blooded vertebrates, more preferably mammals. More particularly contemplated are mammals such as humans, as well as animals such as carnivores other than humans (such as cats and dogs), swine (pigs, hogs, and wild boars), poultry, ruminants (such as cattle, oxen, sheep, giraffes, deer, goats, bison, and camels), and horses.
For example, said rheumatic condition is determined as SLE when the gene expression profile of the at least 20, at least 50, at least 100, at least 120, at least 150, at least 180, at least 200, at least 220, at least 240, at least 250, at least 260, at least 264, at least 270, or at least 300 genes preferably up to 2059 genes is similar to known SLE samples. That includes but is not limited to up-regulation of interferon-induced genes, down-regulation of genes involved in ECM homeostasis and distinct other patterns of expression of all the genes present on a slide. The same is true for RA (up- and down-regulation of specific groups of genes compared to each other), OA, SA and MIC
When describing the invention, the terms used are to be construed in accordance with the following definitions, unless a context dictates otherwise:
As used in the specification and the appended claims, the singular forms “a”, “an,” and “the” include plural referents unless the context clearly dictates otherwise. By way of example, “a method” means one method or more than one method.
The term “and/or” as used in the present specification and in the claims implies that the phrases before and after this term are to be considered either as alternatives or in combination.
As used herein, the term “profile” refers to a repository of the expression level data that can be used to compare the expression levels of different genes among various subjects.
As used herein the term “gene” encompasses sequences including, but not limited to a coding sequence, a promoter region, a transcriptional regulatory sequence, a non-expressed DNA segment that is a specific recognition sequence for regulatory proteins, a non-expressed DNA segment that contributes to gene expression, a DNA segment designed to have desired parameters, sense and anti-sense strands of genomic DNA (i.e. including any introns occurring therein), EST, RNA generated by transcription of genomic DNA (i.e. prior to splicing), RNA generated by splicing of RNA transcribed from genomic DNA, and proteins generated by translation of spliced RNA (e. g. including proteins both before and after cleavage of normally cleaved regions such as transmembrane signal sequences), cDNA made by reverse transcription of an RNA generated by transcription of genomic DNA (including spliced RNA) and fragments thereof, or combinations thereof.
As used herein the term “fragment” shall be understood to mean a nucleic acid that is the same as part of, but not all of a nucleic acid that forms a gene. The term “fragment” also encompasses a part, but not all of an intergenic region.
The term “increased expression” and “decreased expression” refers to expression of the gene in a sample, at a greater or lesser level, respectively, than the level of expression of said gene (e. g. at least two-fold greater or lesser level) in a diseased control (reference sample). The gene is said to be up-regulated or over-expressed or down-regulated or under-expressed if either the gene is present at a greater or lesser level, respectively, than the level in a diseased control. Expression of a gene in a sample is “significantly” higher or lower than the level of expression of a gene in a diseased control if the level of expression of the gene is greater or less, respectively, than the level by an amount greater than the standard error of the assay employed to assess expression, and preferably at least twice, and more preferably three, four, five or ten times that amount. Alternately, expression of the gene in the sample can be considered “significantly” higher or lower than the level of expression in a diseased control if the level of expression is at least about two, and preferably at least about three, four, or five times, higher or lower, respectively, than the level of expression of the gene in said diseased control.
The present invention provides arrays comprising probes for detection of polynucleotides (transcriptional state) or for detection of proteins (translational state) in order to detect differentially-expressed genes of the invention. By “array” is intended a solid support or substrate with peptide or nucleic acid probes attached to said support or substrate. Arrays typically comprise a plurality of different nucleic acid or peptide capture probes that are coupled to a surface of a substrate in different, known locations. These arrays, also described as “microarrays” or colloquially “chips” have been generally described in the art. These arrays may generally be produced using mechanical synthesis methods or light directed synthesis methods which incorporate a combination of photolithographic methods and solid phase synthesis methods.
In one embodiment of the invention, microarrays are provided and used to measure the values to be included in the expression profiles. Microarrays are particularly well suited for this purpose because of the reproducibility between different experiments. In an embodiment, the step of determination of the level of expression is performed using DNA-microarray (also referred as gene chip array), preferably low-density DNA-spotted microarray. As used herein low-density DNA-spotted microarray comprises spotting probes suitable for hybridizing from at least 20, or at least 100 to 5000 genes or fragments thereof, preferably from at least 20 or at least 100 to 3000 genes or fragments thereof, more preferably from at least 20 or at least 100 to 2050 genes or fragment thereof, even more preferably from at least 100 to 500 genes, even more preferably from at least 20 to 500 genes.
Preferably, said method involves clustering of gene expression profiles based on, for instance, DNA-microarray-acquired values for hybridization intensities for each gene.
The skilled person is capable of designing oligonucleotide probes that can be used in methods of the present invention. Preferably, such probes are immobilized on a solid surface as to form an oligonucleotide microarray of the invention. The oligonucleotide probes useful in methods of the present invention are capable of hybridizing under stringent conditions to the at least 20 at least 50, at least 100, at least 120, at least 150, at least 180, at least 200, at least 220, at least 240, at least 250, at least 260, at least 264, at least 270, or at least 300 (preferably to up to 2059) rheumatic conditions-associated nucleic acids as described herein.
In some embodiments, each probe in the array detects a nucleic acid molecule selected from the nucleic acid molecules listed in Tables 1, 2, 3 or 4.
Although a planar array surface is preferred, the array may be fabricated on a surface of virtually any shape or even a multiplicity of surfaces. Arrays may be peptides or nucleic acids on beads, gels, polymeric surfaces, and fibers such as fiber optics, glass or any other appropriate substrate. Arrays may be packaged in such a manner as to allow for diagnostics or other manipulation of an all-inclusive device.
Suitable probes for said microarray comprise probes for genes or fragments thereof as listed in Tables 1, 2, 3 and 4. Preferably suitable probes for said microarray comprise probes for PLAC8, IRF4, SAMD3, HLA-DOB, EOMES, PDK1, BG548679, AK025048, AU146285, P2RY8, AI225238, JAK3, LAX, PTPN7, RP26, TRIM, SLAMF1, PTPRCAP, LCK, PTP4A3, AI825068, BCL11B, CD79A, IL7R, GPR18, STRBP, C20orf103, AA732944, ZAP70, SLC38A1, RUNX3, TOSO, BCL11A, NELL2, ICAM3, LTB, TCF7, TRD@; ANK3, CREB5, FAH, FKBP7, AF009316, KLF4, ANKH, NTN4, THBS4, SCARA3, CYP3A5, MGP, AMN, CMAH, FN1, GABRA4, ABCC6, BE503823, SLC24A6; CXCL11, LOC113730, cig5, STAT1, GM2A, G1P3, IFIT4, EIF5A, EPSTI1, MX1, IFI44L, IFI27, LOC129607, G1P2, IFIT1, FLJ39885, OAS2, FLJ20637, OAS1, MDA5, OAS3, LILRA2, TGFBRAP1, BST2, OASL, CEB1, HLA-DOA, KIS, GNB4, CLECSF12, AW262311, CALR, FPRL2, MAP3K2, FLJ20668, CYBB, SNX10, GRB2, GPR43, FLJ20035, C1QG, ARF6, IFI35, FLJ21069, BE674143; ZNF607, X07868, AU157716, CCDC3, CPSF1, AW029203, AK022838, CCNL2, C7orf10, SEC24D, AFG3L1, TLE2, PCSK5, AA706701, C18orf18, OSBPL6, BC042472, AUTS2, SOX4, PTPRD, AA572675, COL18A1, COL16A1, GPM6B, SCG2, TNC, TP53I11, COL12A1, AL832806, AA912540, MYST4, STEAP, ARNT2, AA854843, KIAA0484, AU147442, PKN3, SYNE1, DSPG3, AW903934, FBXO26, LOC200772, AF116709; SCRG1, DNAJC12, FNDC4, AK024204, NBL1, BF591996, DIO2, SGCD, T90703, SPOCK, CKLFSF4, AW162210, CDC42BPA, KIAA1171, FKSG17, N73742, ZNF515, PTPRS, CTDSPL, NRP2, EXTL2, CCND1, CALU, PVR, ATBF1, AFURS1, SYNPO, MYO7A, KIAA1450, SLC35B4; BCL2L11, MBNL2, TXNIP, RNASET2, DHX9, RUNX2; MSI2, FZD8, AW265065, DBC-1, ANGPTL2, FBI4, C14orf131, BF057799, SRPR, TTC3, COPA, PCDHGA11; CXCL13, AI823917, AA789123, CD209, IAN4L1, NOD3, KIAA1268, HRMT1L1, AI821404, COPG, FLJ33814, H963, TAP1, CUL5, AW504569, AA809449, CCL8, ZC3HDC1, SYNCOILIN, KIAA0090, GGA2, NAP1L, ETV6, AL037998, AK024712, AW612461, BM873997, AK000795, FLJ00133, FOXC1, MGC43690, BF590303, AI090764, BF221547, ID4, AW296081, PGF, RGS5, W80359, AF086069, FLJ32949, RAMP3, T87730, AI742685, GPR4, GRID1, FAM20A, FRMD4, FLJ13089, TBX2, RAPGEF2, BM353142; SLC1A4, LIG3, EMILIN2, ALDH1A1, TNFSF8, PB1, AA365670, YWHAZ, ZNF581, BAG4, CWF19L1, SLC15A2, PTPRC, STRN3, TCL1A, PKHD1L1, CTLA4. Preferably suitable probes for said microarray comprise probes for AI923633; ARPC4; ASH1L; AW008502; AW296081; AW612461; BAG2; BCL11B; BE676335; BG231773; BSG; BTBD1; C12orf30/FLJ13089; C14orf131; CALU; CCL5; CCND1; CD79A; CDC42BPA; CDC42SE1/SPEC1; CHD1; CHD9/FLJ12178; CPSF1; CST7; CTDSPL; DUT; EIF3EIP/EIF3S6IP; ETV6; EXTL2; FLJ21395 fis; FN1; FRMD4A; G3BP1; GPR4; GRID1; GYG1; HMGB1; IFI27; IFI6/G1P3; IFIT3/IFIT4; IL7R; JAK3; JOSD3/MGC5306; KIAA0090; KIAA1128; KIAA1377; KIAA1450; LCK; LOC129607; MYEOV2/LOC150678; NBL1; NELL2; OAS1; PARP12/ZC3HDC1; PDE5A; PGF; PHF21A/BHC80; PIK3C2A; PTBP1; PTEN; PTPN7; QKI; RAB8A; RALGPS2; RAP2A; RAPGEF2; RASGRP1; RBBP6; RGS5; RPL4; RSAD2/cig5; SFRS2B/SRP46; SFRS6; SIPA1L3; SLC15A2; SPARCL1; SUPT16H; SYNCOILIN; TARP /// TRGC2 /// TRGV9; TBC1D20/C20orf140; TBC1D24/KIAA1171; THRAP3; TI-227H /// TUG1; TLE2; TMEM43/MGC3222; TNFSF8; TOX; TRBC1 /// TRBV19; TSPAN3/TM4SF8; UCKL1/URKL1; WDR90/LOC197336; ZFHX3/ATBF1; T87730.
Analysis can be conducted using for example average or complete linkage clustering and Pearson's correlation. Expression files can be analyzed using for example the open-source softwares: TMEV (Tigr Multiarray Experiment Viewer) (www.tigr.org/software), J-Express: http://www.ii.uib.no/˜bjarted/jexpress, or Genesis®, genome.tugraz.at/Software/Genesis/ or using Support vector machine (SVM) or Leave-one-out analyses in GeneSpring.
According to the invention, the present method is performed using a plurality (e.g. from 20 to 2059 genes, for e.g. at least 20, at least 100, at least 264 genes) of genes. In such methods, the level of expression in the sample of said genes as described above can be compared with the level of expression of the plurality of genes in reference samples of the same type obtained from diseased control afflicted with rheumatic conditions for example RA, OA, SLE, MIC, or SA.
The methods of the present invention are particularly useful for subjects with identified inflammatory synovitis or other symptoms associated with rheumatic conditions.
The sample can, of course, be subjected to a variety of well-known post-collection preparative and storage techniques (e. g. fixation, storage, freezing, lysis, homogenization, DNA or RNA extraction, ultrafiltration, concentration, evaporation, centrifugation, etc.) prior to determining the level of expression in the sample.
Expression of a gene according to the invention may be assessed by any of a wide variety of well known methods for detecting expression of a protein or transcribed molecule. Non-limiting examples suitable determination steps include immunological methods for detection of secreted, cell-surface, cytoplasmic, or nuclear proteins, protein purification methods, protein function or activity assays, nucleic acid hybridization methods, nucleic acid reverse transcription methods, and nucleic acid amplification methods. Such methods may also include physical methods such as liquid and gas chromatography, mass spectroscopy, nuclear magnetic resonance and other imaging technologies.
In a preferred embodiment, the step of determination of the level of expression is performed using microarray, preferably DNA-microarray, more preferably low-density DNA-spotted microarray. Suitable probes for said microarray are identified hereunder.
In particular, a mixture of transcribed polynucleotides obtained from the sample is contacted with a substrate having fixed thereto a polynucleotide complementary to or homologous with at least a portion (e. g. at least 7, 10, 15, 20, 25, 30, 40, 50, 100, 250, 296, or more nucleotide residue) of a RNA transcript encoded by a gene for use in the invention. If polynucleotides complementary to or homologous with a RNA transcript encoded by the gene for use in the invention are differentially detectable on the substrate (e. g. detectable using radioactivity, different chromophores or fluorophores), are fixed to different selected positions, then the levels of expression of a plurality of genes can be assessed simultaneously using a single substrate.
When the assay has an internal control, which can be, for example, a known quantity of a nucleic acid derived from a gene for which the expression level is either known or can be accurately determined, unknown expression levels of other genes can be compared to the known internal control. More specifically, when the assay involves hybridizing labeled total RNA to a solid support comprising a known amount of nucleic acid derived from reference genes, an appropriate internal control could be a housekeeping gene (e. g. glucose-6-phosphate dehydrogenase or elongation factor-1), a housekeeping gene being defined as a gene for which the expression level in all cell types and under all conditions is substantially the same. Use of such an internal control allows a discrete expression level for a gene to be determined (e. g. relative to the expression of the housekeeping gene) both for the nucleic acids present on the solid support and also between different experiments using the same solid support. This discrete expression level can then be normalized to a value relative to the expression level of the control gene (for example, a housekeeping gene). As used herein, the term “normalized”, and grammatical derivatives thereof, refers to a manipulation of discrete expression level data wherein the expression level of a reference gene is expressed relative to the expression level of a control gene. For example, the expression level of the control gene can be set at 1, and the expression levels of all reference genes can be expressed in units relative to the expression of the control gene.
In one embodiment, nucleic acids isolated from a biological sample are hybridized to a microarray, wherein the microarray comprises nucleic acids corresponding to those genes to be tested as well as internal control genes. The genes are immobilized on a solid support, such that each position on the support identifies a particular gene. Solid supports include, but are not limited to nitrocellulose and nylon membranes. Solid supports can also be glass or silicon-based (i.e. gene “chips”). Any solid support can be used in the methods of the presently claimed subject matter, so long as the support provides a substrate for the localization of a known amount of a nucleic acid in a specific position that can be identified subsequent to the hybridization and detection steps.
A microarray can be assembled using any suitable method known to one of skill in the art, and any one microarray configuration or method of construction is not considered to be a limitation of the disclosure.
The present invention also encompasses a method for the determination and the classification of rheumatic conditions, said method comprising:
Also provided are kits for use in practicing the subject methods. The term “kit” as used herein refers to any combination of reagents or apparatus that can be used to perform a method of the invention.
The present invention also provides kits useful for diagnosing, treating, and monitoring the disease state in subjects affected by a rheumatic condition. In one embodiment, the invention provides a kit for the determination and the classification of rheumatic conditions, the kit comprising a low density microarray comprising probes suitable for hybridizing with at least 20, preferably at least 100 genes or fragments thereof, selected from the genes listed in Tables 1, 2, 3 or 4. In one embodiment, said probes selectively hybridizes to a sequence at least 95% identical to a sequence of a gene as shown in Tables 1, 2, 3 or 4.
In an embodiment, said microarray comprises probes suitable for hybridizing with at least 100 genes up to 2059 genes selected from the group listed in Table 1. Preferably said 100 genes are those listed in Table 3.
In a preferred embodiment, said microarray comprises probes suitable for hybridizing with at least 264 genes selected from the group comprising PLAC8, IRF4, SAMD3, HLA-DOB, EOMES, PDK1, BG548679, AK025048, AU146285, P2RY8, AI225238, JAK3, LAX, PTPN7, RP26, TRIM, SLAMF1, PTPRCAP, LCK, PTP4A3, AI825068, BCL11B, CD79A, IL7R, GPR18, STRBP, C20orf103, AA732944, ZAP70, SLC38A1, RUNX3, TOSO, BCL11A, NELL2, ICAM3, LTB, TCF7, TRD@; ANK3, CREB5, FAH, FKBP7, AF009316, KLF4, ANKH, NTN4, THBS4, SCARA3, CYP3A5, MGP, AMN, CMAH, FN1, GABRA4, ABCC6, BE503823, SLC24A6; CXCL11, LOC113730, cig5, STAT1, GM2A, G1P3, IFIT4, EIF5A, EPSTI1, MX1, IFI44L, IFI27, LOC129607, G1P2, IFIT1, FLJ39885, OAS2, FLJ20637, OAS1, MDA5, OAS3, LILRA2, TGFBRAP1, BST2, OASL, CEB1, HLA-DOA, KIS, GNB4, CLECSF12, AW262311, CALR, FPRL2, MAP3K2, FLJ20668, CYBB, SNX10, GRB2, GPR43, FLJ20035, C1QG, ARF6, IFI35, FLJ21069, BE674143; ZNF607, X07868, AU157716, CCDC3, CPSF1, AW029203, AK022838, CCNL2, C7orf10, SEC24D, AFG3L1, TLE2, PCSK5, AA706701, C18orf18, OSBPL6, BC042472, AUTS2, SOX4, PTPRD, AA572675, COL18A1, COL16A1, GPM6B, SCG2, TNC, TP53I11, COL12A1, AL832806, AA912540, MYST4, STEAP, ARNT2, AA854843, KIAA0484, AU147442, PKN3, SYNE1, DSPG3, AW903934, FBXO26, LOC200772, AF116709; SCRG1, DNAJC12, FNDC4, AK024204, NBL1, BF591996, DIO2, SGCD, T90703, SPOCK, CKLFSF4, AW162210, CDC42BPA, KIAA1171, FKSG17, N73742, ZNF515, PTPRS, CTDSPL, NRP2, EXTL2, CCND1, CALU, PVR, ATBF1, AFURS1, SYNPO, MYO7A, KIAA1450, SLC35B4; BCL2L11, MBNL2, TXNIP, RNASET2, DHX9, RUNX2; MSI2, FZD8, AW265065, DBC-1, ANGPTL2, FBI4, C14orf131, BF057799, SRPR, TTC3, COPA, PCDHGA11; CXCL13, AI823917, AA789123, CD209, IAN4L1, NOD3, KIAA1268, HRMT1L1, AI821404, COPG, FLJ33814, H963, TAP1, CUL5, AW504569, AA809449, CCL8, ZC3HDC1, SYNCOILIN, KIAA0090, GGA2, NAP1L, ETV6, AL037998, AK024712, AW612461, BM873997, AK000795, FLJ00133, FOXC1, MGC43690, BF590303, AI090764, BF221547, ID4, AW296081, PGF, RGS5, W80359, AF086069, FLJ32949, RAMP3, T87730, AI742685, GPR4, GRID1, FAM20A, FRMD4, FLJ13089, TBX2, RAPGEF2, BM353142; SLC1A4, LIG3, EMILIN2, ALDH1A1, TNFSF8, PB1, AA365670, YWHAZ, ZNF581, BAG4, CWF19L1, SLC15A2, PTPRC, STRN3, TCL1A, PKHD1L1, and CTLA4.
In a preferred embodiment, said microarray comprises probes suitable for hybridizing with at least 100 genes selected from the group comprising AI923633; ARPC4; ASH1L; AW008502; AW296081; AW612461; BAG2; BCL11B; BE676335; BG231773; BSG; BTBD1; C12orf30/FLJ13089; C14orf131; CALU; CCL5; CCND1; CD79A; CDC42BPA; CDC42SE1/SPEC1; CHD1; CHD9/FLJ12178; CPSF1; CST7; CTDSPL; DUT; EIF3EIP/EIF3S6IP; ETV6; EXTL2; FLJ21395 fis; FN1; FRMD4A; G3BP1; GPR4; GRID1; GYG1; HMGB1; IFI27; IFI6/G1P3; IFIT3/IFIT4; IL7R; JAK3; JOSD3/MGC5306; KIAA0090; KIAA1128; KIAA1377; KIAA1450; LCK; LOC129607; MYEOV2/LOC150678; NBL1; NELL2; OAS1; PARP12/ZC3HDC1; PDE5A; PGF; PHF21A/BHC80; PIK3C2A; PTBP1; PTEN; PTPN7; QKI; RAB8A; RALGPS2; RAP2A; RAPGEF2; RASGRP1; RBBP6; RGS5; RPL4; RSAD2/cig5; SFRS2B/SRP46; SFRS6; SIPA1L3; SLC15A2; SPARCL1; SUPT16H; SYNCOILIN; TARP /// TRGC2 /// TRGV9; TBC1D20/C20orf140; TBC1D24/KIAA1171; THRAP3; TI-227H /// TUG1; TLE2; TMEM43/MGC3222; TNFSF8; TOX; TRBC1 /// TRBV19; TSPAN3/TM4SF8; UCKL1/URKL1; WDR90/LOC197336; ZFHX3/ATBF1; T87730.
In a preferred embodiment, said microarray comprises probes suitable for hybridizing with at least 20 genes selected from the group comprising FLJ21395 fis, TTC3, NELL2, PTPN7, HLA-DOA, GPR171, COPA, KIAA0484, TRAT1, FNDC4, BCL11B, C14orf131, FKBP7, TBC1D24, AL037998, AI225238, LOC113730, AA789123, KIAA1377, AW504569.
The kit may comprise a plurality of reagents, each of which is capable of binding specifically with a nucleic acid or polypeptide corresponding to a gene for use in the invention. Suitable probe for binding with a nucleic acid (e. g. a genomic DNA, an mRNA, a spliced mRNA, a cDNA, or the like) include complementary nucleic acids. For example, the nucleic acid reagents may include oligonucleotides (labeled or non-labeled) fixed to a substrate, labeled oligonucleotides not bound with a substrate, pairs of PCR primers, molecular beacon probes, and the like.
In an embodiment, the kit comprises a nucleic acid probe that binds specifically with a gene nucleic acid or a fragment of the nucleic acid.
The kit may further comprise means for performing PCR reactions. The kit may further comprise media and solution suitable for taking a sample and for extracting RNA from said blood sample.
The kit can further comprise additional components for carrying out the method of the invention, such as RNA extraction solutions, purification column and buffers and the like. The kit of the invention can further include any additional reagents, reporter molecules, buffers, excipients, containers and/or devices as required described herein or known in the art, to practice a method of the invention.
The various components of the kit may be present in separate containers or certain compatible components may be pre-combined into a single container, as desired. In addition to the above components, the kits may further include instructions for practicing the present invention. These instructions may be present in the kits in a variety of forms, one or more of which may be present in the kit.
One form in which these instructions may be present is as printed information on a suitable medium or substrate, e. g., a piece or pieces of paper on which the information is printed, in the packaging of the kit, in a package insert, etc. Yet another means would be a computer readable medium, e. g., diskette, CD, etc., on which the information has been recorded. The invention also provides a computer-readable medium comprising one or more digitally encoded expression profiles, where each profile has one or more values representing the expression of said at least 100 genes that are differentially-expressed in a SLE, OA, RA, SA, or MIC disease. In an embodiment, said digitally encoded expression profiles are profiles of SLE, OA, RA, SA, or MIC reference samples. In some embodiments, the digitally-encoded expression profiles are comprised in a database.
The kits according to the invention may comprise a microarray as defined above and a computer readable medium as described above. The array comprises a substrate having addresses, where each address has a probe that can specifically bind a nucleic acid molecule (by using an oligonucleotide array) or a peptide (by using a peptide array) that is differentially-expressed in at least one SLE, OA, RA, SA, or MIC class. The results are converted into a computer-readable medium that has digitally-encoded expression profiles containing values representing the expression level of a nucleic acid molecule detected by the array. Any other convenient means may be present in the kits.
The invention also provides for the storage and retrieval of a collection of data relating to SLE, OA, RA, SA, or MIC specific gene expression data of the present invention, including sequences and expression levels in a computer data storage apparatus.
The present invention discloses at least 20, at least 50, at least 100, at least 120, at least 150, at least 180, at least 200, at least 220, at least 240, at least 250, at least 260, or at least 264 genes (preferably up to 2059, more preferably up to 2050, more preferably up to 1500 genes, more preferably up to 1000 genes, yet more preferably up to 500 genes, yet more preferably up to 300 genes, yet more preferably up to 264, 260, 250, 240, 230, 220, 210; 200, 190, 180, 170, 160, 150, 140, 130, 120, 110, 100, 50, up to 20 genes or fragment thereof) described herein that are differentially-expressed in SLE, OA, RA, SA, and MIC classes. Accordingly, these genes and their gene products are potential therapeutic targets that are useful in methods of screening test compounds to identify therapeutic compounds for the treatment of rheumatic conditions. The differentially-expressed genes of the invention may be used in cell-based screening assays involving recombinant host cells expressing the differentially-expressed gene product. The recombinant host cells are then screened to identify compounds that can activate the product of the differentially-expressed gene (i.e. agonists) or inactivate the product of the differentially-expressed gene (i.e. antagonists).
The following Tables and examples are intended to illustrate and to substantiate the present invention.
Table 1 list about 2059 genes or fragments thereof used for classifying a rheumatic condition into defined clusters. These are suitable genes the expression profile of which can differentiate between SLE, MIC, SA, OA or RA. Accordingly, one can select at least 20, at least 100, at least 120, at least 150, at least 180, at least 200, at least 220, at least 240, at least 250, at least 260, or at least 264 or up to 2059 genes from this group to correctly classify the rheumatic condition in an individual as SLE, MIC, SA, OA or RA based on the expression profiles of the genes. The genes differentially expressed between the five conditions (n=2059) were found by ANOVA. Several sets of genes out of this list can be used to establish the correct diagnosis of biological samples. The fact that several sets of genes can be used means that the differential gene expression between the disorders is strong. The classification cannot be made by looking at absolute values of the expression of some genes. The classification uses algorithms that look at the expression of all the genes in the list in order to attribute one of the diagnoses to the biological sample.
H. sapiens
Homo sapiens T
sapiens cDNA
sapiens cDNA,
sapiens T-cell
sapiens], mRNA
H. sapiens (T1.1)
Homo sapiens
Homo sapiens
Table 2 provides a list of preferred genes the expression profile of which can differentiate between SLE, MIC, SA, OA or RA. The genes are selected from the 2059 genes differentially expressed between the five conditions (Table 1). Accordingly, one can select all these listed genes to correctly classify the rheumatic condition in an individual as SLE, MIC, SA, OA or RA based on the expression profiles of the genes.
sapiens cDNA clone IMAGE: 298260 3′, mRNA
Homo sapiens cDNA 3′, mRNA sequence.
Homo sapiens cDNA clone IMAGE: 2126397 3′,
Homo sapiens cDNA clone IMAGE: 3390860 3′,
sapiens cDNA clone IMAGE: 122386 3′, mRNA
sapiens cDNA clone IMAGE: 1742504 3′ similar to
sapiens cDNA clone IMAGE: 1691641 3′, mRNA
Homo sapiens cDNA clone IMAGE: 415382 3′, mRNA
Homo sapiens clone TUB2 Cri-du-chat region mRNA.
sapiens cDNA clone IMAGE: 399463 3′, mRNA
Drosophila)
Homo sapiens cDNA FLJ20788 fis, clone COL02074.
sapiens]
Table 3 provides a list of preferred genes the expression profile of which can discriminate between SLE, MIC, SA, OA or RA. Accordingly, one can select all these listed genes to correctly classify the rheumatic condition in an individual as SLE, MIC, SA, OA or RA based on the expression profiles of the genes.
Homo sapiens cDNA clone IMAGE: 3634547 5′,
Table 4 provides a list of preferred genes the expression profile of which can differentiate between SLE, MIC, SA, OA or RA. Accordingly, one can select these 20 listed genes to correctly classify the rheumatic condition in an individual as SLE, MIC, SA, OA or RA based on the expression profiles of these genes.
Patients and Synovial Biopsies:
Synovial biopsies were obtained by needle-arthroscopy from the knee of patients with SLE (n=4), RA (n=7), OA (n=5), MIC (n=5) and SA (n=4). For each patient, 4 to 8 synovial samples were snap frozen in liquid nitrogen and stored at −80° for later RNA extraction. The same amount of tissue was also kept at −80° for future immunostaining experiments on frozen sections. The remaining material was stored in formaldehyde and paraffin embedded for conventional optical evaluation and immunostaining of selected cell markers. All SLE patients met the American College of Rheumatology (ACR) revised criteria for the diagnosis of systemic lupus; they all were females and were average 32.0 year-old (range 19-40 year). All of them had active articular disease at the time of synovial tissue sampling. None of the SLE patients was treated with immunosuppressive therapy; some of them were on non-steroidal anti-inflammatory drugs. All RA patients met the ACR criteria for the diagnosis of rheumatoid arthritis. They all had early (<1 year duration) active disease at the time of tissue sampling. They were 2 females and 5 males. They were average 51 year-old (range 37-69 year). Average CRP level was 25 mg/l (range 9-96 mg/l) and average DAS28-CRP score was 5.08 (range 3.76-5.82). They were not treated except with non-steroidal anti-inflammatory drugs. OA individuals were 4 females and 1 male; their average age was 63.2 year (range 51-73 year). All of them had a swollen knee at the time of the needle-arthroscopic procedure. Similarly, MIC and SA individuals were untreated had the time of the needle-arthroscopic procedure and they had a swollen knee. The study was approved by the ethical committee of the Université catholique de Louvain, and informed consent was obtained from all patients.
Microarray Hybridization and Statistical Interpretation:
Total RNA was extracted from the synovial biopsies using the Nucleospin® RNA II extraction kit (Macherey-Nagel GmbH & Co, Düren, Germany), including DNase treatment of the samples. Labeling of RNA (cRNA synthesis) was performed according to a standard Affymetrix® procedure (One-Cycle Target Labeling kit, Affymetrix UK Ltd, High Wycombe, United Kingdom); briefly total RNA was first reverse transcribed into single-stranded cDNA using a T7-Oligo(dT) Promoter Primer and Superscript II reverse transcriptase. Next, RNase H was added together with E. Coli DNA polymerase I and E. Coli DNA ligase, followed by a short incubation with T4 DNA polymerase in order to achieve synthesis of the second-strand cDNA. The double-stranded cDNA was purified and served as a template for the overnight in vitro transcription reaction, carried out in the presence of T7 RNA polymerase and a biotinylated nucleotide analog/ribonucleotide mix. At the end of this procedure, the biotinylated complementary RNA (cRNA) was cleaned, and fragmented by a 35 minute incubation at 95° C.
GeneChip® Human genome U133 Plus 2.0 Arrays were hybridized overnight at 45° C. in monoplicates with 10 μg cRNA. The slides were then washed and stained using the EukGE-WS2v5 Fluidics protocol on the Genechip® Fluidics Station (Affymetrix) before being scanned on a Genechip® Scanner 3000. Data were retrieved on GCOS software for the initial normalization and analysis steps. The number of positive genes was between 48 and 55% on each slide. After scaling on all probe set (to a value of 100), the amplification scale was reported between 1.1 and 2.5 for all the slides. The signals given by the poly-A RNA controls, hybridization controls and housekeeping/control genes (GAPDH 3′/5′ ratio<2) were indicative of the good quality of the amplification and hybridization procedures. Further statistical analyses were performed using the Genespring® software (Agilent Technologies Inc). For each slide, scaled data were normalized to the 50th percentile per chip and to the median per gene. The data were analyzed by ANOVA with or without Benjamini-Hochberg corrections for multiple comparisons, with a minimal fold change between RA or SLE versus OA set at two. Around 2059 genes displayed differences in expression patterns (Table 1) and can be used as synovial markers for the present invention that are useful for characterizing the five disorders. At least 310 probe set for 264 genes that displayed significant differences in expression patterns among the samples were selected (Table 2). Supervised gene clustering studies were performed using these genes on Genesis® Software (Genesis 1.0, developed par Bioinformatics Grazand) and on TMEV 3.1 Software (TIGR Multiple Array Viewer) from The Institute for Genomic Research (TIGR). Complete linkage clustering was performed based on Pearson correlations between the selected genes. In another experiment, at least 100 probe sets that displayed significant differences in expression patterns among the samples were selected (Table 3).
GeneChip HGU133 Plus 2.0 is a high-density oligonucleotide spotted array covering the whole genome (about 50 000 genes). Its use for a diagnostic test is inappropriate because of the high cost of the procedure and the noise caused by the high number of genes. The invention relates to the development of a customized low-density array that can make a diagnosis based on the evaluation of the expression of a low number of selected genes (Table 1, 2, 3 and 4). The selection of the genes and the development of a customized array allow to improve sensitivity and specificity (as compared to a high density array) and to lower the cost, making possible its introduction into clinical practice.
Results:
Supervised clustering of the samples first distributed the samples into two clusters made of SLE and RA versus OA, SA and MIC samples. Inside the two groups, the samples also correctly clustered according to the diagnosis thereby indicating that the five disorders are characterized by distinct molecular signatures. The clustering results are shown in
A 53-year-old female patient presented with arthritis of both knees. X-rays and MRI showed severe degenerative changes of the internal femoro-tibial compartment and a severe inflammatory thickening of the synovial tissue. Biological work-up identified the presence of anti-citrulline antibodies in the serum of the patient, a marker that is associated with rheumatoid arthritis. Her rheumatologist hesitated between a diagnosis of severe OA versus atypical RA. A synovial biopsy was performed at that time.
RNA was extracted, labeled according to the Affymerix procedure described above and hybridized on a GeneChip® Human genome U133 Plus 2.0 Array. Data were retrieved on GCOS software for the initial normalization and analysis steps. The normalized data from this sample were used for a supervised clustering study on TMEV, together with the 25 reference samples from example 1, using the specific selection of genes listed in Table 2 or 3.
The results of that experiment are shown in
A 58-year-old male patient presented with chronic inflammatory knee arthritis. Synovial fluid examination had shown the presence of an inflammatory cell population (>4,000 elements/mm3); X-ray studies were not contributive. Because of the presence of an atypical rash, his rheumatologist questioned the possibility that the patient suffered from seronegative (psoriatic) arthritis. Synovial biopsies were taken by needle-arthroscopy, RNA was extracted, labeled according to the Affymerix procedure described above and hybridized on a GeneChip® Human genome U133 Plus 2.0 Array. Data were retrieved on GCOS software for the initial normalization and analysis steps. The normalized data from this sample were used for a supervised clustering study on TMEV, together with the 25 reference samples from example 1, using the specific selection of genes listed in Table 2 or 3. The results of the clustering study indicated that his synovial tissue did not cluster with SA samples, but with RA synovial tissue (
A 55-year-old patient with a 4 months history of undifferentiated arthritis affecting both knees and one ankle was submitted to several tests (biological work-up, X-rays, synovial fluid examination) in order to establish a correct diagnosis of his condition. None of these tests provided his rheumatologist with a definite diagnosis; in particular blood tests indicated an elevated uric acid level (10 mg/dl) but were also positive for anti-citrulline antibodies (a specific marker of RA). For that reason, he underwent a needle-arthroscopic procedure allowing harvesting several knee synovial biopsies. RNA is extracted, labeled and hybridized on a diagnostic low-density array that is spotted with a small number of probes such as those listed in Tables 1, 2, 3 or 4 that define a specific gene signature for RA, OA, SLE, MIC or SA. Because of the small number of genes requested, the low-density array according to the invention is cheap and can be routinely used in clinical practice for that purpose. The gene signature found in the synovial biopsies was that of microcrystalline arthritis. The patient was treated with local joint injections, colchicine, NSAID's and allopurinol and went into remission
Number | Date | Country | Kind |
---|---|---|---|
07447014.7 | Mar 2007 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP08/52532 | 2/29/2008 | WO | 00 | 8/25/2009 |