The present invention relates to methods and products for the identification and diagnosis of Rheumatoid arthritis (RA), in particular for the diagnosis of anti-citrullinated peptide antibody (ACPA)-negative RA. Most particularly the invention relates to a gene expression signature comprising 12 biomarkers for use in the prognosis or diagnosis of RA.
Rheumatoid arthritis (RA) is a chronic, disabling autoimmune disease with a predilection for peripheral joints(1). The importance of prompt disease-modifying therapy in improving clinical outcomes is reinforced by international management guidelines(2). However, approximately 40% of patients with new-onset inflammatory arthritis have disease which is unclassifiable at inception, and are said to have an undifferentiated arthritis (UA)(3). Recently, a validated “prediction rule” has been developed for use amongst UA patients, whereby a composite score derived from clinical and serological data predicts risk of progression to RA(4). The scoring system relies heavily on autoantibody and, in particular, anti-citrullinated peptide antibody (ACPA) status, highlighting the specificity of circulating ACPA for RA(5). However, the diagnosis of ACPA-negative RA remains challenging in the early arthritis clinic, being frequently delayed despite application of the prediction rule(6).
Technological and computational advances have permitted high-throughput, “discovery-driven” routes to biomarker identification in clinical settings through whole-genome transcription profiling(7). Transcriptome analysis in RA has usually been limited to cross-sectional comparisons with normal controls(8, 9), with exceptions aiming to predict responsiveness to biologic agents in established disease(10). Recent work has demonstrated the potential for peripheral blood mononuclear cells (PBMCs) to yield clinically relevant prognostic “gene signatures” in autoimmune disease(11). The application of a similar, prospective, approach to the discovery of predictive biomarkers in UA should compliment existing diagnostic algorithms, whilst providing new insights into disease pathogenesis(12). However, the use of PBMC for transcriptional analysis may result in data that are biased by relative subset abundance (13). To address this, protocols for the rapid ex vivo positive selection of subsets for the purpose of transcription profiling have been validated(14), permitting scrutiny of pathophysiologically relevant cells in isolation.
Although no single cell-type is exclusively implicated in RA, many of the established and emerging genetic associations of the condition implicate the CD4+ T-cell as a key player, and anomalies in peripheral blood CD4+ T-cell phenotype are well-documented(15, 16). For example, in addition to the long-recognised association of the disease with particular MHC class II alleles that encode a conserved sequence within the peptide binding groove (“shared epitope”)(12), recent genome-wide association scans have implicated protein tyrosine phosphatase 22 (involved in T-cell receptor signalling), the IL2-receptor, the co-stimulatory molecules CD28, CTLA-4 and CD40, and the potentially lineage-defining signal transduction and activator of transcription 4 (STAT4) molecules(17). The inventors have therefore surmised that the peripheral blood (PB) CD4+ T-cell transcriptome might therefore represent a plausible substrate for predictive biomarker discovery in early arthritis.
The following terms are used throughout this document.
A sample is any biological material obtained from an individual.
A polynucleotide is a polymeric form of nucleotides of any length. Nucleotides can be either ribonucleotides or deoxyribonucleotides. The term covers, but is not limited to, single-, double-, or multi-stranded deoxyribonucleic acid (DNA) or ribonucleic acid (RNA), mRNA, genomic DNA, cDNA, DNA-RNA hybrids, or a polymer comprising natural, chemically or biochemically modified, non-natural nucleotide bases. Such polynucleotides may include modifications such as those required to allow attachment to a solid support.
A gene is a polynucleotide sequence that comprises sequences that are expressed in a cell as RNA and control sequences necessary for the production of a transcript or precursor.
A gene expression product can be encoded by a full length coding sequence or by any portion of the coding sequence.
A probe can be a DNA molecule such as a genomic DNA or fragment thereof, an RNA molecule, a cDNA molecule or fragment thereof, a PCR product, a synthetic oligonucleotide, or any combination thereof. Said probe can be a derivative or variant of a nucleic acid molecule, such as, for example, a peptide nucleic acid molecule.
A probe can be specific for a target when it comprises a continuous stretch of nucleotides that are entirely complementary to a target nucleotide sequence (generally an RNA product of said gene, or a cDNA product thereof). However a probe can also be considered to be specific if it comprises a continuous stretch of nucleotides that are partially complementary to a target nucleotide sequence. Partially in this instance can be taken to mean that a maximum of 10% from the nucleotides in a continuous stretch of at least 20 nucleotides differs from the corresponding nucleotide sequence of a RNA product of said gene. The term complementary is well known and refers to a sequence that is related by base-pairing rules to the target sequence. Probes will generally be designed to minimise non-specific hybridization.
Where reference is made to “one or more” or “12 or more” or “X or more” genes, this can be understood to be for the purposes of illustration and are non-limiting (although may illustrate best or preferred options).
Gene-lists 1-9 as referenced in the following description are provided at the end of the description. In lists 2-5, the “Illumina ID” column contains the probe address number on the Illumina WG6 (v3) BeadChip (http://www.illumina.com/support/annotation_files.ilmn). Where >1 “differentially expressed” Illumina probes appearing in a given list corresponded to a single gene entity, duplicates were removed. Uncorrected p-values are given in lists 2-5. Official gene symbols and RefSeq accession numbers are given for identification purposes. Non-linearised fold-change values are given values >1 indicate genes up-regulated in RA relative to non-RA groups in any given comparison. (Values <1=down-regulated). For down-regulated values, linearised data may be obtained by rendering the negative reciprocal of the non-linearised value; i.e FC of 0.75=>|FC| of −1.33. Gene lists are ranked according to FC. Any additional list-specific information is provided on the relevant pages.
In order to provide further clarity to the reader, certain sequences are provided in full herein. In particular, the following sequence data is referred to herein;
According to the present invention there is provided a method of diagnosing Rheumatoid arthritis in a patient, the method comprising:
obtaining a sample comprising CD4+ T-cells from the patient; and
determining expression levels of one or more genes selected from the group consisting of
comparing said expression levels to reference expression levels, wherein a difference in expression of said one or more genes indicates an increased likelihood that the patient has Rheumatoid arthritis.
Optionally the group further consists of CD40LG.
Generally the reference expression levels are representative of levels found in samples comprising cells from a patient who does not have RA.
It has been found that an increase in expression when compared to the reference expression levels indicates an increased likelihood that the patient has rheumatoid arthritis.
The inventors' work has confirmed the utility of the signature where CD4+ T-cells of >95% purity are used, and preliminary data suggest that there is some overlap where whole blood RNA (from unpurified cells) is used as substrate.
Most preferably the step of determining expression levels of one or more genes selected from the group consisting of
includes determining expression levels for all of the genes from the group.
The group may be referred to as a “12 gene signature”
Optionally the group further comprises the gene CD40LG.
This group may be referred to as a “13 gene signature”
It has been shown that a difference in expression when compared to the reference expression levels of all of said one or more genes indicates an increased likelihood that the patient has Rheumatoid arthritis
According to the present invention there is provided an in vitro method for typing a sample from an individual classified as having undifferentiated arthritis, or suspected to suffer from rheumatoid arthritis, the method comprising:
obtaining a sample from the individual; and
determining expression levels of one or more genes selected from the group consisting of
typing said sample on the basis of the expression levels determined; wherein said typing provides prognostic information related to the risk that the individual has rheumatoid arthritis (RA).
Optionally the group further comprises the gene CD40LG.
Most preferably expression levels are determined by determining RNA levels.
Methods for determining mRNA levels are well established, some being described herein.
Preferably the sample comprises CD4+ T cells.
Preferably the sample is peripheral whole blood.
Preferably the methods include the step of separating CD4+ T cells from peripheral whole blood.
Preferably the methods include extracting RNA from the CD4+ T cells.
Most preferably the method is for diagnosing anti-citrullinated peptide antibody (ACPA)-negative rheumatoid arthritis.
Preferably expression levels of all of the genes in the group are determined and compared to a set of reference expression levels.
Optionally the method further comprises the step of combining the results of the 12 gene signature with the results of known prediction analysis. The 13 signature could be used instead of the 12 gene signature.
Preferably the known prediction analysis is the Leiden prediction rule (Reference; van der Helm-van Mil 2008 Arthritis and Rheumatism—
Using a composite of the 12 gene signature (or 13 gene signature)/Leiden prediction test maximises the specificity, precision and sensitivity of the test.
According to another aspect of the present invention there is provided a method of diagnosing rheumatoid arthritis in a patient, the method comprising:
obtaining a blood sample from the patient; and
determining expression/mRNA levels of 12 or more genes selected from the group defined in GENE LIST 2; and
comparing said expression/mRNA levels to a set of reference expression/mRNA levels, wherein a difference in expression of said 12 or more genes indicates an increased likelihood that the patient has Rheumatoid arthritis.
According to another aspect of the present invention there is provided a method of diagnosing Rheumatoid arthritis in a patient, the method comprising:
obtaining a blood sample from the patient; and
determining levels of Interleukin-6 (IL-6); and
comparing said levels to a set of reference IL-6 levels, wherein an difference in expression of IL-6 indicates an increased likelihood that the patient has Rheumatoid arthritis.
It has been found that an increase in expression of IL-6 indicates an increased likelihood that the patient has Rheumatoid arthritis.
Notably, serum IL-6 is notoriously sensitive to, for example, diurnal variation, and the inventors identified that it is useful to standardise the sampling procedure—all the samples were taken between the hours of 1300 and 1630, and frozen to −80 within 4 hours of blood draw, undergoing no more than 1 freeze-thaw cycle, for example.
Most preferably the method is for diagnosing anti-citrullinated peptide antibody (ACPA)-negative rheumatoid arthritis.
Preferably the results of the IL-6 expression analysis are combined with the results of known prediction analysis.
An array comprising (a) a substrate and (b) 12 or more different elements, each element comprising at least one polynucleotide that binds to a specific mRNA transcript, said mRNA transcript being of a gene selected from the group defined in GENE LIST 2.
An array comprising (a) a substrate and (b) one or more different elements, each element comprising at least one polynucleotide that binds to a specific mRNA transcript, said mRNA transcript being of a gene selected from the group comprising
Optionally the group further comprises the gene CD40LG.
An array comprising (a) a substrate and (b) 12 elements, each element comprising at least one polynucleotide that binds to an mRNA transcript, said array comprising a binding element for the mRNA of each of the following group of genes
Optionally the array further comprises an additional element comprising at least one polynucleotide that binds to an mRNA transcript for CD40LG.
Preferably the substrate is a solid substrate,
A kit comprising an array as described above and instructions for its use.
Use of a set of probes comprising polynucleotides specific for 12 or more of the genes listed in GENE LIST 2.
Use of a set of probes comprising polynucleotides specific for one or more of the genes selected from the list;
for determining the risk of an individual suffering from rheumatoid arthritis.
Optionally the set of probes further comprises a polynucleotide specific for CD40LG.
Use of a set of probes comprising polynucleotides specific for the genes selected from the list;
for determining the risk of an individual suffering from rheumatoid arthritis.
Optionally the set of probes further comprises a polynucleotide specific for CD40LG.
Use of a set of probes comprising primers specific for one or more of the genes selected from the list;
for determining the risk of an individual suffering from rheumatoid arthritis.
Optionally the set of probes further comprises a primer specific for CD40LG.
Use of a set of probes comprising primers specific for the genes selected from the list;
for determining the risk of an individual suffering from rheumatoid arthritis.
Optionally the set of probes further comprises a primer specific for CD40LG.
Most preferably the use of a set of probes is for determining the risk of an individual suffering from anti-citrullinated peptide antibody (ACPA)-negative rheumatoid arthritis.
According to a further aspect of the present invention there is provided an IL-6 receptor blocker for the treatment of RA.
This kind of biomarker would be expected to have utility in stratifying early RA patients into subgroups of therapeutic significance. For example, patients with high baseline IL-6 (and potentially also relatively highly dysregulated STAT3-inducible genes in circulating CD4+ T-cells, as a consequence), could potentially be more effectively be managed using an IL-6 signalling blocker (such as tocilizumab) or a Jak1/3 inhibitor.
Optionally the IL-6 receptor blocker is tocilizumab.
Optionally the IL-6 receptor blocker is a Jak1/3 inhibitor.
In order to provide a better understanding of the present invention further details and examples will be provided below with reference to the following figures and tables;
FIG. 12—A ROC curve for ACPA-neg individuals, but also regardless of whether or not a diagnosis could be assigned at inception. 102 patients (all ACPA-EA clinic attendees, including those with defined outcomes at inception). Amongst all ACPA-negative early arthritis clinic attendees, an [IL-6] of ≧10 pg/ml has approx. 0.89 specificity and 0.65 sensitivity for an outcome of RA; and
FIG. 13—A ROC curve for UA patients, whether they be ACPA-pos or ACPA-neg. 61 patients (UA patients only; both ACPA+ and ACPA-); and
FIG. 14—A ROC curve for UA patients, ACPA-neg only. 48 patients (UA patients, ACPA-only). Amongst all ACPA-negative UA patients, an [IL-6] of ≧10 pg/ml has approx. 0.92 specificity and 0.58 sensitivity for an outcome of RA.
These examples are not to be considered as limiting.
Patients with recent onset arthritis symptoms who were naïve to disease-modifying antirheumatic drugs (DMARDs) and corticosteroids, were recruited from the Freeman Hospital early arthritis clinic (EAC), Newcastle upon Tyne, UK, between September 2006 and December 2008. A detailed clinical assessment of each patient was undertaken, including ascertainment of ACPA status (anti-CCP2 test, Axis-Shield), along with routine baseline peripheral blood sampling. An initial working diagnosis was assigned to each patient according to a “working diagnosis proforma” (Table 3). RA was diagnosed only where 1987 ACR classification criteria(18) were unequivocally fulfilled, and UA was defined as a “suspected inflammatory arthritis where RA remained a possibility, but where established classification criteria for any rheumatological condition remained unmet”. This working diagnosis was updated by the consulting rheumatologist at each subsequent clinic visit for the duration of the study—a median of 28 months and greater than 12 months in all cases. The diagnostic outcome of patients with UA at inception was thereby ascertained, with individuals whose arthritis remained undifferentiated at the end of the study being excluded. Patients benefitted from routine clinical care for the duration of the investigation, and all gave written informed consent before inclusion into the study, which was approved by the Local Regional Ethics Committee.
Between 1300 hrs and 1630 hrs during the patients' EAC appointment, 15 ml peripheral whole blood was drawn into EDTA tubes (Greiner Bio-One, Austria) and stored at room temperature for a maximum of 4 hours before processing. Monocytes were first depleted by immunorosetting (Rosettesep® Human Monocyte depletion cocktail, Stemcell Technologies Inc., Vancouver, Canada), and remaining cells underwent positive selection using Easisep® whole blood CD4+ positive selection kit reagents in conjunction with the Robosep® automated cell separator (Stemcell). CD4+ T-cell purity was determined using standard flow cytometry techniques; FITC-conjugated anti-CD4 and PE-conjugated anti-CD14 antibodies were used (Beckton Dickinson, New Jersey, USA). RNA was immediately extracted from CD4+ T-cell isolates using RNeasy MINI Kits® (Qiagen GmbH, Germany), incorporating an “on-column” DNA digestion step.
Microarray experiments were performed in 2 phases (phase I, 95 samples; phase II, 78 samples). In each case, total RNA quality was assessed using an Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, Calif.) according to standard protocols(19). 250 ng RNA was reverse transcribed into cRNA, and biotin-UTP labeled, using the IIlumina TotalPrep RNA Amplification Kit (Ambion, Texas). cRNA was hybridised to the IIlumina Whole Genome 6 (version 3) BeadChip® (Illumina, San Diego, Calif.), following the manufacturer's protocol. Each BeadChip measured the expression of 48,804 genes (annotation file at http://www.illumina.com/support/annotation_files.ilmn) and was imaged using a BeadArray Reader (IIlumina).
During baseline clinical assessment, blood was drawn into serum/gel tubes (Greiner Bio-One, Austria), and serum separated and frozen at −80° C. until use. Serum IL-6, sIL6R, TNF-a, leptin and G-CSF concentrations were measured using an immunosorbance assay platform that incorporates a highly sensitive electro-chemoluminescence detection system (Meso Scale Discovery [MSD], Gaithersberg, Md.) according to the manufacturer's instructions. The potential for heterophilic rheumatoid factors (RFs) in sera to cross-link capture and detection antibodies and contribute to spurious read-outs (20, 21) was excluded during pilot work (pilot Methods;
qRT-PCR.
CD4+ T-cell total RNA samples were reverse transcribed using Superscript II® reverse transcriptase and random hexamers according to the manufacturer's instructions (Invitrogen, Carlsbad, Calif.). For replication of microarray findings real-time PCR reactions for reported transcripts were performed as part of a custom-made TaqMan Low Density Array (7900HT real-time PCR system, Applied Biosystems, Foster City, Calif.). Raw data were normalized and expressed relative to the housekeeping gene beta-actin (BACT) as 2−□Ct values(22). BACT was selected from a panel of 9 potential housekeeping genes, having demonstrated optimal stability for this purpose.
Raw microarray data were imported into GeneSpring GX 7.3.1 software (Agilent Technologies), with which all statistical analyses were performed except where indicated. Phases I and II of the study were independently normalised in 2 steps: each probe measurement was first divided by the 50th percentile of all measurements in its array, before being centred around its own median expression measurement across all samples in the phase. The anticipated batch-effect noted between phases on their combination, in addition to minor within-phase batch effects relating to one of the Illumina TotalPrep RNA Amplification steps, was corrected in the R statistical computing environment (http://www.r-project.org/) using the empirical Bayes method of Johnson et al(23). Raw and transformed data are available for review purposes at the Gene Expression Omnibus (GEO) address: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?token=bviftkociimgsnk&acc=G SE20098. Genes detectably expressed (detection p-value <0.01(24)) in ≧1 sample of each study phase passed filtering of the normalised and batch-corrected data, and were included in subsequent analyses (16,205 genes). To define differential expression in this study. an arbitrary fold-change cut-off of 1.2 between comparator groups was combined with a significance level cut-off of p<0.05 (Welch's t-test), corrected for multiple testing using the false-discovery-rate (FDR) method of Benjamini et al(25). Genes identified in this way were used to train a support vector machine (SVM) classification model (Gaussian kernel) based on known outcomes amongst a “training” sample set(26). The model's accuracy, sensitivity and specificity as a prediction tool was then assessed amongst an independent “validation” sample set. In order to obtain larger lists of differentially expressed genes for biological pathway analysis, significance thresholds were subsequently relaxed through the omission of multiple-test-correction. Ingenuity Pathways Analysis software (Ingenuity Systems, Redwood City, Calif.) was used for the majority of these analyses. An objectively derived list of STAT3-inducible gene set was created for additional hypergeometric statistical testing by combining lists from two publically available databases (full list given in Gene List 1; sources; http://www.broadinstitute.org/gsea/msigdb/geneset_page.jsp?geneSetName=V$STAT3—02&keywords=stat3 http://www.broadinstitute.org/gsea/msigdb/geneset_page.jsp?geneSetName=V$STAT3—01&keywords=stat3 http://rulai.cshl.edu/cgi-bin/TRED/tred.cgi?process=searchTFGene&sel_type=factor_name&factor_organism=any&tx_search_terms=STAT3&target_organism=human&prom_quality=1&prom_quality=2&prom_quality=3&prom_quality=4&prom_quality=5&bind_quality=0&submit=SEARCH). Hypergeometric testing in this case was performed using Stat Trek on-line resource (http://stattrek.com). Parametric and non-parametric analyses of variance (ANOVAs), Mann-Whitney U tests, Pearson's correlation coefficients, intraclass correlations, multivariate analyses and the construction of receiver operator characteristic (ROC) curves were performed using SPSS version 15 (SPSS inc., Chicago Ill.).
Leiden prediction scores were calculated for each member of the training cohort according to baseline clinical and laboratory data as described in reference (5). Risk metrics based on the 12-gene RA “signature” were the sum of normalised expression values for the genes therein, assigning negative charge to the value for NOG (which was down-regulated in RA). Within the training dataset, both scores were entered as independent continuous variables into a logistic regression analysis with RA versus non-RA outcomes as the dependent variable (Table 4). In the resultant model the probability of an outcome of RA is related to both variables via the modified metric: B1x1+B2x2, where B1 and B2 are the regression coefficients for the Leiden prediction score and 12-gene risk metric respectively (B values in Table 4), and x1 and x2 are the values for each amongst individual patients. Hence, for a given patient the modified metric is equal to: (0.98×[Leiden prediction score])+(0.36×[12-gene risk metric/signature]).
The potential for heterophilic rheumatoid factors (RFs) in sera to cross-link capture and detection antibodies and contribute to spurious read-outs was investigated in pilot work to this study. We first confirmed that a commercially available, proprietary cocktail of non-human sera (Heteroblock, Omega Biologicals Inc., Boseman, Mont.) could successfully neutralise the demonstrable heterophilic activity of native RF in human serum. A known final concentration of recombinant interferon-gamma (IFN-γ) was “spiked” into the sample and, by comparing the calculated difference in standard sandwich ELISA readout (BD Pharmingen, New Jersey, USA) between spiked and un-spiked samples with the actual spiked IFN-γ concentration, the extent of heterophilic activity could be ascertained, and the neutralising effect of varying concentrations of Heteroblock determined (FIG. 5A)(reference 21, main text). We next measured IL-6 concentration in 24 RF+ serum samples (median RF by nephelometry=165 IU) and 56 RE-negative samples, using the MSD platform, in each case running parallel assays with and without an optimised final concentration of Heteroblock. For the RF+ samples, excellent correlation was seen between assays performed with and without heteroblock (intraclass correlation coefficient=0.98 [95% CI=0.95-0.99]). A Bland-Altman plot confirmed that any such discrepancy that did exist was no less evident in RF-negative samples, suggesting that interference by heterophilic RFs in sera analysed using this platform is inconsequential (
173 patient samples were retrospectively selected for microarray analysis. 111 of these originated from patients who could be assigned definitive diagnoses at inception, which were confirmed at a median follow-up of 28 months (minimum 1 year); an RA versus non-RA discriminatory “signature” was derived from this “training cohort” alone. The remaining 62 samples, all representing UA patients, formed an independent “validation cohort” for testing the utility of the “signature” according to diagnostic outcomes as they evolved during the same follow-up period. As expected, the characteristics of the UA cohort in respect of age, acute phase response, joint counts etc. fell between the equivalent measurements in the RA and control sample sets within the training cohort (Table 1). For subsequent pathway analysis, all 173 samples were pooled before being divided into four categories based on diagnostic outcome at the end of the study (Table 5).
Astatistical tests for significant variance between 3 inflammatory comparator groups (ACPA-negative RA, ACPA-positive RA and non-RA inflammatory arthritis); ANOVA, Krukskall-Wallis or Chi-square test normally-distributed, skewed or dichotomous data respectively.
Bstatistical tests for significant variance between all 4 inflammatory comparator groups; ANOVA, Kruskall-Wallis or Chi square test for normally-distributed, skewed or dichotomous data respectively.
Flow cytometric analysis was completed for 148/173 (86%) of samples, and a median CD4+ CD14− purity of 98.9% was achieved (range 95-99.7%), with minimal CD4+ CD14+ monocyte contamination (median 0.32%; range 0.01-2.98%). Pilot work had demonstrated that incorporation of the monocyte depletion step described was required to achieve this (
Using a significance threshold robust to multiple test correction (false discovery rate p<0.05)(25), 12 non-redundant genes were shown to be differentially expressed (>1.2-fold) in PB CD4+ T-cells between 47 “training cohort” EAC patients with a confirmed diagnosis of RA, and 64 who could be assigned non-RA diagnoses (Table 2). An extended list, obtainable by omitting multiple-test correction, is given in Gene-List 2. Supervised hierarchical cluster analysis of the resultant multidimensional dataset (111 samples, 12 genes), demonstrated a clear tendency for EAC patients diagnosed with RA to cluster together based on this transcription profile (
BCL3 (NM_005178)
SOCS3 (NM_003955)
PIM1 (NM_002648)
SBNO2 (NM_014963)
LDHA (NM_005566)
CMAH (NR_002174)
NOG (NM_005450)
PDCD1 (NM_005018)
IGFL2
(NM_001002915)
LOC731186
(XM_001128760)
MUC1
(NM_001044391)
GPRIN3 (CR743148)
C
ACalculations based on normalised expression values of array data;
BCalculations based on expression data normalised to
the house-keeping gene beta-actin (2−□Ct); Mann-Whitney U test (see methods). CNote that the transcript CR743148 (IIlumina Probe ID 6370082) has been retired from NCBI, but the expressed sequence tag corresponds to splice variant(s) within the GPRIN3 gene (chromosome 4.90).
To derive a metric denoting risk of RA progression, the sum of normalised expression values for the 12-gene RA “signature” was calculated for each individual in the training cohort (see methods). A receiver operator characteristic (ROC) curve, plotting sensitivity versus [1-specificity] for a range of cut-offs of this risk metric, was then constructed, the area under which (0.85; standard error of mean [SEM]=0.04) suggested a promising discriminatory utility (
The inventors have also found that a third gene could be included to make a 13 gene signature. Effectively, all genes are included as per the original 13 gene signature, but an additional down-regulated gene CD40LG is also included; this provides further specificity to the test giving an area under ROC curve of 0.835.
Next, we tested the potential additive diagnostic value of our 12-gene signature in comparison to the existing “Leiden prediction rule” as a predictor of RA amongst UA patients (4). Whilst the discriminatory utility achieved by the prediction rule in our UA cohort was comparable to that previously reported (n=62; AU ROC curve=0.86; SEM=0.05, data not shown), its performance diminished amongst the ACPA-negative sub-cohort (n=49; AU ROC curve=0.74; SEM=0.08;
All 173 patients studied were now grouped into 4 categories based on outcome diagnosis alone: ACPA-positive RA, ACPA-negative RA, inflammatory non-RA controls and osteoarthritis (OA); their demographic and clinical characteristics are presented for comparison in Table 5. Three lists of differentially expressed genes could then be generated by comparing each of the “inflammatory” groups (which themselves exhibited comparable acute phase responses) with the OA group (>1.2 fold change; uncorrected p<0.05; Gene-lists 3-5). The 3 lists were overlapped on a Venn diagram (
A highly significant over-representation of genes involved in the cell cycle was identified in association with ACPA-positive RA (24/46; p<1.0×10−5);
Since one classical mechanism of STAT3 phosphorylation is via gp130 co-receptor ligation(36), we hypothesised that increased systemic levels of a key gp130 ligand and proinflammatory cytokine, IL-6, may be responsible for the STAT3-mediated transcriptional programme in early RA patients. Baseline serum IL-6 was measured in 131/173 EAC patients, subsequently grouped according to their ultimate diagnosis (ACPA-negative RA, ACPA-positive RA, non-RA inflammatory arthropathy or OA). IL-6 levels were low overall (generally <100 pg/ml), but were highest in the ACPA-negative RA group (
We present a unique analysis of the ex-vivo PB CD4+ T-cell transcriptome in a well-characterised inception cohort of early arthritis patients. We have minimised confounding by including only patients naive to disease-modifying therapy, focussing on a single PB cell subset, collecting and processing samples expeditiously under standardised conditions, and employing careful quality control. In terms of a potential diagnostic tool, it is pleasing that our 12-gene “RA expression signature” (Table 2) performed best amongst the diagnostically challenging ACPA-negative UA patient group. These findings support the involvement of CD4+ T-cells in both ACPA positive and negative disease. The observation that both RA serotypes differed from a non-inflammatory control group to a greater extent than a non-RA inflammatory control group (
The signature's sensitivity and specificity (0.85 and 0.75) for predicting subsequent RA in seronegative UA patients equate to a positive likelihood ratio (LR+) of 3.4, indicating that a prior probability of 25% for RA progression amongst this cohort (13/49 patients progressed to RA) doubles to 53% for an individual assigned a positive SVM classification (posterior probability; [3.4×{0.25/0.75}]/[1+{3.4×(0.25/0.75)}](37)). Moreover, of the 13 ACPA-negative UA patients who progressed to RA in our cohort, 8 fell into an “intermediate” risk category for RA progression according to the validated Leiden prediction score(4), thereby remaining subject to delayed diagnosis. Encouragingly, all but one of these patients were correctly classified based on their 12-gene expression profiles. Our proof-of-concept that this approach might add value to existing algorithms in the diagnosis of ACPA-negative UA is further supported by the construction of ROC curves comparing the Leiden prediction rule with a modified risk metric that amalgamates the features of our gene signature with those of the prediction rule (
Our data indicate that PB CD4+ T-cells in early RA are characterised by a predominant up-regulation of biological pathways involved in cell cycle progression (ACPA-positive) and survival (ACPA-negative) (
Given the well-characterised importance of the STAT3 signalling pathway in both oncogenesis and T-cell survival pathways, it was notable that 5 genes from our statistically robust 12-gene RA signature are reportedly induced following STAT3 phosphorylation(27-32). This up-regulation was generally most pronounced in ACPA-negative RA (
Striking correlations were seen between PB CD4+ T-cell expression of several STAT3-inducible genes and paired, contemporaneous serum IL-6 concentrations (
The inventors also studied two other gp130 ligands seeking a potential role for them in STAT3 pathway induction; Granulocyte colony stimulating factor (G-CSF) and leptin have both been implicated in RA pathogenesis(46, 47), but their levels in sera from the same subset of study patients neither correlated with diagnostic outcome nor STAT3 gene expression (
The inventors also reviewed ROC curves to look at the discriminatory utility of various scoring systems in their cohort, and subsets thereof. 42 patients were excluded from the analysis for whom no IL-6 measurements were available, however only one of these presented with UA.
FIG. 11—The whole cohort, including ACPA pos individuals, regardless of whether or not a diagnosis could be assigned at inception.
FIG. 12—ACPA-neg individuals, but also regardless of whether or not a diagnosis could be assigned at inception.
FIG. 13—UA patients, whether they be ACPA-pos or ACPA-neg
FIG. 14—UA patients, ACPA-neg only.
In each cohort/sub-cohort, 4 ROC curves are compared: the Leiden prediction rule, the 12-gene risk metric we discussed, the composite Leiden/12-gene metric mentioned in the manuscript, and IL-6 alone.
In slides 12 and 14 (excluding ACPA-positive patients), there are given sensitivities/specificities for an example cut-off of 10 pg/ml serum [IL-6].
The results suggest that IL-6 is a useful parameter for predicting outcome in early ACPA-negative disease in particular. The most effective prediction appears to be given by a composite of the Leiden prediction score and the 12-gene metric.
In conclusion, the data provides strong evidence for the induction of an IL-6-mediated STAT3 transcription programme in PB CD4+ T-cells of early RA patients, which is most prominent in ACPA-negative individuals, and which contributes to a gene expression “signature” that may have diagnostic utility. Such a pattern of gene expression amongst CD4+ T-cells at this critical early phase in the natural history of inflammatory arthritis could have a defining role in the switch from potentially self-limiting inflammation to T-cell-perpetuated chronic autoimmunity—a model which may not be limited to the example of RA. In any event, the findings could pave the way for a novel treatment paradigm in early arthritis, whereby drugs targeting the IL-6-gp130-STAT3 “axis” find a rational niche as first choice biologic agents in the management of ACPA-negative RA. One such agent, already available in the clinic, is the IL-6 receptor blocker tocilizumab, whose efficacy is already established in RA(49); others include janus kinase inhibitors currently undergoing phase III clinical trials for the disease (50). Studies such as ours should ultimately contribute to the realisation of true “personalised medicine” in early inflammatory arthritis, in which complex heterogeneity is stratified into pathophysiologically and therapeutically relevant subsets, with clear benefits in terms of clinical outcome and cost.
indicates data missing or illegible when filed
Number | Date | Country | Kind |
---|---|---|---|
1102563.2 | Feb 2011 | GB | national |
1108818.4 | May 2011 | GB | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/GB2012/050315 | 2/13/2012 | WO | 00 | 9/11/2013 |