The present invention relates to the field of diagnosis or prognosis of a responding or non-responding phenotype to various biological drugs useful for treatment of inflammatory diseases, as well as associated therapeutic uses and methods.
Cytokine targeting drugs and more generally anti-inflammatory biological drugs such as TNFα-blocking agents (herein after referred to as “TBA”) are increasingly used in the treatment of various inflammatory diseases. The first indications in which such TBA were approved are rheumatoid arthritis and Crohn's disease. Rheumatoid arthritis (RA) is a chronic, progressive, debilitating auto-immune disease of largely unknown etiology that affects approximately 1% of the population (1). RA is characterized by chronic inflammation of the synovium, which ultimately leads to joint damage, pain and disability (2). The clinical spectrum of RA is heterogeneous, ranging from mild to severe, with variability in secondary organ system involvement. Disease heterogeneity is further illustrated by the current variation in treatment response rates. First line treatment is usually initiated with so called disease-modifying anti-rheumatic drugs (DMARDs), such as methotrexate (MTX). Approximately 30% of patients display a suboptimal response or intolerance to traditional DMARDs (3). In these patients, second line treatment is initiated with “biologics”, agents that block molecules or cells thought to be instrumental to disease progression, such as tumor necrosis factor-α (TNFα) and interleukin-1 (IL-1) or B and T-cells. There are indeed nine biologic agents currently available, each with overlapping or unique mechanisms of action (4). The response rates to such treatments vary widely, with a great number of patients remaining refractory to treatment or demonstrating only partial improvement (5). The incomplete understanding of drug mechanisms of action together with disease heterogeneity means that there are no methods of identifying patient suitability for the various biologics prior to the initiation of the treatment. Establishing a rational basis on which to select patients for specific biologics would help patients to be treated more efficiently; those that would be likely to respond would initiate the biologic in question whereas those unlikely to respond could be provided with another treatment.
In absence of reliable literature on efficacy and safety of biologics and given the percentage of patients that do not respond or experience severe adverse effects, the destructive nature of RA, and the societal costs of inefficacious biological treatments, there is a strong need to make predictions on success before starting the therapy. A clinically or radiographic-based test will most probably assess conditions too late for protecting joints from irreversible destruction. Ideally, a molecular biomarker signature as a predictor for therapy responsiveness should be obtained prior to the start of therapy in a readily available bio-sample, such as peripheral blood. Given the systemic nature of RA and the communication between the systemic and organ-specific compartments, the peripheral blood may not directly have implications for the understanding of disease pathogenesis, but it is especially suitable to analyze gene expression profiles that provide a framework to select clinically relevant biomarkers. Furthermore, blood-based tests remain less invasive for the patients than synovial tissue-based tests.
Ultimately, this may lead to a personalized form of medicine, whereby the best suited therapy will be applied to an individual patient.
The same is true for other inflammatory diseases for which TBA have been approved or for which preliminary results indicate that TBA might be a useful treatment. For the moment, TBA have been approved, in addition to the treatment of RA or Crohn's disease, for the treatment of ankylosing spondylitis, psoriatic arthritis, plaque psoriasis, and ulcerative colitis (see notably FDA labels of infliximab and etanercept). In addition, preliminary results suggest that TBA may be useful in the treatment of several other inflammatory diseases, such as vasculitis (notably Behcet's disease, Churg-Strauss vasculitis, polyarteritis nodosa, and giant cell arthritis); Wegener's granulomatosis; sarcoidosis; adult-onset Still's disease, polymyositis/dermatomyositis, and systemic lupus erythematosus (SLE) (31 and 32).
In all cases, only a proportion (although sometimes a high proportion) of patients treated with TBA display a clinical response to the treatment (see notably FDA labels of infliximab (Remicade®) and etanercept (Enbrel®)). For all diseases in which TBA may be useful, it would thus be very helpful to be able to predict the capacity of a subject to respond or not to TBA treatment.
A very powerful way to gain insight into the molecular signatures underlying pathophysiological processes has arisen from DNA microarray technology, which allows the identification of the fraction of genes that are differentially expressed in blood and tissue samples among patients with clinically defined disease. These differentially expressed genes may provide insight into biological pathways contributing to disease and represent classifiers for early diagnosis, prognosis, and response prediction.
Several pitfalls were experienced using this multistage and relatively expensive technology, which highly depends on perfectly standardized conditions. Factors that might influence the sensitivity and reproducibility range from sample differences, variation in amount and quality of starting RNA material, amplification and labeling strategies and dyes, to probe sequence and hybridization conditions. In addition, the lack of standardized approaches for normalization and usage of data analysis algorithms could influence the outcome. Furthermore, most microarray studies are not prospectively planned and often do not have detailed protocols, but rather tend to make use of existing samples. Therefore, verification of results is an essential step in microarrays studies and quality criteria have to be set.
Several groups have explored the possibility of identifying molecular traits (single nucleotide polymorphisms, gene expression etc.) capable of classifying patients according to their response to treatment based on retrospective analyses of biological samples (synovium or peripheral blood) collected at treatment baseline. In particular, much interest has been paid to the TNFα-blocking agent Infliximab, with the first report several years ago by T Lequerré and colleagues of gene expression-based prediction of response to therapy (6). Since then, several other groups have similarly reported on large-scale gene expression analyses of peripheral blood as a means to predict response to Infliximab (7) (8) (9). All of these studies reported on differentially expressed genes and combinations thereof for the prediction of response to therapy.
These studies provided important proof of concept for the prediction of response to Infliximab at baseline of therapy. Nevertheless, as with all studies of this kind, the use of microarray technology, measuring thousands of genes simultaneously in relatively small cohorts of patients, runs the risk of over-fitting data, leading to false positive results. Moreover, the mono-centric nature of these studies may limit the relevance of the genes identified to a wider and more demographically varied population.
The present invention overcomes these drawbacks by combining information from multiple existing studies. This approach can increase the reliability and generalizability of results. Quantitative approaches in which individual studies addressing a set of related research hypothesis are statistically integrated and analyzed to determine the effectiveness of interventions (meta-analysis) showed the broad utility of applying meta-analytic approaches to genome-wide data for the purpose of biological discovery. Meta-analysis were already used to identify genes differentially expressed between two groups, to compare results obtained on different microarray platforms (cross-platform classification), to identify overlaps between samples from heterologous datasets, to identify co-expressed genes or to reconstruct gene networks.
Meta-analyses of multiple gene expression microarray datasets provide discriminative gene expression signatures that are identified and validated on a large number of microarray samples, generated by different laboratories and microarray technologies. Predictive models generated by this approach are better validated than those generated on a single data set, while showing high predictive power and improved generalization performance.
In the present invention, the meta-analysis was performed according to the stepwise approach in conducting meta-analysis on microarray datasets (1-identify suitable microarray studies; 2-extract data from studies (this step also involved getting additional information from the authors of selected studies); 3-prepare the individual datasets; 4-annotate the individual datasets; 5-resolve the many-to-many relationship between probes and genes; 6-combine the study-specific estimates; 7-analyze, present, and interpret results) described in Ramasamy et al. (10).
Forty six distinct genes differentially expressed between future responders and non responders to Infliximab therapy have thus been identified. Furthermore, one particular gene (MKNK1) has been found to be highly correlated to the Infliximab responsive or non-responsive phenotype of tested subjects, and several combinations of a minimum number of genes are proposed as being predictive of the primary (week 14, week 16 and week 22) response to anti-TNF treatment in RA patients. These combinations comprise genes that are known to be involved in inflammatory or immune processes rather than in the metabolism pathway of Infliximab, which clearly gives a rational for their general usefulness for predicting TNFα-blocking agents (TBA) responsive or non-responsive phenotype of subjects suffering from other inflammatory diseases, notably those for which TBA have been approved or have been shown to be useful in preliminary studies.
The invention thus relates to a method for the in vitro diagnosis or prognosis of a cytokine targeting drug (hereafter referred to as CyTD, such as a TNFα-blocking agent, hereafter referred to as TBA) or anti-inflammatory biological drug responding or non-responding phenotype, comprising:
(a) determining from a biological sample of a subject suffering from an inflammatory disease an expression profile comprising or consisting of:
(b) comparing the obtained expression profile with at least one reference expression profile, and
(c) determining the CyTD or anti-inflammatory biological drug responding or non-responding phenotype from said comparison.
The diagnosis or prognosis method according to the invention permits to determine whether a subject is non-responding or responding to a cytokine targeting drug or anti-inflammatory biological drug treatment. In a preferred embodiment, the diagnosis or prognosis methods according to the invention are intended for the diagnosis or prognosis of non-response (i.e. of a non-responding condition) to a CyTD (preferably a TBA, notably Infliximab) or anti-inflammatory biological drug (preferably a TBA, notably Infliximab). In this case, the methods according to the invention for the diagnosis or prognosis of a non-responding phenotype have the advantage to present high negative predictive value (NPV) and high specificity.
For such diagnosis or prognosis of a non-responding phenotype, a non-responding condition or test outcome is considered a positive result, while a responding condition or test outcome is considered a negative result. True and false positive results, NPV, PPV, specificity, sensitivity and error rate are defined and calculated as follows:
PPV=TP/(TP+FP)
NPV=TN/(TN+FN)
Specificity=TN/(TN+FP)
Sensitivity=TP/(TP+FN)
Error rate=(FP+FN)/Total number of patients
The invention also relates to a method for designing a CyTD or anti-inflammatory biological drug treatment for a subject suffering from an inflammatory disease, said method comprising:
(a) determining from a biological sample of said subject an expression profile comprising or consisting of:
(b) comparing the obtained expression profile with at least one reference expression profile,
(c) determining the CyTD or anti-inflammatory biological drug responding or non-responding phenotype from said comparison, and
(d) designing the dose of CyTD or anti-inflammatory biological drug treatment according to said identified CyTD or anti-inflammatory biological drug responding or non-responding phenotype.
In the above method for designing a CyTD or anti-inflammatory biological drug treatment for a subject suffering from an inflammatory disease, the dose that is designed and optionally administered to the subject depends on its responding or non-responding phenotype. In particular, a usual dose may be administered if the subject is diagnosed or prognosed as responding. In contrast, if the subject is diagnosed or prognosed as non-responding, it may be decided not to administrate the CyTD or anti-inflammatory biological drug treatment or to increase the dose.
The invention is also drawn to a method of treatment of a subject suffering from an inflammatory disease with a CyTD or anti-inflammatory biological drug, comprising:
(a) determining from a biological sample of the said subject the presence of a CyTD or anti-inflammatory biological drug responding or non-responding phenotype using a method according to the invention, and
(b) adapting the CyTD or anti-inflammatory biological drug treatment in function of the result of step (a).
Said adaptation of the CyTD or anti-inflammatory biological drug treatment may consist in:
The invention also refers to a new use of a CyTD or anti-inflammatory biological drug in the treatment of an inflammatory disease, comprising the steps of:
(a) determining from a biological sample of a subject suffering from an inflammatory disease the presence of a CyTD or anti-inflammatory biological drug responding or non-responding phenotype using a method according to the invention, and
(b) determining the dose of CyTD or anti-inflammatory biological drug to administer with respect to the result of step (a).
Optionally; the dose of CyTD or anti-inflammatory biological drug determined in step (b) is administered to the subject.
The invention thus relates to a CyTD or anti-inflammatory biological drug, for use in treating an inflammatory disease, wherein the CyTD or anti-inflammatory biological drug is administered to a subject suffering from said inflammatory disease who has been diagnosed and/or prognosed as responsive using a method according to the invention.
A CyTD or anti-inflammatory biological drug of the present invention is preferably a TNF alpha blocking agent such as defined below.
The invention also relates to the use of a CyTD or anti-inflammatory biological drug for preparing a drug for the treatment of an inflammatory disease in subjects suffering from said inflammatory disease who have been diagnosed and/or prognosed as responsive using a method according to the invention.
In all the present description, the definitions appear in bold characters.
An “inflammatory disease” refers to a disease involving uncontrolled inflammation processes leading to body damages, and includes any disease generally considered as inflammatory diseases by those skilled in the art.
Advantageously, said inflammatory disease is known to involve, at least in some cases, a pathogenic inflammatory cytokine (IL-1, IL-6, IL15, IL-17, IL-18, IL-23 or TNF-α) secretion. The methods according to the invention then permit to diagnose the presence of such a pathogenic inflammatory cytokine secretion in a tested subject, to predict his/her capacity to respond to a CyTD treatment, and thus to adapt his/her treatment in view of his CyTD responding/non-responding phenotype. Even more advantageously, said inflammatory disease is known to involve, at least in some cases, a pathogenic TNF-α secretion. The methods according to the invention then permit to diagnose the presence of such a pathogenic TNF-α secretion in a tested subject, to predict his/her capacity to respond to a TBA treatment, and thus to adapt his/her treatment in view of his TBA responding/non-responding phenotype.
The methods according to the invention also permit to more generally diagnose a sub type of rheumatoid arthritis or a particular state of activation of the immune system in a tested subject, to predict his/her capacity to respond to an anti-inflammatory biological drug treatment, and thus to adapt his/her treatment in view of his anti-inflammatory biological drug responding/non-responding phenotype.
Such inflammatory diseases may be of autoimmune or non-autoimmune origin. Non limiting examples of inflammatory diseases for which the methods and kits according to the invention are useful, in particular for determining a TBA responding or non-responding phenotype, include rheumatoid arthritis (RA), Crohn's disease, ankylosing spondylitis, psoriatic arthritis, plaque psoriasis, ulcerative colitis, vasculitis (notably Behcet's disease, Churg-Strauss vasculitis, polyarteritis nodosa, and giant cell arthritis); Wegener's granulomatosis; sarcoidosis; adult-onset Still's disease, polymyositis/dermatomyositis, and systemic lupus erythematosus (SLE). An advantageous group of diseases for which the methods of the invention are useful are those in the treatment of which TBA have been approved: rheumatoid arthritis (RA), Crohn's disease, ankylosing spondylitis, psoriatic arthritis, plaque psoriasis, ulcerative colitis. The methods according to the invention are particularly useful for RA-suffering patients for determining a TBA responding or non-responding phenotype. They are also useful for RA-suffering patients for determining an anti-inflammatory biological drug (in particular a recombinant protein inhibiting costimulation of T cells by antigen presenting cells) responding or non-responding phenotype.
For rheumatoid arthritis (RA), since first line treatment is usually initiated with so called disease-modifying anti-rheumatic drugs (DMARDs), the invention also refers to a method of treatment of an RA-suffering subject, comprising the steps of:
(a) administering a therapeutic dose of a DMARD to the said subject suffering from RA,
(b) determining from a biological sample of the said RA-suffering subject the presence of a CyTD or anti-inflammatory biological drug responding or non-responding phenotype using a method according to the invention, and
(c) determining the dose of CyTD or anti-inflammatory biological drug to administer with respect to the result of step (b).
Thus the invention also refers to a combination of 1) a DMARD and 2) a CyTD or anti-inflammatory biological drug, for the treatment of RA, comprising the steps of:
(a) administering a therapeutic dose of a DMARD to a subject suffering from RA,
(b) determining from a biological sample of the said RA-suffering subject the presence of a CyTD or anti-inflammatory biological drug responding or non-responding phenotype using a method according to the invention, and
(c) determining the dose of CyTD or anti-inflammatory biological drug to administer with respect to the result of step (b).
Optionnally; the dose of CyTD or anti-inflammatory biological drug determined in step (c) is administered to the subject.
In a preferred embodiment, the DMARD is methotrexate (MTX).
By “cytokine targeting drug” or “CyTD”, it is meant any molecule neutralizing a cytokine signalling, notably by binding to and neutralizing the cytokine or its receptor. Such a binding and neutralizing molecule may notably be an antibody or a fragment thereof specific for said cytokine or cytokine receptor, cytokine receptor antagonists, or any other molecule, such as a recombinant protein, binding to and neutralizing said cytokine or cytokine receptor. Said CyTD preferably targets an inflammatory cytokine such as IL-1, IL-6, IL-15, IL-17, IL-18, IL-23 or TNF-α or a receptor of such inflammatory cytokines. Molecules targeting IL-1 signalling include monoclonal antibodies to IL-1, such as Canakinumab (commercial name Ilaris®), a human anti-IL-1β monoclonal antibody; antagonists of IL-1 receptor such as anakinra (commercial name Kineret®), and a fusion protein between IgG1 Fc portion and ligand-binding domains of human IL-1RI and IL-1AcP such as Rilonacept (nom commercial Arcalyst™) Molecules targeting II-6 signalling notably include Tocilizumab, an anti-IL-6R monoclonal antibody. Molecules targeting II-15 signalling notably include HuMax-IL-15 (AMG 714), an anti-IL-15 monoclonal antibody. Molecules targeting II-17 signalling notably include AIN457, an anti-IL-17A monoclonal antibody. In all the present description, a preferred embodiment of a CyTD is a “TNFα-blocking agent” or “TBA”.
By “TNFα-blocking agent” or “TBA”, it is herein meant a biological agent which is capable of neutralizing the effects of TNFα. Said agent is a preferentially a protein such as a soluble TNFα receptor, e.g. Pegsunercept, or an antibody. In a further preferred embodiment, the said agent is a monoclonal antibody. In an even further preferred embodiment, the said agent is selected in the group consisting of Etanercept (Enbrel®), Infliximab (Remicade®), Adalimumab (Humira®), Certolizumab pegol (Cimzia®), and golimumab (Simponi®). In an even more preferred embodiment, the said agent is Infliximab.
By “anti-inflammatory biological drug”, it is herein meant a biological agent (preferably a recombinant protein, including recombinant antibodies) with anti-inflammatory properties. This includes CyTD directed to inflammatory cytokines such as IL-1, IL-6, IL-15, IL-17, IL-18, IL-23 or TNF-α (in particular TBAs) or to a receptor of such inflammatory cytokines. This also notably includes the following biological agents (preferably recombinant proteins, including recombinant antibodies):
In addition to TBAs, and notably infliximab, another preferred anti-inflammatory biological drug is abatacept.
In a particularly preferred embodiment of any method according to the present invention, the inflammatory disease is rheumatoid arthritis and the CyTD or anti-inflammatory biological drug is Infliximab (Remicade®), a particular TBA. In still another preferred embodiment of any method according to the present invention, the inflammatory disease is rheumatoid arthritis and the CyTD or anti-inflammatory biological drug is abatacept (Orencia®).
According to the present invention, a “CyTD or anti-inflammatory biological drug responding phenotype” is defined as a response state of a subject to the administration of a CyTD or anti-inflammatory biological drug respectively. A “response state” means that the said subject (referred to as a CyTD or anti-inflammatory biological drug responding subject or a responding subject or a responsive subject: for the purpose of this application, these terms are similar) responds to the treatment, i.e. that the treatment is efficacious in the said subject. The definition of response is an improvement in clinical symptoms. The quantification of such response is made according to ACR20, ACR50, ACR70 criteria (11) and/or EULAR criteria between 14 and 22 weeks, and notably at weeks 14, 16 or 22 weeks or change in DAS28>1.2. Even more preferred is EULAR response criteria at 14 weeks. These criteria (31) have been established by organizations regrouping the professionals in the field (ACR: American College of Rheumatology; EULAR: European League Against Rheumatism). These criteria are thus well known to the skilled person in the art and need not be detailed here.
In contrast, a “CyTD or anti-inflammatory biological drug non-responding phenotype” refers to the absence in said subject (referred to herein as a CyTD or anti-inflammatory biological drug non-responding subject or a non responding subject or a non-responsive subject: these terms should be construed in the context of this application as having the same meaning) of a state of response, meaning that said subject remains refractory to the treatment.
In a preferred embodiment of any of the above-described in vitro methods of diagnosis/prognosis according to the invention, the said subject is an RA-suffering subject. An “RA-suffering subject” is a subject fulfilling the American College of Rheumatology (ACR) criteria for RA (11). In one further embodiment, the said subject is not treated with a CyTD or anti-inflammatory biological drug; in another further embodiment, the said subject is treated with a CyTD or anti-inflammatory biological drug.
It will easily be conceived that when the said subject is not yet treated with a CyTD or anti-inflammatory biological drug, the methods of the invention permit a prognosis (also referred to as a prediction) of the responsiveness/non responsiveness of the said subject. Thus, in this embodiment, the method of the invention allows the person skilled in the art to prognose (i.e. to identify or predict) the subjects susceptible of responding to the CyTD or anti-inflammatory biological drug treatment. This is important because of the destructive nature of RA and the societal costs of inefficacious biological treatments. Moreover, since this embodiment of the invention allows for identification of non responsive subjects before any treatment is initiated (i.e. prediction of non response), the risks for one treated subject to encounter severe adverse effects are greatly diminished. Moreover, they are useful for predicting subjects who are not responding to the treatment, i.e. who are refractory to the CyTD or anti-inflammatory biological drug, and should thus be administered another therapy. In particular, the methods of the invention allow for a prediction/prognosis of response or non-response at week 14, 16 or 22 from the beginning of the CyTD or anti-inflammatory biological drug treatment.
When the subject according to the invention is already treated with a CyTD or anti-inflammatory biological drug, the methods of the invention are useful for diagnosing if a subject responds to the said CyTD or anti-inflammatory biological drug, and whether the said subject would thus benefit from a continuation of the said treatment. Moreover, they are useful for diagnosing subjects who are not responding to the treatment, i.e. who are refractory to the CyTD or anti-inflammatory biological drug, and should thus be swiftly shifted to another therapy. In regard of the debilitating nature of RA, this achievement is crucial. In particular, the methods of the invention allow for a diagnosis at week 14 or 22 after the beginning of the CyTD or anti-inflammatory biological drug treatment.
In the present description, what is described for CyTD or anti-inflammatory biological drug also particularly applies to TBA, which is a preferred embodiment of a CyTD or anti-inflammatory biological drug in any method or kit according to the invention.
A “biological sample” may be any sample that may be taken from a subject, such as a serum sample, a plasma sample, a urine sample, a blood sample, in particular a peripheral blood sample, a lymph sample, or a biopsy. It also includes specific cellular subtypes or derivatives extracted from those such as PBMCs. Such a sample must allow for the determination of an expression profile comprising or consisting of:
Preferred biological samples for the determination of an expression profile include samples such as a blood sample, a plasma sample, a lymph sample, or a biopsy. Preferably, the biological sample is a blood sample. Indeed, such a blood sample may be obtained by a completely harmless blood collection from the patient and thus allows for a non-invasive diagnosis or prognosis of a CyTD or anti-inflammatory biological drug responding or non-responding phenotype.
Optionally, all methods according to the invention may further comprise a preliminary step of taking a biological sample from the patient.
By “gene” is meant the Official Gene Symbol provided by HUGO Gene Nomenclature Committee (www.genenames.orq) and used in the Entrez Gene database, the NCBI's repository for gene-specific information (http://www.ncbi.nlm.nih.qov/qene).
By “expression profile” is meant the expression levels of a group of genes comprising or consisting of:
In a most preferred embodiment, the expression profile for diagnosing or prognosing (i.e. predicting) if the subject is responding or not at week 14, 16 or week 22 after the beginning of the CyTD or anti-inflammatory biological drug treatment comprises or preferably consists of the gene MKNK1 or Equivalent Expression Profile thereof, provided that, in said Equivalent Expression Profile thereof, MKNK1 is not replaced by gene S100A8 nor gene MAPK14, and optionally one or more housekeeping gene(s). In another most preferred embodiment, the expression profile for diagnosing or prognosing responsiveness or non responsiveness at week 14, week 16 or week 22 after the beginning of the CyTD or anti-inflammatory biological drug treatment (in particular a TBA treatment) comprises or preferably consists of the genes MKNK1 and GNLY, or Equivalent Expression Profile thereof, provided that, in said Equivalent Expression Profile thereof, MKNK1 is not replaced by gene S100A8 nor gene MAPK14, and optionally one or more housekeeping gene(s).
In another most preferred embodiment, the expression profile for diagnosing or prognosing responsiveness or non responsiveness at week 14, week 16 or week 22 after the beginning of the CyTD or anti-inflammatory biological drug treatment (in particular a TBA treatment) comprises or preferably consists of the genes MKNK1, TBX21 and TGFBR3, or Equivalent Expression Profile thereof, provided that, in said Equivalent Expression Profile thereof, MKNK1 is not replaced by gene S100A8 nor gene MAPK14, and optionally one or more housekeeping gene(s).
In another most preferred embodiment, the expression profile for diagnosing or prognosing responsiveness or non responsiveness at week 14, week 16 or week 22 after the beginning of the CyTD or anti-inflammatory biological drug treatment (in particular a TBA treatment) comprises or preferably consists of the genes MKNK1, GNLY, and ADI1, or Equivalent Expression Profile thereof, provided that, in said Equivalent Expression Profile thereof, MKNK1 is not replaced by gene S100A8 nor gene MAPK14, and optionally one or more housekeeping gene(s).
In another most preferred embodiment, the expression profile for diagnosing or prognosing responsiveness or non responsiveness at week 14, week 16 or week 22 after the beginning of the CyTD or anti-inflammatory biological drug treatment (in particular a TBA treatment) comprises or preferably consists of the genes MKNK1, GNLY, ADI1, and IL1B, or Equivalent Expression Profile thereof, provided that, in said Equivalent Expression Profile thereof, MKNK1 is not replaced by gene S100A8 nor gene MAPK14, and optionally one or more housekeeping gene(s).
In another most preferred embodiment, the expression profile for diagnosing or prognosing responsiveness or non responsiveness at week 14, week 16 or week 22 after the beginning of the CyTD or anti-inflammatory biological drug treatment (in particular a TBA treatment) comprises or preferably consists of the genes MKNK1, GNLY, ADI1, IL1B, and IL1R1, or Equivalent Expression Profile thereof, provided that, in said Equivalent Expression Profile thereof, MKNK1 is not replaced by gene S100A8 nor gene MAPK14, and optionally one or more housekeeping gene(s).
In yet another most preferred embodiment, the expression profile for diagnosing or prognosing responsiveness or non responsiveness at week 14, week 16 or week 22 after the beginning of the CyTD or anti-inflammatory biological drug treatment (in particular a TBA treatment) comprises or preferably consists of the genes MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, FYN, IL1B and CFLAR, or Equivalent Expression Profile thereof, provided that, in said Equivalent Expression Profile thereof, MKNK1 is not replaced by gene S100A8 nor gene MAPK14, and optionally one or more housekeeping gene(s).
In still another most preferred embodiment, the expression profile for diagnosing or prognosing responsiveness or non responsiveness at week 14, week 16 or week 22 after the beginning of the CyTD or anti-inflammatory biological drug treatment (in particular a TBA treatment) comprises or preferably consists of the genes MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, FYN, IL1B, CFLAR, MAPK14 and GNLY or Equivalent Expression Profile thereof, provided that, in said Equivalent Expression Profile thereof, MKNK1 is not replaced by gene S100A8 nor gene MAPK14, and optionally one or more housekeeping gene(s).
In yet another most preferred embodiment, the expression profile for diagnosing or prognosing responsiveness or non responsiveness at week 14, week 16 or week 22 after the beginning of the CyTD or anti-inflammatory biological drug treatment (in particular a TBA treatment) comprises or preferably consists of the genes MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, FYN, IL1B, CFLAR, CD14 and TGFBR2 or Equivalent Expression Profile thereof, provided that, in said Equivalent Expression Profile thereof, MKNK1 is not replaced by gene S100A8 nor gene MAPK14, and optionally one or more housekeeping gene(s).
In another most preferred embodiment, the expression profile for diagnosing or prognosing responsiveness or non responsiveness at week 14, week 16 or week 22 after the beginning of the CyTD or anti-inflammatory biological drug treatment (in particular a TBA treatment) comprises or preferably consists of the genes MKNK1, IFNGR2, IL1B, MAPK14, GNLY, and CD14, or Equivalent Expression Profile thereof, provided that, in said Equivalent Expression Profile thereof, MKNK1 is not replaced by gene S100A8 nor gene MAPK14, and optionally one or more housekeeping gene(s).
In another most preferred embodiment, the expression profile for diagnosing or prognosing responsiveness or non responsiveness at week 14, week 16 or week 22 after the beginning of the CyTD or anti-inflammatory biological drug treatment (in particular a TBA treatment) comprises or preferably consists of the genes MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, IL1B, CFLAR, MAPK14, GNLY, CD14 and TGFBR2, or Equivalent Expression Profile thereof, provided that, in said Equivalent Expression Profile thereof, MKNK1 is not replaced by gene S100A8 nor gene MAPK14, and optionally one or more housekeeping gene(s).
The inventors have determined that MKNK1 is a key gene for determining if a subject suffering from an inflammatory disease, in particular an RA-suffering subject, will or not respond to a CyTD or anti-inflammatory biological drug treatment, in particular a TBA treatment and more particularly an Infliximab treatment. The methods according to the invention are thus mainly based on the determination of an expression profile comprising or consisting of gene MKNK1.
However, the addition of a small number of other genes to the tested expression profile improves the sensitivity, specificity, positive predictive value (PPV) and/or negative predictive value (NPV), and thus decreases the error rate of the diagnosis or prognosis and the determined expression profiles thus preferably comprises at least 1, preferably at least 2, at least 3, at least 4, or at least 5 other genes.
In particular, an expression profile comprising MKNK1 preferably comprises 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 genes useful for the diagnosis or prognosis, including MKNK1, and optionally further comprises one or more housekeeping gene(s) used for normalization of the data.
An expression profile comprising MKNK1 may notably comprise 2-20 genes useful for the diagnosis or prognosis, and in particular 5-15 genes, 6-12 genes, or 8-10 genes useful for the diagnosis or prognosis, including MKNK1, and may optionally further comprise one or more housekeeping gene(s) useful for normalization of the data.
Such an expression profile comprising MKNK1 may notably be chosen from expression profiles comprising the specific combinations of genes disclosed in the present specification, such as those mentioned above.
The determination of the presence of a CyTD or anti-inflammatory biological drug responding or non-responding phenotype is carried out thanks to the comparison of the obtained expression profile with at least one reference expression profile in step (b).
The term “Equivalent Expression Profile” herein refers to expression profiles comprising or consisting of:
wherein the addition, deletion or substitution of some of the genes (preferably at most 1 or 2 genes) does not change significantly the reliability of the test and is considered as an “acceptable expression profile”.
In a preferred embodiment, Equivalent Expression Profiles include expression profiles in which one of the genes of a selected genes combination is replaced by an equivalent gene. In the present description, a first gene (“gene A”) can be considered as equivalent to another second gene (“gene B”), when replacing “gene A” in the expression profile of by “gene B” does not significantly impact the performance of the test. This is typically the case when “gene A” is correlated to “gene B”, meaning that the expression of “gene A” is statistically correlated to the expression level of “gene B”, as determined by a measure such as Pearson's correlation coefficient. The correlation may be positive (meaning that when “gene A” is upregulated in a patient, then “gene” B is also upregulated in that same patient) or negative (meaning that when “gene A” is upregulated in a patient, then “gene B” is downregulated in that same patient). Then fact that replacing “gene A” by a correlated “gene B” still permits reliable diagnosis or prognosis of responding or non-responding phenotype is demonstrated for genes with an average Pearson's correlation coefficient of at least 0.9 in Example 3.
In another embodiment, the addition or substitution of some of the genes of the sets described in the present invention by other genes belonging to the same metabolic pathway should also be considered as an equivalent expression profile.
For instance, genes that are equivalent to MKNK1 or to one of those of the herein disclosed genes combinations and that may be used in addition or as a substitution in an Equivalent Expression Profile include those disclosed in
Illustrative embodiments of Equivalent expression profiles are disclosed in Example 3. Further illustrative Equivalent expression profiles include expression profiles comprising or consisting of at least 1 gene from at least 6 of the following 10 groups:
As may be observed in
In all methods, kits or microarrays described herein, when the expression profile comprises or consists of the gene MKNK1; or of the genes MKNK1, TBX21 and TGFBR3; or of the genes MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, FYN, IL1B and CFLAR; or of all the 46 genes of following Tables 2, 3 and 4, it is further possible to exclude Equivalent Expression Profiles that comprise genes MAPK14 and/or S100A8.
In this case, the methods, kits or microarrays according to the invention are based on an expression profile comprising or consisting of the gene MKNK1; or of the genes MKNK1, TBX21 and TGFBR3; or of the genes MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, FYN, IL1B and CFLAR; or of all the 46 genes of following Tables 2, 3 and 4 or Equivalent Expression Profile thereof, provided that said Equivalent Expression Profile thereof does not comprise gene MAPK14 nor gene S100A8.
The term “Acceptable Expression Profile” herein refers to an expression profile which is capable of correctly classifying at least 60% of the analyzed samples, preferably 65%, and more preferably 70%, and even more preferably 75%, 80% or 85%; and has a sensitivity and specificity of at least 60% preferably 65%, and more preferably 70%, and even more preferably 75 or 80%.The sensitivity value is defined as the ratio of the number of patients actually clinically responding to the CyTD or anti-inflammatory biological drug treatment and classified as responding using the test according to the invention amongst all patients treated with the CyTD or anti-inflammatory biological drug. Specificity measures the proportion of patients actually clinically not responding to the CyTD or anti-inflammatory biological drug treatment which are correctly identified using the test according to the invention amongst all patients treated with the CyTD or anti-inflammatory biological drug.
By “Best Expression Profile” is meant an expression profile which is able to correctly classify at least 80% of the analyzed samples, has either a sensitivity or a sensitivity of at least 80%.
Although the lists of the gene MKNK1; or of the genes MKNK1 and GNLY; or of the genes MKNK1, TBX21 and TGFBR3; or of the genes MKNK1, GNLY, and ADI1; or of the genes MKNK1, GNLY, ADI1, and IL1B; or of the genes MKNK1, GNLY, ADI1, IL1B, and IL1R1; or of the genes MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, FYN, IL1B and CFLAR; or of the genes MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, FYN, IL1B, CFLAR, MAPK14 and GNLY; or of the genes MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, FYN, IL1B, CFLAR, CD14 and TGFBR2; or of the genes MKNK1, IFNGR2, IL1B, MAPK14, GNLY, and CD14; or of the genes MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, FYN, IL1B, CFLAR, MAPK14, GNLY, CD14 and TGFBR2; or of the genes MKNK1, IFNGR2, IL1B, MAPK14, GNLY, and CD14; or of the genes MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, IL1B, CFLAR, MAPK14, GNLY, CD14 and TGFBR2; or of all the 46 genes of following Tables 2, 3 and 4 have been determined as the Best Expression Profiles to assess responsiveness/non responsiveness, an Equivalent Expression Profile such as defined above, still permits to assess responsiveness, with an acceptable reliability. In particular embodiments, sublists of the gene MKNK1; or of the genes MKNK1 and GNLY; or of the genes MKNK1, TBX21 and TGFBR3; or of the genes MKNK1, GNLY, and ADI1; or of the genes MKNK1, GNLY, ADI1, and IL1B; or of the genes MKNK1, GNLY, ADI1, IL1B, and IL1R1; or of the genes MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, FYN, IL1B and CFLAR; or of the genes MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, FYN, IL1B, CFLAR, MAPK14 and GNLY; or of the genes MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, FYN, IL1B, CFLAR, CD14 and TGFBR2; or of the genes MKNK1, IFNGR2, IL1B, MAPK14, GNLY, and CD14; or of the genes MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, FYN, IL1B, CFLAR, MAPK14, GNLY, CD14 and TGFBR2; or of the genes MKNK1, IFNGR2, IL1B, MAPK14, GNLY, and CD14; or of the genes MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, IL1B, CFLAR, MAPK14, GNLY, CD14 and TGFBR2; or of all the 46 genes of following Tables 2, 3 and 4; still permit to assess responsiveness with a good reliability and should be considered as Acceptable Expression Profiles.
While the expression profile used for determining the CyTD (notably TBA) or anti-inflammatory biological drug responsive or non-responsive phenotype may comprise and not only consist of:
By “housekeeping genes”, it is meant genes that are constitutively expressed at a relatively constant level across many or all known conditions, because they code for proteins that are constantly required by the cell, hence, they are essential to a cell and always present under any conditions. It is assumed that their expression is unaffected by experimental conditions. The proteins they code are generally involved in the basic functions necessary for the sustenance or maintenance of the cell. Non-limitating examples of housekeeping genes that may be used in methods of the invention include:
When such housekeeping genes are added to the expression profile (it is not always necessary), they are used for normalization purpose. In this case, the number of housekeeping genes used for normalization in methods according to the invention is preferably comprised between one and five with a preference for three.
The determination of the presence of a responsive or non responsive phenotype is carried out thanks to the obtained expression profile with at least one reference profile in step (b).
A “reference expression profile” is a predetermined expression profile, obtained from a biological sample from a subject with a known particular response state. In particular embodiments, the reference expression profile used for comparison with the test sample in step (b) may have been obtained from a biological sample from a CyTD or anti-inflammatory biological drug responsive subject (“CyTD or anti-inflammatory biological drug responsive reference expression profile” or “responsive reference expression profile”; as used herein these expressions are synonymous), and/or from a biological sample from a CyTD or anti-inflammatory biological drug non-responsive subject (“CyTD or anti-inflammatory biological drug non-responsive reference expression profile” or “non-responsive reference expression profile”; as used herein these expressions have the same meaning).
Preferably, at least one reference expression profile is a CyTD or anti-inflammatory biological drug responsive reference expression profile. Alternatively, at least one reference expression profile may be a CyTD or anti-inflammatory biological drug non-responsive reference expression profile. More preferably, the determination of the presence or absence of a CyTD or anti-inflammatory biological drug responsive phenotype is carried out by comparison with at least one responder and at least one non-responder reference expression profiles. The diagnosis or prognostic may thus be performed using one responsive reference expression profile and one non-responsive reference expression profile. Advantageously, to get a stronger diagnosis or prognostic, said diagnosis or prognostic is carried out using several responsive reference expression profiles and several non-responsive reference expression profiles.
The comparison of a tested subject expression profile with said reference expression profiles, which permits prediction of the tested subject's clinical response based on his/her expression profile, can be done by those skilled in the art using statistical models or machine learning technologies. The PLS (Partial Least Square) regression is particularly relevant to give prediction in the case of small reference samples. The comparison may also be performed using Support Vector Machines (SVM), linear regression or derivatives thereof (such as the generalized linear model abbreviated as GLM, including logistic regression), Linear Discriminant Analysis (LDA), Random Forests, k-NN (Nearest Neighbour) or PAM (Predictive Analysis of Microarrays) statistical methods. More precisely, a group of reference samples, which is generally referred to as training data, is used to select an optimal statistical algorithm that best separates responders from non responders (like a decision rule). The best separation is usually the one that misclassifies as few samples as possible and that has the best chance to perform comparably well on a different dataset.
Typically for a binary outcome such as responder/non-responder, this is done using a generalized linear model abbreviated as GLM, including logistic regression. Logistic regression is based on the determination of a logistic regression function
in which z is usually defined as z=β0+β1x1+ . . . +βnxn, wherein x1 to xn are the expression values of the n genes in the signature, β0 is the intercept, and β1 to βn are the regression coefficients. The values of the intercept and of the regression coefficients are determined based on a group of reference samples (“training data”). f(z) then defines the probability that a test expression profile is responding or non-responding (when defining f(z) based on training data, the user decides if the probability is a probability of response or of non-response). A test expression profile is then classified as responding or non-responding depending if the probability that it is responding or non-responding is inferior or superior to a particular threshold value, which is also determined based on training data. Sometimes, two threshold values are used, defining an undetermined area. Other types of generalized linear models than logistic regression may also be used.
Alternative methods such as nearest neighbour (abbreviated as k-NN) are also commonly used and predict response or non-response for a new sample based on whether the sample is closer to the group of responders or to the group of non-responders. The notion of “closer” is based on a choice of distance (metric, such as but not limited to Euclidian distance) in the n-dimension space defined by a signature consisting of n genes useful for diagnosis or prognosis (thus excluding potential housekeeping genes used for normalization purpose). The distances between a test expression profile and all reference responding or non-responding expression profiles are calculated and the sample is classified by analysis of the k closest reference samples (k being an positive integer of at least 1 and most commonly 3 or 5), a rule of classification being pre-established depending of the number of responding or non-responding reference expression profiles among the k closest reference expression profiles. For instance, when k is 1, a test expression profile is classified as responding if the closest reference expression profile is a responding expression profile, and as non-responding if the closest reference expression profile is a non-responding expression profile. When k is 2, a test expression profile is classified as responding if the two closest reference expression profiles are responding expression profiles, as non-responding if the two closest reference expression profiles are non-responding expression profiles, and undetermined if the two closest reference expression profiles include a responding and a non-responding reference expression profile. When k is 3, a test expression profile is classified as responding if at least two of the three closest reference expression profiles are responding expression profiles, and as non-responding if at least two of the three closest reference expression profiles are non-responding expression profiles. More generally, when k is p, a test expression profile is classified as responding if more than half of the p closest reference expression profiles are responding expression profiles, and as non-responding if more than half of the p closest reference expression profiles are non-responding expression profiles. If the numbers of responding and non-responding reference expression profiles are equal, then the test expression profile is classified as undetermined.
Other methodologies from the field of statistics, mathematics or engineering exist, for example but not limited to decision trees, Support Vector Machines (SVM), Neural Networks and Linear Discriminant Analyses (LDA). These approaches are well known to people skilled in the art.
In summary, an algorithm (which may be selected from linear regression or derivatives thereof such as generalized linear models (GLM, including logistic regression), nearest neighbour (k-NN), decision trees, support vector machines (SVM), neural networks, linear discriminant analyses (LDA), Random forests, or Predictive Analysis of Microarrays (PAM) is calibrated based on a group of reference samples (preferably including several responsive reference expression profiles and several non-responsive reference expression profiles) and then applied to the test sample. In simple terms, a patient will be classified as responder (or non-responder) based on how all the genes in the signature compare to all the genes from a reference profile that was developed from a group of responders (training data).
The notion of whether individual genes of the expression profile are increased or decreased in a responder versus a non-responders is of scientific interest. For each individual gene, the gene expression levels in the responder group can be compared to the non-responder group by the use of Student's t-test or equivalent methods. However, such binary comparisons are generally not used for diagnosis or prognosis when a signature comprises several distinct genes.
The expression profile may be determined by any technology known by a man skilled in the art. In particular, each gene expression level may be measured at the genomic and/or nucleic and/or proteic level. In a preferred embodiment, the expression profile is determined by measuring the amount of nucleic acid transcripts of each gene. In another embodiment, the expression profile is determined by measuring the amount of protein produced by each of the genes.
The amount of nucleic acid transcripts can be measured by any technology known by a man skilled in the art. In particular, the measure may be carried out directly on an extracted messenger RNA (mRNA) sample, or on retrotranscribed complementary DNA (cDNA) prepared from extracted mRNA by technologies well-know in the art. From the mRNA or cDNA sample, the amount of nucleic acid transcripts may be measured using any technology known by a man skilled in the art, including nucleic microarrays, quantitative PCR, next generation sequencing and hybridization with a labelled probe.
In a preferred embodiment, the expression profile is determined using quantitative PCR. Quantitative, or real-time, PCR is a well known and easily available technology for those skilled in the art and does not need a precise description.
In a particular embodiment, which should not be considered as limiting the scope of the invention, the determination of the expression profile using quantitative PCR may be performed as follows. Briefly, the real-time PCR reactions are carried out using the TaqMan Universal PCR Master Mix (Applied Biosystems). 6 μL cDNA is added to a 9 μL PCR mixture containing 7.5 μL TaqMan Universal PCR Master Mix, 0.75 μL of a 20× mixture of probe and primers and 0.75 μl water. The reaction consisted of one initiating step of 2 min at 50 deg. C, followed by 10 min at 95 deg. C, and 40 cycles of amplification including 15 sec at 95 deg. C and 1 min at 60 deg. C. The reaction and data acquisition can be performed using the ABI 7900HT Fast Real-Time PCR System (Applied Biosystems). The number of template transcript molecules in a sample is determined by recording the amplification cycle in the exponential phase (cycle threshold or CQ or CT), at which time the fluorescence signal can be detected above background fluorescence. Thus, the starting number of template transcript molecules is inversely related to CT. The level of expression of a gene is measured using the “ΔΔCT method”, briefly a gene is normalized by the value of one or a group of reference/housekeeping genes and/or by a reference sample such as a pooled sample or a commercially available reference such as the qPCR Human Universal Reference cDNA, random primed; Ozyme; réf 639654.
In another preferred embodiment, the expression profile is determined by the use of a nucleic microarray.
According to the invention, a “nucleic microarray” consists of different nucleic acid probes that are attached to a substrate, which can be a microchip, a glass slide or a microsphere-sized bead. A microchip may be constituted of polymers, plastics, resins, polysaccharides, silica or silica-based materials, carbon, metals, inorganic glasses, or nitrocellulose. Probes can be nucleic acids such as cDNAs (“cDNA microarray”) or oligonucleotides (“oligonucleotide microarray”), and the oligonucleotides may be about 25 to about 60 base pairs or less in length.
To determine the expression profile of a target nucleic sample, said sample is labelled, contacted with the microarray in hybridization conditions, leading to the formation of complexes between target nucleic acids that are complementary to probe sequences attached to the microarray surface. The presence of labelled hybridized complexes is then detected. Many variants of the microarray hybridization technology are available to the man skilled in the art.
In a preferred embodiment, the nucleic microarray is an oligonucleotide microarray comprising or consisting of oligonucleotides specific for:
Preferably, the oligonucleotides are about 50 bases in length. It is acknowledged that the nucleic acid microarray or oligonucleotide microarray of the invention encompasses the microarrays specific for an Equivalent Expression Profile as defined above.
Suitable microarray oligonucleotides specific for any gene of Tables 2, 3 and 4 may be designed, based on the genomic sequence of each gene (see Tables 2, 3 and 4 Genbank accession numbers), using any method of microarray oligonucleotide design known in the art. In particular, any available software developed for the design of microarray oligonucleotides may be used, such as, for instance, the OligoArray software (available at http://berry.enqin.umich.edu/oliqoarray/), the GoArrays software (available at http://www.isima.fr/bioinfo/goarrays/), the Array Designer software (available at http://www.premierbiosoft.com/dnamicroarray/index.html), the Primer3 software (available at http://frodo.wi.mit.edu/primer3/primer3 code.html), or the Promide software (available at http://oligos.molgen.mpg.de/).
In another embodiment, the expression profile is determined by the use of a protein microarray.
In a particular embodiment of a method according to the invention, said method may further comprise determining at least one additional parameter useful for the diagnosis or prognosis. Such “parameters useful for the diagnosis or prognosis” are parameters that cannot be used alone for a diagnosis or prognosis but that have been described as displaying significantly different values between responsive subjects and subjects who are clearly refractory and may thus also be used to refine and/or confirm the diagnosis or prognosis according to the above described method according to the invention. They may notably include relevant clinical parameters depending on the inflammatory disease. For rheumatoid arthritis (RA), such clinical parameters include an assessment of the subject's pain, duration of morning stiffness, the number of swollen joints, the number of painful joints etc. Preferably, the parameters useful for diagnosis or prognosis are determined from a non invasive biological sample of the subject. In particular, for RA, they may be selected from standard biological parameters specific for RA. According to the invention, “standard biological parameters specific for RA” are biological parameters usually used by clinicians to monitor the efficacy of a treatment of RA. These standard biological parameters specific for RA or autoimmune diseases usually comprise serum or plasma concentrations of particular proteins which are well known of those skilled in the art. The said standard biological parameters specific for RA can be determined by tests which include the Antinuclear Antibody test (ANA test), C-Reactive Protein test (CRP test), Erythrocyte sedimentation rate (ESR test), Cyclic Citrullinated Peptide Antibody test (CCP test), and the Rheumatoid Factor test. These tests are well known to the person skilled in the art and not be detailed here. They may be used on their own or in combination.
Such additional parameters may be used to confirm the diagnosis or prognosis obtained using the expression profile comprising or consisting of:
The invention further concerns a kit for the in vitro diagnosis or prognosis of a CyTD or anti-inflammatory biological drug responsive or non responsive phenotype, comprising at least one reagent for the determination of an expression profile comprising, or consisting of:
By “a reagent for the determination of an expression profile” is meant a reagent which specifically allows for the determination of said expression profile, i.e. a reagent specifically intended for the specific determination of the expression level of the genes comprised in the expression profile. This definition excludes generic reagents useful for the determination of the expression level of any gene, such as Taq polymerase or an amplification buffer, although such reagents may also be included in a kit according to the invention.
In a preferred embodiment of a kit according to the invention, said kit is dedicated to the in vitro diagnosis or prognosis of a CyTD or anti-inflammatory biological drug responsive or non responsive phenotype based on expression profiles comprising or consisting of:
By “dedicated”, it is meant that reagents for the determination of an expression profile in the kit of the invention essentially consist of reagents for determining the expression level of the above (i) expression profiles, optionally with one or more housekeeping gene(s), and thus comprise a minimum of reagents for determining the expression of other genes than those mentioned in above described (i) expression profiles and housekeeping genes. For instance, a dedicated kit of the invention preferably comprises no more than 50, 40, 30, 25, 20, preferably no more than 15, preferably no more than 10, preferably no more than 9, 8, 7, 6, 5, 4, 3, 2, or 1 reagent(s) for determining the expression level of a gene that does not belong to one of the above described (i) expression profiles and that is not a housekeeping gene.
Such a kit for the in vitro diagnosis or prognosis of a CyTD or anti-inflammatory biological drug responsive or non responsive phenotype may further comprise instructions for determination of the presence or absence of a responsive phenotype.
Such a kit for the in vitro diagnosis or prognosis of a responsive phenotype may also further comprise at least one reagent for the determining of at least one additional parameter useful for the diagnosis or prognosis such as standard biological parameters. In particular, the said reagent is useful for performing any of the following tests: the Antinuclear Antibody test (ANA test), C-Reactive Protein test (CRP test), Erythrocyte sedimentation rate (ESR test), Cyclic Citrullinated Peptide Antibody test (CCP test), and the Rheumatoid Factor test.
In any kit for the in vitro diagnosis or prognosis of a responsive phenotype according to the invention, the reagent(s) for the determination of an expression profile comprising, or consisting of:
The determination of the expression profile may thus be performed using quantitative PCR and/or a nucleic microarray, preferably an oligonucleotide microarray, and/or protein microarrays.
In addition, the instructions for the determination of the presence or absence of a CyTD (notably TBA) or anti-inflammatory biological drug phenotype preferably include at least one reference expression profile, or at least one reference sample for obtaining a reference expression profile. In a preferred embodiment, at least one reference expression profile is a responsive expression profile. Alternatively, at least one reference expression profile may be a non responsive expression profile. More preferably, the determination of the level of responsiveness is carried out by comparison with both responsive and non-responsive expression profiles as described above.
The invention is also directed to a nucleic acid microarray comprising or consisting of nucleic acids specific for:
Said nucleic acid microarray may comprise additional nucleic acids specific for additional genes and optionally one or more housekeeping gene(s), but preferably consists of a maximum of 500, 400, 300, 200 preferably 100, 90, 80, 70 more preferably 60, 50, 45, 40, 35, 30, 25, 20, 15, 10, or even less (for instance 9, 8, 7, 6, 5, 4, 3, 2 or 1) distinct nucleic acids.
In a preferred embodiment, said nucleic acid microarray comprises no more than 50, 40, 30, 25, 20, preferably no more than 15, preferably no more than 10, preferably no more than 9, 8, 7, 6, 5, 4, 3, 2, or 1 distinct nucleic acids specific for a gene that does not belong to one of the above described (i) expression profiles and that is not a housekeeping gene.
Advantageously, said microarray consists of nucleic acids specific for:
In a preferred embodiment, said nucleic acid microarray is an oligonucleotide microarray comprising or consisting of oligonucleotides specific for:
The present invention also relates to systems (and computer readable medium for causing computer systems) to perform a method of diagnosis or prognosis of a CyTD or anti-inflammatory biological drug responding or non-responding phenotype, based on above described expression profiles.
In particular, in a specific embodiment, the invention also relates to a system 1 (see
In another specific embodiment, the invention further relates to a computer readable medium having computer readable instructions recorded thereon to define software modules including a comparison module and a display module for implementing a method on a computer, said method comprising:
Embodiments of the invention relating to systems and computer-readable media have been described through functional modules, which are defined by computer executable instructions recorded on computer readable media and which cause a computer to perform method steps when executed. The modules have been segregated by function for the sake of clarity. However, it should be understood that the modules need not correspond to discreet blocks of code and the described functions can be carried out by the execution of various code portions stored on various media and executed at various times. Furthermore, it should be appreciated that the modules may perform other functions, thus the modules are not limited to having any particular functions or set of functions.
The computer readable medium can be any available tangible media that can be accessed by a computer. Computer readable medium includes volatile and nonvolatile, removable and non-removable tangible media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer readable medium includes, but is not limited to, RAM (random access memory), ROM (read only memory), EPROM (eraseable programmable read only memory), EEPROM (electrically eraseable programmable read only memory), flash memory or other memory technology, CD-ROM (compact disc read only memory), DVDs (digital versatile disks) or other optical storage media, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage media, other types of volatile and non-volatile memory, and any other tangible medium which can be used to store the desired information and which can accessed by a computer including and any suitable combination of the foregoing.
Computer-readable data embodied on one or more computer-readable media, may define instructions, for example, as part of one or more programs, that, as a result of being executed by a computer, instruct the computer to perform one or more of the functions described herein (e.g., in relation to system 1, or computer readable medium), and/or various embodiments, variations and combinations thereof. Such instructions may be written in any of a plurality of programming languages, for example, Java, J#, Visual Basic, C, C#, C++, Fortran, Pascal, Eiffel, Basic, COBOL assembly language, and the like, or any of a variety of combinations thereof. The computer-readable media on which such instructions are embodied may reside on one or more of the components of either of 1, or computer readable medium described herein, may be distributed across one or more of such components, and may be in transition there between.
The computer-readable media may be transportable such that the instructions stored thereon can be loaded onto any computer resource to implement the aspects of the present invention discussed herein. In addition, it should be appreciated that the instructions stored on the computer readable media, or the computer-readable medium, described above, are not limited to instructions embodied as part of an application program running on a host computer. Rather, the instructions may be embodied as any type of computer code (e.g., software or microcode) that can be employed to program a computer to implement aspects of the present invention. The computer executable instructions may be written in a suitable computer language or combination of several languages. Basic computational biology methods are known to those of ordinary skill in the art and are described in, for example, Setubal and Meidanis et al., Introduction to Computational Biology Methods (PWS Publishing Company, Boston, 1997, ref 38); Salzberg, Searles, Kasif, (Ed.), Computational Methods in Molecular Biology, (Elsevier, Amsterdam, 1998, ref 39); Rashidi and Buehler, Bioinformatics Basics: Application in Biological Science and Medicine (CRC Press, London, 2000, ref 40) and Ouelette and Bzevanis Bioinformatics: A Practical Guide for Analysis of Gene and Proteins (Wiley & Sons, Inc., 2nd ed., 2001, ref 41).
The functional modules of certain embodiments of the invention include a determination module 2, a storage device 3, a comparison module 4 and a display module 5. The functional modules can be executed on one, or multiple, computers, or by using one, or multiple, computer networks. The determination module 2 has computer executable instructions to provide expression level information in computer readable form.
As used herein, “expression level information” refers to information about expression level of any nucleotide (RNA or DNA) and/or amino acid sequences, either full-length or partial. In a preferred embodiment, it refers to the level of expression of mRNA or cDNA, measured by various technologies. The information may be qualitative (presence or absence of a transcript) or quantitative. Preferably it is quantitative. Methods for determining expression level information, i.e. determination modules 2, include systems for protein and DNA/RNA analysis, and in particular those described above for determination of expression profiles at the nucleic or protein level.
The sequence information determined in the determination module can be read by the storage device 3. As used herein the “storage device” 3 is intended to include any suitable computing or processing apparatus or other device configured or adapted for storing data or information. Examples of electronic apparatus suitable for use with the present invention include stand-alone computing apparatus, data telecommunications networks, including local area networks (LAN), wide area networks (WAN), Internet, Intranet, and Extranet, and local and distributed computer processing systems. Storage devices 3 also include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage media, magnetic tape, optical storage media such as CD-ROM, DVD, electronic storage media such as RAM, ROM, EPROM, EEPROM and the like, general hard disks and hybrids of these categories such as magnetic/optical storage media. The storage device 3 is adapted or configured for having recorded thereon expression level information. Such information may be provided in digital form that can be transmitted and read electronically, e.g., via the Internet, on diskette, via USB (universal serial bus) or via any other suitable mode of communication including wireless communication between devices.
As used herein, “stored” refers to a process for encoding information on the storage device 3. Those skilled in the art can readily adopt any of the presently known methods for recording information on known media to generate manufactures comprising the expression level information.
A variety of software programs and formats can be used to store the expression level information on the storage device. Any number of data processor structuring formats (e.g., text file, spreadsheets or database) can be employed to obtain or create a medium having recorded thereon the expression level information.
By providing expression level information in computer-readable form, one can use the expression level information in readable form in the comparison module 4 to compare a specific expression profile with the reference data within the storage device 3. The comparison may notably be done using the various algorithms described above. The comparison made in computer-readable form provides a computer readable comparison result which can be processed by a variety of means. Content based on the comparison result can be retrieved from the comparison module 4 and displayed by the display module 5 to indicate a responding or non-responding phenotype.
Preferably, the reference data are expression level profiles that are indicative of both responding and non-responding phenotypes.
The “comparison module” 4 can use a variety of available software programs and formats for the comparison operative to compare expression level information determined in the determination module 2 to reference data, either directly, or indirectly using any software providing statistical classification algorithms such as those already described above.
The comparison module 4, or any other module of the invention, may include an operating system (e.g., Windows, Linux, Mac OS or UNIX) on which runs a relational database management system, a World Wide Web application, and a World Wide Web server. World Wide Web application includes the executable code necessary for generation of database language statements (e.g., Structured Query Language (SQL) statements). Generally, the executables will include embedded SQL statements. In addition, the World Wide Web application may include a configuration file which contains pointers and addresses to the various software entities that comprise the server as well as the various external and internal databases which must be accessed to service user requests. The Configuration file also directs requests for server resources to the appropriate hardware--as may be necessary should the server be distributed over two or more separate computers. In one embodiment, the World Wide Web server supports a TCP/IP protocol. Local networks such as this are sometimes referred to as “Intranets.” An advantage of such Intranets is that they allow easy communication with public domain databases residing on the World Wide Web (e.g., the GenBank or Swiss Pro World Wide Web site). Thus, in a particular preferred embodiment of the present invention, users can directly access data (via Hypertext links for example) residing on Internet databases using a HTML interface provided by Web browsers and Web servers.
The comparison module 4 provides computer readable comparison result that can be processed in computer readable form by predefined criteria, or criteria defined by a user, to provide a content 6 based in part on the comparison result that may be stored and output as requested by a user using a display module 5. The display module 5 enables display of a content 6 based in part on the comparison result for the user, wherein the content is a signal indicative of a responding or non-responding phenotype. Such signal can be, for example, a display of content indicative of a responding or non-responding phenotype on a computer monitor, a printed page or printed report of content indicating a responding or non-responding phenotype from a printer, or a light or sound indicative of a responding or non-responding phenotype.
The content 6 based on the comparison result varies depending on the algorithm used for comparison.
For instance, when linear regression or derivatives thereof is used, the content 6 may include a probability of being responding or non-responding, or both a probability of being responding or non-responding and one or more threshold values, or merely a signal indicative of a responding or non-responding phenotype. When nearest neighbor (k-NN) is used, the content 6 may include the number or proportion of responding and non-responding expression profiles among the k closest profiles, or merely a signal indicative of a responding or non-responding phenotype. Moreover, the content 6 may simply be a continuous or categorical score reported in a numerical, text or graphical way (for example using a color code such as red, orange or green).
The display module 5 can be any suitable device configured to receive from a computer and display computer readable information to a user. Non-limiting examples include, for example, general-purpose computers such as those based on Intel PENTIUM-type processor, Motorola PowerPC, Sun UltraSPARC, Hewlett-Packard PA-RISC processors, any of a variety of processors available from Advanced Micro Devices (AMD) of Sunnyvale, Calif., or from ARM Holdings, or any other type of processor, visual display devices such as flat panel displays, cathode ray tubes and the like, as well as computer printers of various types or integrated devices such as laptops or tablets, in particular iPads.
In one embodiment, a World Wide Web browser is used for providing a user interface for display of the content 6 based on the comparison result. It should be understood that other modules of the invention can be adapted to have a web browser interface. Through the Web browser, a user may construct requests for retrieving data from the comparison module. Thus, the user will typically point and click to user interface elements such as buttons, pull down menus, scroll bars and the like conventionally employed in graphical user interfaces. The requests so formulated with the user's Web browser are transmitted to a Web application which formats them to produce a query that can be employed to extract the pertinent information.
In one embodiment, the display module 5 displays the comparison result and whether the comparison result is indicative of a responding or non-responding phenotype.
In one embodiment, the content 6 based on the comparison result that is displayed is a signal (e.g. positive or negative signal) indicative of a responding or non-responding phenotype, thus only a positive or negative indication may be displayed.
The present invention therefore provides for systems 1 (and computer readable medium for causing computer systems) to perform methods for diagnosing or prognosing a responding or non-responding phenotype, based on expression profiles information.
System 1, and computer readable medium, are merely illustrative embodiments of the invention for performing methods of diagnosing or prognosing a responding or non-responding phenotype based on expression profiles, and is not intended to limit the scope of the invention. Variations of system 1, and computer readable medium, are possible and are intended to fall within the scope of the invention.
The modules of the system 1 or used in the computer readable medium, may assume numerous configurations. For example, function may be provided on a single machine or distributed over multiple machines.
Having generally described this invention, a further understanding of characteristics and advantages of the invention can be obtained by reference to certain specific examples and figures which are provided herein for purposes of illustration only and are not intended to be limiting unless otherwise specified.
Materials and Methods
In this example, the materials and methodologies used in the subsequent examples are described.
Data Identification and Data Extraction: Studies were selected on the basis that they had been performed on RA patients naive to biologics who had started therapy with Infliximab and measurement of their response to treatment was available at 14 or 22 weeks. Large scale gene expression information had to be available at baseline (prior to treatment). Following the steps described in Ramasamy et al. (10), we identified four six studies that matched our research criteria: Lequerré et al. (6), Sekiguchi et al. (7), Bienkowska et al. (9), and Julià et al. (8), Tanino et al. (36) and van Baarsen et al. (37). The expression data, the phenotypes and the annotation data were all downloaded from GEO (GSE3592, GSE8350, GSE12051 and, GSE15258, GSE20690 and GSE19821 respectively).
All six studies identified “Gene expression signatures of response to anti-TNF therapy”. Interestingly, however, no two publications used the same approach and this can partly explain the lack of overlap between the reported signatures. To make the six studies more comparable, we contacted the authors to obtain additional individual information such as the DAS28 at baseline and week 14, week 16 or week 22 to use a single definition of response (EULAR criteria—with “moderate” and “good” responders considered as responders) or detail of treatment to ensure that only Infliximab-treated patients were analyzed. We therefore reclassified patients as responders based on the EULAR definition at week 14, week 16 and week 22 and performed a binary analysis of good and moderate responders versus non-responders. This binary grouping is particularly suited for the identification of non responders. The final dataset is summarized in Table 1 and was the most homogeneous data we could obtain.
Data Quality and Processing: Data from Bienkowska et al. was the only data for which we downloaded the raw .CEL files and processed them using our internal protocols (normalization was performed using GC-RMA in refiner array by GENEDATA Expressionist® (Genedata AG, Basel, Switzerland)). Six chips were flagged with quality issues due to increased distortion. We thus inexcluded them in our analysis.
Data from Lequerré et al., Julia et al. and Sekiguchi et al. were all downloaded as expression matrices, which correspond to expression values after normalization. The data by Lequerré et al. included technical replicates; we averaged the technical replicates and excluded the control samples from the analysis. Following our internal quality control procedures based on the expression profile of sex specific genes, led us to exclude this dataset from further analysis.
To translate probe information into gene information we reblasted the available probe sequences to the latest version of the human genome (Hsap37). Probes were then selected when they mapped within a transcribed region of a gene. When multiple probes were available, we selected the probes that was the closest to the 3′ end of the gene. An alternative approach that was also implemented was to select the probe with the highest average value. Therefore for each gene, only one probe contributed to the analysis.
Statistical Analysis: Because of the impact on the results of using different parameters when performing a meta analysis, we decided to perform three: The first one is based on the most 3′ for probe selection and the Z statistic for variable selection, the second one is based on the most highly expressed probe and the Z statistic for variable selection and the third one is based on the most 3′ for probe selection and the Meta Array package.
The statistical analysis was performed in R using the MetaArray package (34) or a classical Z test. The MetaArray package implements the latent variable model described in (35) as well as the integrative correlation (12). Individual probes contribution was estimated using the t-test as implemented in the multtest library in R. We set the threshold for significance at a p-value of 0.10 because only 2 genes were significant at 5. For the Z test, a p-value of 5% was used as a threshold of significance.
Results
Table 2 provides the Z score value (indicative of the direction), the p-value, the gene symbol, the gene title and the Gene ID (Notation from Entrez Gene NCBI database, updated on December 2010) for the list of significant genes (p-value <5%) based on the 3′ approach to gene mapping and the Z statistic.
Table 3 provides the Z score value (indicative of the direction), the p-value, the Gene Symbol, the Gene Title and the Gene ID (Notation from Entrez Gene NCBI database, updated on December 2010) based on the most highly expressed probe and the Z statistic.
Table 4 provides the Gene Symbol, the Gene Title and the Gene ID (Notation from Entrez Gene NCBI database, updated on December 2010) for the list of borderline significant genes (p-value <10%) based on the 3′ approach to gene mapping and the MetaArray package in R.
For the 45 genes identified in Example 1, probes specific to Taqman assays were ordered from APPLIED BIOSYSTEMS based on APPLIED's inventoried probes or were designed internally using standard software. After our internal quality control steps 35 out of the 45 genes of the present invention were tested on 40 RA samples. The 40 samples used for the qCPR represent a subset of the original samples used for microarray analysis in the study of Julia et al. The delatadeltaCT method was used to measure gene expression levels after carefully selecting reference genes internally. The following statistical analyses have been performed: Identification of individually differentially expressed genes between the two groups of responders versus non responders (Table 5). The selection criteria used was a significant t-test is p-value<0.05. The following 8 genes were found to be significant: MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, FYN, IL1B and CFLAR.
Additionally logistic regression and kNN classification (k=2, 3 and 5) were performed on the eight significant genes to increase the discriminatory performance compared to individual gene discrimination (Table 6). The three gene combination of MKNK1, TBX21 and TGFBR3 was found to be particularly discriminant. The best two gene combination identified from logistic regression using a forward variable selection is MKNK1 and TBX21.
Genes rarely operate individually and thus genes whose expression correlate to other genes can easily be replaced to achieve similar discriminatory performances. To that effect we identified the genes that most correlate to the eight genes from our claim. Correlation analysis is first based on two genomewide datasets to ensure that most genes of the genome are evaluated: the 44 microarray chips of Julia et al. as well as the entire set of microarray chips of Bienkowska et al. (86 chips including other anti-TNFs). The 20 most positively correlated to each of the eight genes are displayed on
Correlation analysis is then further applied to the qPCR data to illustrate the impact of replacing genes by correlated genes on the discriminatory performance of the signature Tables 9, 10 and
Table 9 below further lists genes whose expression levels, as measured by RT-qPCR in whole blood, correlate at more than 0.6 (Pearson's correlation coefficient superior to 0.6, p-val<0.0001) to 10 candidate genes. Those results were generated from the 90 samples described in Example 4.
To illustrate the equivalence of the performance between a signature based on 8 genes and a signature based on 8 genes with one or the other gene replaced by an equivalent gene (such as one from the list in Table 9), we have replaced PRF1 by IL2RB in the signature, or CFLAR by IFITM2. The resulting performances are only slightly lower as shown in Table 10 below, thus showing that replacing an original gene in a claimed gene combination by an equivalent gene as disclosed in the application still permits a reliable prognosis.
The possibility to replace genes by equivalent ones based on their correlation coefficient is further illustrated in
In Example 2, the signatures of responsiveness to Infliximab treatment in RA-patients have been tested in samples of 40 RA patients.
Similarly, a signature of responsiveness Infliximab treatment in RA-patients consisting of MAPK14 and GNLY disclosed in PCT application N° PCT/EP2011/054569, consisting of MAPK14 and GNLY, had been tested in this PCT application in the same samples of 40 RA patients.
To further validate these signatures, they have been further tested in a group of 90 RA patients treated with Infliximab corresponding to the previously tested RA patients and 50 additional RA patients. In addition, two further signatures corresponding to the 8 genes signature (MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, FYN, IL1B and CFLAR) described above in Example 2, to which are added two further genes ((CD14 and TGFBR2) or (MAPK14 and GNLY)), have been further tested in the group of 90 patients.
Patients and Methods
Patients
Among the 90 tested patients, 40 are the same as those already tested in Example 2, additional 3 samples were included from the same source. Thus 42 samples used for the qCPR represent a subset of the original 43 samples used in the statistical analysis of microarray analysis in the study of Julia et al. One outlier in microarray data (sample 44) that was excluded from microarray analysis was not an outlier in the qPCR analysis and hence contributes as an independent sample. Therefore 43 (36 responders, 7 non responders) RNA samples from Julia et al. were used in the qPCR analysis. Additional 57 samples from RA patients meeting our inclusion criteria were obtained from three additional sites: 24 (14 responders, 10 non responders) from Japan, 14 (10 responders, 4 non responders) from the UK and 9 (8 responders, 1 non responders) from France. The delatadeltaCT method was used to measure gene expression levels using the same reference genes as in example 2.
Patient characteristics for each source are described in Table 11.
Methods
Table 12 below provides the Gene Symbol, the Gene Title, the Gene ID (Notation from Entrez Gene NCBI database, updated on 8 May 2011), and the reference mRNA sequence(s) (Entrez Gene NCBI database, updated on 8 May 2011) for all human genes present in one or more of the tested signatures:
qPCR experiments have been performed as described in Example 2.
Cross-Validation Methods
To robustly estimate the performance of the 8 gene signature, we applied the signature of the 8 genes developed in example 2 to the 90 samples. In a first analysis 43 samples were used for the learning phase and the remaining 47 as an independent set. In a second approach we aimed at improving on the existing signature by adding additional genes to improve the discriminatory performance. For this second approach we used leave one out cross validation on the entire 90 samples.
Results and Discussion
The prognosis of a non-responding condition (to Infliximab) was analyzed. As a result, a non-responding condition or test outcome is considered a positive result, while a responding condition or test outcome is considered a negative result. True and false positive results, NPV, PPV, specificity sensitivity, and error rate are defined and calculated as follows:
PPV=TP/(TP+FP)
NPV=TN/(TN+FN)
Specificity=TN/(TN+FP)
Sensitivity=TP/(TP+FN)
Error rate=(FP+FN)/Total number of patients
The main clinical objective of a signature is not to put patients under inefficacious treatment, therefore a signature should demonstrate a low level of False Negative results.
Results obtained on at least a subgroup of the 90 RA patients are provided below for:
Samples for which there was missing data based on invalid qPCR results were excluded.
The values of PPV, NPV, sensitivity and specificity of all four tested signatures are summarized in Table 17 below:
Data summarized in Table 17 show that the first four tested signatures have high NPV value for prognosis of non-responding condition, meaning that, in each case, a high proportion of patients with a negative test (i.e. prognosed as responding to Infliximab) actually do respond to the Infliximab treatment, so that, based on these prognosis methods of non-responding condition, there will be very few patients prognosed as responding that will fail to respond to the treatment. Thus, based on these prognosis methods, Infliximab will be administered almost only to patients that will actually respond to the treatment. The first four tested signatures also have an acceptable or high specificity for prognosis of non-responding condition, meaning that a high proportion of patients that would actually respond to Infliximab treatment are correctly prognosed as responders using the signatures. Thus, based on the above prognosis methods of non-responding condition according to the invention, most patients that would benefit from Infliximab treatment will actually receive this treatment. The two signatures of greatest clinical interest are the initial eight gene signature as well as the combined 10 genes signature that include MAPK14 and GNLY where the sensitivity is higher than in the other two.
The first four tested signatures are thus helpful prognosis tools for identifying RA patients that will or not respond to an Infliximab treatment.
In addition, data obtained for the 8 genes signature (MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, FYN, IL1B and CFLAR) and the two 10 genes signatures (MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, FYN, IL1B, CFLAR, CD14 and TGFBR2/MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, FYN, IL1B, CFLAR, MAPK14 and GNLY), which both include the 8 genes signature, confirm that the addition of one or more genes to this 8 genes signature does not affect its ability to correctly prognose RA patients as responders or non responders to Infliximab treatment. In contrast, the 10 genes signature combining the 8 genes signature of Example 2 and the 2 genes signature of PCT application PCT/EP2011/054569 shows particularly high NPV and specificity values (the highest of all four signatures), while keeping quite high PPV (the highest of all four signatures) and sensitivity (second best of all four signatures) values. This 10 genes signature (MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, FYN, IL1B, CFLAR, MAPK14 and GNLY) appears to be a particularly good prognosis tool of response and non response to Infliximab treatment in RA patients. All the above methods may also be used for prognosis of non response to other TBA treatments, as well as other CyTD treatments.
In addition, Table 17 also shows that MKNK1 alone permits classification of patients with an error rate of only 0.23. In particular, a prognosis based on MKNK1 is highly specific and also has a high NPV value. Moreover, the addition of at least one other gene significantly improves all values (PPV, NPV, sensitivity, specificity, error rate), and a NPV value of at least 0.85 is obtained when at least one gene is added to MKNK1. These results confirm that MKNK1 is a key gene for the prediction of response or non-response to Infliximab and that gene signatures based on MKNK1 and preferably at least one more gene may result in reliable prediction.
In summary, several signatures (or genes combinations) useful for prognosis of non-responsiveness to CyTD treatment (in particular Infliximab treatment) in subjects suffering from an inflammatory disease (in particular rheumatoid arthritis, RA) have been identified, three of which comprising the 8 following genes: MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, FYN, IL1B and CFLAR. The best signature identified so far for prognosis of non-responsiveness to CyTD treatment (in particular Infliximab treatment) in subjects suffering from an inflammatory disease (in particular RA) comprises or consists of genes MKNK1, PRF1, TBX21, TGFBR3, IFNGR2, FYN, IL1B, CFLAR, MAPK14 and GNLY.
Example of a highly discriminant algorithm is provided, it can be applied at baseline (ie prior to treatment administration) to predict whether a patient will be a responder or a non responder to Infliximab. In this example the biological measures are based on gene expression profiles of 11 genes as measured by relative quantitative PCR on RNA extracted from peripheral blood. Those measures are then combined in a linear model to provide a score, a positive score means that the patient is predicted to be a responder. Such algorithm can easily be implemented in existing software or coded specifically.
Patients and Methods
Patients
In this example 65 Caucasian RA patients were analysed. All 65 patients were eligible for first line anti-TNF therapy, as such they were naive of biologics and refractory to classical DMARDS. The 65 patients have already been described in example 4 (the samples from France, Spain and UK).
Methods
The delatadeltaCT method was used to measure gene expression levels using the same reference genes and procedures as in examples 2 and 4. Previous Table 12 provides the Gene Symbol, the Gene Title, the Gene ID (Notation from Entrez Gene NCBI database, updated on 8 May 2011), and the reference mRNA sequence(s) (Entrez Gene NCBI database, updated on 8 May 2011) of 12 genes, 11 of which the Applied probes used in this example map to. Table 18 below provides the Applied Biosystems probe identifiers.
Results and Discussion
Score=−2608.15−588.10*ddCTMKMK1+218.42*ddCTTBX21+211.81*ddCTTGFBR3−99.48*ddCTPRF1+325.63*ddCTIFNGR2−247.78*ddCTIL1B+272.59*ddCTCFLAR−896.87*ddCTTGFBR2−439.47*ddCTCD14+69.04*ddCTGNLY+480.50*ddCTMAPK14
Equation 1: This equation provides a mean to estimate the probability that a patient is a responder or non responder to infliximab.
The coefficients in Equation 1 were estimated based on a training of the 65 samples. The score resulting from the linear model described in equation 1 allows to perfectly discriminate all responders from non responders as can be seen in
1. Lee D M, Weinblatt M E. Rheumatoid arthritis. Lancet. 2001 Sep. 15; 358(9285):903-11.
2. Choy E H, Panayi G S. Cytokine pathways and joint inflammation in rheumatoid arthritis. N Engl J Med. 2001 Mar. 22; 344(12):907-16.
3. Kooloos W M, de Jong D J, Huizinga T W, Guchelaar H J. Potential role of pharmacogenetics in anti-TNF treatment of rheumatoid arthritis and Crohn's disease. Drug Discovery Today. 2007; 12(3-4):125-31.
4. Isaacs J D. Antibody engineering to develop new antirheumatic therapies. Arthritis Res Ther. 2009; 11(3):225.
5. Hetland M L, Christensen I J, Tarp U, Dreyer L, Hansen A, Hansen I T, et al. Direct comparison of treatment responses, remission rates, and drug adherence in patients with rheumatoid arthritis treated with adalimumab, etanercept, or infliximab: results from eight years of surveillance of clinical practice in the nationwide Danish DANBIO registry. Arthritis Rheum. 2010 January; 62(1):22-32.
6. Lequerre T, Gauthier-Jauneau A C, Bansard C, Derambure C, Hiron M, Vittecoq O, et al. Gene profiling in white blood cells predicts infliximab responsiveness in rheumatoid arthritis. Arthritis Res Ther. 2006; 8(4):R105.
7. Sekiguchi N, Kawauchi S, Furuya T, Inaba N, Matsuda K, Ando S, et al. Messenger ribonucleic acid expression profile in peripheral blood cells from RA patients following treatment with an anti-TNF-alpha monoclonal antibody, infliximab. Rheumatology (Oxford). 2008 June; 47(6):780-8.
8. Julia A, Erra A, Palacio C, Tomas C, Sans X, Barcelo P, et al. An eight-gene blood expression profile predicts the response to infliximab in rheumatoid arthritis. PLoS One. 2009; 4(10):e7556.
9. Bienkowska J R, Dalgin G S, Batliwalla F, Allaire N, Roubenoff R, Gregersen P K, et al. Convergent Random Forest predictor: methodology for predicting drug response from genome-scale data applied to anti-TNF response. Genomics. 2009 December; 94(6):423-32.
10. Ramasamy A, Mondry A, Holmes C C, Altman D G. Key issues in conducting a meta-analysis of gene expression microarray datasets. PLoS Med. 2008 Sep. 30; 5(9):e184.
11. Arnett F C, Edworthy S M, Bloch D A, McShane D J, Fries J F, Cooper N S, et al. The American Rheumatism Association 1987 revised criteria for the classification of rheumatoid arthritis. Arthritis Rheum. 1988 March; 31(3):315-24.
12. Parmigiani G, Garrett-Mayer E S, Anbazhagan R, Gabrielson E. A cross-study comparison of gene expression studies for the molecular classification of lung cancer. Clin Cancer Res. 2004 May 1; 10(9):2922-7.
13. Barton A, Thomson W, Ke X, Eyre S, Hinks A, Bowes J, et al. Rheumatoid arthritis susceptibility loci at chromosomes 10p15, 12q13 and 22q13. Nat Genet. 2008 October; 40(10):1156-9.
14. Goronzy J J, Weyand C M. Developments in the scientific understanding of rheumatoid arthritis. Arthritis Res Ther. 2009; 11(5):249.
15. Lorenz E, Muhlebach M S, Tessier P A, Alexis N E, Duncan Hite R, Seeds M C, et al. Different expression ratio of S100A8/A9 and S100A12 in acute and chronic lung diseases. Respir Med. 2008 April; 102(4):567-73.
16. Cheng P, Corzo C A, Luetteke N, Yu B, Nagaraj S, Bui M M, et al. Inhibition of dendritic cell differentiation and accumulation of myeloid-derived suppressor cells in cancer is regulated by S100A9 protein. J Exp Med. 2008 Sep. 29; 205(10):2235-49.
17. Lim S Y, Raftery M, Goyette J, Hsu K, Geczy C L. Oxidative modifications of S100 proteins: functional regulation by redox. J Leukoc Biol. 2009 86(3): 577-87.
18. Simard J C, Girard D, Tessier P A. Induction of neutrophil degranulation by S100A9 via a MAPK-dependent mechanism. J Leukoc Biol. 2010 Jan. 26.
19. Chen Y S, Yan W, Geczy C L, Brown M A, Thomas R. Serum levels of soluble receptor for advanced glycation end products and of S100 proteins are associated with inflammatory, autoantibody, and classical risk markers of joint and vascular damage in rheumatoid arthritis. Arthritis Res Ther. 2009; 11(2):R39.
20. Groh V, Bruhl A, El-Gabalawy H, Nelson J L, Spies T. Stimulation of T cell autoreactivity by anomalous expression of NKG2D and its MIC ligands in rheumatoid arthritis. Proc Natl Acad Sci USA. 2003 Aug. 5; 100(16):9452-7.
21. Paul R, Obermaier B, Van Ziffle J, Angele B, Pfister H W, Lowell C A, et al. Myeloid Src kinases regulate phagocytosis and oxidative burst in pneumococcal meningitis by activating NADPH oxidase. J Leukoc Biol. 2008 October; 84(4):1141-50.
22. Fumagalli L, Zhang H, Baruzzi A, Lowell C A, Berton G. The Src family kinases Hck and Fgr regulate neutrophil responses to N-formyl-methionyl-leucyl-phenylalanine. J Immunol. 2007 Mar. 15; 178(6):3874-85.
23. Mocsai A, Ligeti E, Lowell C A, Berton G. Adhesion-dependent degranulation of neutrophils requires the Src family kinases Fgr and Hck. J Immunol. 1999 Jan. 15; 162(2):1120-6.
24. Bosco M C, Curiel R E, Zea A H, Malabarba M G, Ortaldo J R, Espinoza-Delgado I. IL-2 signaling in human monocytes involves the phosphorylation and activation of p59hck. J Immunol. 2000 May 1; 164(9):4575-85.
25. Deng A, Chen S, Li Q, Lyu S C, Clayberger C, Krensky A M. Granulysin, a cytolytic molecule, is also a chemoattractant and proinflammatory activator. J Immunol. 2005 May 1; 174(9):5243-8.
26. Krensky A M, Clayberger C. Biology and clinical relevance of granulysin. Tissue Antigens. 2009 March; 73(3):193-8.
27. Martinon F, Tschopp J. Inflammatory caspases and inflammasomes: master switches of inflammation. Cell Death Differ. 2007 January; 14(1):10-22.
28. Kurokawa M, Kornbluth S. Caspases and kinases in a death grip. Cell. 2009 Sep. 4; 138(5):838-54.
29. Morel J, Audo R, Hahne M, Combe B. Tumor necrosis factor-related apoptosis-inducing ligand (TRAIL) induces rheumatoid arthritis synovial fibroblast proliferation through mitogen-activated protein kinases and phosphatidylinositol 3-kinase/Akt. J Biol Chem. 2005 Apr. 22; 280(16):15709-18.
30. Korb A, Tohidast-Akrad M, Cetin E, Axmann R, Smolen J, Schett G. Differential tissue expression and activation of p38 MAPK alpha, beta, gamma, and delta isoforms in rheumatoid arthritis. Arthritis Rheum. 2006 September; 54(9):2745-
31. Fransen J, van Riel PLCM. The Disease Activity Score and the EULAR response criteria. Clin Exp Rheumatol. 2005 23(5 Suppl 39): S93-9.
32. Lorenz H M et al. Arthritis Res. 2002; 4 Suppl 3:S17-24
33. Atzeni F et al. Autoimmun Rev. 2007 September; 6(8):529-36.
34. Choi H et al. A latent variable approach for meta-analysis of gene expression data from multiple microarray experiments. BMC Bioinformatics. 2007 September; 8(364).
35. Choi H et al. Latent variable modelling for combining genomic data from multiple studies. Unpublished manuscript (2005).
36. Tanino M et al. Prediction of efficacy of anti-TNF biologic agent, infliximab, for rheumatoid arthritis patients using a comprehensive transcriptome analysis of white blood cells. Biochem Biophys Res Commun. 2009 September; 387(2):261-5.
37. van Baarsen L G et al. Regulation of IFN response gene activity during infliximab treatment in rheumatoid arthritis is associaed with clinical response to treatment. Arthritis Res Ther. 2010; 12(1):R11;
38. Setubal and Meidanis et al., Introduction to Computational Biology Methods (PWS Publishing Company, Boston, 1997;
39. Salzberg, Searles, Kasif, (Ed.), Computational Methods in Molecular Biology, (Elsevier, Amsterdam, 1998;
40. Rashidi and Buehler, Bioinformatics Basics: Application in Biological Science and Medicine (CRC Press, London, 2000;
41. Ouelette and Bzevanis Bioinformatics: A Practical Guide for Analysis of Gene and Proteins (Wiley & Sons, Inc., 2nd ed., 2001.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP2012/051164 | 1/25/2012 | WO | 00 | 10/20/2014 |
Number | Date | Country | |
---|---|---|---|
61457191 | Jan 2011 | US | |
61457743 | May 2011 | US | |
61588390 | Jan 2012 | US |