The present invention pertains to a method for determining a biological pathway activity, said biological pathway being associated with a disease. More precisely, the method consists in evaluating the level of activity of said pathway in a patient suffering from said disease, based on gene expression profiling. The invention also concerns the application of such method to targeted therapies.
A wide range of methods for microarray data analysis have evolved, ranging from simple fold-change approaches to many complex and computationally demanding techniques. Gene expression profiling by microarray technology has become a widely used strategy for investigating the molecular mechanisms underlying many complex diseases. However, the analysis is further complicated by the biological heterogeneity encountered in most of the diseases.
A common observation in the analysis of gene expression is that many genes show similar expression patterns (1) which may share biological functions under common regulatory control. Moreover, these co-expressed genes are frequently clustered according to their expression patterns in subsets of experimental conditions (2). Thus, gene co-expression instead of differential expression could be informative as well.
However, the method commonly used in the literature does not take into account the activation status of the biological signature, which can generate some misclassification (3-7).
Thus, the inventors have developed a method for determining a level of activity of a biological pathway of a patient suffering from a disease, providing a response indicative of the activity or non activity of said biological pathway, which avoids misclassifications.
The method of the present invention permits to identify truly active biological networks associating only with high levels of correlation of biological signature components. Indeed, taking into account this new correlation aspect for the interpretation of biological networks should allow to capture the actually activated mechanisms at the cellular level.
A biological signature is defined by a set of genes or their products that share one or more biological processes. When genes are co-regulated or co-activated under various biological conditions, the corresponding expression profiles may display relative similarity, or co-expression.
The development, the interest and the illustration of a method of the invention is below exposed in relation with rheumatoid arthritis (RA). Of course, the invention is not restricted to RA, it extends to any pathology with which at least one biological pathway can be associated. As a further example, said pathology may be systemic lupus erythematosus (SLE), multiple sclerosis (MS), Sjögren's syndrome, type I diabetes, dermatomyositis, etc. . . . .
The method of the invention for determining a level of activity of a biological pathway from a biological sample of a patient suffering from a pathology comprises the following steps:
measuring the level of expression of at least three genes from a group of individuals, control individuals or patients suffering from the pathology, for which the biological pathway is inactive, said at least three genes being associated to said biological pathway, for establishing a negative reference, and
measuring level of expression of said at least three genes from a group of individuals, control individuals or patients suffering from the pathology, for which the biological pathway is active, for establishing a positive reference, and
measuring the level of expression of said at least three genes of said patient in the sample, and
comparing the level of expression of said at least three genes in the sample of said patient with the level of expression of said at least three genes of the negative reference and determining a value C− which corresponds to a correlation level between the level of expression of said at least three genes of said negative reference and the level of expression of said at least three genes of said patient, and
comparing the level of expression of said at least three genes in the sample of said patient with the level of expression of said at least three genes of the positive reference and determining a value C+ which corresponds to a correlation level between the level of expression of said at least three genes of said positive reference and the level of expression of said at least three genes of said patient, and
establishing a ratio C+/C− which gives a correlation score, wherein
In accordance with the present invention, the phrase “control individual” encompasses any individual wherein said pathology is not diagnosed, in particular it encompasses healthy individuals, patients suffering from any other pathology than said pathology, asymptomatic patients suffering from said pathology.
The present invention also concerns a method for in vitro establishing a prognosis to develop a pathology for an healthy individual, by determining a level of activity of a biological pathway, said method comprising the steps of:
providing a sample from the healthy individual,
measuring the level of expression of at least three genes from a group of individuals, control individuals or patients suffering from the pathology, for which the biological pathway is inactive, said at least three genes being associated to said biological pathway, for establishing a negative reference, and
measuring level of expression of said at least three genes from a group of individuals, control individuals or patients suffering from the pathology, for which the biological pathway is active, for establishing a positive reference, and
measuring the level of expression of said at least three genes of said healthy individual in the sample, and
comparing the level of expression of said at least three genes in the sample of said healthy individual with the level of expression of said at least three genes of the negative reference and determining a value C− which corresponds to a correlation level between the level of expression of said at least three genes of said negative reference and the level of expression of said at least three genes of said control individual, and
comparing the level of expression of said at least three genes in the sample of said healthy individual with the level of expression of said at least three genes of the positive reference and determining a value C+ which corresponds to a correlation level between the level of expression of said at least three genes of said positive reference and the level of expression of said at least three genes of said control individual, and
establishing a ratio C+/C− which gives a correlation score, wherein
In another embodiment, the invention also relates to a method for in vitro establishing a diagnosis of a pathology for a patient, by determining a level of activity of a biological pathway, said method comprising the steps of:
providing a sample from the patient,
measuring the level of expression of at least three genes from a group of individuals, control individuals or patients suffering from the pathology, for which the biological pathway is inactive, said at least three genes being associated to said biological pathway, for establishing a negative reference, and
measuring level of expression of said at least three genes from a group of individuals, control individuals or patients suffering from the pathology, for which the biological pathway is active, for establishing a positive reference, and
measuring the level of expression of said at least three genes of said patient in the sample, and
comparing the level of expression of said at least three genes in the sample of said patient with the level of expression of said at least three genes of the negative reference and determining a value C− which corresponds to a correlation level between the level of expression of said at least three genes of said negative reference and the level of expression of said at least three genes of said patient, and
comparing the level of expression of said at least three genes in the sample of said patient with the level of expression of said at least three genes of the positive reference and determining a value C+ which corresponds to a correlation level between the level of expression of said at least three genes of said positive reference and the level of expression of said at least three genes of said patient, and
establishing a ratio C+/C− which gives a correlation score, wherein
As mentioned above, the invention also pertains to the use of the above-described method for targeted/individualized therapies.
Hence, the present invention also concerns a method for assessing whether a patient having a pathology is in need of a drug administration, said drug interacting directly or indirectly with a biological pathway, in which the biological activity of the biological pathway is determined and said biological activity is associated with said pathology, said method comprising the steps of:
providing a sample from the patient,
measuring the level of expression of at least three genes from a group of individuals, control individuals or patients, for which the biological pathway is inactive, said at least three genes being associated to said biological pathway, for establishing a negative reference, and
measuring level of expression of said at least three genes from a group of individuals, control individuals or patients, for which the biological pathway is active, for establishing a positive reference, and
measuring the level of expression of said at least three genes of said patient in the sample, and
comparing the level of expression of said at least three genes in the sample of said patient with the level of expression of said at least three genes of the negative reference and determining a value C− which corresponds to a correlation level between the level of expression of said at least three genes of said negative reference and the level of expression of said at least three genes of said patient, and
comparing the level of expression of said at least three genes in the sample of said patient with the level of expression of said at least three genes of the positive reference and determining a value C+ which corresponds to a correlation level between the level of expression of said at least three genes of said positive reference and the level of expression of said at least three genes of said patient, and
establishing a ratio C+/C− which gives a correlation score, wherein
A further subject of the invention is a method for monitoring the treatment response of a patient to the administration of a drug, said drug interacting directly or indirectly with a biological pathway, in which the biological activity of the biological pathway is determined and said biological activity is associated with a pathology, said method comprising the steps of:
providing a sample from the patient,
measuring the level of expression of at least three genes from a group of individuals, control individuals or patients, for which the biological pathway is inactive, said at least three genes being associated to said biological pathway, for establishing a negative reference, and
measuring level of expression of said at least three genes from a group of individuals, control individuals or patients, for which the biological pathway is active, for establishing a positive reference, and
measuring the level of expression of said at least three genes of said patient in the sample, and
comparing the level of expression of said at least three genes in the sample of said patient with the level of expression of said at least three genes of the negative reference and determining a value C− which corresponds to a correlation level between the level of expression of said at least three genes of said negative reference and the level of expression of said at least three genes of said patient, and
comparing the level of expression of said at least three genes in the sample of said patient with the level of expression of said at least three genes of the positive reference and determining a value C+ which corresponds to a correlation level between the level of expression of said at least three genes of said positive reference and the level of expression of said at least three genes of said patient, and
establishing a ratio C+/C− which gives a correlation score, wherein
Preferred embodiments of the methods in accordance with the invention are below disclosed, said embodiments being taken alone or in combination.
Hence, according to an preferred embodiment, the level activity of said at least three genes is determined by analyzing the expression level of nucleic acids or proteins in said sample. Said nucleic acids comprise RNAs or cDNAs obtained from said RNAs including long and small RNAs such as mRNAs, miRNAs.
A biological sample from the patient may be a tissue sample or a fluid sample, such as a sample of blood, plasma, serum, urine, synovial fluid and cerebrospinal fluid.
In their study, the inventors selected the signature of interferon (IFN)-inducible genes as an example to study correlation levels between genes composing that signature.
Accordingly, preferred methods as described above involve a biological pathway of type I interferon (type I IFN).
Indeed, the increase of IFN regulated genes has been reported in different diseases like rheumatoid arthritis (RA), systemic lupus erythematosus (SLE) (4), systemic sclerosis (8), multiple sclerosis (9) and in tissues from patients with Sjögren's syndrome (10), type I diabetes (11-13) and dermatomyositis (12). But, to characterize the IFN signature, an IFN “score” is calculated from common methods, i.e., the IFN “score” is calculated for each patient and control based on the average expression of genes which composed the signature. However, as explained above, this approach does not take into account the co-regulation of these IFN inducible genes. In fact, genes with similar functions usually are co-expressed under certain experimental conditions only.
The method in accordance with the invention offers an alternative with which the IFN signature could be characterized by the level of global correlation and not solely by the expression levels. In fact, analyses of our results based on the mean expression of the IFN-related genes showed disparities in the classification of HC and RA patients (9%,
The following experimental part illustrates, by way of example and not by way of limitation, the development of a method of the invention wherein said biological pathway concerns the expression of a human type I interferon (IFN) and the pathology is rheumatoid arthritis (RA), said method involving 35 genes associated with said pathway.
Unsupervised hierarchical clustering of 35 IFN-inducible genes that distinguish rheumatoid arthritis (RA) patients IFNhigh (dendrogram {circle around (1)}) from RA patients IFNlow (dendrogram {circle around (2)}). Each row represents a gene; each column shows the expression for 35 IFN-inducible genes expressed by each patients. Dark grey indicates genes that are expressed at higher levels and light grey indicates genes that are expressed at lower levels.
A correlation index was defined for each gene of the IFN signature as the median of its correlations with the remaining genes. Thus, the correlation profiles for the different groups, RA IFNlow (dotted line) and RA IFNhigh (continuous line), are represented using the 35 calculated correlation indexes. The median values of the correlation indexes obtained from the different groups are 0.33 and 0.63, respectively.
Each point represents a single individual with the decision variable calculated from the Classification Algorithm based on a Biological Signature (CABS). The shaded box indicates the normal range according to the rule of the CABS: If Dhigh
Patients and Controls
102 RA patients fulfilling the revised American College of Rheumatology 1987 criteria for RA were enrolled. Their clinical characteristics are shown in Table 1. As an IFN positive control group (IFNhigh), 10 systemic lupus erythematosus patients
(SLE) fulfilling the American College of Rheumatology criteria for the SLE were studied. In addition, 100 age- and sex-matched healthy control subjects (HC) without any familial history of RA, autoimmune disease and concomitant medication were also recruited. All subjects provided written informed consent and the study was approved by the local Ethical Committee for clinical research. The table 1 describes the demographic and clinical characteristics of the patients and healthy control subjects.
aMedian (Q1-Q3)
bDAS28: Disease Activity Score
cSLEDAI: Systemic Lupus Erythematosus Disease Activity Index
Sample Collection, Processing and Microarray Hybridization
Peripheral blood samples were collected in PAXgene™ Blood RNA tubes (PreAnalytix, Hilden, Germany) in order to stabilize mRNA (22). Blood samples were incubated at room temperature for 2 h, and then stored at −20° C. until RNA extraction according to the manufacturer's instructions. Briefly, RNA was isolated using the PAXgene™ Blood RNA kit (PreAnalytix). Following cell lysis, nucleic acids were pelleted and treated with a buffer containing proteinase K. After digestion with a RNase-free DNase (Qiagen, Valencia, Calif., USA), RNA was subsequently purified on PAXgene™ spin columns and eluted in 80 μl of elution buffer. RNA integrity was assessed using RNA 6000 nano chips and the Agilent 2100 Bioanalyzer (Agilent Technologies, Waldbronn, Germany) according to the manufacturer's instructions. The RNA integrity number (RIN) was obtained from the entire electrophoretic trace of the RNA sample. cDNA was synthesized from 50 ng of total RNA using the WT-Ovation™ System (NuGEN, San Carlos, Calif., USA) powered by Ribo-SPIA™ technology. Fragmented cDNA was end labeled with a biotin-conjugated nucleotide analog (DLR-1a; Affymetrix, Santa Clara, Calif., USA) using terminal transferase (Roche Diagnostics, Mannheim, Germany). Fragmented and labeled cDNA was hybridized for 18 h at 50° C. in a hybridization solution containing 7% DMSO. Hybridization was performed using GeneChip® Human Genome U133 Plus 2.0 arrays (Affymetrix), containing 54,675 probe sets corresponding to 38,500 identified genes. After washing, chips were stained with streptavidin-phycoerythrin according to Affymetrix EukGE-WS2v4 protocol using the Fluidic FS450 station. The microarrays were read with the GeneChip® Scanner 3000 (Affymetrix). Affymetrix GeneChip Operating Software version 1.4 (GCOS) was used to manage Affymetrix GeneChip array data and to automate the control of GeneChip fluidics stations and scanners.
Data Analysis
Data Processing
Expression data were generated using the Robust Multi-array Average (RMA) method (19) implemented in the Affy package of the Bioconductor microarray analysis environment (http://www.bioconductor.org). The RMA method consists of three steps: background adjustment, quantile normalization (20) and probe set summary of the log-normalized data applying a median polishing procedure. Before subsequent patient stratification, non informative genes showing very low expression levels and low variability across microarrays were excluded,
Biclustering and Functional Enrichment Analyses
The SAMBA algorithm (Statistical-Algorithmic Method for Bicluster Analysis) implemented in EXPANDER 4.0.3 (EXPression ANalyzer and DisplayER) was used for the biclustering (21). This algorithm uses probabilistic modeling of the data and theoretical graph techniques to identify such subsets of genes that behave similarly across a subset of patients (22).
The TANGO algorithm (Tool for Analysis of GO enrichment), implemented in EXPANDER 4.0.3, was used to identify the biological significance of these biclusters (21).
Classification Algorithm based on a Biological Signature (CABS)
A classification algorithm was developed to identify individuals with or without the type I IFN gene biological signature. Applied to the IFN-inducible genes, the CABS is divided into three steps.
Step 1 Prototype Construction:
Two groups of RA patients (IFNhigh; IFNlow) were identified from the hierarchical clustering representing the 35 IFN-inducible genes which characterized the IFN signature (
Step 2 Decision Variable Calculation:
For a given individual, the IFN gene expression profiles corresponding to a vector with a size of 35 genes were extracted. Pearson correlation of genes related to the IFN signature was evaluated with both prototypes and denoted CORhigh and CORlow. The decision variable calculation was given by the ratio between these two
Step 3 Decision Making:
The following rule was applied to classify the individuals. High IFN signature was assigned if Dhigh
Results
Analysis of Heterogeneity with the Biclustering Method
The study of biological data heterogeneity was conducted with a biclustering approach. This method using the SAMBA algorithm performs clustering on genes and conditions simultaneously in order to identify subsets of genes that show similar expression patterns across specific subsets of patients and vice versa. After data filtering, 121 biclusters were identified from 9,856 selected probe sets. To draw a clear picture of these co-expressed gene groups, the TANGO algorithm was used for GO functional enrichment analysis. The identified biclusters were represented by 15 functional biological processes (Table 2)
a Biological terms composed of 95% IFN mediated immunity genes.
To focus on the IFN signature, the “immune response” and “response to virus” ontology groups, which represent a broad composite family, were selected. Interestingly, within this subgroup, 95% of 37 genes were known to be induced by IFN. The list of these 35 genes is presented in the right column of
Activation of IFN Pathway in a Sub-Group of RA Patients
To visualize the expression profiles of the 35 IFN-response genes among all RA patients and to investigate their interactions, a hierarchical clustering was performed with the Spotfire Decision Site 8.2.1. This clustering separated the samples into two main groups, one of patients with RA (n=26/102, 25.5%) with high expression (
Characterization of the IFN Signature Based on a Correlation Approach
The expression pattern of 35 IFN-response genes was defined as the “IFN signature”. To go further in the description of the IFN-induced genes, the correlation levels between the co-expressed genes were assessed in the two groups. Interestingly, the analysis revealed disparities between correlation levels. The group associated with high IFN expression level showed a better correlation (Rmedian=0.63) than the other one (Rmedian=0.33), with a significant difference (p=8.46E-13), suggesting a functional difference in the activated state of these genes (
Effect of TNF Inhibition on IFN Pathway Activation
The functional relationship between TNF inhibition and possible changes in IFN pathway activation was studied. CABS was used to assess the correlation levels in RA patients before and after anti-TNFα treatment. Out of the subgroup of 43 RA patients treated with anti-TNF, 22 RA patients (11 RA IFNhigh and 11 RA IFNlow; infliximab n=6, etanercept n=10 and adalimumab n=6) were evaluated at 6 months for treatment response using the DAS28 criteria. Although the values appeared quite heterogeneous, a statistical significant decrease (p=0.0186) of the correlation level was observed in patients associated with high IFN signature (
Comparison of Characterization Methods of IFN Signature.
A comparative analysis between correlation-based approach (CABS) and the classical “IFN score” based on the average values of gene expression was performed (
The method of the present invention permits to identify truly active biological networks associating only with high levels of correlation of biological signature components. This new correlation aspect for the interpretation of biological networks allows capturing the actually activated mechanisms at the cellular level.
Such correlation-based approach can be advantageously applied to investigate the dynamics of evolution of cellular mechanisms like response to treatment. As an example, in the context of RA, the inventors have applied this method to monitor patients treated by anti-TNF therapy.
Interestingly also, the method illustrating the present invention and using CABS allows to pinpoint type I IFN signaling as a means to stratify RA patients even starting with whole blood transcriptomics analysis from samples collected in PAXgene tubes. Similar analyses can be performed for the other identified biclusters, highlighting the obvious advantage of whole blood transcriptomics. Using the example of the IFN signature, the use of correlations shows interest in the characterization of the genes sharing both an expression pattern and a biological function. The use of expression correlations is a better way to obtain a global picture of an activated signature in various disease conditions.
Number | Date | Country | Kind |
---|---|---|---|
10306090.1 | Oct 2010 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP2011/066223 | 9/19/2011 | WO | 00 | 5/3/2013 |