The present invention provides in vitro methods for diagnosing or detecting an autoimmune disease in an individual, as well as arrays and kits for use in such methods.
Autoimmune diseases (AID) constitute a large group of chronic and severe disorders, characterized by an abnormal response from the immune system in which healthy cells are attacked. Patients diagnosed with an autoimmune disease are faced with a life sentence, with severe side effects and increased mortality. Systemic Erythematosus Lupus (SLE), Rheumatoid Arthritis (RA), Sjögren Syndrome (SS) and Systemic Vasculitis (SV) represent four systemic autoimmune diseases (AID), which if left untreated can lead to severe and sometimes permanent physiological disability and increased morbidity (1, 2). Diagnosis at an early stage plays a crucial role for enabling proper disease monitoring and therapeutic interventions to prevent or minimize organ and tissue related damage. However, clinical diagnosis remains a challenge due to fluctuating symptoms over time, including a wide repertoire of symptoms such as fatigue, joint and muscle pain and inflammation, symptoms which are commonly shared among several autoimmune disorders. In addition, a patient can be affected by more than one autoimmune disease at the same time (such as concurrent Sjögren syndrome in SLE and RA patients) which confers an increased risk of misdiagnosis and/or under diagnosis (3, 4).
Current tools for clinical diagnosis include the combined information generated from clinical, laboratory and imaging findings, where the presence of various autoantibodies such as anti-nuclear antibodies (ANA), anti-cyclic citrullinated peptides (aCCP), Rheumatoid Factor (RF), anti-neutrophil cytoplasmic antibodies (ANCA), anti-double stranded antibodies (anti-dsDNA) and anti-Ro/SSA and anti-LA/SSB)), constitute important key players in the diagnostic routine of SLE, RA, SS and SV(5-8). However, a positive result for an autoantibody may not be exclusive for one disease and the use of single markers has not reached the high levels of specificity as required (9-12). Identification of new blood-based biomarkers for correct and early diagnosis is of high clinical relevance to enable early therapeutic interventions, thereby saving both lives and cost for society.
Considering that underlying disease biology is still unclear, panels of disease-specific markers can provide an improved option for reflecting underlying disease-specific molecular alterations. Previous studies have shown that high-performing proteomic technologies, such as recombinant antibody microarrays, offering a multiplexed approach are better able to reflect the complexity of multifactorial diseases, such as AID (13-18). Using this approach, candidate biomarker panels indicative for SLE, Systemic Sclerosis and SLE disease activity have been identified (11, 15).
However, there remains a need for improved methods of diagnosing or detecting autoimmune diseases, particularly SLE, RA, SS and SV.
The inventors have now shown for the first time that by using biomarker panels classification of autoimmune disease could be achieved with high accuracy. These results highlight the power of using a multiplexed approached for decoding multifactorial, complex diseases such as autoimmune disease, which will play a significant role for diagnostic purposes.
Accordingly, a first aspect of the invention provides a method for diagnosing or detecting an autoimmune disease in an individual, the method comprising or consisting of the steps of:
Thus, in one embodiment, the method comprises determining a biomarker signature of the test sample, which enables a diagnosis to be reached in respect of the individual from which the sample is obtained.
By “autoimmune disease” we include any condition comprising or consisting of an abnormal immune response in an individual, wherein the immune response is directed against the individual.
By “diagnosing or detecting an autoimmune disease” we include determination of an autoimmune disease-associated state in an individual.
By “autoimmune disease-associated state” we include autoimmune disease diagnosis per se, the risk of having or of developing an autoimmune disease, and determination of the stage or sub-group of a particular autoimmune disease.
The term “autoimmune disease state” may mean or include (i) the presence or absence of an autoimmune disease (e.g., discriminating an active autoimmune disease from a non-autoimmune disease, a non-active autoimmune disease from a non-autoimmune disease and/or a highly active autoimmune disease from a non-autoimmune disease), and (ii) the activity of autoimmune disease (e.g., discriminating an active autoimmune disease from a non-active autoimmune disease, and/or discriminating a highly-active autoimmune disease from a non-active autoimmune disease).
Thus, it will be appreciated by persons skilled in the art that the methods of the invention are suitable for differentiating individuals with an autoimmune disease from healthy individuals as well as, for example, determining the activity level of an autoimmune disease in an individual (e.g. determining whether an autoimmune disease is in an active or inactive state) or determining whether an autoimmune disease is in remission in an individual.
Thus, in one embodiment, the method is for diagnosing an active autoimmune disease (e.g., an SLE flare) in a subject.
By “biomarker” we include any naturally-occurring biological molecule, or component or fragment thereof, the measurement of which can provide information useful in the diagnosis of an autoimmune disease. Thus, in the context of Table 1 generally (i.e. Table 1(A), Table 1(B), Table 1(C), Table 1(D) and Table 1(E)), the biomarker may be the protein, or a polypeptide fragment or carbohydrate moiety thereof (or, in the case of sialyl Lewis x, a carbohydrate moiety per se). Alternatively, the biomarker may be a nucleic acid molecule, such as a mRNA, cDNA or circulating tumour DNA molecule, which encodes the protein or part thereof.
In one embodiment, the biomarker mRNA and/or amino acid sequences correspond to those available on the GenBank database (http://www.ncbi.nlm.nih.gov/genbank/) and natural variants thereof. In a further embodiment, the biomarker mRNA and/or amino acid sequences correspond to those available on the GenBank database in January 2019.
In one embodiment of the invention, step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more of the biomarkers defined in Table 1(A), for example 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 or 31 of the biomarkers defined in Table 1(A).
For example, step (b) may comprise or consist of measuring at least 5 biomarkers. Step (b) may comprise or consist of measuring at least 10 biomarkers. Step (b) may comprise or consist of measuring at least 15 biomarkers. Step (b) may comprise or consist of measuring at least 20 biomarkers. Step (b) may comprise or consist of measuring 30 or fewer biomarkers. Step (b) may comprise or consist of measuring 25 or fewer biomarkers. Step (b) may comprise or consist of measuring 20-25 biomarkers. Step (b) may comprise or consist of measuring 25-31 biomarkers.
In an additional or alternative embodiment of any of the aspects of the invention described herein, in step (b) the presence and/or amount in the test sample of GSN (gelsolin) is measured in addition to the presence and/or amount of HADH2.
In an additional or alternative embodiment of any of the aspects of the invention described herein, in step (b) the presence and/or amount in the test sample of GSN (gelsolin) is measured instead of the presence and/or amount of HADH2. In an additional or alternative embodiment of each of the aspects of the invention described herein, in step (b) the presence and/or amount in the test sample of HADH2 is measured instead of GSN (gelsolin).
As detailed in Supplementary Table S6, the antibody sequence referred to herein as binding HADH2 may also bind GSN.
In an additional or alternative embodiment of any of the aspects of the invention described herein, measuring the presence and/or amount in the test sample of HADH2 and/or GSN in step (b) is replaced by measuring the presence and/or amount in the test sample of a protein bound by the antibody sequence of SEQ ID NO: 7. Preferably the protein bound by the antibody sequence of SEQ ID NO: 7 is HADH2 and/or GSN.
In an additional or alternative embodiment of any of the aspects of the invention described herein, measuring the presence and/or amount in the test sample of one or more core biomarkers in step (b) is replaced by measuring the presence and/or amount in the test sample of a protein bound by one or more of the antibody sequences described in Supplementary Table S6.
It will be appreciated that step (b) may additionally comprise measuring the presence and/or amount of one or more further biomarkers not listed in Table 1(A), wherein the further biomarkers may provide additional diagnostic information.
In one embodiment step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more of the biomarkers defined in Table 1(A)i, i.e. step (b) comprises on consists of measuring the presence and/or amount of one or more “core biomarker”, for example 2 or 3 of the core biomarkers.
In one embodiment step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more, for example 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, or 22 of the biomarkers defined in Table 1(A)ii, i.e. step (b) comprises on consists of measuring the presence and/or amount of one or more “preferred biomarker”.
In one embodiment step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more, for example 2, 3, 4, 5, or 6, of the biomarkers defined in Table 1(A)iii, i.e. step (b) comprises on consists of measuring the presence and/or amount of one or more “optional biomarker”.
In one embodiment step (b) comprises or consists of measuring the presence and/or amount in the test sample of biomarkers defined in Table 1(A)i, Table 1(A)ii and/or Table 1(A)iii. i.e. step (b) comprises measuring the presence of core, preferred and/or biomarkers.
In one embodiment the one or more biomarker(s) selected from the group defined in Table 1(A) are biomarkers which are also present in Table 2(A). Table 2(A) corresponds to differentially expressed markers in autoimmune disease.
In one embodiment of the first aspect of the invention, the method further comprises measuring the presence and/or amount of one or more of the biomarkers defined in Table 2(A). It will be appreciated by persons skilled in the art that these markers may be different to those in Table 1(A). Thus, the method may comprise a further additional step of measuring markers present in Table 2(A) (differentially expressed markers) which are not present in Table 1(A).
It will be appreciated by persons skilled in the art that the biomarker signature of Table 1(A), directed to autoimmune diseases generally, may be used in combination with any one or more of the biomarker signatures of Table 1(B), Table 1(C), Table 1(D), and Table 1(E), relating to specific autoimmune diseases (SLE, RA, SS and SV, respectively).
Thus, in one embodiment, the method further comprises measuring the presence and/or amount of one or more, for example 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 or 32, of the biomarkers defined in Table 1(B).
In one embodiment, the method further comprises measuring the presence and/or amount of one or more, for example 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, or 29, of the biomarkers defined in Table 1(C).
In one embodiment, the method further comprises measuring the presence and/or amount of one or more, for example 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 or 33, of the biomarkers defined in Table 1(D).
In one embodiment, the method further comprises measuring the presence and/or amount of one or more, for example 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, or 26, the biomarkers defined in Table 1(E).
In one embodiment of the invention, the automimmune disease to be diagnosed is an inflammatory rheumatic disease, e.g. systemic lupus erythematosus (SLE), rheumatoid arthritis (RA), Sjögren's syndrome (SS) or systemic vasculitis (SV).
In one embodiment of the invention, the autoimmune disease to be diagnosed is selected from: systemic lupus erythematosus (SLE), rheumatoid arthritis (RA), Sjögren's syndrome (SS) or systemic vasculitis (SV).
In one embodiment of the invention, systemic vasculitis (SV) is antineutrophil cytoplasmic antibody (ANCA) associated vasculitis.
Also provided as part of the invention are specific methods for diagnosing or detecting specific autoimmune diseases. It will be appreciated by persons skilled in the art that the descriptions and options relating to the first aspect of the invention also apply for these subsequent aspects of the present invention, as they are closely related methods.
Therefore a second, related, aspect of the invention provides a method for diagnosing or detecting systemic lupus erythematosus in an individual comprising or consisting of the steps of:
In one embodiment of the invention, step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more of the biomarkers defined in Table 1(B), for example 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, or 32 of the biomarkers defined in Table 1(B).
For example, step (b) may comprise or consist of measuring at least 5 biomarkers. Step (b) may comprise or consist of measuring at least 10 biomarkers. Step (b) may comprise or consist of measuring at least 15 biomarkers. Step (b) may comprise or consist of measuring at least 20 biomarkers. Step (b) may comprise or consist of measuring 32 or fewer biomarkers. Step (b) may comprise or consist of measuring 25 or fewer biomarkers. Step (b) may comprise or consist of measuring 20-25 biomarkers. Step (b) may comprise or consist of measuring 25-32 biomarkers.
It will be appreciated that step (b) may additionally comprise measuring the presence and/or amount of one or more further biomarkers not listed in Table 1(B), wherein the further biomarkers may provide additional diagnostic information.
In one embodiment step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more, for example 2, or 3, of the biomarkers defined in Table 1(B)i, i.e. step (b) comprises on consists of measuring the presence and/or amount of one or more “core biomarker”.
In one embodiment step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more, for example 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, or 19 of the biomarkers defined in Table 1(B)ii, i.e. step (b) comprises on consists of measuring the presence and/or amount of one or more “preferred biomarker”.
In one embodiment step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more, for example 2, 3, 4, 5, 6, 7, 8, 9 or 10, of the biomarkers defined in Table 1(B)iii, i.e. step (b) comprises on consists of measuring the presence and/or amount of one or more “optional biomarker”.
In one embodiment step (b) comprises or consists of measuring the presence and/or amount in the test sample of biomarkers defined in Table 1(B)i, Table 1(B)ii and/or Table 1(B)iii. i.e. step (b) comprises measuring the presence of core, preferred and/or biomarkers.
In one embodiment the one or more biomarker(s) selected from the group defined in Table 1(B) are biomarkers which are also present in Table 2(B). Table 2(B) corresponds to differentially expressed markers in SLE.
In one embodiment the method further comprises measuring the presence and/or amount of one or more of the biomarkers defined in Table 2(B). It will be appreciated by persons skilled in the art that the markers to be measured may or may not also be present in Table 1(B).
A third aspect of the invention provides a method for diagnosing or detecting rheumatoid arthritis (RA) in an individual comprising or consisting of the steps of:
In one embodiment of the invention, step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more of the biomarkers defined in Table 1(C), for example 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28 or 29 of the biomarkers defined in Table 1(C).
For example, step (b) may comprise or consist of measuring at least 5 biomarkers. Step (b) may comprise or consist of measuring at least 10 biomarkers. Step (b) may comprise or consist of measuring at least 15 biomarkers. Step (b) may comprise or consist of measuring at least 20 biomarkers. Step (b) may comprise or consist of measuring 29 or fewer biomarkers. Step (b) may comprise or consist of measuring 25 or fewer biomarkers. Step (b) may comprise or consist of measuring 20-25 biomarkers. Step (b) may comprise or consist of measuring 25-29 biomarkers.
It will be appreciated that step (b) may additionally comprise measuring the presence and/or amount of one or more further biomarkers not listed in Table 1(C), wherein the further biomarkers may provide additional diagnostic information.
In one embodiment step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more, for example 2, or 3, of the biomarkers defined in Table 1(C)i, i.e. step (b) comprises on consists of measuring the presence and/or amount of one or more “core biomarker”.
In one embodiment step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more, for example 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14, of the biomarkers defined in Table 1(C)ii, i.e. step (b) comprises on consists of measuring the presence and/or amount of one or more “preferred biomarker”.
In one embodiment step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more, for example 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 of the biomarkers defined in Table 1(C)iii, i.e. step (b) comprises on consists of measuring the presence and/or amount of one or more “optional biomarker”.
In one embodiment step (b) comprises or consists of measuring the presence and/or amount in the test sample of biomarkers defined in Table 1(C)i, Table 1(C)ii and/or Table 1(C)iii. i.e. step (b) comprises measuring the presence of core, preferred and/or biomarkers.
In one embodiment the one or more biomarker(s) selected from the group defined in Table 1(C) are biomarkers which are also present in Table 2(C). Table 2(C) corresponds to differentially expressed markers in RA.
In one embodiment of the first aspect of the invention, the method further comprises measuring the presence and/or amount of one or more of the biomarkers defined in Table 2(C). It will be appreciated by persons skilled in the art that these markers may be different to those in Table 1(C). Thus, the method may comprise a further additional step of measuring markers present in Table 2(C) (differentially expressed markers) which are not present in Table 1(C).
In one embodiment the one or more biomarker(s) selected from the group defined in Table 1(C) are biomarkers which are also present in Table 2(C).
In one embodiment the method further comprises measuring the presence and/or amount of one or more of the biomarkers defined in Table 2(C). It will be appreciated by persons skilled in the art that the markers to be measured may or may not also be present in Table 1(C).
A fourth aspect of the invention provides a method for diagnosing or detecting Sjögren's syndrome (SS) in an individual comprising or consisting of the steps of:
In one embodiment of the invention, step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more of the biomarkers defined in Table 1(D), for example 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 32 or 33 of the biomarkers defined in Table 1(D).
For example, step (b) may comprise or consist of measuring at least 5 biomarkers. Step (b) may comprise or consist of measuring at least 10 biomarkers. Step (b) may comprise or consist of measuring at least 15 biomarkers. Step (b) may comprise or consist of measuring at least 20 biomarkers. Step (b) may comprise or consist of measuring 30 or fewer biomarkers. Step (b) may comprise or consist of measuring 25 or fewer biomarkers. Step (b) may comprise or consist of measuring 20-25 biomarkers. Step (b) may comprise or consist of measuring 25-30 biomarkers. Step (b) may comprise or consist of measuring 30-33 biomarkers.
It will be appreciated that step (b) may additionally comprise measuring the presence and/or amount of one or more further biomarkers not listed in Table 1(D), wherein the further biomarkers may provide additional diagnostic information.
In one embodiment step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more, for example 2, or 3, of the biomarkers defined in Table 1(D)i, i.e. step (b) comprises on consists of measuring the presence and/or amount of one or more “core biomarker”.
In one embodiment step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more, for example 2, 3, 4, 5, 6, 7, 8, or 9, of the biomarkers defined in Table 1(D)ii, i.e. step (b) comprises on consists of measuring the presence and/or amount of one or more “preferred biomarker”.
In one embodiment step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more, for example 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 21, of the biomarkers defined in Table 1(D)iii, i.e. step (b) comprises on consists of measuring the presence and/or amount of one or more “optional biomarker”.
In one embodiment step (b) comprises or consists of measuring the presence and/or amount in the test sample of biomarkers defined in Table 1(D)i, Table 1(D)ii and/or Table 1(D)iii. i.e. step (b) comprises measuring the presence of core, preferred and/or biomarkers.
In one embodiment the one or more biomarker(s) selected from the group defined in Table 1(D) are biomarkers which are also present in Table 2(D). Table 2(D) corresponds to differentially expressed markers in SS.
In one embodiment the method further comprises measuring the presence and/or amount of one or more of the biomarkers defined in Table 2(D). It will be appreciated by persons skilled in the art that the markers to be measured may or may not also be present in Table 1(D).
A fifth aspect of the invention provides a method for diagnosing or detecting systemic vasculitis (SV) in an individual comprising or consisting of the steps of:
In one embodiment of the invention, the systemic vasculitis (SV) is antineutrophil cytoplasmic antibody (ANCA) associated vasculitis.
In one embodiment of the invention, step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more of the biomarkers defined in Table 1(E), for example 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26 or 27 of the biomarkers defined in Table 1(E).
For example, step (b) may comprise or consist of measuring at least 5 biomarkers. Step (b) may comprise or consist of measuring at least 10 biomarkers. Step (b) may comprise or consist of measuring at least 15 biomarkers. Step (b) may comprise or consist of measuring at least 20 biomarkers. Step (b) may comprise or consist of measuring 27 or fewer biomarkers. Step (b) may comprise or consist of measuring 25 or fewer biomarkers. Step (b) may comprise or consist of measuring 20-25 biomarkers. Step (b) may comprise or consist of measuring 25-27 biomarkers.
It will be appreciated that step (b) may additionally comprise measuring the presence and/or amount of one or more further biomarkers not listed in Table 1(E), wherein the further biomarkers may provide additional diagnostic information.
In one embodiment step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more, for example 2, or 3, of the biomarkers defined in Table 1(E)i, i.e. step (b) comprises on consists of measuring the presence and/or amount of one or more “core biomarker”.
In one embodiment step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more, for example 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or 13 of the biomarkers defined in Table 1(E)ii, i.e. step (b) comprises on consists of measuring the presence and/or amount of one or more “preferred biomarker”.
In one embodiment step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more, for example 2, 3, 4, 5, 6, 7, 8, 9, or 10, of the biomarkers defined in Table 1(E)iii, i.e. step (b) comprises on consists of measuring the presence and/or amount of one or more “optional biomarker”.
In one embodiment step (b) comprises or consists of measuring the presence and/or amount in the test sample of biomarkers defined in Table 1(E)i, Table 1(E)ii and/or Table 1(E)iii. i.e. step (b) comprises measuring the presence of core, preferred and/or biomarkers.
In one embodiment the one or more biomarker(s) selected from the group defined in Table 1(E) are biomarkers which are also present in Table 2(E). Table 2(E) corresponds to differentially expressed markers in SV.
In one embodiment the one or more biomarker(s) selected from the group defined in Table 1(E) are biomarkers which are also present in Table 2(E).
In one embodiment the method further comprises measuring the presence and/or amount of one or more of the biomarkers defined in Table 2(E). It will be appreciated by persons skilled in the art that the markers to be measured may or may not also be present in Table 1(E).
It will be appreciated by persons skilled in the art that specified embodiments may be applied to any of the first to fifth aspects of the invention, as the first to the fifth aspects of the invention all relate to closely related methods.
For example, in any of the first to the fifth aspects of the invention, preferably the individual is a human, but may be any mammal such as a domesticated mammal (preferably of agricultural or commercial significance including a horse, pig, cow, sheep, dog and cat).
For the avoidance of doubt, test samples from more than one disease state may be provided in step (a), for example, ≥2, ≥3, ≥4, ≥5, ≥6 or ≥7 or different disease states. Step (a) may provide at least two test samples, for example, ≥3, ≥4, ≥5, ≥6, ≥7, ≥8, ≥9, ≥10, ≥15, ≥20, ≥25, ≥50 or ≥100 test samples. Where multiple test samples are provided, they may be of the same type (e.g., all serum or urine samples) or of different types (e.g., serum and urine samples).
It will be appreciated by persons skilled in the art that, in addition to measuring the biomarkers in a sample from an individual to be tested, the methods of the invention may also comprise measuring those same biomarkers in one or more control samples.
Thus, in one embodiment, the method of any of the above aspects of the invention further comprises the steps of:
As discussed above, by “having an autoimmune disease” we include both diagnosis of an autoimmune disease and determination of an autoimmune disease-associated state.
Optionally the control samples of step (c) are provided from an individual not having an autoimmune disease (negative control). Optionally, the individual not afflicted with an autoimmune disease is a healthy individual (negative control).
Alternatively, or additionally, the control samples of step (c) are provided from an individual with an autoimmune disease (positive control).
Thus, the individual may be identified as having an autoimmune disease in the event that the presence and/or amount in the test sample of the one or more biomarkers measured in step (b) is different from the presence and/or amount in the control sample. Alternatively, the presence and/or amount in the test sample of the one or more biomarkers measured in step (b) corresponds to the presence and/or amount in the control sample of the one or more biomarkers measured in step (d), i.e. the control sample is a positive control.
For the avoidance of doubt, control samples from more than one disease state may be provided in step (c), for example, ≥2, ≥3, ≥4, ≥5, ≥6 or ≥7 different disease states. Step (c) may provide at least two control samples, for example, ≥3, ≥4, ≥5, ≥6, ≥7, ≥8, ≥9, ≥10, ≥15, ≥20, ≥25, ≥50 or ≥100 control samples. Where multiple control samples are provided, they may be of the same type (e.g., all serum or urine samples) or of different types (e.g., serum and urine samples). Preferably the test samples types and control samples types are matched/corresponding.
By “is different to the presence and/or amount in a control sample” we mean or include the presence and/or amount of the one or more biomarker in the test sample differs from that of the one or more control sample (or to predefined reference values representing the same). Preferably the presence and/or amount in the test sample differs from the presence or amount in the one or more control sample (or mean of the control samples) by at least ±5%, for example, at least ±6%, ±7%, ±8%, ±9%, ±10%, ±11%, ±12%, ±13%, ±14%, ±15%, ±16%, ±17%, ±18%, ±19%, ±20%, ±21%, ±22%, ±23%, ±24%, ±25%, ±26%, ±27%, ±28%, ±29%, ±30%, ±31%, ±32%, ±33%, ±34%, ±35%, ±36%, ±37%, ±38%, ±39%, ±40%, ±41%, ±42%, ±43%, ±44%, ±45%, ±41%, ±42%, ±43%, ±44%, ±55%, ±60%, ±65%, ±66%, ±67%, ±68%, ±69%, ±70%, ±71%, ±72%, ±73%, ±74%, ±75%, ±76%, ±77%, ±78%, ±79%, ±80%, ±81%, ±82%, ±83%, ±84%, ±85%, ±86%, ±87%, ±88%, ±89%, ±90%, ±91%, ±92%, ±93%, ±94%, ±95%, ±96%, ±97%, ±98%, ±99%, ±100%, ±125%, ±150%, ±175%, ±200%, ±225%, ±250%, ±275%, ±300%, ±350%, ±400%, ±500% or at least ±1000% of the one or more control sample (e.g., the negative control sample).
Alternatively or additionally, the presence or amount in the test sample differs from the mean presence or amount in the control samples by at least >1 standard deviation from the mean presence or amount in the control samples, for example, ≥1.5, ≥2, ≥3, ≥4, ≥5, ≥6, ≥7, ≥8, ≥9, ≥10, ≥11, ≥12, ≥13, ≥14 or ≥15 standard deviations from the from the mean presence or amount in the control samples. Any suitable means may be used for determining standard deviation (e.g., direct, sum of square, Welford's), however, in one embodiment, standard deviation is determined using the direct method (i.e., the square root of [the sum the squares of the samples minus the mean, divided by the number of samples]).
Alternatively or additionally, by “is different to the presence and/or amount in a control sample” we mean or include that the presence or amount in the test sample does not correlate with the amount in the control sample in a statistically significant manner. By “does not correlate with the amount in the control sample in a statistically significant manner” we mean or include that the presence or amount in the test sample correlates with that of the control sample with a p-value of >0.001, for example, >0.002, >0.003, >0.004, >0.005, >0.01, >0.02, >0.03, >0.04 >0.05, >0.06, >0.07, >0.08, >0.09 or >0.1. Any suitable means for determining p-value known to the skilled person can be used, including z-test, t-test, Student's t-test, f-test, Mann-Whitney U test, Wilcoxon signed-rank test and Pearson's chi-squared test.
Alternatively, as described above, the autoimmune disease-associated disease state may be identified in the event that the presence and/or amount in the test sample of the one or more biomarkers measured in step (b) corresponds to the presence and/or amount in the control sample of the one or more biomarkers measured in step (d).
Thus, the methods of the invention may comprise steps (c)+(d) for either or both a positive and a negative control.
By “corresponds to the presence and/or amount in a control sample” we include that the presence and/or amount is identical to that of a positive control sample; or closer to that of one or more positive control sample than to one or more negative control sample (or to predefined reference values representing the same). Preferably the presence and/or amount is within ±40% of that of the one or more control sample (or mean of the control samples), for example, within ±39%, ±38%, ±37%, ±36%, ±35%, ±34%, ±33%, ±32%, ±31%, ±30%, ±29%, ±28%, ±27%, ±26%, ±25%, ±24%, ±23%, ±22%, ±21%, ±20%, ±19%, ±18%, ±17%, ±16%, ±15%, ±14%, ±13%, ±12%, ±11%, ±10%, ±9%, ±8%, ±7%, ±6%, ±5%, ±4%, ±3%, ±2%, ±1%, ±0.05% or within 0% of the one or more control sample (e.g., the positive control sample).
Alternatively or additionally, the difference in the presence or amount in the test sample is standard deviation from the mean presence or amount in the control samples, for example, ≤4.5, ≤4, ≤3.5, ≤3, ≤2.5, ≤2, ≤1.5, ≤1.4, ≤1.3, ≤1.2, ≤1.1, ≤1, ≤0.9, ≤0.8, ≤0.7, ≤0.6, ≤0.5, ≤0.4, ≤0.3, ≤0.2, ≤0.1 or 0 standard deviations from the from the mean presence or amount in the control samples, provided that the standard deviation ranges for differing and corresponding biomarker expressions do not overlap (e.g., abut, but no not overlap).
Alternatively or additionally, by “corresponds to the presence and/or amount in a control sample” we include that the presence or amount in the test sample correlates with the amount in the control sample in a statistically significant manner. By “correlates with the amount in the control sample in a statistically significant manner” we mean or include that the presence or amount in the test sample correlates with the that of the control sample with a p-value of ≤0.05, for example, ≤0.04, ≤0.03, ≤0.02, ≤0.01, ≤0.005, ≤0.004, ≤0.003, ≤0.002, ≤0.001, ≤0.0005 or ≤0.0001.
Differential expression (up-regulation or down regulation) of biomarkers, or lack thereof, can be determined by any suitable means known to a skilled person. Differential expression is determined to a p value of a least less than 0.05 (p=<0.05), for example, at least <0.04, <0.03, <0.02, <0.01, <0.009, <0.005, <0.001, <0.0001, <0.00001 or at least <0.000001. For example, differential expression may be determined using a support vector machine (SVM).
In one embodiment, the SVM is, or is derived from, the SVM described below in Supplementary Table S4.
It will be appreciated by persons skilled in the art that differential expression may relate to a single biomarker or to multiple biomarkers considered in combination (i.e., as a biomarker signature). Thus, a p value may be associated with a single biomarker or with a group of biomarkers. Indeed, proteins having a differential expression p value of greater than 0.05 when considered individually may nevertheless still be useful as biomarkers in accordance with the invention when their expression levels are considered in combination with one or more other biomarkers.
As exemplified in the accompanying Example, the expression of certain proteins in a tissue, blood, serum or plasma test sample may be indicative of an autoimmune disease in an individual. For example, the relative expression of certain serum proteins in a single test sample may be indicative of the presence of an autoimmune disease in an individual.
In an alternative or additional embodiment, the presence and/or amount in the test sample of the one or more biomarkers measured in step (b) may be compared against predetermined reference values representative of the measurements in step (d) i.e., reference negative and/or positive control values.
As detailed above, the methods of the invention may also comprise measuring, in one or more negative or positive control samples, the presence and/or amount of the one or more biomarkers measured in the test sample in step (b).
For example, one or more negative control samples may be from an individual who was not, at the time the sample was obtained, afflicted with:
Thus, the negative control sample may be obtained from a healthy individual, for example one afflicted with none of (a), (b) or (c) above.
Likewise, one or more positive control samples may be from an individual who, at the time the sample was obtained, was afflicted with an autoimmune disease; and/or any other disease or condition.
In one embodiment of the methods of the invention, the control samples of step (c) are provided from an individual with systemic lupus erythematosus (SLE), rheumatoid arthritis (RA), Sjögren's syndrome (SS) or systemic vasculitis (SV).
In a preferred embodiment of the second aspect of the invention the control sample is provided from an individual with systemic lupus erythematosus. In a preferred embodiment of the third aspect of the invention the control sample is provided from an individual with rheumatoid arthritis. In a preferred embodiment of the fourth aspect of the invention the control sample is provided from an individual with Sjögren's syndrome. In a preferred embodiment of the fifth aspect of the invention the control sample is provided from an individual with systemic vasculitis.
In one embodiment of any of the first to fifth the control samples of step (c) are provided from an individual with systemic lupus erythematosus subtype 1 (SLE-1), systemic lupus erythematosus subtype 2 (SLE-2) or systemic lupus erythematosus subtype 3 (SLE-3). SLE-1 comprises skin and musculoskeletal involvement but lacks serositis, systemic vasculitis and kidney involvement. SLE-2 comprises skin and musculoskeletal involvement, serositis and systemic vasculitis but lacks kidney involvement. SLE-3 comprises skin and musculoskeletal involvement, serositis, systemic vasculitis and SLE glomerulonephritis. SLE-1, SLE-2 and SLE-3 represent mild/absent, moderate and severe SLE disease states, respectively (e.g., see Sturfelt G, Sjoholm A G. Complement components, complement activation, and acute phase response in systemic lupus erythematosus. Int Arch Allergy Appl Immunol 1984; 75:75-83 which is incorporated herein by reference).
In an alternative embodiment, the control samples of step (c) are provided from an individual with rheumatoid arthritis (RA), which may also include extra-articular manifestations, such as nodules, scleritis, Felty's syndrome, neuropathy, pericarditis, pleuritis or glomerulonephritis
In one embodiment, the control samples of step (c) are provided from an individual with primary Sjögren's syndrome. Alternatively, the control samples of step (c) may be provided from an individual with secondary Sjögren's syndrome.
In one embodiment, the control samples of step (c) are provided from an individual with a systemic vasculitis condition, such as antineutrophil cytoplasmic antibody (ANCA) vasculitis. The condition may be selected from MPO systemic vasculitis and/or PR3 systemic vasculitis from patients in active or inactive disease state.
In one embodiment of any of the first to the fifth aspects of the invention, the method is repeated until an autoimmune disease is diagnosed and/or an autoimmune disease associated disease state is determined in the individual using the methods of the present invention and/or conventional clinical methods (i.e., until confirmation of the diagnosis is made).
Thus, steps (a) and (b) may be repeated using a sample from the same individual taken at different time to the original sample tested (or the previous method repetition). Such repeated testing may enable disease progression to be assessed, for example to determine the efficacy of the selected treatment regime and (if appropriate) to select an alternative regime to be adopted.
Thus, in one embodiment, the method is repeated using a test sample taken between 1 day to 104 weeks to the previous test sample(s) used, for example, between 1 week to 100 weeks, 1 week to 90 weeks, 1 week to 80 weeks, 1 week to 70 weeks, 1 week to 60 weeks, 1 week to 50 weeks, 1 week to 40 weeks, 1 week to 30 weeks, 1 week to 20 weeks, 1 week to 10 weeks, 1 week to 9 weeks, 1 week to 8 weeks, 1 week to 7 weeks, 1 week to 6 weeks, 1 week to 5 weeks, 1 week to 4 weeks, 1 week to 3 weeks, or 1 week to 2 weeks.
Alternatively or additionally, the method may be repeated using a test sample taken every period from the group consisting of: 1 day, 2 days, 3 day, 4 days, 5 days, 6 days, 7 days, 10 days, 2 weeks, 3 weeks, 4 weeks, 5 weeks, 6 weeks, 7 weeks, 8 weeks, 9 weeks, 10 weeks, 15 weeks, 20 weeks, 25 weeks, 30 weeks, 35 weeks, 40 weeks, 45 weeks, 50 weeks, 55 weeks, 60 weeks, 65 weeks, 70 weeks, 75 weeks, 80 weeks, 85 weeks, 90 weeks, 95 weeks, 100 weeks, 104, weeks, 105 weeks, 110 weeks, 115 weeks, 120 weeks, 125 weeks and 130 weeks.
Alternatively or additionally, the method may be repeated at least once, for example, 2 times, 3 times, 4 times, 5 times, 6 times, 7 times, 8 times, 9 times, 10 times, 11 times, 12 times, 13 times, 14 times, 15 times, 16 times, 17 times, 18 times, 19 times, 20 times, 21 times, 22 times, 23, 24 times or 25 times.
Alternatively or additionally, the method is repeated continuously.
In one preferred embodiment of the methods of the invention, step (a) comprises providing a serum sample from an individual to be tested and/or step (b) comprises measuring in the sample the expression of the protein or polypeptide of the one or more biomarker(s). Thus, a biomarker signature for the sample may be determined at the protein level.
In such an embodiment, step (b) and/or step (d) may be performed using one or more first binding agents capable of binding to a biomarker (i.e., protein) listed in Table 1(A), Table 1(B), Table 1(C), Table 1(D), or Table 1(E). It will be appreciated by persons skilled in the art that the first binding agent may comprise or consist of a single species with specificity for one of the protein biomarkers or a plurality of different species, each with specificity for a different protein biomarker.
In one embodiment, the one or more first binding agents are selected from those listed in Supplementary table S5 and/or Supplementary table S6.
Suitable binding agents (also referred to as binding molecules) can be selected from a library, based on their ability to bind a given target molecule, as discussed below.
In one preferred embodiment, at least one type of the binding agents, and more typically all of the types, may comprise or consist of an antibody or antigen-binding fragment of the same, or a variant thereof.
Methods for the production and use of antibodies are well known in the art, for example see Antibodies: A Laboratory Manual, 1988, Harlow & Lane, Cold Spring Harbor Press, ISBN-13: 978-0879693145, Using Antibodies: A Laboratory Manual, 1998, Harlow & Lane, Cold Spring Harbor Press, ISBN-13: 978-0879695446 and Making and Using Antibodies: A Practical Handbook, 2006, Howard & Kaser, CRC Press, ISBN-13: 978-0849335280 (the disclosures of which are incorporated herein by reference).
Thus, a fragment may contain one or more of the variable heavy (VH) or variable light (VL) domains. For example, the term antibody fragment includes Fab-like molecules (Better et al (1988) Science 240, 1041); Fv molecules (Skerra et al (1988) Science 240, 1038); single-chain Fv (scFv) molecules where the VH and VL partner domains are linked via a flexible oligopeptide (Bird et al (1988) Science 242, 423; Huston et al (1988) Proc. Natl. Acad. Sci. USA 85, 5879) and single domain antibodies (dAbs) comprising isolated V domains (Ward et al (1989) Nature 341, 544).
For example, the binding agent(s) may be scFv molecules, Fabs or the binding domains of immunoglobulin molecules.
The term “antibody variant” includes any synthetic antibodies, recombinant antibodies or antibody hybrids, such as but not limited to, a single-chain antibody molecule produced by phage-display of immunoglobulin light and/or heavy chain variable and/or constant regions, or other immunointeractive molecule capable of binding to an antigen in an immunoassay format that is known to those skilled in the art.
A general review of the techniques involved in the synthesis of antibody fragments which retain their specific binding sites is to be found in Winter & Milstein (1991) Nature 349, 293-299.
Molecular libraries such as antibody libraries (Clackson et al, 1991, Nature 352, 624-628; Marks et al, 1991, J Mol Biol 222(3): 581-97), peptide libraries (Smith, 1985, Science 228(4705): 1315-7), expressed cDNA libraries (Santi et al (2000) J Mol Biol 296(2): 497-508), libraries on other scaffolds than the antibody framework such as affibodies (Gunneriusson et al, 1999, Appl Environ Microbiol 65(9): 4134-40) or libraries based on aptamers (Kenan et al, 1999, Methods Mol Biol 118, 217-31) may be used as a source from which binding molecules that are specific for a given motif are selected for use in the methods of the invention.
Conveniently, the binding agent(s) may be immobilised on a surface (e.g., on a multiwell plate or array); see Example below.
In one embodiment of the methods of the invention, step (b), (d) and/or step (f) is performed using an assay comprising a second binding agent capable of binding to the one or more biomarkers, the second binding agent comprising a detectable moiety. For example, an immobilised (first) binding agent may initially be used to ‘trap’ the protein biomarker on to the surface of a microarray, and then a second binding agent may be used to detect the ‘trapped’ protein.
The second binding agent may be as described above in relation to the (first) binding agent, such as an antibody or antigen-binding fragment thereof.
It will be appreciated by skilled person that the one or more biomarkers (e.g., proteins) in the test sample may be labelled with a detectable moiety, prior to performing step (b). Likewise, the one or more biomarkers in the control sample(s) may be labelled with a detectable moiety. The biomarker(s) may be labelled with a directly or indirectly detectable moiety.
Alternatively, or in addition, the first and/or second binding agents may be labelled with a detectable moiety.
By a “detectable moiety” we include the meaning that the moiety is one which may be detected and the relative amount and/or location of the moiety (for example, the location on an array) determined.
Suitable detectable moieties are well known in the art. For example, the detectable moiety may be selected from the group consisting of: a fluorescent moiety; a luminescent moiety; a chemiluminescent moiety; a radioactive moiety; an enzymatic moiety.
In one preferred embodiment, the detectable moiety is biotin.
In one embodiment, in step (b) and/or step (d) the biotinylated biomarkers are detected using streptavidin labelled with a detectable moiety selected from the group consisting of: a fluorescent moiety; a luminescent moiety; a chemiluminescent moiety; a radioactive moiety; an enzymatic moiety.
Thus, the detectable moiety may be a fluorescent and/or luminescent and/or chemiluminescent moiety which, when exposed to specific conditions, may be detected. For example, a fluorescent moiety may need to be exposed to radiation (i.e., light) at a specific wavelength and intensity to cause excitation of the fluorescent moiety, thereby enabling it to emit detectable fluorescence at a specific wavelength that may be detected.
Alternatively, the detectable moiety may be an enzyme which is capable of converting a (preferably undetectable) substrate into a detectable product that can be visualised and/or detected. Examples of suitable enzymes are discussed in more detail below in relation to, for example, ELISA assays.
In a further alternative, the detectable moiety may be a radioactive atom which is useful in imaging. Suitable radioactive atoms include 99mTc and 123I for scintigraphic studies. Other readily detectable moieties include, for example, spin labels for magnetic resonance imaging (MRI) such as 123I again, 131I, 111In, 19F, 13C, 15N, 17O, gadolinium, manganese or iron. Clearly, the agent to be detected (such as, for example, the one or more biomarkers in the test sample and/or control sample described herein and/or an antibody molecule for use in detecting a selected protein) must have sufficient of the appropriate atomic isotopes in order for the detectable moiety to be readily detectable.
Preferred assays for detecting serum or plasma proteins include enzyme linked immunosorbent assays (ELISA), radioimmunoassay (RIA), immunoradiometric assays (IRMA) and immunoenzymatic assays (IEMA), including sandwich assays using monoclonal and/or polyclonal antibodies. Exemplary sandwich assays are described by David et al in U.S. Pat. Nos. 4,376,110 and 4,486,530, hereby incorporated by reference. Antibody staining of cells on slides may be used in methods well known in cytology laboratory diagnostic tests, as well known to those skilled in the art.
Conveniently, the assay is an ELISA (Enzyme Linked Immunosorbent Assay) which typically involves the use of enzymes giving a coloured reaction product, usually in solid phase assays. Enzymes such as horseradish peroxidase and phosphatase have been widely employed. A way of amplifying the phosphatase reaction is to use NADP as a substrate to generate NAD which now acts as a coenzyme for a second enzyme system. Pyrophosphatase from Escherichia coli provides a good conjugate because the enzyme is not present in tissues, is stable and gives a good reaction colour. Chemi-luminescent systems based on enzymes such as luciferase can also be used.
ELISA methods are well known in the art, for example see The ELISA Guidebook (Methods in Molecular Biology), 2000, Crowther, Humana Press, ISBN-13: 978-0896037281 (the disclosures of which are incorporated by reference).
Alternatively, conjugation with the vitamin biotin is frequently used since this can readily be detected by its reaction with enzyme-linked avidin or streptavidin to which it binds with great specificity and affinity.
In one embodiment, the detectable moiety is fluorescent moiety (for example an Alexa Fluor dye, e.g. Alexa647).
In one preferred embodiment, step (b) and/or step (d) may be performed using an array.
Arrays per se are well known in the art. Typically, they are formed of a linear or two-dimensional structure having spaced apart (i.e. discrete) regions (“spots”), each having a finite area, formed on the surface of a solid support. An array can also be a bead structure where each bead can be identified by a molecular code or colour code or identified in a continuous flow. Analysis can also be performed sequentially where the sample is passed over a series of spots each adsorbing the class of molecules from the solution. The solid support is typically glass or a polymer, the most commonly used polymers being cellulose, polyacrylamide, nylon, polystyrene, polyvinyl chloride or polypropylene. The solid supports may be in the form of tubes, beads, discs, silicon chips, microplates, polyvinylidene difluoride (PVDF) membrane, nitrocellulose membrane, nylon membrane, other porous membrane, non-porous membrane (e.g. plastic, polymer, perspex, silicon, amongst others), a plurality of polymeric pins, or a plurality of microtitre wells, or any other surface suitable for immobilising proteins, polynucleotides and other suitable molecules and/or conducting an immunoassay. The binding processes are well known in the art and generally consist of cross-linking covalently binding or physically adsorbing a protein molecule, polynucleotide or the like to the solid support. By using well-known techniques, such as contact or non-contact printing, masking or photolithography, the location of each spot can be defined. For reviews see Jenkins, R. E., Pennington, S. R. (2001, Proteomics, 2, 13-29) and Lal et al (2002, Drug Discov Today 15; 7(18 Suppl): S143-9).
Typically, the array is a microarray. By “microarray” we include the meaning of an array of regions having a density of discrete regions of at least about 100/cm2, and preferably at least about 1000/cm2. The regions in a microarray have typical dimensions, e.g., diameters, in the range of between about 10-250 μm, and are separated from other regions in the array by about the same distance. The array may also be a macroarray or a nanoarray.
Once suitable binding molecules (discussed above) have been identified and isolated, the skilled person can manufacture an array using methods well known in the art of molecular biology.
Examples of array formats are described below in the Example and in; e.g., Steinhauer et al., 2002; Wingren and Borrebaeck, 2008; Wingren et al., 2005, Delfani et al., 2016 (the disclosure of which are incorporated herein by reference).
Thus, in an exemplary embodiment the method comprises:
In an alternative embodiment, step (b) and/or step (d) comprises measuring the expression of a nucleic acid molecule encoding the one or more biomarkers.
The nucleic acid molecule may be a gene expression intermediate or derivative thereof, such as a mRNA or cDNA.
Thus, measuring the expression of the one or more biomarker(s) in step (b) and/or step (d) may be performed using a method selected from the group consisting of Southern hybridisation, Northern hybridisation, polymerase chain reaction (PCR), reverse transcriptase PCR (RT-PCR), quantitative real-time PCR (qRT-PCR), nanoarray, microarray, macroarray, autoradiography and in situ hybridisation.
For example, measuring the expression of the one or more biomarker(s) in step (b) and/or step (d) may be performed using one or more binding moieties, each individually capable of binding selectively to a nucleic acid molecule encoding one of the biomarkers identified in Table 1(A), Table 1(B), Table 1(C), Table 1(D) or Table 1(E).
Conveniently, the one or more binding moieties each comprise or consist of a nucleic acid molecule, such as DNA, RNA, PNA, LNA, GNA, TNA or PMO.
Advantageously, the one or more binding moieties are 5 to 100 nucleotides in length. For example, 15 to 35 nucleotides in length.
It will be appreciated that the nucleic acid-based binding moieties may comprise a detectable moiety.
Thus, the detectable moiety may be selected from the group consisting of: a fluorescent moiety; a luminescent moiety; a chemiluminescent moiety; a radioactive moiety (for example, a radioactive atom); or an enzymatic moiety.
Alternatively or additionally, the detectable moiety may comprise or consist of a radioactive atom, for example selected from the group consisting of technetium-99m, iodine-123, iodine-125, iodine-131, indium-111, fluorine-19, carbon-13, nitrogen-15, oxygen-17, phosphorus-32, sulphur-35, deuterium, tritium, rhenium-186, rhenium-188 and yttrium-90.
Alternatively or additionally, the detectable moiety of the binding moiety may be a fluorescent moiety.
In a further embodiment, the nucleic acid molecule is a circulating tumour DNA molecule (ctDNA).
Methods suitable for detecting ctDNA are now well-established; for example, see Lewis et al., 2016, World J Gastroenterol. 22(32): 7175-7185, and references cited therein (the disclosures of which are incorporated herein by reference).
In one embodiment, expression of the one or more biomarker(s) in step (b) is determined using a DNA microarray.
In one embodiment, the methods may be performed using the methods for detecting and/or quantifying one or more biomarker(s) in a biological sample as described in PCT/EP2019/052105 filed on 29 Jan. 2019.
In one embodiment, the sample provided in step (a) (and/or in step (c)) may be selected from the group consisting of unfractionated blood, plasma, serum, tissue fluid, milk, bile, synovial fluid, and urine.
Conveniently, the sample provided in step (a) and/or (c) is unfractionated blood, plasma, or serum. In one embodiment, the sample provided in step (a) and/or (c) is serum.
By appropriate selection of some or all of the biomarkers in Table 1(A), 1(B), 1(C), 1(D) and/or 1(E), optionally in conjunction with one or more further biomarkers, the methods of the invention exhibit high predictive accuracy for diagnosis of an autoimmune disease, including SLE, SV, SS and RA.
Thus, the predictive accuracy of the method, as determined by an ROC AUC value, may be at least 0.50, for example at least 0.55, 0.60, 0.65, 0.70, 0.75, 0.80, 0.85, 0.90, 0.95, 0.96, 0.97, 0.98 or at least 0.99.
Thus, in one embodiment, the predictive accuracy of the method, as determined by an ROC AUC value, is at least 0.70.
In the methods of the invention, the ‘raw’ data obtained in step (b) (and/or in step (d)) undergoes one or more analysis steps before a diagnosis is reached. For example, the raw data may need to be standardised against one or more control values (i.e., normalised).
Typically, diagnosis is performed using a support vector machine (SVM), such as those available from http://cran.r-project.org/web/packages/e1071/index.html (e.g. e1071 1.5-24). However, any other suitable means may also be used.
Support vector machines (SVMs) are a set of related supervised learning methods used for classification and regression. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that predicts whether a new example falls into one category or the other. Intuitively, an SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on.
More formally, a support vector machine constructs a hyperplane or set of hyperplanes in a high or infinite dimensional space, which can be used for classification, regression or other tasks. Intuitively, a good separation is achieved by the hyperplane that has the largest distance to the nearest training data points of any class (so-called functional margin), since in general the larger the margin the lower the generalization error of the classifier. For more information on SVMs, see for example, Burges, 1998, Data Mining and Knowledge Discovery, 2:121-167.
In one embodiment of the invention, the SVM is ‘trained’ prior to performing the methods of the invention using biomarker profiles from individuals with known disease status (for example, individuals known to have an autoimmune disease or individuals known to be healthy). By running such training samples, the SVM is able to learn what biomarker profiles are associated with an autoimmune disease. Once the training process is complete, the SVM is then able to determine whether or not the biomarker sample tested is from an individual with an autoimmune disease.
However, this training procedure can be by-passed by pre-programming the SVM with the necessary training parameters. For example, diagnoses can be performed according to the known SVM parameters using the SVM algorithm detailed in Supplementary Table S4 below, based on the measurement of any or all of the biomarkers listed in Table 1(A), Table 1(B), Table 1(C), Table 1(D) and/or Table 1(E).
It will be appreciated by skilled persons that suitable SVM parameters can be determined for any combination of the biomarkers listed in Tables 1(A), 1(B), 1(C), 1(D) and/or 1(E) by training an SVM machine with the appropriate selection of data (i.e. biomarker measurements from individuals with known autoimmune disease status). Alternatively, the data of the Examples and figures may be used to determine a particular autoimmune disease-associated disease state according to any other suitable statistical method known in the art.
Preferably, the method of the invention has an accuracy of at least 60%, for example 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% accuracy.
Preferably, the method of the invention has a sensitivity of at least 60%, for example 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sensitivity.
Preferably, the method of the invention has a specificity of at least 60%, for example 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% specificity.
By “accuracy” we mean the proportion of correct outcomes of a method, by “sensitivity” we mean the proportion of all autoimmune disease positive samples that are correctly classified as positives, and by “specificity” we mean the proportion of all autoimmune disease negative samples that are correctly classified as negatives.
Signal intensities may be quantified using any suitable means known to the skilled person, for example using Array-Pro (Media Cybernetics). Signal intensity data may be normalised (i.e., to adjust technical variation). Normalisation may be performed using any suitable method known to the skilled person. Alternatively or additionally, data are normalised using the empirical Bayes algorithm ComBat (Johnson et al., 2007).
Further statistical analysis of the refined data may be performed using methods well-known in the art, such as PCA, q-value calculation by ANOVA, and/or fold change calculation in Qlucore Omics Explorer.
As described above, a first (‘training’) data set may be used to identify a combination of biomarkers, e.g. from Table 1(A), Table 1(B), Table 1(C), Table 1(D), or Table 1(E), to serve as a biomarker signature for the diagnosis of an autoimmune disease. Mathematical analysis of the training data set may be performed using known algorithms (such as a backward elimination, or BE, algorithm) to determine the most suitable biomarker signatures. The predictive accuracy of a given biomarker combination (signature) can then be verified against a new (‘verification’) data set. Such methodology is described in detail in the Example.
It will be appreciated by persons skilled in the art that the individual(s) tested may be of any ethnicity or geographic origin. Alternatively, the individual(s) tested may be of a defined sub-population, e.g., based on ethnicity and/or geographic origin. For example, the individual(s) tested may be Caucasian and/or Chinese (e.g., Han ethnicity).
In one embodiment of any of the first to the fifth aspects of the invention, a diagnosis in a patient of an autoimmune disease is subsequently made using one or more diagnostic tests for an autoimmune disease.
Suitable conventional clinical methods are well known in the art. For example, diagnostic tests for an autoimmune disease may include auto antibody tests such as Anti Nuclear Antibody test (ANA), Anti-Double Stranded DNA (anti-dsDNA), Antineutrophil Cytoplasmic antibodies (ANCA), Cyclic Citrullinated Peptide Antibodies (CCP), Rheumatoid Factor (RF), Extractable Nuclear Antigen Antibodies (e.g. anti-SS-A (Ro) and anti-SS-B (La), anti-sm, anti-RNP, anti-Jo-1, Scl-70), antihistone antibodies and AntiCentromere Antibodies (ACA) or complement analysis (C3, C4).
In one embodiment of any of the first to fifth aspects of the invention, the methods comprise, in the event that the individual is diagnosed with an autoimmune disease, the additional step (g) of administering to the individual a therapy for said autoimmune disease.
Optionally the autoimmune disease therapy is selected from the group consisting of: Nonsteroidal anti-inflammatory drugs (NSAID) such as Ibuprofen and Naproxen; immune-suppressing drugs such as Corticosteroids; synthetic DMARDs (such as Methotrexate, cyclophosphoamide); and Biologicals (such as TNF-inhibitors, IL-inhibitors); and combinations thereof.
A further aspect of the invention provides an array for diagnosing or detecting an autoimmune disease in an individual comprising one or more agents (such as any of the above-described binding agents) suitable for measuring the presence and/or amount of one or more biomarkers selected from the group defined in Table 1(A), Table 1(B), Table 1(C), Table 1(D), and/or Table 1(E).
Thus, the array is suitable for performing a method according to any one of the first to fifth aspects of the invention.
The array comprises one or more binding agents capable (individually or collectively) of binding to one or more of the biomarkers defined in Table 1(A), Table 1(B), Table 1(C), Table 1(D) and/or Table 1(E), either at the protein level or the nucleic acid level.
In one preferred embodiment, the array comprises one or more antibodies, or antigen-binding fragments thereof, capable (individually or collectively) of binding to one or more of the biomarkers defined in Table 1(A), Table 1(B), Table 1(C), Table 1(D) and/or Table 1(E), at the protein level. For example, the array may comprise scFv molecules capable (collectively) of binding to all of the biomarkers defined in Table 1(A) at the protein level.
It will be appreciated that the array may comprise one or more positive and/or negative control samples. For example, conveniently the array comprises bovine serum albumin as a positive control sample and/or phosphate-buffered saline as a negative control sample.
In one embodiment, the array comprises agents capable of binding to all of the biomarkers defined in any one of: Table 1(A), Table 1(B), Table 1(C), Table 1(D) or Table 1(E), In another embodiment, the array comprises agents capable of binding to one or more of the biomarkers defined in any one of: Table 1(A), Table 1(B), Table 1(C), Table 1(D) or Table 1(E), e.g. agents capable of binding to any of the particular combinations of the biomarkers defined in Table 1(A) as described in the first aspect.
Advantageously, the array comprises antibodies, or antigen-binding fragments thereof, capable of binding to all of the biomarkers at the protein level.
Advantageously, the array comprises agents capable of binding to all of the biomarkers at the mRNA and/or DNA level.
A further aspect of the invention provides the use of one or more biomarkers selected from the group defined in Table 1(A), Table 1(B), Table 1(C), Table 1(D) and/or Table 1(E) as biomarkers for diagnosing or detecting an autoimmune disease in an individual.
Optionally, the autoimmune disease is selected from systemic lupus erythematosus (SLE), rheumatoid arthritis (RA), Sjögren's syndrome (SS) or systemic vasculitis (SV).
In one embodiment all of the biomarkers defined in Table 1(A), Table 1(B), Table 1(C), Table 1(D) and/or Table 1(E) are used as biomarkers for diagnosing or detecting an autoimmune disease in an individual. Optionally, the autoimmune disease is selected from systemic lupus erythematosus (SLE), rheumatoid arthritis (RA), Sjögren's syndrome (SS) or systemic vasculitis (SV).
A further aspect of the invention provides kit for diagnosing or detecting an autoimmune disease in an individual comprising:
A further aspect of the invention provides a kit for determining the presence of, or risk of having, an autoimmune disease in an individual comprising:
A further aspect of the invention provides a use of one or more binding moieties to a biomarker as described herein (e.g. in Table 1(A)) in the preparation of a kit for diagnosing or determining an autoimmune disease-associated disease state in an individual. Thus, multiple different binding moieties may be used, each targeted to a different biomarker, in the preparation of such as kit. In one embodiment, the binding moiety is an antibody or antigen-binding fragment thereof (e.g. scFv), as described herein.
A further aspect of the invention provides a method of treating an autoimmune disease in an individual comprising the steps of:
For example, the autoimmune disease therapy may be selected from SLE, RA, SS or SV.
A further aspect of the invention provides a computer program for operating the methods the invention, for example, for interpreting the expression data of step (c) (and subsequent expression measurement steps) and thereby diagnosing or determining an autoimmune disease-associated disease state. The computer program may be a programmed SVM. The computer program may be recorded on a suitable computer-readable carrier known to persons skilled in the art. Suitable computer-readable-carriers may include compact discs (including CD-ROMs, DVDs, Blu-ray and the like), floppy discs, flash memory drives, ROM or hard disc drives. The computer program may be installed on a computer suitable for executing the computer program.
Preferred, non-limiting examples which embody certain aspects of the invention will now be described, with reference to the following figures:
Early and correct diagnosis of autoimmune diseases (AID) pose a clinical challenge due to the multifaceted nature of symptoms which also may change over time. The aim of this study was to perform protein expression profiling of the four systemic AIDs: Systemic Lupus Erythematosus (SLE); Systemic Vasculitis (SV), e.g. ANCA associated Vasculitis; Rheumatoid Arthritis (RA); and Sjögrens Syndrome (SS), and healthy controls and to identify candidate biomarker signatures for differential classification.
A total of 316 serum samples collected from patients diagnosed with SLE, RA, SS, SV and healthy controls, were analysed using 394-plex recombinant antibody microarrays. Differential protein expression profiling was performed using Wilcoxon rank sum test and condensed biomarker panels were identified using advanced bioinformatics and state-of-the art classification algorithms to pinpoint signatures reflecting disease.
In this study we were able to classify individual AIDs with high accuracy, as demonstrated by ROC Area Under the Curve (ROC AUC) values ranging between 0.955 to 0.803. In addition, the group of AIDs could be separated from healthy controls at a ROC AUC-value of 0.938. Disease specific candidate biomarker signature as well as a general autoimmune signature were identified, including several deregulated analytes.
This study supports the rationale of using multiplexed affinity-based technologies to reflect the biological complexity of autoimmune diseases. A multiplexed approach for decoding multifactorial, complex diseases such as autoimmune diseases, will play a significant role for future diagnostic purposes, essential to prevent severe organ and tissue related damage.
This retrospective study included a total of 316 serum samples collected from healthy controls (n=77) and patients diagnosed with a systemic autoimmune disease (n=239). All samples were collected from Department of Rheumatology Skåne University Hospital (Malmo or Lund). Patients were diagnosed with either SLE (n=39), RA (n=45), SS (n=73) or SV (n=82) and considered, according to their specific clinical criteria's, to be in an active disease when samples were collected. For SLE patients, disease activity was defined using the SLEDAI-2000-score (19) (mean score 7, range 1-19), and all RA patients demonstrated elevated CRP levels (mean 31 (13-55) mg/L). ANCA-specificity in SV patients were defined according to MPO+/− or PR3+/− status and all Sjögren samples were collected from patients that fulfilled the 2002 American-European Consensus Group Criteria (20) for primary SS. As controls, serum samples from healthy individuals with no previous history of autoimmune disease were used. Within the AID cohort mean age was 59 years and the female to male ratio 168:82 whereas the mean age in healthy controls was 60 years and a female to male ratio of 66:11 (Table 3). Ethical approval for the study was granted by the regional ethics review board in Lund, Sweden.
A total of 394 recombinant scFv antibodies were selected from in-house designed large phage display libraries (21, 22) (Supplementary Table S1). Of these, 379 of the scFv antibodies were directed against 161 (mainly immunoregulatory) antigens. The remaining 15 scFv antibodies were directed against 15 short amino acid motifs (4-6 amino acids long), denoted CIMS antibodies. For some analytes more than one scFv antibody clone (2-9) targeting different epitopes, were chosen to minimize the risk of impaired antibody activity followed by epitope masking during sample labelling process. All scFv antibodies were produced, according to standardized protocols, in 15 mL E. coli cultures and purified to from the cell periplasmic space using the MagneHis™ Protein Purification system (Promega, Madison, Wis., USA) and a KingFisher96 robot (Thermo Fisher Scientific. Waltham, Mass., USA). Buffer exchange to PBS was performed using a Zeba™ 96-well desalt spin plate (Pierce) and concentration and purity of the scFvs was determined using Nanodrop at 280 nm (NanoDrop Technologies, Wilmington, USA) and 10% SDS-PAGE (InVitrogen, Carlsbad, Ca, USA). Production of 26×28 subarrays were generated by a noncontact printer (SciFlexarrayer S11, Scenion, Berlin, Germany). Briefly described, single droplets (300 pL) of scFv antibody solutions, PBS (blank) or biotinylated BSA (position marker), were printed on Blank Polymer Maxisorp slides (NUNC A/S, Roskilde, Denmark) and allowed to absorb to the surface. Antibody microarrays were analysed as previously described (23). In brief, biotinylated samples were added to individual subarrays, and bound proteins were detected using Alexa-647 labelled streptavidin. Slides were scanned at 635 nm using the LS Reloaded™ laser scanner (Tecan) at a fixed laser scanning setting of 150% PMT gain.
Data pre-processing were performed as described. In brief, spot signal intensities were quantified using the Immunovia Quant™ software, v1.0 (Immunovia AB, Lund, Sweden). Signal intensities with local background subtraction were used for data analysis. Each data point represented the mean value of three technical replicate spots, unless any replicate CV exceeded 15%, in which case the worst performing replicate was eliminated and the average value of the two remaining replicates were used instead. The data was normalised using a two-step strategy. First, the data was normalised according to day-to-day variation using the “subtract by group mean” approach as previously described (24, 25). In the second step, a modified semi-global normalisation was used to minimize any array-to-array variations. In this approach 15% of the antibodies displaying the lowest CV-values over all samples were identified and used to calculate a scaling factor as previously described (26, 27). Quality control and visualization of potential outliers were performed using the Qlucore Omice Explorer 3.1 software (Qlucore AB, Lund, Sweden).
A schematic outline of the analysis process is demonstrated in
Significantly up- or down-regulated proteins were identified using Wilcoxon signed-rank test (q<0.05) and p-values were adjusted with the Benjamini and Hochberg method (28). Venn diagram including differentially expressed analytes was created at http://bioinformatics.psb.ugent.be/webtools/Venn/. For supervised classification analysis a linear support vector machine (SVM) combined with a leave-one-out classification algorithm was used to evaluate the predictive performance of a model. In the LOO CV procedure one sample was removed, and the remaining samples were used to train the model. The left-out sample was then used to test the model and the process was repeated until every sample had been used as a test sample. A decision values for each excluded sample was thus generated, corresponding to the distance to the hyper plane and a receiver operating characteristic (ROC) curve was constructed. The area under the curve (AUC) was then calculated and used as a measure of the prediction performance of the classifier.
To define a condensed biomarker signature for the differential profiling analysis a ranking procedure combined with two levels of K-fold cross validation loops were used. In short, the outer K-fold cross validation loop was used to test a condensed biomarker signature of a given length and the inner loop was used to define a ranking of the antibodies. The final condensed biomarker signature, of a given size, was then assembled using all ranking lists analyzed in the outer loop. For more details see supplemental information below.
The aim of this study was to perform differential protein expression profiling of autoimmune diseases and healthy controls and to identify condensed biomarker signatures for disease classification. To this end, a total of 316 serum samples, collected from healthy controls (n=77) and patients diagnosed with SLE (n=39), RA (n=45), SV (n=82) or SS (n=73) were analysed on 394-plex antibody microarrays. One sample collected from a patient with Sjögren syndrome was removed from analysis due to technical reasons. One antibody, targeting Keratin-19, was failed during printing process and removed from further analysis, though two clones targeting the same antigen remained. Altogether, a total of 315 samples and 393 antibodies were used for final data analysis, differential profiling, and signature development. Visualization of the data set in Qlucore™ revealed no differences in relation to array block, sample labelling day, assay day or scanning positions, suggesting that eventual technical differences had successfully been removed during normalisation.
In a first step of analysis we wanted to investigate if a signature reflecting AID (including SLE, RA, SS and SV) could be identified. Altogether, we were able to demonstrate that AID could be separated from healthy controls and that a biomarker signature, indicative for AID could indeed be identified. This was done by using SVM analysis combined with a leave-one-out cross validation, including all antibodies (n=393), which demonstrated that AIDs could be separated from healthy controls with a ROC AUC-value of 0.938 (
Next, we were interested in which analytes were deregulated among the AIDs. By using Wilcoxon signed rank test, a total of 77 analytes, targeted by 114 antibodies were found to be differentially expressed (q<0.05) between AIDs and healthy controls. Among the upregulated some of the most interesting included antibodies targeting Apolipoprotein A1, IL-6, IL-12, TNF-α, IL-16, Osteopontin, PRKCZ and DLG4, whereas antibodies targeting C3, IL-4, VEGF, KKCC1-1 and SPDLY-1 were found among the downregulated. A heatmap including the top 25 antibodies and their corresponding analytes revealed some separation of the two groups (
Considering that many autoimmune diseases display similar symptoms, making clinical diagnosis challenging, we turned our focus towards the AID group for protein expression profiling analysis (
Again, we were interested in if the different groups could be separated using shorter biomarker signatures. Condensed biomarker signatures for SLE, RA, SS and SV respectively were identified using the same procedure as previous (Supplementary data table S2, B-E). Herein, using the disease-specific signatures, SLE was again found to be classified with highest accuracy (AUC=0.964) followed by SV (AUC=0.939) SS (AUC=0.795) and RA (AUC=0.793), as presented by PCA-plots in
To further explore the serum proteomes of SLE, RA, SS and SV, differentially expressed analytes for respective disease type were identified (Wilcoxon. q<0.05) (Supplemental table S3 B-E). In total, the highest number of differentially expressed analytes was found for SV (n=326 antibodies targeting 160 analytes) followed in decreasing order by SS (n=207 antibodies targeting 127 analytes), SLE (n=127 antibodies targeting 85 analytes) and RA (n=114 antibodies targeting 81 analytes). Considering the complexity of underlying molecular alterations in AID and that both common as disease specific alteration would be of interest, we investigated the amount of overlap of analytes. Firstly, we investigated the overlap based on an antibody level, i.e. relating to the specific clones, irrespective of which analytes they targeted. This revealed a major overlap (
Autoimmune diseases today pose a global health issue, affecting millions of people around the globe (29). Diffuse, general symptoms, such as fatigue, inflammation and joint pain, that change in severity over time, shared among several diseases, make clinical diagnosis challenging and there is an urgent need for refined clinical tools for early and differential diagnosis. In this study, candidate biomarker signatures for the autoimmune diseases RA, SLE, SS and SV, were pinpointed. Altogether the results showed that leave-one-out cross validation analysis including all antibodies (n=393) could accurately classify individual AIDs at AUC-values ranging between 0.955-0.803 (
Based from the classification analysis, SLE and SV were found to be easiest to separate from the others (AUC values of 0.955 and 0.937 respectively) while RA and SS were a bit more difficult to separate (AUC=0.858 and AUC=0.803 respectively) (
Important to address, is the low number of samples used in this study which confer a limiting factor since an independent data set for validation was lacking. The use of supervised learning algorithms may pose a problem when its applied in small data sets due to the risk of overfitting, which may lead to poor performance in new sample sets (38, 39). Considering this, the approach used for feature extraction and subsequent generation of condensed signatures in this study was carefully selected to avoid the risk of overtraining. Ultimately, a short signature with high predictive power may always from a logistical and cost-effective view be preferred. However, there is always a trade between the length of the signature and performance, which is why we in this first study, compromised to include 40 antibodies in the final consensus lists. Also, the high number of antibodies most likely reflect that pinpointed diseases do share similar pathogenetic pathways, and thus a higher number of antibodies for differential diagnosis, may from this perspective be necessary. This may also be supported by the major overlap of analytes observed from the differential analysis (Wilcoxon) (
Based from the differential protein expression analysis, only a small number of disease specific analytes were found (Supplemental table S3B-E,
In this study several analytes, involved in immunoregulatory response, were found to be deregulated among the AIDs compared to the healthy controls (Supplemental table S3A). As expected, one of the upregulated analytes was TNF-α which already has been demonstrated as a promising therapeutic target for treatment with biological TNF inhibitors, especially in RA (41, 42). Other analytes included the pro-inflammatory cytokine IL-6, which also is highly interesting from a therapeutic perspective when it comes to treatment of blockade strategy treatment in autoimmune diseases (43). The level of Osteopontin has previously been demonstrated to be elevated in SLE patients which we could confirm in this study. Osteopontin has been suggested to be associated with SLE development and a potential marker for SLE activity and organ damage (44). Altogether, these data suggest that a more general autoimmune signature may be present, including several already known and novel markers that may play significant roles within autoimmunity. In addition, the finding of a candidate biomarker signature for classification of AIDs from healthy controls, which is supported also from other studies (14) further strengthen the potential of using our antibody microarray platform for biomarker discovery in autoimmune diseases. A tool, able to function as sensor for autoimmune diseases, resulting in the transferal of patients to the right instance, would be of high significance for early and correct diagnosis.
The four systemic autoimmune diseases (SLE, RA, SS and SV) analysed in this study were chosen based on the fact that they share many clinical symptoms and in addition, three of them e.g. SLE, RA and SS, are among the most common AIDs. SV is not that common, though associated with a very poor prognosis.
We here demonstrate that a general AID biomarker signature could be delineated and that individual AIDs (SLE, RA, SS and SV) could be classified at high accuracies using a multiplexed microarray. These results together with previous studies (15, 16, 27, 34), suggest the fact that the use of a multiplexed approach is more suitable for decoding multifactorial diseases such as autoimmune diseases and will play a significant role for future diagnostic purposes, essential to prevent severe organ and tissue related damage.
A linear support vector machine (SVM) was used as the classification method when defining the condensed biomarker signature. See the scripts detailed in Supplementary table S4.
To rank a given signature a 5-fold cross validation scheme, repeated 15 times, was used as follows: (i) For each training dataset an SVM model was trained. (ii) The corresponding validation dataset was used to estimate the importance of each individual protein in the signature. This was accomplished by removing a given protein (i.e. replacing its expression value by the mean value over all samples) and measure the change of the validation performance. An important protein will result in a large decrease of the validation performance. This procedure was repeated for each validation dataset in the repeated K-fold cross validation procedure. The average change of validation performance for each protein was then computed, giving a final ranking list of all proteins in the signature.
To obtain an unbiased estimate of the performance of a condensed signature, according to the computed ranking list, it is not possible to again use the dataset used to obtain the ranking of the proteins. An additional test set is needed. To this end an outer 5-fold cross validation loop, repeated two times, was introduced with the purpose of evaluating condensed signatures of different lengths. The average test AUC value was used as the estimate of the performance of a condensed signature with a given length.
The different ranking lists generated by the outer loop are slightly different from each other in terms of the rank of a specific protein. The final condensed signature of a given length was assembled by log-rank averaging of all ten lists.
Number | Date | Country | Kind |
---|---|---|---|
1904472.6 | Mar 2019 | GB | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2020/058767 | 3/27/2020 | WO | 00 |