The present invention relates to biomarkers for the diagnosis, characterisation and prognosis of systemic lupus erythematosus (SLE), as well as signatures and arrays thereof and methods for use of the same.
Systemic lupus erythematosus (SLE) is a chronic, multisystem, autoimmune 20 connective tissue disease with a broad range of clinical manifestations. The disease aetiology is linked to multiple factors, including genetic, environmental, and hormonal factors, but the underlying mechanism is still largely unknown. Up to 30-50% of the SLE patients might suffer from glomerulonephritis, a condition of renal involvement and considered one of the most severe manifestation of SLE. Renal involvement in SLE carries significant morbidity and mortality.
Clinical manifestations vary widely among patients, and the signs and symptoms evolve over time, and overlap with those of other autoimmune diseases, why SLE is often misdiagnosed and/or overlooked. In fact, patients may spend up to four years and see three or more physicians before the disease is correctly diagnosed. On the other hand, SLE is also often over-diagnosed. The diagnosis of SLE in clinical practice is usually made according to the principles outlined by Fries and Holman; presence of typical manifestations from at least two organ systems in combination with immunological abnormality consistent with SLE in the absence of a better diagnostic alternative. However, during last years it has been concluded that a biopsy verified lupus glomerulonephritis in combination with immunological abnormality should be accepted for SLE diagnosis. Hence, novel means for improved diagnosis of SLE are needed.
Further, SLE classification criteria have been defined by the American College for Rheumatology (ACR) and more recently from systemic lupus International Collaborating Clinics (SLICC). According to ACR, SLE is classified when at least 4 of 11 clinical and/or immunological criteria, shared by many diseases, are fulfilled. In the case of SLICC, SLE is classified if (i) at least 4 of 17 clinical and immunological criteria, or (ii) biopsy verified lupus nephritis in the presence of antinuclear antibodies (ANA) or anti-dsDNA antibodies are met. In practice, this means that patients can display a very diverse set of symptoms, but all still be classified as similar.
Although major efforts have been made to decipher SLE-associated biomarkers, the output of validated and clinically useful biomarkers is still limited. In fact, there is no single laboratory blood- or urine-based test yet at hand that specifically and accurately can confirm or rule out the diagnosis of SLE. This lack of adequate biomarkers for SLE has hampered proper clinical management of patients with SLE. Considering the complexity and heterogeneity of SLE, a multiplex biomarker panel, rather than a single biomarker will be required to resolve this clinical need, placing high demands on the technologies used for biomarker discovery.
The present invention provides an optimized recombinant antibody microarray platform. An optimized procedure for handling and analysing the microarray data was adopted. Further, the method allows SLE to be classified irrespective of the phenotype.
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
The present invention provides an optimized recombinant antibody microarray platform. An optimized procedure for handling and analysing the microarray data was adopted. Further, the method allows SLE to be classified irrespective of the phenotype.
Accordingly, the first aspect the invention provides a method for determining a systemic lupus erythematosus-associated disease state in a subject comprising measuring the presence and/or amount in a test sample of one or more biomarker selected from the group defined in Table A, wherein the presence and/or amount in the one more test sample of the one or more biomarker(s) selected from the group defined in Table A is indicative of a systemic lupus erythematosus-associated disease state.
Alternatively or additionally, the first aspect the invention provides a method for determining a systemic lupus erythematosus-associated disease state in a subject comprising the steps of:
Thus, the invention provides biomarkers and biomarker signatures for determining a systemic lupus erythematosus-associated disease state in a subject.
By “systemic lupus erythematosus-associated disease state” we include the diagnosis, prognosis and/or characterisation of phenotypic subtype of SLE in the subject.
Thus, in one embodiment, the method is for diagnosing SLE in a subject.
Preferably, the individual is a human, but may be any mammal such as a domesticated mammal (preferably of agricultural or commercial significance including a horse, pig, cow, sheep, dog and cat).
For the avoidance of doubt, test samples from more than one disease state may be provided in step (a), for example, ≥2, 3, ≥4, ≥5, ≥6 or 7 different disease states. Step (a) may provide at least two test samples, for example, 3, 4, ≥5, ≥6, ≥7, ≥8, ≥9, ≥10, ≥15, ≥20, ≥25, ≥50 or 100 test samples. Where multiple test samples are provided, they may be of the same type (e.g., all serum or urine samples) or of different types (e.g., serum and urine samples).
In one embodiment, the method further comprises the steps of:
For the avoidance of doubt, control samples from more than one disease state may be provided in step (c), for example, 2, ≥3, ≥4, ≥5, ≥6 or ≥7 different disease states. Step (c) may provide at least two control samples, for example, ≥3, 4, ≥5, ≥6, ≥7, ≥8, ≥9, ≥10, ≥15, ≥20, ≥25, ≥50 or ≥100 control samples Where multiple control samples are provided, they may be of the same type (e.g., all serum or urine samples) or of different types (e.g., serum and urine samples). Preferably the test samples types and control samples types are matched/corresponding.
The healthy individual may be free from SLE, autoimmune disease and/or renal disease. The healthy individual may be free from any form of disease.
The control sample of step (c) may be provided from an individual with:
The control sample of step (c) may be provided from an individual with systemic lupus erythematosus subtype 1 (SLE-1), systemic lupus erythematosus subtype 2 (SLE-2) or systemic lupus erythematosus subtype 3 (SLE-3).
Alternatively or additionally the test sample of step (a) and/or the control sample of step (c) or step (e) is/are individually provided from:
SLE1 comprises skin and musculoskeletal involvement but lacks serositis, systemic vasculitis and kidney involvement. SLE2 comprises skin and musculoskeletal involvement, serositis and systemic vasculitis but lacks kidney involvement. SLE3 comprises skin and musculoskeletal involvement, serositis, systemic vasculitis and SLE glomerulonephritis. SLE1, SLE2 and SLE3 represent mild/absent, moderate and severe SLE disease states, respectively (e.g., see Sturfelt G, Sjöholm AG. Complement components, complement activation, and acute phase response in systemic lupus erythematosus. Int Arch Allergy Appl Immunol 1984; 75:75-83 which is incorporated herein by reference).
By “is different to the presence and/or amount in a control sample” we mean or include the presence and/or amount of the one or more biomarker in the test sample differs from that of the one or more control sample (or to predefined reference values representing the same). Preferably the presence and/or amount in the test sample differs from the presence or amount in the one or more control sample (or mean of the control samples) by at least +5%, for example, at least ±6%, 7%, 8%, 9%, 10%, ±11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, +23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, +35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 41%, +42%, 43%, 44%, 55%, 60%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, +72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, +84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, +96%, 97%, 98%, 99%, 100%, 125%, 150%, 175%, 200%, 225%, 250%, +275%, 300%, 350%, 400%, 500% or at least ±1000% of the one or more control sample (e.g., the negative control sample).
Alternatively or additionally, the presence or amount in the test sample differs from the mean presence or amount in the control samples by at least >1 standard deviation from the mean presence or amount in the control samples, for example, ≥1.5, ≥2, ≥3, ≥4, ≥5, ≥6, ≥7, ≥8, ≥9, ≥10, ≥11, ≥12, 13, 14 or 15 standard deviations from the from the mean presence or amount in the control samples. Any suitable means may be used for determining standard deviation (e.g., direct, sum of square, Welford's), however, in one embodiment, standard deviation is determined using the direct method (i.e., the square root of [the sum the squares of the samples minus the mean, divided by the number of samples]).
Alternatively or additionally, by “is different to the presence and/or amount in a control sample” we mean or include that the presence or amount in the test sample does not correlate with the amount in the control sample in a statistically significant manner. By “does not correlate with the amount in the control sample in a statistically significant manner” we mean or include that the presence or amount in the test sample correlates with that of the control sample with a p-value of >0.001, for example, >0.002, >0.003, >0.004, >0.005, >0.01, >0.02, >0.03, >0.04 >0.05, >0.06, >0.07, >0.08, >0.09 or >0.1.
Any suitable means for determining p-value known to the skilled person can be used, including z-test, t-test, Student's t-test, f-test, Mann-Whitney U test, Wilcoxon signed-rank test and Pearson's chi-squared test.
In an alternative or additional embodiment the method comprises the steps comprising or consisting of:
By “corresponds to the presence and/or amount in a control sample” we mean or include the presence and/or amount is identical to that of a positive control sample; or closer to that of one or more positive control sample than to one or more negative control sample (or to predefined reference values representing the same). Preferably the presence and/or amount is within ±40% of that of the one or more control sample (or mean of the control samples), for example, within ±39%, 38%, 37%, 36%, +35%, 34%, 33%, 32%, 31%, 30%, 29%, 28%, 27%, 26%, 25%, 24%, +23%, 22%, 21%, 20%, 19%, 18%, 17%, 16%, 15%, 14%, 13%, 12%, +11%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.05% or within 0% of the one or more control sample (e.g., the positive control sample).
Alternatively or additionally, the difference in the presence or amount in the test sample is ≤5 standard deviation from the mean presence or amount in the control samples, for example, ≤4.5, ≤4, ≤3.5, ≤3, ≤2.5, ≤2, ≤1.5, ≤1.4, ≤1.3, ≤1.2, ≤1.1, ≤1, ≤0.9, ≤0.8, ≤0.7, ≤0.6, ≤0.5, ≤0.4, ≤0.3, ≤0.2, ≤0.1 or 0 standard deviations from the from the mean presence or amount in the control samples, provided that the standard deviation ranges for differing and corresponding biomarker expressions do not overlap (e.g., abut, but no not overlap).
Alternatively or additionally, by “corresponds to the presence and/or amount in a control sample” we mean or include that the presence or amount in the test sample correlates with the amount in the control sample in a statistically significant manner. By “correlates with the amount in the control sample in a statistically significant manner” we mean or include that the presence or amount in the test sample correlates with the that of the control sample with a p-value of ≤0.05, for example, ≤0.04, ≤0.03, ≤0.02, ≤0.01, ≤0.005, ≤0.004, ≤0.003, ≤0.002, ≤0.001, ≤0.0005 or ≤0.0001.
Differential expression (up-regulation or down regulation) of biomarkers, or lack thereof, can be determined by any suitable means known to a skilled person. Differential expression is determined to a p value of a least less than 0.05 (p=<0.05), for example, at least <0.04, <0.03, <0.02, <0.01, <0.009, <0.005, <0.001, <0.0001, <0.00001 or at least <0.000001. Alternatively or additionally, differential expression is determined using a support vector machine (SVM). Alternatively or additionally, the SVM is an SVM as described below.
It will be appreciated by persons skilled in the art that differential expression may relate to a single biomarker or to multiple biomarkers considered in combination (i.e. as a biomarker signature). Thus, a p value may be associated with a single biomarker or with a group of biomarkers. Indeed, proteins having a differential expression p value of greater than 0.05 when considered individually may nevertheless still be useful as biomarkers in accordance with the invention when their expression levels are considered in combination with one or more other biomarkers.
As exemplified in the accompanying examples, the expression of certain biomarkers in a tissue, blood, serum or plasma test sample may be indicative of an SLE-associated disease state in an individual. For example, the relative expression of certain serum proteins in a single test sample may be indicative of the activity of SLE in an individual.
In an alternative or additional embodiment the presence and/or amount in the test sample of the one or more biomarkers measured in step (b) are compared against predetermined reference values representative of the measurements in steps (d) and/or (f).
In one embodiment, step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more of the biomarkers defined in Table A, for example, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62 or 63 of the biomarkers defined in Table A.
In one embodiment, step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more of the biomarkers defined in Table A(i), for example, two of the biomarkers defined in Table A(i).
In one embodiment, step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more of the biomarkers defined in Table A(ii), for example, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18 or 19 of the biomarkers defined in Table A(ii).
In one embodiment, step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more of the biomarkers defined in Table A(iii), for example, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 or 42 of the biomarkers defined in Table A(iii).
Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of MYOM2. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of ORP-3. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of APOA1.
Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of APOA4. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of ATP5B. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of CHX10. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of TBC1 D9. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of UPF3B. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of LUM, Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of Digoxin. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of Surface Ag X. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of Motif (10) Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of Motif (13). Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of Motif (14). Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of Motif (15). Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of Motif (2). Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of Motif (4). Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of Motif (5). Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of Motif (6). Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of Motif (7). Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of Motif (8). Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of Angiomotin. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of C1-INH. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of C1q. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of C3. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of C4. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of CD40. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of CD40 ligand. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of Cystatin C. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of Factor B. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of GLP-1. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of GLP-1R. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of IgM. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of IL-11. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of IL-12. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of IL-13. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of IL-16. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of IL-18. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of IL-1ra. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of IL-2. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of IL-3. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of IL-4. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of IL-5. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of IL-6. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of IL-8. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of IL-9. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of Integrin α-10. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of JAK3. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of LDL. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of Lewis X. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of Lewis Y. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of MCP-1. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of MCP-3. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of MCP-4. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of Procathepsin W. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of Properdine. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of RANTES. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of Sialle Lewis X. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of TGF-β1. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of TM peptide. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of TNF-α. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of TNF-β. Alternatively or additionally, step (b) comprises, consists of or excludes measuring the presence and/or amount in the test sample of VEGF.
In one embodiment, the biomarker mRNA and/or amino acid sequences correspond to those available on the GenBank database (http://www.ncbi.nlm.nih.gov/genbank/) and natural variants thereof. In a further embodiment, the biomarker mRNA and/or amino acid sequences correspond to those available on the GenBank database on 7 Jun. 2016.
Alternatively or additionally, the method excludes the use of biomarkers that are not listed in Table A and/or the present Examples section.
By ‘TM peptide’ we mean a peptide derived from a 10TM protein, to which the scFv antibody construct of SEQ ID NO:1 below has specificity (wherein the CDR sequences are bolded):
KGLE
FTISRDNSKNTLYLQMNSLRAEDTAV
WYQQLPGTAPKLLIY
GV
FGGGTKLT
Hence, this scFv may be used or any antibody, or antigen binding fragment thereof, that competes with this scFv for binding to the 10TM protein. For example, the antibody, or antigen binding fragment thereof, may comprise the same CDRs as present in SEQ ID NO:1.
It will be appreciated by persons skilled in the art that such an antibody may be produced with an affinity tag (e.g., at the C-terminus) for purification purposes. For example, an affinity tag of SEQ ID NO:2 below may be utilised:
By ‘Motif #’ (wherein ‘#’ represents a number) we include a protein comprising the selection motif shown in Table B. Alternatively or additionally we include a protein specifically bound by an antibody having the CDRs defined in Table B in respect of the motif in question. Alternatively or additionally the antibody has a framework region as defined in Olsson et al., 2012, ‘Epitope-specificity of recombinant antibodies reveals promiscuous peptide-binding properties.’ Protein Sci., 21(12):1897-910.
By “expression” we include the level or amount of a gene product such as mRNA or protein.
Generally, the systemic lupus erythematosus-associated disease state in a subject is determined with an ROC AUC of at least 0.55, for example with an ROC AUC of at least, 0.60, 0.65, 0.70, 0.75, 0.80, 0.85, 0.90, 0.95, 0.96, 0.97, 0.98 or with an ROC AUC of at least 0.99. Preferably, the systemic lupus erythematosus-associated disease state in an individual is determined with an ROC AUC of at least 0.85.
Typically, the systemic lupus erythematosus-associated disease state in a subject is determined using a support vector machine (SVM), such as those available from http://cran.r-project.org/web/packages/e1071/index.html (e.g. e1071 1.5-24). However, any other suitable means may also be used.
Support vector machines (SVMs) are a set of related supervised learning methods used for classification and regression. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that predicts whether a new example falls into one category or the other. Intuitively, an SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on.
More formally, a support vector machine constructs a hyperplane or set of hyperplanes in a high or infinite dimensional space, which can be used for classification, regression or other tasks. Intuitively, a good separation is achieved by the hyperplane that has the largest distance to the nearest training datapoints of any class (so-called functional margin), since in general the larger the margin the lower the generalization error of the classifier. For more information on SVMs, see for example, Burges, 1998, Data Mining and Knowledge Discovery, 2:121-167.
In one embodiment of the invention, the SVM is ‘trained’ prior to performing the methods of the invention using proteome samples from subjects assigned to known patient groups (namely, those patients in which the systemic lupus erythematosus-associated disease state is present versus those patients in which it is absent). By running such training samples, the SVM is able to learn what biomarker profiles are associated with the systemic lupus erythematosus-associated disease state. Once the training process is complete, the SVM is then able whether or not the proteome sample tested is from a subject a systemic lupus erythematosus-associated disease state.
However, this training procedure can be by-passed by pre-programming the SVM with the necessary training parameters. For example, a systemic lupus erythematosus-associated disease state in a subject can be determined using SVM parameters based on the measurement of some or all the biomarkers listed in Table A.
It will be appreciated by skilled persons that suitable SVM parameters can be determined for any combination of the biomarkers listed Table A by training an SVM machine with the appropriate selection of data (i.e. biomarker measurements in samples from known patient groups.
Alternatively, the data provided in the present figures and tables may be used to determine a particular SLE-associated disease state according to any other suitable statistical method known in the art, such as Principal Component Analysis (PCA) Orthogonal PCA (OPLS) and other multivariate statistical analyses (e.g., backward stepwise logistic regression model). For a review of multivariate statistical analysis see, for example, Schervish, Mark J. (November 1987). “A Review of Multivariate Analysis”. Statistical Science 2 (4): 396-413 which is incorporated herein by reference.
Preferably, the method of the invention has an accuracy of at least 51%, for example 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% accuracy.
Preferably, the method of the invention has a sensitivity of at least 51%, for example 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sensitivity.
Preferably, the method of the invention has a specificity of at least 51%, for example 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% specificity.
By “accuracy” we mean the proportion of correct outcomes of a method, by “sensitivity” we mean the proportion of all positive chemicals that are correctly classified as positives, and by “specificity” we mean the proportion of all negative chemicals that are correctly classified as negatives.
Alternatively or additionally, the method is for diagnosing systemic lupus erythematosus in an individual; wherein the presence and/or amount in the test sample of the one or more biomarker(s) selected from the group defined in Table A is indicative of whether the individual has systemic lupus erythematosus. For example, step (b) may comprise or consist of measuring the presence and/or amount in the test sample of all of the biomarkers defined in Table A(i), Table A(iii) and/or Table A(iv).
By “diagnosing” we mean determining whether a subject is suffering from SLE. Conventional methods of diagnosing SLE are well known in the art.
The American College of Rheumatology established eleven criteria in 1982 (see Tan et al., 1982, The 1982 revised criteria for the classification of systemic lupus erythematosus, Arthritis. Rheum., 25:1271-7), which were revised in 1997 as a classificatory instrument to operationalise the definition of SLE in clinical trials (see Hochberg, 1997, Updating the American College of Rheumatology revised criteria for the classification of systemic lupus erythematosus, Arthritis. Rheum., 40:1725). For the purpose of identifying patients for clinical studies, a person is taken to have SLE if any 4 out of 11 symptoms are present simultaneously or serially on two separate occasions.
Some people, especially those with antiphospholipid syndrome, may have SLE without four of the above criteria, and also SLE may present with features other than those listed in the criteria (see Asherson et al., 2003, Catastrophic antiphospholipid syndrome: international consensus statement on classification criteria and treatment guidelines, Lupus, 12(7):530-4; Sangle et al., 2005, Livedo reticularis and pregnancy morbidity in patients negative for antiphospholipid antibodies, Ann. Rheum. Dis., 64(1):147-8; and Hughes and Khamashta, 2003, Seronegative antiphospholipid syndrome, Ann. Rheum. Dis., 62(12):1127).
Recursive partitioning has been used to identify more parsimonious criteria (see Edworthy et al., 1988, Analysis of the 1982 ARA lupus criteria data set by recursive partitioning methodology: new insights into the relative merit of individual criteria, J. Rheumatol., 15(10):1493-8). This analysis presented two diagnostic classification trees:
Alternatively or additionally, the diagnosis of SLE in is made according to the principles outlined by Fries and Holman, in: Smith L H Jr, ed. In: Smith L H Jr, ed. major Problems in Internal Medicine. Vol VI., 1976, which is incorporated herein by reference.
Other alternative set of criteria has been suggested, the St. Thomas' Hospital “alternative” criteria in 1998 (see Hughes, 1998, Is it lupus? The St. Thomas' Hospital “alternative” criteria, Clin. Exp. Rheumatol., 16(3):250-2).
However, these criteria were not intended to be used to diagnose individuals. They are time-consuming, subjective, require a high degree of experience to use effectively and have a high frequency of excluding actual SLE sufferers (i.e., diagnosing SLE patients as non-SLE patients). The present invention addresses these problems, providing objective SLE diagnosis.
Alternatively or additionally the method is for characterising systemic lupus erythematosus in an individual; wherein the presence and/or amount in the test sample of the one or more biomarker(s) selected from the group defined in Table A is indicative of whether the individual has systemic lupus erythematosus, subtype 1, subtype 2 or subtype 3. For example, step (b) may comprise or consist of measuring the presence and/or amount in the test sample of all of the biomarkers defined in Table A(i), Table A(ii) and/or Table A(iii).
By “characterising” or “classifying” we include determining the phenotypic subtype of SLE in a subject. SLE1 comprises skin and musculoskeletal involvement but lacks serositis, systemic vasculitis and kidney involvement. SLE2 comprises skin and musculoskeletal involvement, serositis and systemic vasculitis but lacks kidney involvement. SLE3 comprises skin and musculoskeletal involvement, serositis, systemic vasculitis and SLE glomerulonephritis. SLE1, SLE2 and SLE3 represent mild/absent, moderate and severe SLE disease states, respectively.
Alternatively or additionally the method is for diagnosing systemic lupus erythematosus in an individual, wherein step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more of the biomarkers defined in
Alternatively or additionally the method is for diagnosing systemic lupus erythematosus in an individual, wherein step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more of the biomarkers defined in
Alternatively or additionally the method is for diagnosing systemic lupus erythematosus in an individual, wherein step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more of the biomarkers defined in
Alternatively or additionally the method is for diagnosing and/or characterising systemic lupus erythematosus type 1 in an individual (SLE1); wherein step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more of the biomarkers defined in
Alternatively or additionally the method is for diagnosing and/or characterising systemic lupus erythematosus type 2 in an individual (SLE2); wherein step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more of the biomarkers defined in
Alternatively or additionally the method is for diagnosing and/or characterising systemic lupus erythematosus type 3 in an individual (SLE3); wherein step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more of the biomarkers defined in
Alternatively or additionally the method is for diagnosing and/or characterising systemic lupus erythematosus type 1 (SLE1), systemic lupus erythematosus type 2 (SLE2) or systemic lupus erythematosus type 3 (SLE3); wherein step (b) comprises or consists of measuring the presence and/or amount in the test sample of one or more of the biomarkers defined in
SLE disease severity and progression are conventionally determined through a clinical assessment and scoring using the following (SLEDAI-2000) criteria (see Gladman et al., 2002; J. Rheumatol., 29(2):288-91):
The corresponding score/weight is applied if a descriptor is present at the time of visit or in the proceeding 10 to 30 days. The score is then totaled. A skilled person will appreciate that the SLEDAI boundaries of passive (remissive) SLE and active (flaring) SLE may vary according to the patient group being assessed.
Thus, in one embodiment the lower range for passive (remissive) SLE may be any one of 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20; the upper range for passive (remissive) SLE may be any one of 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44 or 45; the lower range for active or high active (flaring) SLE may be any one of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20; the upper range for mid severity SLE may be any one of 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85; the upper range for active or high active (flaring) SLE may be any one of 15, 16, 17, 18, 19 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 105 or 105; with the provisos that the lower range of a particular severity level must be of a lower score than its higher range and the ranges of each severity level may not overlap.
However, in one embodiment a total SLEDAI score of 0-4 indicates passive (remissive) SLE and a score of 5 or greater indicates active (flaring).
Alternatively or additionally, an increase in SLEDAI score of >3 from the previous assessment indicates mild or moderate flare. An increase in SLEDAI score of >12 from the previous assessment indicates severe flare. A decrease in SLEDAI score of >3 from the previous assessment indicates mild or moderate remission. A decrease in SLEDAI score of >12 from the previous assessment indicates advanced remission. An increase or decrease in SLEDAI score of 53 indicates stable (neither flaring nor non-flaring) SLE.
In one embodiment, the control sample of step (c) is provided from a healthy individual or an individual with systemic lupus erythematosus.
In an alternative or additional embodiment step (b) comprises measuring the expression of the protein or polypeptide of the one or more biomarker(s).
Methods of detecting and/or measuring the concentration of protein and/or nucleic acid are well known to those skilled in the art, see for example Sambrook and Russell, 2001, Cold Spring Harbor Laboratory Press.
Preferred methods for detection and/or measurement of protein include Western blot, North-Western blot, immunosorbent assays (ELISA), antibody microarray, tissue microarray (TMA), immunoprecipitation, in situ hybridisation and other immunohistochemistry techniques, radioimmunoassay (RIA), immunoradiometric assays (IRMA) and immunoenzymatic assays (IEMA), including sandwich assays using monoclonal and/or polyclonal antibodies. Exemplary sandwich assays are described by David et al., in U.S. Pat. Nos. 4,376,110 and 4,486,530, hereby incorporated by reference. Antibody staining of cells on slides may be used in methods well known in cytology laboratory diagnostic tests, as well known to those skilled in the art.
Typically, ELISA involves the use of enzymes which give a coloured reaction product, usually in solid phase assays. Enzymes such as horseradish peroxidase and phosphatase have been widely employed. A way of amplifying the phosphatase reaction is to use NADP as a substrate to generate NAD which now acts as a coenzyme for a second enzyme system. Pyrophosphatase from Escherichia coli provides a good conjugate because the enzyme is not present in tissues, is stable and gives a good reaction colour. Chemi-luminescent systems based on enzymes such as luciferase can also be used.
Conjugation with the vitamin biotin is frequently used since this can readily be detected by its reaction with enzyme-linked avidin or streptavidin to which it binds with great specificity and affinity.
Preferred methods for detection and/or measurement of nucleic acid (e.g. mRNA) include southern blot, northern blot, polymerase chain reaction (PCR), reverse transcriptase PCR (RT-PCR), quantitative real-time PCR (qRT-PCR), nanoarray, microarray, macroarray, autoradiography and in situ hybridisation.
In one embodiment, step (b), (d) and/or step (f) is performed using a first binding agent capable of binding to the one or more biomarker(s).
Binding agents (also referred to as binding molecules) can be selected from a library, based on their ability to bind a given motif, as discussed below.
In one embodiment, the first binding agent is an antibody or a fragment thereof.
Thus, a fragment may contain one or more of the variable heavy (VH) or variable light (VL) domains. For example, the term antibody fragment includes Fab-like molecules (Better et al (1988) Science 240, 1041); Fv molecules (Skerra et al (1988) Science 240, 1038); single-chain Fv (ScFv) molecules where the VH and VL partner domains are linked via a flexible oligopeptide (Bird et al (1988) Science 242, 423; Huston et al (1988) Proc. Natl. Acad. Sci. USA 85, 5879) and single domain antibodies (dAbs) comprising isolated V domains (Ward et al (1989) Nature 341, 544).
The term “antibody variant” includes any synthetic antibodies, recombinant antibodies or antibody hybrids, such as but not limited to, a single-chain antibody molecule produced by phage-display of immunoglobulin light and/or heavy chain variable and/or constant regions, or other immunointeractive molecule capable of binding to an antigen in an immunoassay format that is known to those skilled in the art.
A general review of the techniques involved in the synthesis of antibody fragments which retain their specific binding sites is to be found in Winter & Milstein (1991) Nature 349, 293-299.
Additionally or alternatively at least one type, more typically all of the types, of the 35 binding molecules is an aptamer.
Molecular libraries such as antibody libraries (Clackson et al, 1991, Nature 352, 624-628; Marks et al, 1991, J Mol Biol 222(3): 581-97), peptide libraries (Smith, 1985, Science 228(4705): 1315-7), expressed cDNA libraries (Santi et al (2000) J Mol Biol 296(2): 497-508), libraries on other scaffolds than the antibody framework such as affibodies (Gunneriusson et al, 1999, App/Environ Microbiol 65(9): 4134-40) or libraries based on aptamers (Kenan et al, 1999, Methods Mol Biol 118, 217-31) may be used as a source from which binding molecules that are specific for a given motif are selected for use in the methods of the invention.
The molecular libraries may be expressed in vivo in prokaryotic (Clackson et al, 1991, op. cit.; Marks et al, 1991, op. cit.) or eukaryotic cells (Kieke et al, 1999, Proc Natl Acad Sci USA, 96(10):5651-6) or may be expressed in vitro without involvement of cells (Hanes & Pluckthun, 1997, Proc Natl Acad Sci USA 94(10):4937-42; He & Taussig, 1997, Nucleic Acids Res 25(24):5132-4; Nemoto et al, 1997, FEBS Lett, 414(2):405-8).
In cases when protein based libraries are used often the genes encoding the libraries of potential binding molecules are packaged in viruses and the potential binding molecule is displayed at the surface of the virus (Clackson et al, 1991, op. cit.; Marks et al, 1991, op. cit; Smith, 1985, op. cit.).
The most commonly used such system today is filamentous bacteriophage displaying antibody fragments at their surfaces, the antibody fragments being expressed as a fusion to the minor coat protein of the bacteriophage (Clackson et al, 1991, op. cit.; Marks et al, 1991, op. cit). However, also other systems for display using other viruses (EP 39578), bacteria (Gunneriusson et al, 1999, op. cit.; Daugherty et al, 1998, Protein Eng 11(9):825-32; Daugherty et al, 1999, Protein Eng 12(7):613-21), and yeast (Shusta et al, 1999, J Mol Biol 292(5):949-56) have been used.
In addition, recently, display systems utilising linkage of the polypeptide product to its encoding mRNA in so called ribosome display systems (Hanes & Pluckthun, 1997, op. cit.; He & Taussig, 1997, op. cit.; Nemoto et al, 1997, op. cit.), or alternatively linkage of the polypeptide product to the encoding DNA (see U.S. Pat. No. 5,856,090 and WO 98/37186) have been presented.
When potential binding molecules are selected from libraries one or a few selector peptides having defined motifs are usually employed. Amino acid residues that provide structure, decreasing flexibility in the peptide or charged, polar or hydrophobic side chains allowing interaction with the binding molecule may be used in the design of motifs for selector peptides. For example:
Typically selection of binding molecules may involve the use of array technologies and systems to analyse binding to spots corresponding to types of binding molecules.
In one embodiment, the antibody or fragment thereof is a recombinant antibody or fragment thereof (such as an scFv).
By “ScFv molecules” we mean molecules wherein the VH and VL partner domains are linked via a flexible oligopeptide.
The advantages of using antibody fragments, rather than whole antibodies, are several-fold. The smaller size of the fragments may lead to improved pharmacological properties, such as better penetration of solid tissue. Effector functions of whole antibodies, such as complement binding, are removed. Fab, Fv, ScFv and dAb antibody fragments can all be expressed in and secreted from E. coli, thus allowing the facile production of large amounts of the said fragments.
Whole antibodies, and F(ab′)2 fragments are “bivalent”. By “bivalent” we mean that the said antibodies and F(ab′)2 fragments have two antigen combining sites. In contrast, Fab, Fv, ScFv and dAb fragments are monovalent, having only one antigen combining sites.
The antibodies may be monoclonal or polyclonal. Suitable monoclonal antibodies may be prepared by known techniques, for example those disclosed in “Monoclonal Antibodies: A manual of techniques”, H Zola (CRC Press, 1988) and in “Monoclonal Hybridoma Antibodies: Techniques and applications”, J G R Hurrell (CRC Press, 1982), both of which are incorporated herein by reference.
In one embodiment, the antibody or fragment thereof is selected from the group consisting of: scFv; Fab; a binding domain of an immunoglobulin molecule.
Alternatively or additionally, antibody or antigen-binding fragment is capable of competing for binding to a biomarker specified in Table A with an antibody for that biomarker defined in Table E.
By “capable of competing” for binding to a biomarker specified in Table A with an antibody molecule as defined herein (or a variant, fusion or derivative of said antibody or antigen-binding fragment, or a fusion of a said variant or derivative thereof, which retains the binding specificity for the required biomarker) we mean or include that the tested antibody or antigen-binding fragment is capable of inhibiting or otherwise interfering, at least in part, with the binding of an antibody molecule as defined herein.
For example, the antibody or antigen-binding fragment may be capable of inhibiting the binding of an antibody molecule defined herein by at least 10%, for example at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 35% or even by 100%.
Competitive binding may be determined by methods well known to those skilled in the art, such as ELISA (as described herein) and/or SPR (as described in the accompanying Examples).
Alternatively or additionally, the antibody or antigen-binding fragment is an antibody 30 defined in Table E or an antigen-binding fragment thereof, or a variant thereof.
Alternatively or additionally, the antibody the antibody or antigen-binding fragment comprises a VH and VL domain specified in Table E, or a variant thereof.
By ‘variants’ of the antibody or antigen-binding fragment of the invention we include insertions, deletions and substitutions, either conservative or non-conservative. In particular we include variants of the sequence of the antibody or antigen-binding fragment where such variations do not substantially alter the activity of the antibody or antigen-binding fragment. In particular, we include variants of the antibody or antigen-binding fragment where such changes do not substantially alter the binding specificity for the respective biomarker specified in Table E.
The polypeptide variant may have an amino acid sequence which has at least 70% identity with one or more of the amino acid sequences of the antibody or antigen-binding fragment of the invention as defined herein—for example, at least 75%, at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% identity or 100% with one or more of the amino acid sequences of the antibody or antigen-binding fragment of the invention as defined herein.
The percent sequence identity between two polypeptides may be determined using suitable computer programs, for example the GAP program of the University of Wisconsin Genetic Computing Group and it will be appreciated that percent identity is calculated in relation to polypeptides whose sequences have been aligned optimally.
The alignment may alternatively be carried out using the Clustal W program (as described in Thompson et al., 1994, Nucl. Acid Res. 22:4673-4680, which is incorporated herein by reference).
The parameters used may be as follows:
Alternatively, the BESTFIT program may be used to determine local sequence alignments.
The antibodies may share CDRs (e.g., 1, 2, 3, 4, 5 or 6) CDRs with one or more of the antibodies defined in Table E.
CDRs can be defined using any suitable method known in the art. Commonly used methods include Paratome (Kunik, Ashkenazi and Ofran, 2012, ‘Paratome: an online tool for systematic identification of antigen-binding regions in antibodies based on sequence or structure’ Nucl. Acids Res., 40:W521-W524; http://www.ofranlab.org/paratome/), Kabat (Wu and Kabat, 1970, ‘An analysis of the sequences of the variable regions of Bence Jones proteins and myeloma light chains and their implications for antibody complementarity.’ J. Exp. Med., 132:211-250), Chothia (Chothia and Lesk, 1987 ‘Canonical structures for the hypervariable regions of immunoglobulins’ J. Mol. Biol., 196:901-917; Chothia et al., 1989‘Conformations of immunoglobulin hypervariable regions’ Nature, 342:877-883) and IMGT (Lefranc et al., 2003 ‘IMGT unique numbering for immunoglobulin and T cell receptor variable domains and Ig superfamily V-like domains. Dev. Comp. Immunol., 27:55-77; Lefranc et al., 2005 ‘IMGT unique numbering for immunoglobulin and T cell receptor constant domains and Ig superfamily C-like domains' Dev. Comp. Immunol., 29:185-203; http://www.imgt.org). For example, the method used may be the IMGT method.
Alternatively or additionally, the first binding agent is immobilised on a surface (e.g., on a multiwell plate or array).
In one embodiment, the one or more biomarker(s) in the test sample is labelled with a detectable moiety.
In one embodiment, the one or more biomarker(s) in the control sample is labelled with a detectable moiety (which may be the same or different from the detectable moiety used to label the test sample).
By a “detectable moiety” we include the meaning that the moiety is one which may be detected and the relative amount and/or location of the moiety (for example, the location on an array) determined.
Detectable moieties are well known in the art.
A detectable moiety may be a fluorescent and/or luminescent and/or chemiluminescent moiety which, when exposed to specific conditions, may be detected. For example, a fluorescent moiety may need to be exposed to radiation (i.e. light) at a specific wavelength and intensity to cause excitation of the fluorescent moiety, thereby enabling it to emit detectable fluorescence at a specific wavelength that may be detected.
Alternatively, the detectable moiety may be an enzyme which is capable of converting a (preferably undetectable) substrate into a detectable product that can be visualised and/or detected. Examples of suitable enzymes are discussed in more detail below in relation to, for example, ELISA assays.
Alternatively, the detectable moiety may be a radioactive atom which is useful in imaging. Suitable radioactive atoms include 99mTc and 123I for scintigraphic studies.
Other readily detectable moieties include, for example, spin labels for magnetic resonance imaging (MRI) such as 123I again, 131I, 11In, 19F, 13C, 15N, 17O, gadolinium, manganese or iron. Clearly, the agent to be detected (such as, for example, the one or more proteins in the test sample and/or control sample described herein and/or an antibody molecule for use in detecting a selected protein) must have sufficient of the appropriate atomic isotopes in order for the detectable moiety to be readily detectable.
The radio- or other labels may be incorporated into the agents of the invention (i.e. the proteins present in the samples of the methods of the invention and/or the binding agents of the invention) in known ways. For example, if the binding moiety is a polypeptide it may be biosynthesised or may be synthesised by chemical amino acid synthesis using suitable amino acid precursors involving, for example, fluorine-19 in place of hydrogen. Labels such as 99mTc, 123I, 186Rh, 188Rh and 111In can, for example, be attached via cysteine residues in the binding moiety. Yttrium-90 can be attached via a lysine residue. The IODOGEN method (Fraker et al (1978) Biochem. Biophys. Res. Comm. 80, 49-57) can be used to incorporate 123I. Reference (“Monoclonal Antibodies in Immunoscintigraphy”, J-F Chatal, CRC Press, 1989) describes other methods in detail. Methods for conjugating other detectable moieties (such as enzymatic, fluorescent, luminescent, chemiluminescent or radioactive moieties) to proteins are well known in the art.
Preferably, the detectable moiety is selected from the group consisting of: a fluorescent moiety, a luminescent moiety, a chemiluminescent moiety, a radioactive moiety, and an enzymatic moiety.
In an alternative or additional embodiment step (b), (d) and/or (f) comprises measuring the expression of a nucleic acid molecule encoding the one or more biomarkers. The nucleic acid molecule may be a cDNA molecule or an mRNA molecule. Preferably the nucleic acid molecule is an mRNA molecule. Also preferably the nucleic acid molecule is a cDNA molecule.
Hence, measuring the expression of the one or more biomarker(s) in step (b) may be performed using a method selected from the group consisting of Southern hybridisation, Northern hybridisation, polymerase chain reaction (PCR), reverse transcriptase PCR (RT-PCR), quantitative real-time PCR (qRT-PCR), nanoarray, microarray, macroarray, autoradiography and in situ hybridisation. Preferably measuring the expression of the one or more biomarker(s) in step (b) is determined using a DNA microarray. Hence, the method may comprise or consist of measuring the expression of the one or more biomarker(s) in step (b) using one or more binding moiety, each capable of binding selectively to a nucleic acid molecule encoding one of the biomarkers identified in Table A.
In an alternative or additional embodiment step the one or more binding moieties each comprise or consist of a nucleic acid molecule such as DNA, RNA, PNA, LNA, GNA, TNA or PMO (preferably DNA). Preferably the one or more binding moieties are 5 to 100 nucleotides in length. More preferably, the one or more nucleic acid molecules are 15 to 35 nucleotides in length. The binding moiety may comprise a detectable moiety.
Suitable binding agents (also referred to as binding molecules) may be selected or screened from a library based on their ability to bind a given nucleic acid, protein or amino acid motif.
In an alternative or additional embodiment measuring the expression of the one or more biomarker(s) in step (b), (d) and/or (f) is performed using one or more binding moieties, each individually capable of binding selectively to a nucleic acid molecule encoding one of the biomarkers identified in Table A.
In an alternative or additional embodiment, the nucleic acid binding moiety comprises a detectable moiety as defined above.
In one embodiment, step (b) is performed using an array.
In one embodiment, step (d) is performed using an array.
For example, the array may be a bead-based array or a surface-based array.
In one embodiment, the array is selected from the group consisting of macroarrays, microarrays and nanoarrays.
Arrays per se are well known in the art. Typically they are formed of a linear or two-dimensional structure having spaced apart (i.e. discrete) regions (“spots”), each having a finite area, formed on the surface of a solid support. An array can also be a bead structure where each bead can be identified by a molecular code or colour code or identified in a continuous flow. Analysis can also be performed sequentially where the sample is passed over a series of spots each adsorbing the class of molecules from the solution. The solid support is typically glass or a polymer, the most commonly used polymers being cellulose, polyacrylamide, nylon, polystyrene, polyvinyl chloride or polypropylene. The solid supports may be in the form of tubes, beads, discs, silicon chips, microplates, polyvinylidene difluoride (PVDF) membrane, nitrocellulose membrane, nylon membrane, other porous membrane, non-porous membrane (e.g. plastic, polymer, perspex, silicon, amongst others), a plurality of polymeric pins, or a plurality of microtitre wells, or any other surface suitable for immobilising proteins, polynucleotides and other suitable molecules and/or conducting an immunoassay.
The binding processes are well known in the art and generally consist of cross-linking covalently binding or physically adsorbing a protein molecule, polynucleotide or the like to the solid support. By using well-known techniques, such as contact or non-contact printing, masking or photolithography, the location of each spot can be defined. For reviews see Jenkins, R. E., Pennington, S. R. (2001, Proteomics, 2, 13-29) and Lal et al (2002, Drug Discov Today 15; 7(18 Suppl):S143-9).
Typically, the array is a microarray. By “microarray” we include the meaning of an array of regions having a density of discrete regions of at least about 100/cm2, and preferably at least about 1000/cm2. The regions in a microarray have typical dimensions, e.g., diameters, in the range of between about 10-250 μm, and are separated from other regions in the array by about the same distance. The array may also be a macroarray or a nanoarray.
Once suitable binding molecules (discussed above) have been identified and isolated, the skilled person can manufacture an array using methods well known in the art of molecular biology.
In one embodiment, step (b) is performed using an assay comprising a second binding agent capable of binding to the one or more proteins, the second binding agent having a detectable moiety.
In one embodiment, step (d) is performed using an assay comprising a second binding agent capable of binding to the one or more proteins, the second binding agent having a detectable moiety.
In one embodiment, the second binding agent is an antibody or a fragment thereof (for example, as described above in relation to the first binding agent).
Typically, the assay is an ELISA (Enzyme Linked Immunosorbent Assay) which typically involve the use of enzymes which give a coloured reaction product, usually in solid phase assays. Enzymes such as horseradish peroxidase and phosphatase have been widely employed. A way of amplifying the phosphatase reaction is to use NADP as a substrate to generate NAD which now acts as a coenzyme for a second enzyme system. Pyrophosphatase from Escherichia coli provides a good conjugate because the enzyme is not present in tissues, is stable and gives a good reaction colour. Chemi-luminescent systems based on enzymes such as luciferase can also be used.
Conjugation with the vitamin biotin is also employed used since this can readily be detected by its reaction with enzyme-linked avidin or streptavidin to which it binds with great specificity and affinity.
It will be appreciated by persons skilled in the art that there is a degree of fluidity in the biomarker composition of the signatures of the invention. Thus, different combinations of the biomarkers may be equally useful in the diagnosis, prognosis and/or characterisation of SLE. In this way, each biomarker (either alone or in combination with one or more other biomarkers) makes a contribution to the signature.
In an alternative or additional embodiment the sample provided in step (a), (c) and/or (e) is selected from the group consisting of unfractionated blood, plasma, serum, tissue fluid, breast tissue, milk, bile and urine. In an alternative or additional embodiment the sample provided in step (a), (c) and/or (e) is selected from the group consisting of unfractionated blood, plasma and serum. In an alternative or additional embodiment the sample provided in step (a), (c) and/or (e) is serum. In an alternative or additional embodiment the sample provided in step (a), (c) and/or (e) is urine. In an alternative or additional embodiment a serum sample and a urine sample are provided in step (a), (c) and/or (e).
In an alternative or additional embodiment the method comprises recording the diagnosis, prognosis or characterisation on a physical or electronic data carrier (i.e., physical or electronic file).
In an alternative or additional embodiment the method comprises the step of:
In an alternative or additional embodiment in the event that the individual is diagnosed with SLE, the method comprises the step of:
As noted above, in the event that the individual is not diagnosed with SLE, they may be subjected to further monitoring for SLE (for example, using the method of the present invention).
In an alternative or additional embodiment in the event that the individual is characterised or prognosed as having a flare in SLE (mild, moderate or severe), the method comprises the step of:
As noted above, in the event that the individual is not diagnosed with SLE flare, they may be subjected to further monitoring for SLE flare (for example, using the method of the present invention).
The repeated monitoring may be repeated at least every 5 days, for example, at least every 10 days, at least every 15 days, at least every 20 days, at least every 25 days, at least every 30 days, at least every 2 months, at least every 3 months, at least every 4 months, at least every 5 months, at least every 6 months, at least every 7 months, at least every 8 months, at least every 9 months, at least every 10 months, at least every 11 months, at least every 12 months, at least every 18 months or at least every 24 months.
Monitoring may also continue in a repeated fashion regardless of whether or not the individual is found to have SLE or SLE flare.
In an alternative or additional embodiment, a more aggressive treatment may be provided for more aggressive SLE types (e.g., SLE3) or during an SLE flare. Suitable therapeutic approaches can be determined by the skilled person according to the prevailing guidance at the time, for example, the American College of Rheumatology Guidelines for Screening, Treatment, and Management of Lupus Nephritis (Hahn et al., 2012, Arthritis Care & Research, 64(6):797-808) which is incorporated herein by reference.
In an alternative or additional embodiment the SLE therapy is selected from the group consisting of systemic inflammation directed treatment (Antimalarials (Hydroxychloroquine), Corticosteroids, Pulse (or mini-pulse) cyclophosphamide (CTX) (with or without corticosteroid co-administration), Mycophenolate mofetil (MMF), Azathioprine (AZA), Methotrexate (MTX)), immune cell targeted therapies (Anti-CD20 antibodies (rituximab, atumumab, ocrelizumab and veltuzumumab), anti-CD22 (Epratuzumab), abetimus (LJP-394), belimumab, atacicept), co-stimulatory signalling pathway targeting (anti-ICOS (inducible costimulator) antibody, anti-ICOS-L (inducible costimulator ligand) antibody, anti-B7RP1 antibody (AMG557)), anti-cytokine therapy (anti-TNF therapy, anti-IL-10, anti-IL-1, anti-IL-18, anti-IL-6, anti-IL-15, memantine, anti-interferon-alpha (IFN-α), plasmapheresis (or plasma exchange), intravenous immunoglobulin (IVIG), DNA vaccination, statins, antioxidants (N-acetylcysteine (NAC), Cysteamine (CYST)), anti-IgE antibodies and anti-FcϵRIa antibodies, Syk (spleen tyrosine kinase) inhibition, and Jak (Janus kinase) inhibition), kidney excision, kidney transplant.
Accordingly, the present invention comprises an anti-SLE agent for use in treating SLE wherein the dosage regime is determined based on the results of the method of the first aspect of the invention.
The present invention comprises the use of an anti-SLE agent in treating SLE wherein the dosage regime is determined based on the results of the method of the first aspect of the invention.
The present invention comprises the use of an anti-SLE agent in the manufacture of a medicament for treating SLE wherein the dosage regime is determined based on the results of the method of the first aspect of the invention.
The present invention comprises a method of treating SLE comprising providing a sufficient amount of an anti-SLE agent wherein the type and amount of anti-SLE agent sufficient to treat the SLE is determined based on the results of the method of the first aspect of the invention.
A second aspect of the invention provides an array for determining a systemic lupus erythematosus-associated disease state in an individual comprising one or more binding agent as defined above in relation to the first aspect of the invention.
In one embodiment, the array is for use in a method according to the first aspect of the invention.
In another embodiment the array is for determining a disease state defined in the first aspect of the invention comprising or consisting of measuring the presence and/or amount of a corresponding biomarker or group of biomarkers defined in the first aspect of the invention.
In a further embodiment the array is an array defined in the first aspect of the invention.
In one embodiment, the one or more binding agent is capable of binding to all of the proteins defined in Table A.
A third aspect of the invention provides the use of one or more biomarkers selected from the group defined in Table A as a biomarker for determining a systemic lupus erythematosus-associated disease state in an individual. In one embodiment, all of the biomarkers defined in Table A are used as a biomarker for determining a systemic lupus erythematosus-associated disease state in an individual.
A fourth aspect of the invention provides the use of one or more biomarkers selected from the group defined in Table A in the manufacture of a medicament (e.g. a diagnostic agent) for determining a Systemic Lupus Erythematosus-associated disease state in an individual.
A fifth aspect of the invention provides one or more biomarkers selected from the group defined in Table A for determining a Systemic Lupus Erythematosus-associated disease state in an individual.
A sixth aspect of the invention provides use of one or more binding agent as defined in the first aspect of the invention for determining a Systemic Lupus Erythematosus-associated disease state in an individual. Alternatively or additionally all of the biomarkers defined in Table A are used for determining a Systemic Lupus Erythematosus-associated disease state in an individual. In one embodiment, the binding agent(s) is/are antibodies or antigen-binding fragments thereof.
A seventh aspect of the invention provides use of one or more binding agent as defined in the first aspect of the invention for the manufacture of a medicament (e.g. a diagnostic agent) for determining a Systemic Lupus Erythematosus-associated disease state in an individual. In one embodiment, the binding agent(s) is/are antibodies or antigen-binding fragments thereof.
An eighth aspect of the invention provides one or more binding agent as defined in the first aspect of the invention for determining a Systemic Lupus Erythematosus-associated disease state in an individual. In one embodiment, the binding agent(s) is/are antibodies or antigen-binding fragments thereof.
A ninth aspect of the invention provides a kit for determining a systemic lupus erythematosus-associated disease state in an individual comprising:
A tenth aspect of the invention provides a method of treating systemic lupus erythematosus in an individual comprising the steps of:
By “Systemic lupus erythematosus therapy” we include treatment of the symptoms of systemic lupus erythematosus (SLE), most notably fatigue, joint pain/swelling and/or skin rashes.
Other symptoms of SLE can include:
Typically, treatment for SLE may include one or more of the following (see also above):
An eleventh aspect of the invention provides a computer program for operating the methods the invention, for example, for interpreting the expression data of step (c) (and subsequent expression measurement steps) and thereby diagnosing or determining a pancreatic cancer-associated disease state. The computer program may be a programmed SVM. The computer program may be recorded on a suitable computer-readable carrier known to persons skilled in the art. Suitable computer-readable-carriers may include compact discs (including CD-ROMs, DVDs, Blue Rays and the like), floppy discs, flash memory drives, ROM or hard disc drives. The computer program may be installed on a computer suitable for executing the computer program.
Preferred, non-limiting examples which embody certain aspects of the invention will now be described with reference to the following tables and above-described figures:
Objective. To define a multiplex serum biomarker signature associated with systemic lupus erythematosus (SLE).
Methods. Affinity proteomics, represented by 195-plex recombinant antibody microarrays, targeting mainly immunoregulatory proteins, was used to perform protein expression profiling of crude, biotinylated serum samples. State of the art bioinformatics was used to define condensed multiplex signatures associated with SLE, and the classification power was evaluated in terms of receiver operating characteristic curves.
Results. The results showed that a condensed (25-plex), pre-validated serum biomarker signature classifying SLE vs. healthy controls with high specificity and sensitivity could be pin-pointed. The panel was composed of novel as well as already known candidate markers. Further, the data indicated that SLE vs. healthy controls could be classified irrespective of the phenotype, reflecting the severity of the disease. The biological relevance of the biomarkers was supported by data mining and pathway analysis.
Conclusion. Our study showed that the immune system could be exploited as a specific and sensitive sensor for SLE. SLE-associated serum biomarker panels have been identified, enhancing our fundamental knowledge of SLE, and in the long-term run allowing serum-based diagnosis of SLE.
Systemic lupus erythematosus (SLE) is a chronic and multisystem autoimmune connective-tissue disease (1, 2), with disease spectra ranging from subtle symptoms to life-threatening multi-organ failure (3, 4). Some hallmarks characteristics of SLE include production of autoantibodies, deposition of immune complexes in tissues, and excessive complement activation (4, 5). Despite major efforts, the complex etiology and pathogenesis, heterogeneous presentation and unpredictable course still pose major challenges in the monitoring and diagnosis of the disease (5-7).
In more detail, the clinical manifestations vary widely among patients, and the signs and symptoms evolve overtime, and overlap with those of other autoimmune diseases, why SLE is often misdiagnosed and/or overlooked (6, 8, 9). In fact, patients may spend up to four years and see three or more physicians before the disease is correctly diagnosed (8). On the other hand, SLE is also often over-diagnosed (10). The diagnosis of SLE in clinical practice is usually made according to the principles outlined by Fries and Holman (11); presence of typical manifestations from at least two organ systems in combination with immunological abnormality consistent with SLE in the absence of a better diagnostic alternative. However, it has during last years been concluded that a biopsy verified lupus glomerulonephritis in combination with immunological abnormality should be accepted for SLE diagnosis. Hence, novel means for improved diagnosis of SLE are needed.
Further, SLE classification criteria have been defined by the American College for Rheumatology (ACR) (12, 13) and more recently from systemic lupus International Collaborating Clinics (SLICC) (14). According to ACR, SLE is classified when at least 4 of 11 clinical and/or immunological criteria, shared by many diseases, are fulfilled. In the case of SLICC, SLE is classified if i) at least 4 of 17 clinical and immunological criteria, or ii) biopsy verified lupus nephritis in the presence of antinuclear antibodies (ANA) or anti-dsDNA antibodies are met. In practice, this means that patients can display a very diverse set of symptoms, but all still be classified as similar.
Although major efforts have been made to decipher SLE-associated biomarkers, the output of validated and clinically useful biomarkers is still limited (6, 15-19). In fact, there is no single laboratory blood- or urine-based test yet at hand that specifically and accurately can confirm or rule out the diagnosis of SLE (6, 15, 18, 19). This lack of adequate biomarkers for SLE has hampered proper clinical management of patients with SLE (15). Considering the complexity and heterogeneity of SLE, a multiplex biomarker panel, rather than a single biomarker may be required to resolve this clinical need (20), placing high demands on the technologies used for biomarker discovery.
In this regards, omic-based technologies holds great promise as one route for biomarker discovery in SLE (17). We have recently used affinity proteomics, represented by recombinant antibody microarrays (21, 22), for serum biomarker discovery in SLE (23) (Carlsson et al, unpublished observations). Targeting mainly immunoregulatory proteins in crude, non-fractionated serum samples, the results showed that candidate serum biomarker panels associated with SLE could be deciphered.
In this study, we have extended our previous efforts, and performed differential serum protein expression profiling of a large cohort of SLE patients vs. healthy controls. To this end, a re-optimized recombinant antibody microarray platform, displaying superior performances (23) (Delfani et al, unpublished data), and targeting a larger set of immuneregulatory analytes, was applied. In addition, an optimized procedure for handling and analysing the microarray data was also adopted (Delfani et al, unpublished data). The results showed that a condensed (25-plex), pre-validated serum biomarker signature classifying SLE vs. healthy controls with high specificity and sensitivity could be identified. Further, the data also outlined that SLE could be classified irrespective of the phenotype, reflecting the severity of the disease.
In total, 197 serum samples were collected at the Department of Rheumatology, Skane University Hospital (Lund, Sweden), including SLE patients (n=86) and normal controls (n=50) (Table I). The SLE patients had clinical SLE diagnosis and displayed four or more American College of Rheumatology classification criteria (13, 24). The SLE samples were collected over time during follow-up and the patients were presented with either flare or remission, i.e. for some patients up to four samples were collected at different time-points. The SLE patients (samples) were grouped according to disease severity as previously described (25): 1) skin and musculoskeletal involvement (SLE1, n=30); 2) serositis, systemic vasculitis but not kidney involvement (SLE2, n=30); 3) presence of SLE glomerulonephritis (SLE3, n=87). The clinical disease activity was defined as SLE disease activity index 2000 (SLEDAI-2K) score (26). All samples were aliquoted and stored at −80° C. until analysis. This retrospective study was approved by the regional ethics review board in Lund, Sweden.
The serum samples were labelled with EZ-link Sulfo-NHS-LC-Biotin (Pierce, Rockford, Ill., USA) using a previously optimized labelling protocol for serum proteomes (21, 22, 27). Briefly, the samples were diluted 1:45 in PBS (about 2 mg protein/ml), and biotinylated at a molar ratio of biotin:protein of 15:1. Unreacted biotin was removed by extensive dialysis against PBS (pH 7.4) for 72 h at 4° C. The samples were aliquoted and stored at −20° C. until further use.
In total, 195 human recombinant single-chain fragment variable (scFv) antibodies, including 180 antibodies targeting 73 mainly immunoregulatory analytes, anticipated to reflect the events taking place in SLE, and 15 scFv antibodies targeting 15 short amino acid motifs (4 to 6 amino acids long) (28) were selected from a large phage display library (Table II) (29) (Persson et al, unpublished data). The specificity, affinity, and on-chip functionality of the scFv antibodies have been previously validated (see Supplementary Appendix 1 for details).
All scFv antibodies were produced in 100 ml E. coli and purified from expression supernatants using affinity chromatography on Ni2+-NTA agarose (Qiagen, Hilden, Germany) validated (see Supplementary Appendix 1 for details).
The scFv microarrays were produced an handled using a previously optimized and validated set-up (23) (Delfani et al, unpublished data) (see Supplementary Appendix 1 for details). Briefly, 14 identical 25×28 subarrays were printed on each black polymer MaxiSorp microarray slide (NUNC A/S, Roskilde, Denmark) using a non-contact printer (SciFlexarrayer S11, Scienion, Berlin, Germany). Biotinylated samples were added and any bound analytes were visualized using Alexa 647-labelled streptavidin (SA647) (Invitrogen). Finally, the slides were scanned with a confocal microarray scanner (ScanArray Express, PerkinElmer Life & Analytical Sciences).
The ScanArray Express software v4.0 (PerkinElmer Life & Analytical Sciences) was used to quantify spot signal intensities, using the fixed circle method. Signal intensities with local background subtraction were used for data analysis. Each data point represents the mean value of all three replicate spots unless any replicate CV exceeded 15%, in which case the worst performing replicate was eliminated and the average value of the two remaining replicates was used instead. Log10 values of signal intensities were used for subsequent analysis. The microarray data was normalized in a two-step procedure using a semi-global normalization method (23, 30, 31) and the “subtract by group mean” approach (see Supplementary Appendix 1 for details).
Where applicable, the sample cohort was randomly divided into a training set (⅔ of the samples) and a test set (⅓ of the samples), making sure that the distribution of SLE vs. controls and/or samples with active vs. inactive disease was similar between the two sets. It should be noted that for those SLE patients where more than one sample was at hand, the sample was randomly selected for each comparison, and only one sample per patient was included in each subset comparison in order to avoid bias (i.e. over-representation of certain patients).
The support vector machine (SVM) is a supervised learning method in R (32-34) that we used to classify the samples (see Supplementary Appendix 1 for details). For classification of SLE1 vs. N and SLE2 vs. N, the SVM was trained using a leave-one-out cross-validation procedure (30), and the prediction performance of the classifier was evaluated by constructing a receiver operating characteristics (ROC) curve and calculating the area under the curve (AUC).
In the case of SLE vs. N and SLE3 vs. N, the samples were divided into a training set and a test set, and a backward elimination algorithm (35) combined with a leave-one-out cross-validation procedure was applied on the training set to determine a condensed panel of antibodies displaying the highest combined discriminatory power. A single SVM model was then calibrated on the training set using the condensed antibody panel, whereafter the model (classifier) was frozen and evaluated on the test set.
Significantly differentially expressed analytes (p<0.05) were identified based on Wilcoxon rank sum tests. Heat maps and visualization of the samples by principal component analysis (PCA) were carried using Qlucore Omics Explorer 2.2. (Qlucore AB, Lund, Sweden). Data-mining and pathway analysis was conducted using Metacore (Thomson Reuters, New York, N.Y., USA).
In this study, we have applied recombinant scFv antibody microarrays for deciphering serum biomarker signatures reflecting SLE. In total, 197 crude, biotinylated serum samples (SLE n=147, healthy controls n=50) representing 136 patients (86 SLE and 50 controls) were profiled using 195-antibody microarrays, targeting mainly immunoregulatory analytes (Table 1). The scanned microarray images were converted into protein expression profiles, or protein maps, and disease-associated serum biomarker panels were delineated.
Serum Biomarker Panel Discriminating SLE vs. Healthy Controls
First, we determined whether a multiplex serum biomarker signature discriminating SLE vs. healthy controls could be deciphered. To this end, the data set was randomly divided into a training set (⅔ of all samples) and a test set (⅓ of all samples). A stepwise backward elimination procedure was then applied to the training set in order to identify the smallest set of antibodies, i.e. biomarkers, required for differentiating SLE vs. healthy controls. The results showed that a combination of 16 antibodies, evaluated in terms of the smallest error, provided the best classification (
In order to evaluate the classification power of this 25-biomarker signature, the panel was first used to train a single SVM model, denoted frozen SVM, on the training set.
Next, the frozen SVM model was applied to the independent test set. The results showed that a ROC AUC value of 0.94 was obtained (
To test the robustness of the data set with respect to the classification, we randomly divided the entire data set in 9 additional pairs of training and test sets, and re-ran the above process. The results showed that the 10 comparisons resulted in a median AUC value of 0.86 (range 0.79 to 0.95) (
To explore the biological relevance of the observed serum biomarkers, we attempted to perform a focused data-mining and pathway analysis using Metacore™. The analyze single experiment workflow tool was used for conducting enrichment analysis of the data set by mapping it onto selected MetaCore's ontologies, including disease by biomarkers (
When searching for disease by the identified biomarkers, the data analysis showed that SLE was the top hit, followed by 2 other autoimmune conditions, rheumatoid arthritis and connective tissue disease, (
Serum Biomarker Panels Discriminating Phenotypic Subsets of SLE vs. Healthy Controls
To investigate whether disease severity was a confounding factor for discriminating SLE vs. healthy controls, the SLE samples were grouped according to phenotype (SLE1, SLE2, and SLE3), and the data analysis were re-run. The disease severity is reflected by the phenotype, with SLE1 displaying the least symptoms and SLE3 the most and severe symptoms. The classification was performed adopting a leave-one-out cross-validation, the most stringent approach that can be employed when the sample cohorts are too small to justify the samples to be split into training and test sets.
The results showed that the all three phenotypes could be discriminated from healthy controls, with ROC AUC values of 0.92 (SLE1) (
In
Refined Serum Biomarker Panel Discriminating SLE3 vs. Healthy Controls
Finally, we refined the serum biomarker panel discriminating SLE3 vs. healthy control (the only phenotype with a sufficient number of samples allowing the data set to be split into training and tests). Hence, the smallest set of antibodies, i.e. biomarkers, required for differentiating SLE3 vs. healthy controls was determined as described above (backward elimination algorithm), and the procedure was iterated 10 times. The smallest number of biomarkers required for the best classification was found to be 9, and to allow some flexibility in the signature, the top 25 antibodies were selected to represent the condensed biomarker panel (data not shown). Applying the frozen SVMs on the test set resulted in a median ROC AUC value of 0.94 (range 0.84 to 0.97) (
The frequency at which each biomarker occurred in these ten 25-plex signatures is shown in
Once clinical symptoms have developed, prompt diagnosis and adequate management of SLE remains great challenges (8). In fact, laboratory tests and biomarker panels that enables early and accurate diagnosis of SLE are still not at hand, for review see (8, 9, 15, 16, 18, 19, 36). In this context, autoantibodies, such as ANA and anti-dsDNA, have frequently been exploited, but the use of these immunological markers for diagnosis is associated with considerable drawbacks (4). Additional biomarkers that have been suggested in the quest of improving the specificity and sensitivity of the diagnosis, include e.g. abnormal levels of erythrocyte-bound complement activation product C4d and complement receptor 1 (37), platelet bound C4d (38), and lymphocyte bound C4d (39). Hence, additional panels of high-performing serum, plasma, and/or urine biomarker panels would thus be essential.
Spurred by two recent discovery studies (23) (Carlsson et al, unpublished observations), we have in this study extended our efforts in harnessing the diagnostic power of the immune system. More specifically, we further explored the fact that immunoregulation is a central phenomenon of SLE. To this end, we designed our 195 antibody microarrays to target predominantly key regulatory serum proteins, including 73 unique proteins and 15 peptide motifs. Despite only targeting a focused window of the entire serum proteome, the results showed that we could extract condensed (≤25-plex) serum biomarker panels differentiating SLE vs. healthy controls irrespective of the disease phenotype (reflecting the disease severity). The classification was accomplished displaying a high discriminatory power, illustrated by a (median) ROC AUC of 0.86 to 0.94. In this context, it should be noted that the bioinformatic analyses were performed using two of the most stringent procedures at hand (training and test sets, combined with backward elimination and frozen SVM versus leave-one-out cross-validation).
In the case of SLE vs. healthy controls, the SLE-associated biomarker panel was identified through backward elimination (35), defining the condensed signature displaying the best classification. Such panels are designed to contain biomarkers providing as orthogonal information as possible, while when viewed alone, an individual marker might not be significantly (p<0.05) differentially expressed. Noteworthy, the core signature, composed of six proteins (C3, CD40, Cystatin C, MCP-1, Sialyl lewis x, and TGF-β), identified in all ten iterative comparisons irrespectively of how the training and test sets were defined, were also found to be differentially expressed. In addition, five of these proteins were also targeted in our recent discovery studies, and four of these were then found to be differentially expressed (C3, CD40, Sialyl lewis x, and TGF-β) (23) (Carlsson et al, unpublished observations). While Sialyl lewis x appeared to be a novel marker, the other five proteins have previously been found to be associated with SLE. C3 and interferon-regulated cytokines, such as MCP-1, have been indicated as potential markers for disease activity (16, 40). TGF-β plays a large role in the control of autoimmunity, and it has been suggested that it might be involved in pathogenesis of renal damage (41). CD40 has been identified as susceptibility locus, and altered levels might have implications for the regulation of aberrant immune response in the disease (42). In addition, Cystatin C serum levels have been found to be dependent on renal function (43).
In addition to the 6 core markers, the overall list of variables was composed of novel markers as well additional markers already reported to be associated with SLE (8, 9, 15, 16, 18, 19, 36). As for example, several complement proteins (C4, C1 esterase inhibitor, factor b, C1q, and properdin) were found to be deregulated, and complement proteins (e.g. C4 and C1q) have also been frequently implicated in the pathogenesis of SLE (5, 44, 45). Several cytokines (e.g. IL-2, IL-4, IL-6, IL-12, IL-16, and TNF-α) were also found to be de-regulated as previously indicated, and could play a key role in the immune dysregulation in SLE (46). It should, however, be noted that these e priori known candidate biomarkers have mainly been reported as individual markers, and not in the context of a high-performing multiplex serum biomarker signature for SLE.
The biological relevance of the SLE-associated condensed serum biomarker panel was also highlighted by the data mining and pathway analysis, further supporting our approach of using the immune system as a sensor for SLE. As for example, when searching for disease by biomarkers, the software tool proposed SLE as the top indication. Further, the pathway analysis also indicated apoptosis, or programed cell-death as a top process. Abnormal immunoregulation, as reflected by defective clearance of immune complexes and apoptopic cells (materials), have also been identified as a feature in SLE (5). The reason(s) for this defect is not clear, but might be due to quantitative or qualitative defects of early complement proteins, such as C2, C4, or C1q.
Finally, we also investigated whether disease severity, as reflected by the three phenotypes of SLE (44), was a confounding factor for the classification. In our recent discovery studies, the data indicated the classification was challenging for the phenotype displaying the least symptoms (SLE1), but improved with increasing symptoms, i.e. SLE1<SLE2<SL3 (23). In this study, the biomarker signatures were improved and refined, which could be explained by three key factors, namely, i) we analysed a significantly larger sample cohort, ii) we targeted a larger set of immunoregulatory analytes, and iii) we used a re-optimized microarray platform with significantly improved performances (23) (Carlsson et al, unpublished observations) (Delfani et al, unpublished observations). In this study, the data thus showed that the classification of SLE vs. healthy controls was high (ROC AUC of 0.90 to 0.94) irrespective of disease severity (phenotype). In other words, the disease severity was not a confounding factor for classification. Again, the biological relevance of several of the observed biomarkers, such as C3, C4, CD40, MCP-1, IL-6, IL12, and cystatine C was supported by the literature. As above, these markers have been reported mainly as individual markers and not in the context of a multiplex high-performing serum biomarker signature (8, 9, 15, 16, 18, 19, 36).
Taken together, among other things, we have defined a condensed 25-plex serum biomarker signature reflecting SLE using affinity proteomics, thereby enabling serum-based diagnosis of SLE.
In total, 195 human recombinant scFv antibodies, including 180 antibodies targeting 73 mainly immunoregulatory analytes, anticipated to reflect the events taking place in SLE, and 15 scFv antibodies targeting 15 short amino acid motifs (4 to 6 amino acids long) (8) were selected from a large phage display library (Table II) (9) (Persson et al, unpublished data). The specificity, affinity (normally in the nM range), and on-chip functionality of these phage display derived scFv antibodies was ensured by using i) stringent phage-display selection and screening protocols (9), ii) multiple clones (1-9) per target, and iii) a molecular design, adapted for microarray applications (10). In addition, the specificity of several of the antibodies have previously also been validated using well-characterized, standardized serum samples (with known analytes of the targeted analytes), and orthogonal methods, such as mass spectrometry (affinity pull-down experiments), ELISA, MesoScaleDiscovery (MSD) assay, cytometric bead assay, and MS, as well as using spiking and blocking (Table II) (5, 6, 11-17). Notably, the reactivity of some antibodies might be lost since the label (biotin) used to label the sample to enable detection could block the affinity binding to the antibodies (epitope masking). However, we addressed this potential problem by frequently including more than one antibody clone against the same protein, but directed against different epitopes (10).
All scFv antibodies were produced in 100 ml E. coli and purified from expression supernatants using affinity chromatography on Ni2+-NTA agarose (Qiagen, Hilden, Germany). ScFvs were eluted using 250 mM imidazole, extensively dialyzed against PBS (pH 7.4), and stored at 4° C. until use. The protein concentration was determined by measuring the absorbance at 280 nm (average 340 μg/ml, range 30-1500 μg/ml). The degree of purity and integrity of the scFv antibodies was evaluated by 10% SDS-PAGE (Invitrogen, Carlsbad, Calif., USA).
The scFv microarrays were produced using a previously optimized and validated set-up (14) (Delfani et al, unpublished data). Briefly, the antibodies were printed on black polymer MaxiSorp microarray slides (NUNC A/S, Roskilde, Denmark), by spotting one drop (˜330 pL) at each position, using a non-contact printer (SciFlexarrayer S11, Scienion, Berlin, Germany). Each microarray, composed of 195 scFvs antibodies, one negative control (PBS) and one positive control (biotinylated BSA, b-BSA), was split into 14 sub-arrays of 25×28 spots. Furthermore, each sub-array was divided in three segments where a row of b-BSA consisting of 25 replicate spots was printed at the beginning and the end of each segment. Each scFv antibody was dispensed in three replicates, one in each segment, to assure adequate reproducibility.
For handling the arrays, we used a recently optimized protocol (Delfani et al, unpublished data). Briefly, the printed microarrays were allowed to dry for 2 h at RT and were then mounted in a multi-well incubation chambers (NEXTERION® IC-16) (Schott, Jena, Germany). Next, the slides were blocked with 1% (v/v) Tween-20 (Merck Millipore) and 1% (w/v) fat-free milk powder (Semper, Sundbyberg, Sweden) in PBS (MT-PBS solution) for 2 h at RT. Subsequently, the slides were washed for four times with 150 μl 0.05% (v/v) Tween-20 in PBS (T-PBS solution), and then incubated with 100 μl biotinylated serum sample, diluted 1:10 in MT-PBS solution (corresponding to a total serum dilution of 1:450), for 2 h at RT under gentle agitation using an orbital shaker. After another washing, the slides were incubated with 100 μl 1 μg/ml Alexa 647-labelled streptavidin (SA647) (Invitrogen) in MT-PBS for 1 h at RT under agitation. Finally, the slides were washed in T-PBS, and dried under a stream of nitrogen gas, and immediately scanned with a confocal microarray scanner (ScanArray Express, PerkinElmer Life & Analytical Sciences) at 10 μm resolution, using fixed scanner settings of 60% PMT gain and 90% laser power.
The ScanArray Express software v4.0 (PerkinElmer Life & Analytical Sciences) was used to quantify spot signal intensities, using the fixed circle method. Signal intensities with local background subtraction were used for data analysis. Each data point represents the mean value of all three replicate spots unless any replicate CV exceeded 15%, in which case the worst performing replicate was eliminated and the average value of the two remaining replicates was used instead. Log10 values of signal intensities were used for subsequent analysis.
For evaluation of normalization strategies and initial analysis on variance, the data was visualized using principal component analysis (PCA) and hierarchical clustering In Qluecore Omics Explorer (Qlucore AB, Lund, Sweden). Subsequently, the data normalization procedure was carried out in two steps. First, the microarray data was normalized for array-to-array variations using a semi-global normalization method, where 20% of the analytes displaying the lowest CV-values over all samples were identified and used to calculate a scaling factor, as previously described (14, 18, 19).
Second, the data was normalized for day-to-day variation using the “subtract by group mean” approach. In this approach, the mean value (
The support vector machine (SVM) is a supervised learning method in R (20-22) that we was used to classify the samples. The supervised classification was conducted using a linear kernel, and the cost of constraints was set to 1, which is the default value in the R function SVM, and no attempt was performed to tune it. This absence of parameter tuning was chosen to avoid over fitting. No filtration on the data was done before training the SVM, i.e. all antibodies used on the microarray were included in the analysis. Further, a receiver operating characteristics (ROC) curve, as constructed using the SVM decision values and the area under the curve (AUC), was calculated.
Depending on the size of the sample cohorts, two different strategies were applied. For classification of SLE vs. N and SLE3 vs. N, the samples were first randomly divided into a training set (⅔ of the data) and a test set (⅓ of the data) while maintaining the same ratios of samples from each group. It should be noted that for those SLE patients where more than one sample was at hand, the sample was randomly selected for each comparison, and only one sample per patient was included in each subset comparison in order to avoid bias. A backward elimination algorithm (23) combined with a leave-one-out cross-validation procedure was then applied to the training set to create a condensed panel of antibodies displaying the highest combined discriminatory power. The condensed panel of antibodies was then employed to train a single SVM model on the training set. The trained SVM model was then frozen and applied to the test set, and a ROC AUC was calculated and used to evaluate the performance of the SVM classifier. In order to demonstrate the robustness of the data set, 9 additional training and test sets were generated and the above data analysis process was repeated. Finally, the frequency at which each antibody was included in all 10 different defined antibody panels was assessed.
When classifying SLE1 vs. N and SLE2 vs. N the number of samples was not large enough to divide the sample set into a training set and a test set. Therefore, the SVM was trained using the leave-one-out cross-validation procedure as previously described (18). By iterating all samples, a ROC curve was constructed using the decision values and the corresponding AUC value was determined, and used for evaluating the prediction performance of the classifier.
Significantly differentially expressed analytes (p<0.05) were identified based on Wilcoxon rank sum tests. Heat maps and visualization of the samples by principal component analysis (PCA) were carried using Qlucore Omics Explorer. Data-mining and pathway analysis was conducted using Metacore (Thomson Reuters, New York, N.Y., USA).
Submitted with this application is a Sequence Listing in the form of an ASCII text (.txt) file, which is hereby incorporated by reference into the specification of the application. The ASCII text file (467 KB) was created on Jun. 23, 2022 and has the file name 20220623_Sequence_Listing147432_001132.txt.
Number | Date | Country | Kind |
---|---|---|---|
1609950.9 | Jun 2016 | GB | national |
This application is a continuation of U.S. patent application Ser. No. 16/308,258, filed Dec. 7, 2018, which is a national stage application under 35 U.S.C. § 371 of PCT Application No. PCT/EP2017/063852, filed Jun. 7, 2017, which claims the benefit of Great Britain Patent Application No. 1609950.9, filed Jun. 7, 2016.
Number | Date | Country | |
---|---|---|---|
Parent | 16308258 | Dec 2018 | US |
Child | 17848361 | US |