Inflammatory skin diseases are heterogeneous in nature, and have variable causation, course and responsiveness to therapy. They have unique clinical features but may have both selective and overlapping responses to targeted therapies. There is a need for understanding molecular pathways involved in the pathogenesis of these conditions to allow identification and optimization of therapies.
Methods of the current invention can classify whether skin of a patient is indicative of an inflammatory disease state, such as lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state, with high accuracy, sensitivity, specificity, positive predictive value and/or negative predictive value. As shown in Example 2, by analyzing certain cellular and pathway gene signatures, methods of the current invention can distinguish patients with inflammatory disease states from healthy control, and/or can also distinguish between the patients having different inflammatory diseases. Methods described herein provide better understanding of the pathogenic mechanisms, and provide direction for new and/or optimized therapeutic avenues for these disease conditions. In certain aspects, the methods described allow for dimensionality reduction. Dimensionality reduction, such as scaling from a larger set of cellular and pathway gene signatures, to a smaller set of cellular and pathway gene signatures for disease state classification, may reduce diagnostic costs, and provide timely diagnosis thereby informing early effective intervention.
The present invention includes a method for assessing skin of a patient. The method can include any one of, any combination of, or all of steps (a), (b) and (c). Step (a) can include assaying a biological sample obtained or derived from the patient to produce a data set comprising and/or derived from gene expression measurements of the biological sample from each of a plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci. The plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci can comprise at least one gene selected from the genes listed in Table 1, Table 2, Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, Table 4C, and Table 4D. As used herein, genes listed in Tables Table 1, Table 2, Tables 4A-1 to 4A-20, Tables 4B-1 to 4B-28, Table 4C, and Table 4D, may be understood to include all the genes listed in these tables. As a non-limiting example, “genes listed in Table X and Y” includes x+y genes, where Table λ contains x genes and Table Y contains y genes, considering no overlap exists between x and y genes. In the event of overlap, duplicate copies can be excluded from analysis. Step (b) can include analyzing the data set to classify the skin of the patient as indicative of a disease state. Step (c) can include electronically outputting a report indicative of the classification of the skin of the patient as indicative of the disease state. The report of step (c) can be indicative of the classification obtained in step (b). The skin of the patient can contain one or more lesions, or does not contain a lesion. In certain embodiments, the skin of the patient contains one or more lesions. In certain embodiments, the skin of the patient does not contain a lesion.
In certain embodiments, the disease state is an inflammatory skin disease state. In certain embodiments, the disease state is a rheumatic skin disease state. In certain embodiments the disease state is lupus (e.g., systemic lupus erythematosus (SLE)), psoriasis (PSO), atopic dermatitis (AD), and/or systemic sclerosis (scleroderma) (SSc) disease state.
In certain embodiments, the lupus is SLE, discoid lupus erythematosus (DLE), cutaneous lupus erythematosus (CLE), acute cutaneous lupus erythematosus (ACLE), subacute cutaneous lupus erythematosus (SCLE), chronic cutaneous lupus erythematosus (CCLE), or any combination thereof. In certain embodiments, the lupus is SLE. In certain embodiments, the lupus is CLE. In certain embodiments, the lupus is DLE. In certain embodiments, the lupus is ACLE. In certain embodiments, the lupus is SCLE. In certain embodiments, the lupus is CCLE. In certain embodiments, the disease state is lupus disease state. In certain embodiments, the disease state is SLE disease state. In certain embodiments, the disease state is cutaneous lupus erythematosus (CLE) disease state. In certain embodiments, the disease state is DLE disease state. In certain embodiments, the disease state is ACLE disease state. In certain embodiments, the disease state is SCLE disease state. In certain embodiments, the disease state is CCLE disease state. In certain embodiments, the SLE disease state is discoid lupus erythematosus (DLE) disease state, acute cutaneous lupus erythematosus (ACLE) disease state, subacute cutaneous lupus erythematosus (SCLE) disease state, chronic cutaneous lupus erythematosus (CCLE) disease state, or any combination thereof. In certain embodiments, the SLE disease state is DLE disease state. In certain embodiments, the SLE disease state is SCLE disease state. In certain embodiments, the disease state is lupus, PSO, AD, and/or SSc, disease state, and in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc, disease state. In certain embodiments, the disease state is lupus, PSO, AD, or SSc, disease state, and in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus, PSO, AD, or SSc, disease state. In certain embodiments, the disease state is lupus disease state, and in step (b) the data set is analyzed to classify the skin of the patient as indicative of lupus disease state. In certain embodiments, the disease state is lupus disease state, and in step (b) the data set is analyzed to classify whether the skin of the patient is indicative of a group 1 lupus disease state, group 2 lupus disease state, group 3 lupus disease state, or not having the lupus disease state. Group 1, 2, and 3 lupus disease state can be characterized by gene enrichment analysis corresponding to group 1, 2 and 3 lupus disease, respectively, as described in Example 2, and
In certain embodiments, the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 305, 310, 315, 320, 325, 330, 335, 340, 345, 350, 355, 360, 365, 370, 375, 380, 385, 390, 395, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500, 1550, 1600, 1650, or all, or any range or value there between, genes selected from genes listed in Table 1, Table 2, Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, Table 4C, and Table 4D.
In certain embodiments, the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 305, 310, 315, 320, 325, 330, 335, 340, 345, 350, 355, 360, 365, 370, 375, 380, 385, 390, 395, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500, 1550, 1600, 1650, all, or any range or value there between, genes selected from the genes listed in Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28.
In certain embodiments, the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 305, 310, 315, 320, 325, 330, 335, 340, 345, 350, 355, 360, or all genes selected from the genes listed in each of one or more tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28. In certain embodiments, the one or more Tables includes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, or 48, or any range there between Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28.
The method can classify the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state with an accuracy of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The method can classify the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state with an sensitivity of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The method can classify the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state with an specificity of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The method can classify the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state with a positive predictive value of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The method can classify the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state with a negative predictive value of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The method can classify the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state with receiver operating characteristic (ROC) curve having an Area-Under-Curve (AUC) of at least about 0.50, at least about 0.55, at least about 0.60, at least about 0.65, at least about 0.70, at least about 0.75, at least about 0.80, at least about 0.85, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least about 0.98, at least about 0.99, or more than about 0.99.
In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with an accuracy of about 70% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with an accuracy of about 70% to about 75%, about 70% to about 80%, about 70% to about 85%, about 70% to about 90%, about 70% to about 92.5%, about 70% to about 95%, about 70% to about 96%, about 70% to about 97%, about 70% to about 98%, about 70% to about 99%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 92.5%, about 75% to about 95%, about 75% to about 96%, about 75% to about 97%, about 75% to about 98%, about 75% to about 99%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 92.5%, about 80% to about 95%, about 80% to about 96%, about 80% to about 97%, about 80% to about 98%, about 80% to about 99%, about 80% to about 100%, about 85% to about 90%, about 85% to about 92.5%, about 85% to about 95%, about 85% to about 96%, about 85% to about 97%, about 85% to about 98%, about 85% to about 99%, about 85% to about 100%, about 90% to about 92.5%, about 90% to about 95%, about 90% to about 96%, about 90% to about 97%, about 90% to about 98%, about 90% to about 99%, about 90% to about 100%, about 92.5% to about 95%, about 92.5% to about 96%, about 92.5% to about 97%, about 92.5% to about 98%, about 92.5% to about 99%, about 92.5% to about 100%, about 95% to about 96%, about 95% to about 97%, about 95% to about 98%, about 95% to about 99%, about 95% to about 100%, about 96% to about 97%, about 96% to about 98%, about 96% to about 99%, about 96% to about 100%, about 97% to about 98%, about 97% to about 99%, about 97% to about 100%, about 98% to about 99%, about 98% to about 100%, or about 99% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with an accuracy of about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with an accuracy of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, or about 99%.
In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a sensitivity of about 70% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a sensitivity of about 70% to about 75%, about 70% to about 80%, about 70% to about 85%, about 70% to about 90%, about 70% to about 92.5%, about 70% to about 95%, about 70% to about 96%, about 70% to about 97%, about 70% to about 98%, about 70% to about 99%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 92.5%, about 75% to about 95%, about 75% to about 96%, about 75% to about 97%, about 75% to about 98%, about 75% to about 99%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 92.5%, about 80% to about 95%, about 80% to about 96%, about 80% to about 97%, about 80% to about 98%, about 80% to about 99%, about 80% to about 100%, about 85% to about 90%, about 85% to about 92.5%, about 85% to about 95%, about 85% to about 96%, about 85% to about 97%, about 85% to about 98%, about 85% to about 99%, about 85% to about 100%, about 90% to about 92.5%, about 90% to about 95%, about 90% to about 96%, about 90% to about 97%, about 90% to about 98%, about 90% to about 99%, about 90% to about 100%, about 92.5% to about 95%, about 92.5% to about 96%, about 92.5% to about 97%, about 92.5% to about 98%, about 92.5% to about 99%, about 92.5% to about 100%, about 95% to about 96%, about 95% to about 97%, about 95% to about 98%, about 95% to about 99%, about 95% to about 100%, about 96% to about 97%, about 96% to about 98%, about 96% to about 99%, about 96% to about 100%, about 97% to about 98%, about 97% to about 99%, about 97% to about 100%, about 98% to about 99%, about 98% to about 100%, or about 99% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a sensitivity of about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a sensitivity of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, or about 99%.
In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a specificity of about 70% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a specificity of about 70% to about 75%, about 70% to about 80%, about 70% to about 85%, about 70% to about 90%, about 70% to about 92.5%, about 70% to about 95%, about 70% to about 96%, about 70% to about 97%, about 70% to about 98%, about 70% to about 99%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 92.5%, about 75% to about 95%, about 75% to about 96%, about 75% to about 97%, about 75% to about 98%, about 75% to about 99%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 92.5%, about 80% to about 95%, about 80% to about 96%, about 80% to about 97%, about 80% to about 98%, about 80% to about 99%, about 80% to about 100%, about 85% to about 90%, about 85% to about 92.5%, about 85% to about 95%, about 85% to about 96%, about 85% to about 97%, about 85% to about 98%, about 85% to about 99%, about 85% to about 100%, about 90% to about 92.5%, about 90% to about 95%, about 90% to about 96%, about 90% to about 97%, about 90% to about 98%, about 90% to about 99%, about 90% to about 100%, about 92.5% to about 95%, about 92.5% to about 96%, about 92.5% to about 97%, about 92.5% to about 98%, about 92.5% to about 99%, about 92.5% to about 100%, about 95% to about 96%, about 95% to about 97%, about 95% to about 98%, about 95% to about 99%, about 95% to about 100%, about 96% to about 97%, about 96% to about 98%, about 96% to about 99%, about 96% to about 100%, about 97% to about 98%, about 97% to about 99%, about 97% to about 100%, about 98% to about 99%, about 98% to about 100%, or about 99% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a specificity of about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a specificity of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, or about 99%.
In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a positive predictive value of about 70% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a positive predictive value of about 70% to about 75%, about 70% to about 80% about 70% to about 85%, about 70% to about 90%, about 70% to about 92.5%, about 70% to about 95%, about 70% to about 96%, about 70% to about 97%, about 70% to about 98%, about 70% to about 99%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 92.5%, about 75% to about 95%, about 75% to about 96%, about 75% to about 97%, about 75% to about 98%, about 75% to about 99%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 92.5%, about 80% to about 95%, about 80% to about 96%, about 80% to about 97%, about 80% to about 98%, about 80% to about 99%, about 80% to about 100%, about 85% to about 90%, about 85% to about 92.5%, about 85% to about 95%, about 85% to about 96%, about 85% to about 97%, about 85% to about 98%, about 85% to about 99%, about 85% to about 100%, about 90% to about 92.5%, about 90% to about 95%, about 90% to about 96%, about 90% to about 97%, about 90% to about 98%, about 90% to about 99%, about 90% to about 100%, about 92.5% to about 95%, about 92.5% to about 96%, about 92.5% to about 97%, about 92.5% to about 98%, about 92.5% to about 99%, about 92.5% to about 100%, about 95% to about 96%, about 95% to about 97%, about 95% to about 98%, about 95% to about 99%, about 95% to about 100%, about 96% to about 97%, about 96% to about 98%, about 96% to about 99%, about 96% to about 100%, about 97% to about 98%, about 97% to about 99%, about 97% to about 100%, about 98% to about 99%, about 98% to about 100%, or about 99% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a positive predictive value of about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a positive predictive value of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, or about 99%.
In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a negative predictive value of about 70% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a negative predictive value of about 70% to about 75%, about 70% to about 80% about 70% to about 85%, about 70% to about 90%, about 70% to about 92.5%, about 70% to about 95%, about 70% to about 96%, about 70% to about 97%, about 70% to about 98%, about 70% to about 99%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85% about 75% to about 90%, about 75% to about 92.5%, about 75% to about 95%, about 75% to about 96%, about 75% to about 97%, about 75% to about 98%, about 75% to about 99%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 92.5%, about 80% to about 95%, about 80% to about 96%, about 80% to about 97%, about 80% to about 98%, about 80% to about 99%, about 80% to about 100%, about 85% to about 9000, about 85% to about 92.5%, about 85% to about 95%, about 85% to about 96%, about 85% to about 97%, about 85% to about 98%, about 85% to about 99%, about 85% to about 100%, about 90% to about 92.5%, about 90% to about 95%, about 90% to about 96%, about 90% to about 97%, about 90% to about 98%, about 90% to about 99%, about 90% to about 100%, about 92.5% to about 95%, about 92.5% to about 96%, about 92.5% to about 97%, about 92.5% to about 98%, about 92.5% to about 99%, about 92.5% to about 100%, about 95% to about 96%, about 95% to about 97%, about 95% to about 98%, about 95% to about 99%, about 95% to about 100%, about 96% to about 97%, about 96% to about 98%, about 96% to about 99%, about 96% to about 100%, about 97% to about 98%, about 97% to about 99%, about 97% to about 100%, about 98% to about 99%, about 98% to about 100%, or about 99% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a negative predictive value of about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a negative predictive value of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, or about 99%.
In certain embodiments, the trained machine learning model classifies the skin of the patient as indicative of the disease state of the patient with a ROC having an AUC of about 0.7 to about 1. In certain embodiments, the trained machine learning model classifies the skin of the patient as indicative of the disease state of the patient with a ROC having an AUC of about 0.7 to about 0.75, about 0.7 to about 0.8, about 0.7 to about 0.85, about 0.7 to about 0.9, about 0.7 to about 0.925, about 0.7 to about 0.95, about 0.7 to about 0.96, about 0.7 to about 0.97, about 0.7 to about 0.98, about 0.7 to about 0.99, about 0.7 to about 1, about 0.75 to about 0.8, about 0.75 to about 0.85, about 0.75 to about 0.9, about 0.75 to about 0.925, about 0.75 to about 0.95, about 0.75 to about 0.96, about 0.75 to about 0.97, about 0.75 to about 0.98, about 0.75 to about 0.99, about 0.75 to about 1, about 0.8 to about 0.85, about 0.8 to about 0.9, about 0.8 to about 0.925, about 0.8 to about 0.95, about 0.8 to about 0.96, about 0.8 to about 0.97, about 0.8 to about 0.98, about 0.8 to about 0.99, about 0.8 to about 1, about 0.85 to about 0.9, about 0.85 to about 0.925, about 0.85 to about 0.95, about 0.85 to about 0.96, about 0.85 to about 0.97, about 0.85 to about 0.98, about 0.85 to about 0.99, about 0.85 to about 1, about 0.9 to about 0.925, about 0.9 to about 0.95, about 0.9 to about 0.96, about 0.9 to about 0.97, about 0.9 to about 0.98, about 0.9 to about 0.99, about 0.9 to about 1, about 0.925 to about 0.95, about 0.925 to about 0.96, about 0.925 to about 0.97, about 0.925 to about 0.98, about 0.925 to about 0.99, about 0.925 to about 1, about 0.95 to about 0.96, about 0.95 to about 0.97, about 0.95 to about 0.98, about 0.95 to about 0.99, about 0.95 to about 1, about 0.96 to about 0.97, about 0.96 to about 0.98, about 0.96 to about 0.99, about 0.96 to about 1, about 0.97 to about 0.98, about 0.97 to about 0.99, about 0.97 to about 1, about 0.98 to about 0.99, about 0.98 to about 1, or about 0.99 to about 1. In certain embodiments, the trained machine learning model classifies the skin of the patient as indicative of the disease state of the patient with a ROC having an AUC of about 0.7, about 0.75, about 0.8, about 0.85, about 0.9, about 0.925, about 0.95, about 0.96, about 0.97, about 0.98, about 0.99, or about 1. In certain embodiments, the trained machine learning model classifies the skin of the patient as indicative of the disease state of the patient with a ROC having an AUC of at least about 0.7, about 0.75, about 0.8, about 0.85, about 0.9, about 0.925, about 0.95, about 0.96, about 0.97, about 0.98, or about 0.99. In certain embodiments, the trained machine learning model classifies the skin of the patient as indicative of the disease state of the patient with a ROC having an AUC of at most about 0.75, about 0.8, about 0.85, about 0.9, about 0.925, about 0.95, about 0.96, about 0.97, about 0.98, about 0.99, or about 1.
In certain embodiments, the patient has lupus, PSO, AD, and/or SSc. In certain embodiments, the patient is suspected of having lupus, PSO, AD, and/or SSc. In certain embodiments, the patient is at elevated risk of having lupus, PSO, AD, and/or SSc. In certain embodiments, the patient is asymptomatic for lupus, PSO, AD, and/or SSc. In certain embodiments, the patient has DLE, and/or SCLE. In certain embodiments, the patient is suspected of having DLE, and/or SCLE. In certain embodiments, the patient is at elevated risk of having DLE, and/or SCLE. In certain embodiments, the patient is asymptomatic for DLE, and/or SCLE. In certain embodiments, the patient has, is suspected of having, is at elevated risk of having and/or is asymptomatic for lupus. In certain embodiments, the patient has, is suspected of having, is at elevated risk of having and/or is asymptomatic for PSO. In certain embodiments, the patient has, is suspected of having, is at elevated risk of having and/or is asymptomatic for AD. In certain embodiments, the patient has, is suspected of having, is at elevated risk of having and/or is asymptomatic for SSc.
In certain embodiments, the method can further comprise administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state. The treatment can be configured to treat, reduce severity, and/or reduce risk of having the lupus, PSO, AD, or SSc. In some embodiments, the treatment is configured to treat the lupus, PSO, AD, or SSc. In some embodiments, the treatment is configured to reduce a severity of the lupus, PSO, AD, or SSc. In some embodiments, the treatment is configured to reduce a risk of having the lupus, PSO, AD, or SSc. The treatment can be one or more treatments of lupus, PSO, AD, and/or SSc. In certain embodiments, the method can comprise administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the DLE, or SCLE disease state. The treatment can be configured to treat, reduce severity, and/or reduce risk of having the DLE, or SCLE. In some embodiments, the treatment is configured to treat the DLE, or SCLE. In some embodiments, the treatment is configured to reduce a severity of the DLE, or SCLE In some embodiments, the treatment is configured to reduce a risk of having the DLE, or SCLE. The treatment can be one or more treatments of DLE, or SCLE. In some embodiments, the treatment comprises a pharmaceutical composition. In certain embodiments, the treatment is configured to treat, reduce severity, and/or reduce risk of having lupus. In certain embodiments, the treatment is configured to treat, reduce severity, and/or reduce risk of having PSO. In certain embodiments, the treatment is configured to treat, reduce severity, and/or reduce risk of having AD. In certain embodiments, the treatment is configured to treat, reduce severity, and/or reduce risk of having SSc. In certain embodiments, the treatment is configured to treat, reduce severity, and/or reduce risk of having DLE. In certain embodiments, the treatment is configured to treat, reduce severity, and/or reduce risk of having SCLE. The treatment can be a treatment mentioned herein. A treatment used in the context of the present methods may be any known to those of skill in the art for treating, e.g., reducing the severity of or reducing the risk of, the disease state in the patient. In some embodiments, the treatment comprises an immunosuppressive treatment. In some embodiments, the treatment comprises a pharmaceutical composition comprising one or more agents that target and/or inhibit: TNF (e.g., etanercept, infliximab, adalimumab, certolizumab); IL-12/23 (IL23 complex) (e.g., ustekinumab, guselkumab, risankizumab; an interferon or interferon receptor (e.g., anifrolumab, which binds to IFNAR); proteasome (e.g., bortezomib, carfilzomib, ixazomib); CD38 (e.g., daratumumab, isatuximab); SLAMF7 (e.g., elotuzumab); IWPDH (mycophenylate mofetil); BlyS (e.g., belimumab); CD19 (e.g., inebilizumab); CD20 (e.g., rituximab, obinutuzumab); CD20/CD3 (e.g., glofitamab); NPL4 (e.g., disulfiram); neutrophil elastase (e.g., alvelestat); a growth factor receptor, e.g., FGFR, PDGFR, VEGFR (e.g., nintedanib, pirfenidone); BDCA2 (e.g., BIIB059); ILT7 (e.g., Daxdilmab). In some embodiments, the pharmaceutical composition comprises an agent that targets plasma cells (e.g., bortezomib, carfilzomib, ixazomib, daratumumab, isatuximab, elotuzumab, mycophenylate mofetil), B cells (e.g., belimumab, inebilizumab, rituximab, glofitamab, obinutuzumab), neutrophils (e.g., disulfiram, alvelestat), TGFB fibroblasts (e.g., nintedanib, pirfenidone), and/or dendritic cells (e.g., BIIB059, Daxdilmab). In some embodiments, a treatment for DLE comprises an agent that targets plasma cells and/or B cells. In some embodiments, a treatment for psoriasis comprises an agent that targets neutrophils. In some embodiments, a treatment for systemic sclerosis comprises an agent that targets TGFB fibroblasts and/or dendritic cells. In some embodiments, a treatment for atopic dermatitis comprises an agent that targets IL23. In some embodiments, the treatment can be one or more treatments shown in
In certain embodiments, the step (b) comprises using a trained machine learning classifier to analyze the data set to classify the skin of the patient as indicative of having the disease state. In certain embodiments, the trained machine learning classifier can be trained to infer whether the skin of patient is indicative of the disease state based on the gene expression measurements from the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci. The trained machine learning classifier can generate an inference indicating whether the skin of patient is indicative of the disease state, based on the dataset. The trained machine learning classifier can generate the inference based at least on comparing the data set to a reference data set. The trained machine learning classifier can be trained using a reference data set, wherein a first portion of the reference data set can be used as training data set, and a second portion of the reference data set can be used as validation dataset. The reference data set can comprise and/or can be derived from gene expression measurements of reference biological samples from the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci. The plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci of the data set and the reference data set can at least partially overlap (e.g., are same). The reference biological samples can be obtained from a plurality of reference subjects. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having the disease state, and a second plurality of biological samples obtained or derived from reference subjects not having the disease state, wherein the skin of the reference subjects having the disease state contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having the disease state, and a second plurality of biological samples obtained or derived from reference subjects not having the disease state, wherein the skin of the reference subjects having the disease state do not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of biological samples obtained or derived from reference subjects not having lupus state, wherein the skin of the reference subjects having the lupus disease state contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of biological samples obtained or derived from reference subjects not having lupus disease state, wherein the skin of the reference subjects having the lupus disease state do not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having PSO disease state, and a second plurality of biological samples obtained or derived from reference subjects not having PSO disease state, wherein the skin of the reference subjects having the PSO disease state contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having PSO disease state, and a second plurality of biological samples obtained or derived from reference subjects not having PSO disease state, wherein the skin of the reference subjects having the PSO disease state do not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having AD disease state, and a second plurality of biological samples obtained or derived from reference subjects not having AD disease state, wherein the skin of the reference subjects having the AD disease state contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having AD disease state, and a second plurality of biological samples obtained or derived from reference subjects not having AD disease state, wherein the skin of the reference subjects having the AD disease state do not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having SSc disease state, and a second plurality of biological samples obtained or derived from reference subjects not having SSc disease state, wherein the skin of the reference subjects having the SSc disease state contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having SSc disease state, and a second plurality of biological samples obtained or derived from reference subjects not having SSc disease state, wherein the skin of the reference subjects having the SSc disease state do not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of biological samples obtained or derived from reference subjects having PSO disease state, wherein the skin of the reference subjects contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of biological samples obtained or derived from reference subjects having PSO disease state, wherein the skin of the reference subjects do not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of biological samples obtained or derived from reference subjects having AD disease state, wherein the skin of the reference subjects contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of biological samples obtained or derived from reference subjects having AD disease state, wherein the skin of the reference subjects do not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of biological samples obtained or derived from reference subjects having SSc disease state, wherein the skin of the reference subjects contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of biological samples obtained or derived from reference subjects having SSc disease state, wherein the skin of the reference subjects do not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having AD disease state, and a second plurality of biological samples obtained or derived from reference subjects having PSO disease state, wherein the skin of the reference subjects do not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having AD disease state, and a second plurality of biological samples obtained or derived from reference subjects having PSO disease state, wherein the skin of the reference subjects contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having DLE disease state, and a second plurality of biological samples obtained or derived from reference subjects having SCLE disease state, wherein the skin of the reference subjects contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having DLE disease state, and a second plurality of biological samples obtained or derived from reference subjects having SCLE disease state, wherein the skin of the reference subjects do not contain a lesion. The reference biological samples can comprise skin biopsy sample, blood sample, isolated peripheral blood mononuclear cells (PBMCs), or any derivative thereof. In certain embodiments, the trained machine learning classifier is trained to infer the classification of the skin of the patient based on a set of N features, the machine learning classifier trained by at least determining, from a training dataset, the N features that are usable to determine a binary classification indicative of whether a training dataset patient has i) skin indicative of at least one of one or more inflammatory skin disease state selected from lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state, or healthy state, or i) skin indicative of a first inflammatory skin disease state of the one or more inflammatory skin disease state or a second inflammatory skin disease of the one or more inflammatory skin disease state. The N features can be determined according to method described herein. The patient can be a human. The reference subjects can be humans.
In certain embodiments, the trained machine learning classifier is a linear regression, a logistic regression, a Ridge regression, a Lasso regression, an elastic net (EN) regression, a support vector machine (SVM), a gradient boosted machine (GBM), a k nearest neighbors (kNN), a generalized linear model (GLM), a naïve Bayes (NB) classifier, a neural network, a Random Forest (RF), a deep learning algorithm, a linear discriminant analysis (LDA), a decision tree learning (DTREE), an adaptive boosting (ADB), Classification and Regression Tree (CART), Hierarchical clustering, or any combination thereof.
The biological sample can comprise a skin biopsy sample, a blood sample, isolated peripheral blood mononuclear cells (PBMCs), or any derivative thereof. In certain embodiments, the biological sample comprises a skin biopsy sample, or any derivative thereof. In certain embodiments, the biological sample comprises a blood sample, or any derivative thereof. In certain embodiments, the biological sample comprises isolated peripheral blood mononuclear cells (PBMCs), or any derivative thereof.
In certain embodiments, the method further comprises determining a likelihood of the classification of the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state. In certain embodiments, the method further comprises monitoring the skin of the patient, wherein the monitoring comprises assessing the skin of the patient at a plurality of different time points. A difference in the assessment of the skin of the patient among the plurality of time points can be indicative of one or more clinical indications selected from the group consisting of: (i) a diagnosis of the skin of the patient, (ii) a prognosis of the skin of the patient, and (iii) an efficacy or non-efficacy of a course of treatment for treating the skin of the patient. The inference of the machine learning classifier can include a confidence value between 0 and 1. In certain embodiments, the confidence value of the inference of the machine learning classifier is between 0 and 1, such as, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1, or any value or ranges there between, that the patient has the disease state. In certain embodiments, the confidence value of the inference of the machine learning classifier is between 0 and 1, such as, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1, or any value or ranges there between, that the subject has lupus disease state. In certain embodiments, the confidence value of the inference of the machine learning classifier is between 0 and 1, that the subject has PSO disease state. In certain embodiments, the confidence value of the inference of the machine learning classifier is between 0 and 1, such as, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1, or any value or ranges there between, that the subject has AD disease state. In certain embodiments, the confidence value of the inference of the machine learning classifier is between 0 and 1, such as, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1, or any value or ranges there between, that the subject has SSc disease state. In certain embodiments, the confidence value of the inference of the machine learning classifier is between 0 and 1, such as, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1, or any value or ranges there between, that the subject has DLE disease state. In certain embodiments, the confidence value of the inference of the machine learning classifier is between 0 and 1, such as, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1, or any value or ranges there between, that the subject has SCLE disease state.
The data set can be generated from the biological sample from the patient. For example, nucleic acid molecules of the patient in the biological sample can be assessed to obtain the data set. In certain embodiments, the gene expression measurements of the biological sample, from the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci can be performed using any suitable method known to those of skill in the art including but not limited to DNA sequencing, RNA sequencing, microarray data, RNA-Seq, qPCR, northern blotting, fluorescent in situ hybridization, serial analysis of gene expression, tiling arrays or any combination thereof, to obtain the data set. In certain embodiments, data set can be derived from the gene expression measurement data of the biological sample, wherein the gene expression measurement data is analyzed using a suitable data analysis tool including but not limited to a BIG-C™ big data analysis tool, an I-Scope™ big data analysis tool, a T-Scope™ big data analysis tool, a CellScan big data analysis tool, an MS (Molecular Signature) Scoring™ analysis tool, gene set variation analysis (GSVA), Z-score, gene set enrichment analysis (GSEA), enrichment algorithm, multiscale embedded gene co-expression network analysis (MEGENA), weighted gene co-expression network analysis (WGCNA), differential expression analysis, log 2 expression analysis, or any combination thereof, to obtain the dataset. In certain embodiments, the gene expression measurement data of the biological sample can be analyzed using GSVA, to obtain the data set. The reference data set can be generated from the reference biological samples. In certain embodiments, the gene expression measurements of the reference biological samples from the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci (e.g., of the reference data set) can be performed using any suitable method known to those of skill in the art including but not limited to DNA sequencing, RNA sequencing, microarray data, RNA-Seq, qPCR, northern blotting, fluorescent in situ hybridization, serial analysis of gene expression, tiling arrays or any combination thereof. In certain embodiments, reference data set can be derived from the gene expression measurement data of the reference biological samples, wherein the gene expression measurement data is analyzed using a suitable data analysis tool including but not limited to a BIG-C™ big data analysis tool, an I-Scope™ big data analysis tool, a T-Scope™ big data analysis tool, a CellScan big data analysis tool, an MS (Molecular Signature) Scoring™ analysis tool, gene set variation analysis (GSVA), Z-score, gene set enrichment analysis (GSEA), enrichment algorithm, multiscale embedded gene co-expression network analysis (MEGENA), weighted gene co-expression network analysis (WGCNA), differential expression analysis, log 2 expression analysis, or any combination thereof, to obtain the reference data set. In certain embodiments, the gene expression measurement data of the reference biological samples can be analyzed using GSVA, to obtain the reference data set.
In certain embodiments, the skin of the patient comprises one or more lesions, and in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-8, Table 4B-25, Table 4B-14, Table 4A-16, Table 4B-22, Table 4B-10, Table 4A-11, Table 4B-16, Table 4B-26, Table 4A-1, Table 4A-19, Table 4A-15, Table 4B-28, Table 4B-15, and Table 4B-23, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-8, Table 4B-25, Table 4B-14, Table 4A-16, Table 4B-22, Table 4B-10, Table 4A-11, Table 4B-16, Table 4B-26, Table 4A-1, Table 4A-19, Table 4A-15, Table 4B-28, Table 4B-15, and Table 4B-23, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, iv) in step (c) the report is indicative of the classification of the skin of the patient as indicative of the lupus disease state. In certain embodiments, the treatment is administered to the patient is based at least in part on the classification of the skin of the patient as indicative of the lupus disease state, and/or the treatment is configured to treat, to reduce severity of, and/or reduce risk of having the lupus. In certain particular embodiments, for the embodiments described in this paragraph Tables selected includes at least Tables 4B-8 and B-10
In certain embodiments, the skin of the patient does not comprise a lesion, and in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-26, Table 4A-8, Table 4A-14, Table 4A-16, Table 4B-11, Table 4A-1, Table 4B-6, Table 4A-10, Table 4B-10, Table 4B-16, Table 4B-2, Table 4B-19, Table 4B-13, Table 4B-1, and Table 4B-25, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between tables selected from Table 4B-26, Table 4A-8, Table 4A-14, Table 4A-16, Table 4B-11, Table 4A-1, Table 4B-6, Table 4A-10, Table 4B-10, Table 4B-16, Table 4B-2, Table 4B-19, Table 4B-13, Table 4B-1, and Table 4B-25, or any combination thereof, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, iv) in step (c) the report is indicative of the classification of the skin of the patient as indicative of the lupus disease state. In certain embodiments, the treatment is administered to the patient is based at least in part on the classification of the skin of the patient as indicative of the lupus disease state, and/or the treatment is configured to treat, to reduce severity of, and/or reduce risk of having the lupus.
In certain embodiments, the skin of the patient comprises one or more lesions, and in step (b) the data set is analyzed to classify the skin of the patient as indicative of the PSO disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-3, Table 4B-25, Table 4B-10, Table 4B-16, Table 4B-8, Table 4B-14, Table 4B-2, Table 4A-7, Table 4B-28, Table 4B-23, Table 4B-20, Table 4B-26, Table 4A-13, Table 4B-18, and Table 4A-16, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the PSO disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-3, Table 4B-25, Table 4B-10, Table 4B-16, Table 4B-8, Table 4B-14, Table 4B-2, Table 4A-7, Table 4B-28, Table 4B-23, Table 4B-20, Table 4B-26, Table 4A-13, Table 4B-18, and Table 4A-16, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the PSO disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, iv) in step (c) the report is indicative of the classification of the skin of the patient as indicative of the PSO disease state. In certain embodiments, the treatment is administered to the patient is based at least in part on the classification of the skin of the patient as indicative of the PSO disease state, and/or the treatment is configured to treat, to reduce severity of, and/or reduce risk of having the PSO. In certain particular embodiments, for the embodiments described in this paragraph Tables selected includes at least Tables 4B-8 and B-10.
In certain embodiments, the skin of the patient does not comprise a lesion, and in step (b) the data set is analyzed to classify the skin of the patient as indicative of the PSO disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-1, Table 4B-3, Table 4B-12, Table 4A-14, Table 4A-20, Table 4B-17, Table 4B-20, Table 4B-27, Table 4A-9, Table 4A-15, Table 4A-18, Table 4A-13, Table 4B-26, Table 4B-2, and Table 4A-5, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the PSO disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-1, Table 4B-3, Table 4B-12, Table 4A-14, Table 4A-20, Table 4B-17, Table 4B-20, Table 4B-27, Table 4A-9, Table 4A-15, Table 4A-18, Table 4A-13, Table 4B-26, Table 4B-2, and Table 4A-5, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the PSO disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, iv) in step (c) the report is indicative of the classification of the skin of the patient as indicative of the PSO disease state. In certain embodiments, the treatment is administered to the patient is based at least in part on the classification of the skin of the patient as indicative of the PSO disease state, and/or the treatment is configured to treat, to reduce severity of, and/or reduce risk of having the PSO.
In certain embodiments, the skin of the patient comprises one or more lesions, and in step (b) the data set is analyzed to classify the skin of the patient as indicative of the AD disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-10, Table 4B-25, Table 4B-8, Table 4B-22, Table 4B-28, Table 4B-16, Table 4A-16, Table 4B-14, Table 4B-13, Table 4B-23, Table 4B-7, Table 4B-15, Table 4A-12, Table 4B-3, and Table 4B-2, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the AD disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-10, Table 4B-25, Table 4B-8, Table 4B-22, Table 4B-28, Table 4B-16, Table 4A-16, Table 4B-14, Table 4B-13, Table 4B-23, Table 4B-7, Table 4B-15, Table 4A-12, Table 4B-3, and Table 4B-2, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the AD disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, iv) in step (c) the report is indicative of the classification of the skin of the patient as indicative of the AD disease state. In certain embodiments, the treatment is administered to the patient is based at least in part on the classification of the skin of the patient as indicative of the AD disease state, and/or the treatment is configured to treat, to reduce severity of, and/or reduce risk of having the AD. In certain particular embodiments, for the embodiments described in this paragraph Tables selected includes at least Tables 4B-8 and B-10.
In certain embodiments, the skin of the patient does not comprise a lesion, and in step (b) the data set is analyzed to classify the skin of the patient as indicative of the AD disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-17, Table 4B-28, Table 4A-6, Table 4A-7, Table 4B-2, Table 4B-20, Table 4A-9, Table 4B-18, Table 4A-12, Table 4A-16, Table 4A-13, Table 4B-23, Table 4B-9, Table 4A-3, and Table 4A-10, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the AD disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-17, Table 4B-28, Table 4A-6, Table 4A-7, Table 4B-2, Table 4B-20, Table 4A-9, Table 4B-18, Table 4A-12, Table 4A-16, Table 4A-13, Table 4B-23, Table 4B-9, Table 4A-3, and Table 4A-10, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the AD disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, iv) in step (c) the report is indicative of the classification of the skin of the patient as indicative of the AD disease state. In certain embodiments, the treatment is administered to the patient is based at least in part on the classification of the skin of the patient as indicative of the AD disease state, and/or the treatment is configured to treat, to reduce severity of, and/or reduce risk of having the AD.
In certain embodiments, the skin of the patient comprises one or more lesions, and in step (b) the data set is analyzed to classify the skin of the patient as indicative of the SSc disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4A-16, Table 4B-8, Table 4B-25, Table 4B-21, Table 4B-26, Table 4B-10, Table 4B-28, Table 4B-2, Table 4B-27, Table 4B-14, Table 4A-18, Table 4A-6, Table 4A-15, Table 4B-12, and Table 4B-23, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the SSc disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4A-16, Table 4B-8, Table 4B-25, Table 4B-21, Table 4B-26, Table 4B-10, Table 4B-28, Table 4B-2, Table 4B-27, Table 4B-14, Table 4A-18, Table 4A-6, Table 4A-15, Table 4B-12, and Table 4B-23, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the SSc disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, iv) in step (c) the report is indicative of the classification of the skin of the patient as indicative of the SSc disease state. In certain embodiments, the treatment is administered to the patient is based at least in part on the classification of the skin of the patient as indicative of the SSc disease state, and/or the treatment is configured to treat, to reduce severity of, and/or reduce risk of having the SSc. In certain particular embodiments, for the embodiments described in this paragraph Tables selected includes at least Tables 4B-8 and B-10.
In certain embodiments, the skin of the patient does not comprise a lesion, and in step (b) the data set is analyzed to classify the skin of the patient as indicative of the SSc disease state.
In certain embodiments, the skin of the patient comprises one or more lesions, and in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4A-11, Table 4B-13, Table 4A-17, and Table 4B-10, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4A-11, Table 4B-13, Table 4A-17, Table 4B-23, Table 4B-20 and Table 4B-10, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4A-11, Table 4B-13, Table 4A-17, and Table 4B-10, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4B-23, Table 4B-13, Table 4A-17, and Table 4B-20, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or 13, or any range there between, tables selected from Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4B-13, and Table 4A-17, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain particular embodiments, all the 13 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 or 17 or any range there between, tables selected from Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4A-11, Table 4B-23, Table 4B-13, Table 4A-17, Table 4B-20 and Table 4B-10, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain particular embodiments, all the 17 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, iv) in step (c) the report is indicative of the classification of the skin of the patient as indicative of the lupus or AD disease state. In certain embodiments, the treatment is administered to the patient is based at least in part on the classification of the skin of the patient as indicative of the lupus or AD disease state, and/or the treatment is configured to treat, to reduce severity of, and/or reduce risk of having the lupus or AD.
In certain embodiments, the skin of the patient does not comprise a lesion, and in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4B-1, Table 4A-10, Table 4A-12, Table 4B-13, and Table 4B-15, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4B-1, Table 4A-10, Table 4A-12, Table 4B-13, Table 4B-23, Table 4B-12 and Table 4B-15, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4B-1, Table 4A-10, Table 4A-12, Table 4B-13, and Table 4B-15, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4B-23, Table 4A-10, Table 4B-12, Table 4B-13, and Table 4B-15, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or 13, or any range there between, tables selected from Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4A-10, Table 4B-13, and Table 4B-15, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain particular embodiments, all the 13 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17, or any range there between, tables selected from Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4B-1, Table 4A-10, Table 4A-12, Table 4B-13, Table 4B-23, Table 4B-12, and Table 4B-15, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain particular embodiments, all the 17 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, iv) in step (c) the report is indicative of the classification of the skin of the patient as indicative of the lupus or AD disease state. In certain embodiments, the treatment is administered to the patient is based at least in part on the classification of the skin of the patient as indicative of the lupus or AD disease state, and/or the treatment is configured to treat, to reduce severity of, and/or reduce risk of having the lupus or AD.
In certain embodiments, the skin of the patient comprises one or more lesions, and in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, Table 4B-5, and Table 4B-7, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, Table 4B-5, Table 4B-23 and Table 4B-7, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, Table 4B-5, and Table 4B-7, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, Table 4B-5, and Table 4B-23, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or 16, or any range there between, tables selected from Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, Table 4B-5, Table 4B-7, and Table 4B-23, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain particular embodiments, all the 16 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14, or any range there between, tables selected from Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, and Table 4B-5, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain particular embodiments, all the 14 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, iv) in step (c) the report is indicative of the classification of the skin of the patient as indicative of the lupus or PSO disease state. In certain embodiments, the treatment is administered to the patient is based at least in part on the classification of the skin of the patient as indicative of the lupus or PSO disease state, and/or the treatment is configured to treat, to reduce severity of, and/or reduce risk of having the lupus or PSO.
In certain embodiments, i) the skin of the patient does not comprise a lesion, and in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, Table 4B-24, Table 4B-22, Table 4A-7, Table 4B-10, and Table 4A-10, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, Table 4B-24, Table 4B-22, Table 4A-7, Table 4B-10, Table 4A-5, Table 4B-17, Table 4B-12, Table 4A-3, and Table 4A-10, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, Table 4B-24, Table 4B-22, Table 4A-7, Table 4B-10, and Table 4A-10, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, Table 4B-22, Table 4A-5, Table 4B-17, Table 4B-12, and, Table 4A-3, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or 11, or any range there between, tables selected from Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, and Table 4B-22, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain particular embodiments, all the 11 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, or 19, or any range there between, tables selected from Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, Table 4B-24, Table 4B-22, Table 4A-7, Table 4B-10, Table 4A-5, Table 4B-17, Table 4B-12, Table 4A-3, and Table 4A-10, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain particular embodiments, all the 19 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, iv) in step (c) the report is indicative of the classification of the skin of the patient indicative of the lupus or PSO disease state. In certain embodiments, the treatment is administered to the patient is based at least in part on the classification of the skin of the patient as indicative of the lupus or PSO disease state, and/or the treatment is configured to treat, to reduce severity of, and/or reduce risk of having the lupus or PSO.
In certain embodiments, the skin of the patient comprises one or more lesions, and in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4A-15, Table 4B-13, and Table 4A-8, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4A-15, Table 4B-13, Table 4A-1 and Table 4A-8, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4A-15, Table 4B-13, and Table 4A-8, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4A-1, Table 4B-13, and Table 4A-8, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14, or any range there between, tables selected from Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4B-13, and Table 4A-8, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state. In certain particular embodiments, all the 14 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or 16, or any range there between, tables selected from Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4A-15, Table 4A-1, Table 4B-13, and Table 4A-8, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state. In certain particular embodiments, all the 16 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, iv) in step (c) the report is indicative of the classification of the skin of the patient as indicative of the lupus or SSc disease state. In certain embodiments, the treatment is administered to the patient is based at least in part on the classification of the skin of the patient as indicative of the lupus or SSc disease state, and/or the treatment is configured to treat, to reduce severity of, and/or reduce risk of having the lupus or SSc.
In certain embodiments, the skin of the patient does not comprise a lesion, and in step (b) the data set is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state.
In certain embodiments, i) the skin of the patient does not comprise a lesion, and in step (b) the data set is analyzed to classify the skin of the patient as indicative of the AD or PSO disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-1, Table 4B-14, Table 4B-3, Table 4B-7, Table 4B-17, Table 4A-9, Table 4B-12, Table 4A-4, Table 4B-10, Table 4A-14, Table 4B-20, Table 4B-22, Table 4B-16, Table 4B-13, and Table 4A-11 and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the AD or PSO disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-1, Table 4B-14, Table 4B-3, Table 4B-7, Table 4B-17, Table 4A-9, Table 4B-12, Table 4A-4, Table 4B-10, Table 4A-14, Table 4B-20, Table 4B-22, Table 4B-16, Table 4B-13, and Table 4A-11, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the AD or PSO disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, iv) in step (c) the report is indicative of the classification of the skin of the patient as indicative of the AD or PSO disease state. In certain embodiments, the treatment is administered to the patient is based at least in part on the classification of the skin of the patient as indicative of the AD or PSO disease state, and/or the treatment is configured to treat, to reduce severity of, and/or reduce risk of having the AD or PSO.
In certain embodiments, the skin of the patient comprises one or more lesions, and in step (b) the data set is analyzed to classify the skin of the patient as indicative of the AD or PSO disease state.
In certain embodiments, the skin of the patient comprises one or more lesions, and in step (b) the data set is analyzed to classify the skin of the patient as indicative of the DLE or SCLE disease state. In certain embodiments, the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4A-16, Table 4B-26, Table 4B-25, Table 4B-2, Table 4B-22, Table 4B-14, Table 4A-13, Table 4A-15, Table 4B-4, Table 4B-9, Table 4A-10, Table 4A-12, Table 4B-6, Table 4B-1, and Table 4A-5, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the discoid lupus erythematosus (DLE) or Subacute cutaneous lupus erythematosus (SCLE) disease state. In certain embodiments, the skin of the patient comprises one or more lesions, ii) the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4A-16, Table 4B-26, Table 4B-25, Table 4B-2, Table 4B-22, Table 4B-14, Table 4A-13, Table 4A-15, Table 4B-4, Table 4B-9, Table 4A-10, Table 4A-12, Table 4B-6, Table 4B-1, Table 4A-5, or any combination thereof, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the discoid lupus erythematosus (DLE) or Subacute cutaneous lupus erythematosus (SCLE) disease state. In certain embodiments, iv) in step (c) the report is indicative of the classification of the skin of the patient as indicative of the DLE or SCLE disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, the treatment is administered to the patient is based at least in part on the classification of the skin of the patient as indicative of the DLE or SCLE disease state, and/or the treatment is configured to treat, to reduce severity of, and/or reduce risk of having the DLE or SCLE, respectively.
In certain embodiments, the skin of the patient does not comprise a lesion, and in step (b) the data set is analyzed to classify the skin of the patient as indicative of the DLE or SCLE disease state.
In an aspect, the present disclosure provides a method for assessing skin of a patient. The method can include any one of, any combination of, or all of steps (a′), (b′) and (c′). Step (a′) can include performing enrichment assessment of a data set comprising gene expression measurements of a biological sample from the patient to obtain an enrichment score of the patient, wherein the enrichment assessment comprises assessing enrichment of expression at least 2 genes selected from the genes listed in Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28, in the biological sample. Step (b′) can include analyzing the enrichment score of the patient, e.g. obtained in step (a′) to classify the skin of the patient as indicative of a disease state of the patient. Step (c′) can include electronically outputting a report classifying the skin of patient indicative of the disease state of the patient. The report of step (c′) can be indicative of the classification obtained in step (b′).
In certain embodiments, the disease state is an inflammatory skin disease state. In certain embodiments, the disease state is a rheumatic skin disease state. In certain embodiments the disease state is lupus (e.g., systemic lupus erythematosus (SLE)), psoriasis (PSO), atopic dermatitis (AD), and/or systemic sclerosis (scleroderma) (SSc) disease state. The skin of the patient can contain one or more lesions, or does not contain a lesion. In certain embodiments, the skin of the patient contains one or more lesions. In certain embodiments, the skin of the patient does not contain a lesion.
In certain embodiments, the lupus is SLE, CLE, DLE, ACLE, SCLE, or CCLE, or any combination thereof. In certain embodiments, the lupus is SLE. In certain embodiments, the lupus is CLE. In certain embodiments, the lupus is DLE. In certain embodiments, the lupus is ACLE. In certain embodiments, the lupus is SCLE. In certain embodiments, the lupus is CCLE. In certain embodiments, the disease state is lupus disease state. In certain embodiments, the disease state is SLE disease state. In certain embodiments, the disease state is CLE disease state. In certain embodiments, the disease state is DLE disease state. In certain embodiments, the disease state is ACLE disease state. In certain embodiments, the disease state is SCLE disease state. In certain embodiments, the disease state is CCLE disease state.
In certain embodiments, the SLE disease state is discoid lupus erythematosus (DLE) disease state, acute cutaneous lupus erythematosus (ACLE) disease state, subacute cutaneous lupus erythematosus (SCLE) disease state, or chronic cutaneous lupus erythematosus (CCLE) disease state, or any combination thereof. In certain embodiments, the SLE disease state is DLE disease state. In certain embodiments, SLE disease state is SCLE disease state.
In certain embodiments, the disease state is lupus, PSO, AD, and/or SSc, disease state, and in step (b′) the enrichment score of the patient, e.g. obtained in step (a′) is analyzed to classify the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc, disease state. In certain embodiments, the disease state is lupus, PSO, AD, or SSc, disease state, and in step (b′) the enrichment score of the patient, e.g. obtained in step (a′) is analyzed to classify the skin of the patient as indicative of the lupus, PSO, AD, or SSc, disease state. In certain embodiments, the disease state is lupus disease state, and in step (b′) the enrichment score of the patient, e.g. obtained in step (a′), is analyzed to classify the skin of the patient as indicative of the lupus disease state. In certain embodiments, the disease state is lupus disease state, and in step (b′) the enrichment score of the patient, e.g. obtained in step (a′), is analyzed to classify whether the skin of the patient is indicative of a group 1 lupus disease state, group 2 lupus disease state, group 3 lupus disease state, or not having the lupus disease state. Group 1, 2, and 3 lupus disease state can be characterized by gene enrichment analysis corresponding to group 1, 2 and 3 lupus disease, respectively, as described in Example 2, and
In certain embodiments, the at least 2 genes in step (a′) comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 305, 310, 315, 320, 325, 330, 335, 340, 345, 350, 355, 360, 365, 370, 375, 380, 385, 390, 395, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500, 1550, 1600, 1650, or all (e.g., genes listed in the Tables), or any value or range there between, genes.
In certain embodiments, the at least 2 genes in step (a′) comprises independently at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 305, 310, 315, 320, 325, 330, 335, 340, 345, 350, 355, 360, or all, or any value or range there between genes selected from the genes listed in each of one or more Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28. In certain embodiments, the one or more Tables can include 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, or 48, or any range there between Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28.
The method can classify the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state of the patient, with an accuracy of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The method can classify the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state of the patient, with an sensitivity of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The method can classify the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state of the patient, with an specificity of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The method can classify the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state of the patient, with a positive predictive value of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The method can classify the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state of the patient, with a negative predictive value of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The method can classify the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state of the patient, with receiver operating characteristic (ROC) curve having an Area-Under-Curve (AUC) of at least about 0.50, at least about 0.55, at least about 0.60, at least about 0.65, at least about 0.70, at least about 0.75, at least about 0.80, at least about 0.85, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least about 0.98, at least about 0.99, or more than about 0.99.
In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with an accuracy of about 70% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with an accuracy of about 70% to about 75%, about 70% to about 80%, about 70% to about 85%, about 70% to about 90%, about 70% to about 92.5%, about 70% to about 95%, about 70% to about 96%, about 70% to about 97%, about 70% to about 98%, about 70% to about 99%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 92.5%, about 75% to about 95%, about 75% to about 96%, about 75% to about 97%, about 75% to about 98%, about 75% to about 99%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 92.5%, about 80% to about 95%, about 80% to about 96%, about 80% to about 97%, about 80% to about 98%, about 80% to about 99%, about 80% to about 100%, about 85% to about 90%, about 85% to about 92.5%, about 85% to about 95%, about 85% to about 96%, about 85% to about 97%, about 85% to about 98%, about 85% to about 99%, about 85% to about 100%, about 90% to about 92.5%, about 90% to about 95%, about 90% to about 96%, about 90% to about 97%, about 90% to about 98%, about 90% to about 99%, about 90% to about 100%, about 92.5% to about 95%, about 92.5% to about 96%, about 92.5% to about 97%, about 92.5% to about 98%, about 92.5% to about 99%, about 92.5% to about 100%, about 95% to about 96%, about 95% to about 97%, about 95% to about 98%, about 95% to about 99%, about 95% to about 100%, about 96% to about 97%, about 96% to about 98%, about 96% to about 99%, about 96% to about 100%, about 97% to about 98%, about 97% to about 99%, about 97% to about 100%, about 98% to about 99%, about 98% to about 100%, or about 99% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with an accuracy of about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with an accuracy of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, or about 99%.
In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a sensitivity of about 70% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a sensitivity of about 70% to about 75%, about 70% to about 80%, about 70% to about 85%, about 70% to about 90%, about 70% to about 92.5%, about 70% to about 95%, about 70% to about 96%, about 70% to about 97%, about 70% to about 98%, about 70% to about 99%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 92.5%, about 75% to about 95%, about 75% to about 96%, about 75% to about 97%, about 75% to about 98%, about 75% to about 99%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 92.5%, about 80% to about 95%, about 80% to about 96%, about 80% to about 97%, about 80% to about 98%, about 80% to about 99%, about 80% to about 100%, about 85% to about 90%, about 85% to about 92.5%, about 85% to about 95%, about 85% to about 96%, about 85% to about 97%, about 85% to about 98%, about 85% to about 99%, about 85% to about 100%, about 90% to about 92.5%, about 90% to about 95%, about 90% to about 96%, about 90% to about 97%, about 90% to about 98%, about 90% to about 99%, about 90% to about 100%, about 92.5% to about 95%, about 92.5% to about 96%, about 92.5% to about 97%, about 92.5% to about 98%, about 92.5% to about 99%, about 92.5% to about 100%, about 95% to about 96%, about 95% to about 9700, about 95% to about 98%, about 95% to about 99%, about 95% to about 100%, about 96% to about 97%, about 96% to about 98%, about 96% to about 99%, about 96% to about 100%, about 97% to about 98%, about 97% to about 99%, about 97% to about 100%, about 98% to about 99%, about 98% to about 100%, or about 99% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a sensitivity of about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a sensitivity of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, or about 99%.
In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a specificity of about 70% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a specificity of about 70% to about 75%, about 70% to about 80%, about 70% to about 85%, about 70% to about 90%, about 70% to about 92.5%, about 70% to about 95%, about 70% to about 96%, about 70% to about 97%, about 70% to about 98%, about 70% to about 99%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 92.5%, about 75% to about 95%, about 75% to about 96%, about 75% to about 97%, about 75% to about 98%, about 75% to about 99%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 92.5%, about 80% to about 95%, about 80% to about 96%, about 80% to about 97%, about 80% to about 98%, about 80% to about 99%, about 80% to about 100%, about 85% to about 90%, about 85% to about 92.5%, about 85% to about 95%, about 85% to about 96%, about 85% to about 97%, about 85% to about 98%, about 85% to about 99%, about 85% to about 100%, about 90% to about 92.5%, about 90% to about 95%, about 90% to about 96%, about 90% to about 97%, about 90% to about 98%, about 90% to about 99%, about 90% to about 100%, about 92.5% to about 95%, about 92.5% to about 96%, about 92.5% to about 97%, about 92.5% to about 98%, about 92.5% to about 99%, about 92.5% to about 100%, about 95% to about 96%, about 95% to about 97%, about 95% to about 98%, about 95% to about 99%, about 95% to about 100%, about 96% to about 97%, about 96% to about 98%, about 96% to about 99%, about 96% to about 100%, about 97% to about 98%, about 97% to about 99%, about 97% to about 100%, about 98% to about 99%, about 98% to about 100%, or about 99% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a specificity of about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96% about 97%, about 98%, about 99%, or about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a specificity of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, or about 99%.
In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a positive predictive value of about 70% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a positive predictive value of about 70% to about 75%, about 70% to about 80% about 70% to about 85%, about 70% to about 90%, about 70% to about 92.5%, about 70% to about 95%, about 70% to about 96%, about 70% to about 97%, about 70% to about 98%, about 70% to about 99%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 92.5%, about 75% to about 95%, about 75% to about 96%, about 75% to about 97%, about 75% to about 98%, about 75% to about 99%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 92.5%, about 80% to about 95%, about 80% to about 96%, about 80% to about 97%, about 80% to about 98%, about 80% to about 99%, about 80% to about 100%, about 85% to about 9000, about 85% to about 92.5%, about 85% to about 95%, about 85% to about 96%, about 85% to about 97%, about 85% to about 98%, about 85% to about 99%, about 85% to about 100%, about 90% to about 92.5%, about 90% to about 95%, about 90% to about 96%, about 90% to about 97%, about 90% to about 98%, about 90% to about 99%, about 90% to about 100%, about 92.5% to about 95%, about 92.5% to about 96%, about 92.5% to about 97%, about 92.5% to about 98%, about 92.5% to about 99%, about 92.5% to about 100%, about 95% to about 96%, about 95% to about 97%, about 95% to about 98%, about 95% to about 99%, about 95% to about 10000, about 96% to about 97%, about 96% to about 98%, about 96% to about 99%, about 96% to about 100%, about 97% to about 98%, about 97% to about 99%, about 97% to about 100%, about 98% to about 99%, about 98% to about 100%, or about 99% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a positive predictive value of about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a positive predictive value of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, or about 99%.
In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a negative predictive value of about 70% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a negative predictive value of about 70% to about 75%, about 70% to about 80%, about 70% to about 85%, about 70% to about 90%, about 70% to about 92.5%, about 70% to about 95%, about 70% to about 96%, about 70% to about 97%, about 70% to about 98%, about 70% to about 99%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 92.5%, about 75% to about 95%, about 75% to about 96%, about 75% to about 97%, about 75% to about 98%, about 75% to about 99%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 92.5%, about 80% to about 95%, about 80% to about 96%, about 80% to about 97%, about 80% to about 98%, about 80% to about 99%, about 80% to about 100%, about 85% to about 9000, about 85% to about 92.5%, about 85% to about 95%, about 85% to about 96%, about 85% to about 97%, about 85% to about 98%, about 85% to about 99%, about 85% to about 100%, about 90% to about 92.5%, about 90% to about 95%, about 90% to about 96%, about 90% to about 97%, about 90% to about 98%, about 90% to about 99%, about 90% to about 100%, about 92.5% to about 95%, about 92.5% to about 96%, about 92.5% to about 97%, about 92.5% to about 98%, about 92.5% to about 99%, about 92.5% to about 100%, about 95% to about 96%, about 95% to about 97%, about 95% to about 98%, about 95% to about 99%, about 95% to about 10000, about 96% to about 97%, about 96% to about 98%, about 96% to about 99%, about 96% to about 100%, about 97% to about 98%, about 97% to about 99%, about 97% to about 100%, about 98% to about 99%, about 98% to about 100%, or about 99% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a negative predictive value of about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a negative predictive value of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, or about 99%.
In certain embodiments, the trained machine learning model classifies the skin of the patient as indicative of the disease state of the patient with a ROC having an AUC of about 0.7 to about 1. In certain embodiments, the trained machine learning model classifies the skin of the patient as indicative of the disease state of the patient with a ROC having an AUC of about 0.7 to about 0.75, about 0.7 to about 0.8, about 0.7 to about 0.85, about 0.7 to about 0.9, about 0.7 to about 0.925, about 0.7 to about 0.95, about 0.7 to about 0.96, about 0.7 to about 0.97, about 0.7 to about 0.98, about 0.7 to about 0.99, about 0.7 to about 1, about 0.75 to about 0.8, about 0.75 to about 0.85, about 0.75 to about 0.9, about 0.75 to about 0.925, about 0.75 to about 0.95, about 0.75 to about 0.96, about 0.75 to about 0.97, about 0.75 to about 0.98, about 0.75 to about 0.99, about 0.75 to about 1, about 0.8 to about 0.85, about 0.8 to about 0.9, about 0.8 to about 0.925, about 0.8 to about 0.95, about 0.8 to about 0.96, about 0.8 to about 0.97, about 0.8 to about 0.98, about 0.8 to about 0.99, about 0.8 to about 1, about 0.85 to about 0.9, about 0.85 to about 0.925, about 0.85 to about 0.95, about 0.85 to about 0.96, about 0.85 to about 0.97, about 0.85 to about 0.98, about 0.85 to about 0.99, about 0.85 to about 1, about 0.9 to about 0.925, about 0.9 to about 0.95, about 0.9 to about 0.96, about 0.9 to about 0.97, about 0.9 to about 0.98, about 0.9 to about 0.99, about 0.9 to about 1, about 0.925 to about 0.95, about 0.925 to about 0.96, about 0.925 to about 0.97, about 0.925 to about 0.98, about 0.925 to about 0.99, about 0.925 to about 1, about 0.95 to about 0.96, about 0.95 to about 0.97, about 0.95 to about 0.98, about 0.95 to about 0.99, about 0.95 to about 1, about 0.96 to about 0.97, about 0.96 to about 0.98, about 0.96 to about 0.99, about 0.96 to about 1, about 0.97 to about 0.98, about 0.97 to about 0.99, about 0.97 to about 1, about 0.98 to about 0.99, about 0.98 to about 1, or about 0.99 to about 1. In certain embodiments, the trained machine learning model classifies the skin of the patient as indicative of the disease state of the patient with a ROC having an AUC of about 0.7, about 0.75, about 0.8, about 0.85, about 0.9, about 0.925, about 0.95, about 0.96, about 0.97, about 0.98, about 0.99, or about 1. In certain embodiments, the trained machine learning model classifies the skin of the patient as indicative of the disease state of the patient with a ROC having an AUC of at least about 0.7, about 0.75, about 0.8, about 0.85, about 0.9, about 0.925, about 0.95, about 0.96, about 0.97, about 0.98, or about 0.99. In certain embodiments, the trained machine learning model classifies the skin of the patient as indicative of the disease state of the patient with a ROC having an AUC of at most about 0.75, about 0.8, about 0.85, about 0.9, about 0.925, about 0.95, about 0.96, about 0.97, about 0.98, about 0.99, or about 1.
In certain embodiments, the patient has lupus, PSO, AD, and/or SSc. In certain embodiments, the patient is suspected of having lupus, PSO, AD, and/or SSc. In certain embodiments, the patient is at elevated risk of having lupus, PSO, AD, and/or SSc. In certain embodiments, the patient is asymptomatic for lupus, PSO, AD, and/or SSc. In certain embodiments, the patient has DLE, and/or SCLE. In certain embodiments, the patient is suspected of having DLE, and/or SCLE. In certain embodiments, the patient is at elevated risk of having DLE, and/or SCLE. In certain embodiments, the patient is asymptomatic for DLE, and/or SCLE. In certain embodiments, the patient has, is suspected of having, is at elevated risk of having and/or is asymptomatic for lupus. In certain embodiments, the patient has, is suspected of having, is at elevated risk of having and/or is asymptomatic for PSO. In certain embodiments, the patient has, is suspected of having, is at elevated risk of having and/or is asymptomatic for AD. In certain embodiments, the patient has, is suspected of having, is at elevated risk of having and/or is asymptomatic for SSc. In certain embodiments, the patient has, is suspected of having, is at elevated risk of having and/or is asymptomatic for DLE. In certain embodiments, the patient has, is suspected of having, is at elevated risk of having and/or is asymptomatic for SCLE. In certain embodiments, the method can further comprise administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state. The treatment can be configured to treat, reduce severity, and/or reduce risk of having the lupus, PSO, AD, or SSc. In some embodiments, the treatment is configured to treat the lupus, PSO, AD, or SSc. In some embodiments, the treatment is configured to reduce a severity of the lupus, PSO, AD, or SSc. In some embodiments, the treatment is configured to reduce a risk of having the lupus, PSO, AD, or SSc. The treatment can be one or more treatments of lupus, PSO, AD, and/or SSc. In certain embodiments, the method can comprise administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the DLE or SCLE disease state. The treatment can be configured to treat, reduce severity, and/or reduce risk of having the DLE or SCLE. In some embodiments, the treatment is configured to treat the DLE or SCLE. In some embodiments, the treatment is configured to reduce a severity of the DLE or SCLE. In some embodiments, the treatment is configured to reduce a risk of having the DLE or SCLE. The treatment can be one or more treatments of DLE or SCLE. In some embodiments, the treatment comprises a pharmaceutical composition. In certain embodiments, the treatment is configured to treat, reduce severity, and/or reduce risk of having lupus. In certain embodiments, the treatment is configured to treat, reduce severity, and/or reduce risk of having PSO. In certain embodiments, the treatment is configured to treat, reduce severity, and/or reduce risk of having AD. In certain embodiments, the treatment is configured to treat, reduce severity, and/or reduce risk of having SSc. In certain embodiments, the treatment is configured to treat, reduce severity, and/or reduce risk of having DLE. In certain embodiments, the treatment is configured to treat, reduce severity, and/or reduce risk of having SCLE.
A treatment used in the context of the present methods may be any known to those of skill in the art for treating, e.g., reducing the severity of or reducing the risk of, the disease state in the patient. In some embodiments, the treatment comprises an immunosuppressive treatment. In some embodiments, the treatment comprises a pharmaceutical composition comprising one or more agents that target and/or inhibit: TNF (e.g., etanercept, infliximab, adalimumab, certolizumab); IL-12/23 (IL23 complex) (e.g., ustekinumab, guselkumab, risankizumab; an interferon or interferon receptor (e.g., anifrolumab, which binds to IFNAR); proteasome (e.g., bortezomib, carfilzomib, ixazomib); CD38 (e.g., daratumumab, isatuximab); SLAMF7 (e.g., elotuzumab); IMPDH (mycophenylate mofetil); BlyS (e.g., belimumab); CD19 (e.g., inebilizumab); CD20 (e.g., rituximab, obinutuzumab); CD20/CD3 (e.g., glofitamab); NPL4 (e.g., disulfiram); neutrophil elastase (e.g., alvelestat); a growth factor receptor, e.g., FGFR, PDGFR, VEGFR (e.g., nintedanib, pirfenidone); BDCA2 (e.g., BIIB059); ILT7 (e.g., Daxdilmab). In some embodiments, the pharmaceutical composition comprises an agent that targets plasma cells (e.g., bortezomib, carfilzomib, ixazomib, daratumumab, isatuximab, elotuzumab, mycophenylate mofetil), B cells (e.g., belimumab, inebilizumab, rituximab, glofitamab, obinutuzumab), neutrophils (e.g., disulfiram, alvelestat), TGFB fibroblasts (e.g., nintedanib, pirfenidone), and/or dendritic cells (e.g., BIIB059, Daxdilmab). In some embodiments, a treatment for DLE comprises an agent that targets plasma cells and/or B cells. In some embodiments, a treatment for psoriasis comprises an agent that targets neutrophils. In some embodiments, a treatment for systemic sclerosis comprises an agent that targets TGFB fibroblasts and/or dendritic cells. In some embodiments, a treatment for atopic dermatitis comprises an agent that targets IL23. In some embodiments, the treatment can be one or more treatments shown in
The biological sample can comprise a skin biopsy sample, a blood sample, isolated peripheral blood mononuclear cells (PBMCs), or any derivative thereof. In certain embodiments, the biological sample comprises a skin biopsy sample, or any derivative thereof. In certain embodiments, the biological sample comprises a blood sample, or any derivative thereof. In certain embodiments, the biological sample comprises isolated peripheral blood mononuclear cells (PBMCs), or any derivative thereof.
In certain embodiments, the method further comprises determining a likelihood of the classification of the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state of the patient. In certain embodiments, the method further comprises monitoring the skin of the patient, wherein the monitoring comprises assessing the skin of the patient at a plurality of different time points. A difference in the assessment of the skin of the patient among the plurality of time points can be indicative of one or more clinical indications selected from the group consisting of: (i) a diagnosis of the skin of the patient, (ii) a prognosis of the skin of the patient, and (iii) an efficacy or non-efficacy of a course of treatment for treating the skin of the patient.
In certain embodiments, the enrichment assessment of the data set in step (a′) is performed using gene set variation analysis (GSVA), gene set enrichment analysis (GSEA), enrichment algorithm, Z-score, multiscale embedded gene co-expression network analysis (MEGENA), weighted gene co-expression network analysis (WGCNA), differential expression analysis, log 2 expression analysis, or any combination thereof. In certain particular embodiments, the enrichment assessment of the data set in step (a′) is performed using GSVA.
In certain embodiments, the enrichment score of the patient comprises one or more Table specific enrichment scores of the patient, wherein the one or more Table specific enrichment scores are generated using one or more of the Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28, wherein for a respective selected Table, at least one Table specific enrichment score of the patient is generated for enrichment of expression of at least 2 genes listed in the respective Table, in the biological sample. The one or more Table specific enrichment scores comprises the at least one Table specific enrichment score from each of the selected Table. The at least 2 genes of the data set can comprise the at least 2 genes from each of the selected table (e.g., for a respective selected table for enrichment of expression of which the at least one Table specific enrichment score from the respective selected table is generated). In certain embodiments, the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, or 48, or 1 to 48, or any range there between Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28. In certain embodiments, the all the 48 Tables, e.g., Tables 4A-1 to 4A-20 and Tables 4B-1 to 4B-28, are selected. In certain embodiments, independently for each of the selected Table of the one or more Tables, the at least one Table specific enrichment score from the Table is generated, for enrichment of expression of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295 or 300 or all or any range or value there between, genes selected from the genes listed in the Table, in the biological sample. In certain embodiments, for each of the selected Table one Table specific enrichment score is generated, and the one or more Table specific enrichment of the patient comprises the one Table specific enrichment score of the patient from each of the selected Table. In certain embodiments, the Table specific enrichment scores are GSVA scores, and are obtained using GSVA. In certain embodiments, the GSVA scores can be Z-score GSVA scores.
In certain embodiments, the enrichment assessment of the data set in step (a′) is performed using GSVA, wherein the enrichment score obtained in step (a′) comprises one or more GSVA scores of the patient, wherein the one or more GSVA scores are generated using one or more of the Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28, wherein for a respective selected Table, at least one GSVA score of the patient is generated for enrichment of expression of at least 2 genes listed in the respective Table, in the biological sample. The one or more GSVA scores of the patient comprises the at least one GSVA score from each of the selected Table. The one or more GSVA scores are generated by the enrichment assessment of the data set in step (a′), and the at least 2 genes of step (a′) comprises the at least 2 genes from each of the selected table (e.g., for a respective selected table for enrichment of expression of which the at least one GSVA score from the respective selected table is generated). In certain embodiments, the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, or 48, or 1 to 48, or any range there between Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28. In certain embodiments, the all the 48 Tables, e.g., Tables 4A-1 to 4A-20 and Tables 4B-1 to 4B-28, are selected. In certain embodiments, independently for each of the selected Table of the one or more Tables, the at least one GSVA score from the Table is generated, for enrichment of expression of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295 or 300 or all or any range or value there between, genes selected from the genes listed in the Table, in the biological sample. In certain embodiments, for each of the selected Table one GSVA score is generated, and the one or more GSVA score of the patient comprises the one GSVA score of the patient from each of the selected Table.
In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables (e.g., of step (a′) comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-8, Table 4B-25, Table 4B-14, Table 4A-16, Table 4B-22, Table 4B-10, Table 4A-11, Table 4B-16, Table 4B-26, Table 4A-1, Table 4A-19, Table 4A-15, Table 4B-28, Table 4B-15, and Table 4B-23, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-8, Table 4B-25, Table 4B-14, Table 4A-16, Table 4B-22, Table 4B-10, Table 4A-11, Table 4B-16, Table 4B-26, Table 4A-1, Table 4A-19, Table 4A-15, Table 4B-28, Table 4B-15, and Table 4B-23, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the lupus disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain particular embodiments, for the embodiments described in this paragraph Tables selected includes at least Tables 4B-8 and B-10
In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-26, Table 4A-8, Table 4A-14, Table 4A-16, Table 4B-11, Table 4A-1, Table 4B-6, Table 4A-10, Table 4B-10, Table 4B-16, Table 4B-2, Table 4B-19, Table 4B-13, Table 4B-1, and Table 4B-25, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-26, Table 4A-8, Table 4A-14, Table 4A-16, Table 4B-11, Table 4A-1, Table 4B-6, Table 4A-10, Table 4B-10, Table 4B-16, Table 4B-2, Table 4B-19, Table 4B-13, Table 4B-1, and Table 4B-25, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the lupus, disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient contains one or more lesions, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD disease state of the patient. In certain embodiments, the skin of the patient contains one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-10, Table 4B-25, Table 4B-8, Table 4B-22, Table 4B-28, Table 4B-16, Table 4A-16, Table 4B-14, Table 4B-13, Table 4B-23, Table 4B-7, Table 4B-15, Table 4A-12, Table 4B-3, and Table 4B-2, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD disease state of the patient. In certain embodiments, the skin of the patient contains one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-10, Table 4B-25, Table 4B-8, Table 4B-22, Table 4B-28, Table 4B-16, Table 4A-16, Table 4B-14, Table 4B-13, Table 4B-23, Table 4B-7, Table 4B-15, Table 4A-12, Table 4B-3, and Table 4B-2, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the AD, disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain particular embodiments, for the embodiments described in this paragraph Tables selected includes at least Tables 4B-8 and B-10.
In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-17, Table 4B-28, Table 4A-6, Table 4A-7, Table 4B-2, Table 4B-20, Table 4A-9, Table 4B-18, Table 4A-12, Table 4A-16, Table 4A-13, Table 4B-23, Table 4B-9, Table 4A-3, and Table 4A-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-17, Table 4B-28, Table 4A-6, Table 4A-7, Table 4B-2, Table 4B-20, Table 4A-9, Table 4B-18, Table 4A-12, Table 4A-16, Table 4A-13, Table 4B-23, Table 4B-9, Table 4A-3, and Table 4A-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the AD, disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the PSO disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-3, Table 4B-25, Table 4B-10, Table 4B-16, Table 4B-8, Table 4B-14, Table 4B-2, Table 4A-7, Table 4B-28, Table 4B-23, Table 4B-20, Table 4B-26, Table 4A-13, Table 4B-18, and Table 4A-16, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the PSO disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-3, Table 4B-25, Table 4B-10, Table 4B-16, Table 4B-8, Table 4B-14, Table 4B-2, Table 4A-7, Table 4B-28, Table 4B-23, Table 4B-20, Table 4B-26, Table 4A-13, Table 4B-18, and Table 4A-16, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the PSO disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the PSO, disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain particular embodiments, for the embodiments described in this paragraph Tables selected includes at least Tables 4B-8 and B-10.
In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the PSO disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-1, Table 4B-3, Table 4B-12, Table 4A-14, Table 4A-20, Table 4B-17, Table 4B-20, Table 4B-27, Table 4A-9, Table 4A-15, Table 4A-18, Table 4A-13, Table 4B-26, Table 4B-2, and Table 4A-5, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the PSO disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-1, Table 4B-3, Table 4B-12, Table 4A-14, Table 4A-20, Table 4B-17, Table 4B-20, Table 4B-27, Table 4A-9, Table 4A-15, Table 4A-18, Table 4A-13, Table 4B-26, Table 4B-2, and Table 4A-5, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the PSO disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the PSO, disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the SSc disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4A-16, Table 4B-8, Table 4B-25, Table 4B-21, Table 4B-26, Table 4B-10, Table 4B-28, Table 4B-2, Table 4B-27, Table 4B-14, Table 4A-18, Table 4A-6, Table 4A-15, Table 4B-12, and Table 4B-23, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the SSc disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4A-16, Table 4B-8, Table 4B-25, Table 4B-21, Table 4B-26, Table 4B-10, Table 4B-28, Table 4B-2, Table 4B-27, Table 4B-14, Table 4A-18, Table 4A-6, Table 4A-15, Table 4B-12, and Table 4B-23, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the SSc disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the SSc, disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain particular embodiments, for the embodiments described in this paragraph Tables selected includes at least Tables 4B-8 and B-10.
In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the SSc disease state of the patient.
In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4A-11, Table 4B-13, Table 4A-17, and Table 4B-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17, or 1 to 17, or any range there between, Tables selected from the group consisting of Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4A-11, Table 4B-13, Table 4A-17, Table 4B-23, Table 4B-20, and Table 4B-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient. In certain particular embodiments, all the 17 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4B-23, Table 4B-13, Table 4A-17, and Table 4B-20, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or 13, or 1 to 13, or any range there between, Tables selected from the group consisting of Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4B-13, and Table 4A-17, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient. In certain particular embodiments, all the 13 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17, or 1 to 17, or any range there between, Tables selected from the group consisting of Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4A-11, Table 4B-13, Table 4A-17, Table 4B-23, Table 4B-20, and Table 4B-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the lupus or AD, disease state of the patient.
In certain embodiments, the skin of the patient does not comprise a lesion, and, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4B-1, Table 4A-10, Table 4A-12, Table 4B-13, and Table 4B-15, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4B-23, Table 4A-10, Table 4B-12, Table 4B-13, and Table 4B-15, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or 13, or 1 to 13, or any range there between, Tables selected from the group consisting of Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4A-10, Table 4B-13, and Table 4B-15, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient. In certain particular embodiments, all the 13 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17, or 1 to 17, or any range there between, Tables selected from the group consisting of Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4B-1, Table 4A-10, Table 4A-12, Table 4B-13, Table 4B-23, Table 4B-12, and Table 4B-15, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient. In certain particular embodiments, all the 17 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17, or 1 to 17, or any range there between, Tables selected from the group consisting of Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4B-1, Table 4A-10, Table 4A-12, Table 4B-13, Table 4B-23, Table 4B-12 and Table 4B-15, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the lupus or AD, disease state of the patient.
In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, Table 4B-5, and Table 4B-7, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, Table 4B-5, and Table 4B-23, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14, or 1 to 14, or any range there between, Tables selected from the group consisting of Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, and Table 4B-5, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient. In certain particular embodiments, all the 14 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or 16, or 1 to 16, or any range there between, Tables selected from the group consisting of Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, Table 4B-5, Table 4B-23, and Table 4B-7, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient. In certain particular embodiments, all the 16 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or 16, or 1 to 16, or any range there between, Tables selected from the group consisting of Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, Table 4B-5, Table 4B-23 and Table 4B-7, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the lupus or PSO, disease state of the patient.
In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, Table 4B-24, Table 4B-22, Table 4A-7, Table 4B-10, and Table 4A-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, Table 4A-5, Table 4B-17, Table 4B-12, Table 4A-3, and Table 4B-22, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or 11, or 1 to 11, or any range there between, Tables selected from the group consisting of Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, and Table 4B-22, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient. In certain particular embodiments, all the 11 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, or 19, or 1 to 19, or any range there between, Tables selected from the group consisting of Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, Table 4B-24, Table 4B-22, Table 4A-7, Table 4B-10, and Table 4A-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient. In certain particular embodiments, all the 19 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, or 19 Tables selected from the group consisting of Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, Table 4B-24, Table 4B-22, Table 4A-7, Table 4B-10, Table 4A-5, Table 4B-17, Table 4B-12, Table 4A-3, and Table 4A-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the lupus or PSO, disease state of the patient.
In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4A-15, Table 4B-13, and Table 4A-8 and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4A-1, Table 4B-13, and Table 4A-8 and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14, or 1 to 14, or any range there between, Tables selected from the group consisting of Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4B-13, and Table 4A-8 and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state of the patient. In certain particular embodiments, all the 14 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or 16, or 1 to 16, or any range there between, Tables selected from the group consisting of Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4A-15, Table 4B-13, Table 4A-1, and Table 4A-8 and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state of the patient. In certain particular embodiments, all the 16 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or 16, or 1 to 16, or any range there between, Tables selected from the group consisting of Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4A-15, Table 4B-13, Table 4A-1 and Table 4A-8 and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the lupus or SSc, disease state of the patient.
In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state of the patient.
In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD or PSO disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-1, Table 4B-14, Table 4B-3, Table 4B-7, Table 4B-17, Table 4A-9, Table 4B-12, Table 4A-4, Table 4B-10, Table 4A-14, Table 4B-20, Table 4B-22, Table 4B-16, Table 4B-13, and Table 4A-11, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD or PSO disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-1, Table 4B-14, Table 4B-3, Table 4B-7, Table 4B-17, Table 4A-9, Table 4B-12, Table 4A-4, Table 4B-10, Table 4A-14, Table 4B-20, Table 4B-22, Table 4B-16, Table 4B-13, and Table 4A-11, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD or PSO disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the AD or PSO, disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD or PSO disease state of the patient.
In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the DLE or SCLE disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4A-16, Table 4B-26, Table 4B-25, Table 4B-2, Table 4B-22, Table 4B-14, Table 4A-13, Table 4A-15, Table 4B-4, Table 4B-9, Table 4A-10, Table 4A-12, Table 4B-6, Table 4B-1, and Table 4A-5, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the DLE or SCLE disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4A-16, Table 4B-26, Table 4B-25, Table 4B-2, Table 4B-22, Table 4B-14, Table 4A-13, Table 4A-15, Table 4B-4, Table 4B-9, Table 4A-10, Table 4A-12, Table 4B-6, Table 4B-1, and Table 4A-5, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the DLE or SCLE disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the DLE or SCLE, disease state of the patient.
In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the DLE or SCLE disease state of the patient.
In certain embodiments, the step (b′) comprises using a trained machine learning model to analyze the enrichment score of the patient to classify the skin of the patient as indicative of the disease state. The trained machine learning model can generate an inference indicating whether the skin of patient is indicative of the disease state, based on the enrichment score of the patient. In certain embodiments the analyzing in step (b′) comprises providing the one or more GSVA scores of the patient as an input to the trained machine-learning model, wherein the trained machine-learning model is trained to generate the inference, based at least on the one or more GSVA scores. In certain embodiments, the method further comprises receiving, as an output of the machine-learning model, the inference indicating whether the skin of the patient is indicative of the disease state. The trained machine learning model can generate the inference based at least on comparing the data set to a reference data set. In certain embodiments, step (b′) comprises comparing the data set to a reference data set. The trained machine learning model can be trained using a reference data set, wherein a first portion of the reference data set can be used as training data set, and a second portion of the reference data set can be used as validation dataset. The reference data set can comprise and/or be derived from gene expression measurements of reference biological samples of at least 2 genes selected from the group of genes listed in Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28. In certain embodiments, the reference data set comprises a plurality of reference enrichments scores derived from the gene expression measurements of the at least 2 genes, of the plurality of the reference biological samples. The reference enrichments scores can be derived based at least on enrichment assessment of the at least 2 genes, in the plurality of the reference biological samples. The at least 2 genes of the reference data set and at least 2 genes of the data set can at least partially overlap (e.g. same). In certain embodiments, the enrichment assessment of the reference data set is performed using gene set variation analysis (GSVA), gene set enrichment analysis (GSEA), enrichment algorithm, Z-score, multiscale embedded gene co-expression network analysis (MEGENA), weighted gene co-expression network analysis (WGCNA), differential expression analysis, log 2 expression analysis, or any combination thereof. In certain particular embodiments, the enrichment assessment of the reference data set is performed using GSVA. In certain embodiments, a respective enrichment score comprises one or more GSVA scores, wherein the one or more GSVA scores of the respective enrichment score are generated using one or more of the Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and, Table 4B-28, wherein for a respective selected Table, at least one GSVA score of the respective enrichment score is generated based on enrichment of expression of at least 2 genes listed in the respective Table, in the respective reference biological sample (e.g., from which the respective enrichment score is obtained), wherein the one or more GSVA scores comprises the at least one GSVA score from each of the selected table. In certain embodiments, the selected tables of the data set (e.g., from which the one or more GSVA scores of the data set is generated), and the selected tables of the reference data set (e.g., from which the one or more GSVA scores of the reference data set is generated) can at least partially overlap (e. g., same). In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having the disease state, and a second plurality of biological samples obtained or derived from reference subjects not having the disease state, wherein the skin of the reference subjects having the disease state contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having the disease state, and a second plurality of biological samples obtained or derived from reference subjects not having the disease state, wherein the skin of the reference subjects having the disease state do not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of biological samples obtained or derived from reference subjects not having lupus disease state, wherein the skin of the reference subjects having the lupus disease state contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of biological samples obtained or derived from reference subjects not having lupus disease state, wherein the skin of the reference subjects having lupus disease state do not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having PSO disease state, and a second plurality of biological samples obtained or derived from reference subjects not having PSO disease state, wherein the skin of the reference subjects having PSO disease state contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having PSO disease state, and a second plurality of biological samples obtained or derived from reference subjects not having PSO disease state, wherein the skin of the reference subjects having PSO disease state do not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having AD disease state, and a second plurality of biological samples obtained or derived from reference subjects not having AD disease state, wherein the skin of the reference subjects having AD disease state contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having AD disease state, and a second plurality of biological samples obtained or derived from reference subjects not having AD disease state, wherein the skin of the reference subjects having AD disease state do not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having SSc disease state, and a second plurality of biological samples obtained or derived from reference subjects not having SSc disease state, wherein the skin of the reference subjects having SSc disease state contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having SSc disease state, and a second plurality of biological samples obtained or derived from reference subjects not having SSc disease state, wherein the skin of the reference subjects having SSc disease state do not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of biological samples obtained or derived from reference subjects having PSO disease state, wherein the skin of the reference subjects contains one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of biological samples obtained or derived from reference subjects having PSO disease state, wherein the skin of the reference subjects does not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of biological samples obtained or derived from reference subjects having AD disease state, wherein the skin of the reference subjects contains one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of biological samples obtained or derived from reference subjects having AD disease state, wherein the skin of the reference subjects does not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of biological samples obtained or derived from reference subjects having SSc disease state, wherein the skin of the reference subjects contains one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of biological samples obtained or derived from reference subjects having SSc disease state, wherein the skin of the reference subjects do not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having AD disease state, and a second plurality of biological samples obtained or derived from reference subjects having PSO disease state, wherein the skin of the reference subjects does not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having AD disease state, and a second plurality of biological samples obtained or derived from reference subjects having PSO disease state, wherein the skin of the reference subjects contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having DLE disease state, and a second plurality of biological samples obtained or derived from reference subjects having SCLE disease state, wherein the skin of the reference subjects contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from reference subjects having DLE disease state, and a second plurality of biological samples obtained or derived from reference subjects having SCLE disease state, wherein the skin of the reference subjects do not contain a lesion. The reference biological samples can comprise skin biopsy sample, blood sample, isolated peripheral blood mononuclear cells (PBMCs), or any derivative thereof. In certain embodiments, the trained machine learning model is trained to infer the classification of the skin of the patient based on a set of N features, the machine learning model trained by at least determining, from a training dataset, the N features that are usable to determine a binary classification indicative of whether a training dataset patient has i) skin indicative of at least one of one or more inflammatory skin disease state selected from lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state, or healthy state, or i) skin indicative of a first inflammatory skin disease state of the one or more inflammatory skin disease state or a second inflammatory skin disease of the one or more inflammatory skin disease state. The N features can be determined according to method described herein, using the method of steps (a″), (b″), (c″), (d″), (e″), and/or (f″). The patient can be a human. The reference subjects can be humans. In certain embodiments, the trained machine-learning model is trained to generate the inference of whether the skin of the patient is indicative of the lupus disease state, based at least on the one or more GSVA scores of the patient, wherein the method can classify whether the skin of the patient is indicative of the lupus disease state. In certain embodiments, the trained machine-learning model is trained to generate the inference of whether the skin of the patient is indicative of the AD disease state, based at least on the one or more GSVA scores of the patient, wherein the method can classify whether the skin of the patient is indicative of the AD disease state. In certain embodiments, the trained machine-learning model is trained to generate the inference of whether the skin of the patient is indicative of the PSO disease state, based at least on the one or more GSVA scores of the patient, wherein the method can classify whether the skin of the patient is indicative of the PSO disease state. In certain embodiments, the trained machine-learning model is trained to generate an inference of whether the skin of the patient is indicative of the SSc disease state, based at least on the one or more GSVA scores of the patient, wherein the method can classify whether the skin of the patient is indicative of the SSc disease state. In certain embodiments, the trained machine-learning model is trained to generate the inference of whether the skin of the patient is indicative of the lupus or AD disease state, based at least on the one or more GSVA scores of the patient, wherein the method can classify whether the skin of the patient is indicative of the lupus or AD disease state. In certain embodiments, the trained machine-learning model is trained to generate the inference of whether the skin of the patient is indicative of the lupus or PSO disease state, based at least on the one or more GSVA scores of the patient, wherein the method can classify whether the skin of the patient is indicative of the lupus or PSO disease state. In certain embodiments, the trained machine-learning model is trained to generate the inference of whether the skin of the patient is indicative of the lupus or SSc disease state, based at least on the one or more GSVA scores of the patient, wherein the method can classify whether the skin of the patient is indicative of the lupus or SSc disease state. In certain embodiments, the trained machine-learning model is trained to generate the inference of whether the skin of the patient is indicative of the DLE or SCLE disease state, based at least on the one or more GSVA scores of the patient, wherein the method can classify whether the skin of the patient is indicative of the DLE or SCLE disease state. In certain embodiments, the trained machine-learning model is trained to generate the inference of whether the skin of the patient is indicative of the AD or PSO disease state, based at least on the one or more GSVA scores of the patient, wherein the method can classify whether the skin of the patient is indicative of the AD or PSO disease state. The trained machine learning model can be trained using linear regression, logistic regression, Ridge regression, Lasso regression, an elastic net (EN) regression, support vector machine (SVM), gradient boosted machine (GBM), k nearest neighbors (kNN), generalized linear model (GLM), naïve Bayes (NB) model, neural network, Random Forest (RF), deep learning algorithm, linear discriminant analysis (LDA), decision tree learning (DTREE), adaptive boosting (ADB), Classification and Regression Tree (CART), Hierarchical clustering, or any combination thereof. In certain embodiments, the trained machine learning model can be trained according to a method described herein, e.g. using the method of steps (a″), (b″), (c″), (d″), (e″), and/or (f″).
The inference of the machine learning model can include a confidence value between 0 and 1. In certain embodiments, the confidence value of the inference of the machine learning model is between 0 and 1, that the patient has the disease state. In certain embodiments, the confidence value of the inference of the machine learning model is between 0 and 1, such as, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1, or any value or ranges there between, that the subject has lupus disease state. In certain embodiments, the confidence value of the inference of the machine learning model is between 0 and 1, such as, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1, or any value or ranges there between, that the subject has PSO disease state. In certain embodiments, the confidence value of the inference of the machine learning model is between 0 and 1, such as, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1, or any value or ranges there between, that the subject has AD disease state. In certain embodiments, the confidence value of the inference of the machine learning model is between 0 and 1, such as, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1, or any value or ranges there between, that the subject has SSc disease state. In certain embodiments, the confidence value of the inference of the machine learning model is between 0 and 1, such as, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1, or any value or ranges there between, that the subject has DLE disease state. In certain embodiments, the confidence value of the inference of the machine learning model is between 0 and 1, such as, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1, or any value or ranges there between, that the subject has SCLE disease state.
In certain embodiments, the analyzing in step (b′) comprises generating a disease risk score of the patient based at least on the one or more GSVA scores of the patient, wherein the skin of the patient is classified as indicative of the disease state based on the disease risk score. The skin of the patient can be classified as indicative of the disease state based on comparing the risk score of the patient to a reference value. In certain embodiments, the skin of the patient is classified as indicative of the disease state based on comparing the risk score of the patient to a reference value, wherein risk score at one side (e.g., higher or lower) of the reference value indicates skin of the patient is indicative of the disease state, and risk score at the other side (e.g., lower or higher respectively) of the reference value indicates skin of the patient is not indicative of the disease state.
In certain embodiments, generating the disease risk score of the patient comprises developing one or more weighted GSVA scores of the patient from the one or more GSVA scores, and summing the one or more weighted GSVA scores to obtain the disease risk score of the patient. For a respective GSVA score of the one or more GSVA scores, the weighted GSVA score is obtained by multiplying the respective GSVA score with its respective weight factor, wherein the respective weight factor is determined based on contribution of the set of genes from which the respective GSVA score is generated, on the classification of the skin of the patient. The set of genes from which the respective GSVA score is generated are the genes based on enrichment of expression which in the biological sample, the respective GSVA score is generated. In certain particular embodiments, the one or more GSVA score of the patient is binarized, and the binarized GSVA scores are multiplied with the respective weight factors to obtain the weighted GSVA scores. In certain embodiments, binarizing the one or more GSVA scores includes replacing all GSVA scores (e.g., of the one or more GSVA scores) above a threshold value with a first value, and replacing all GSVA scores (e.g., of the one or more GSVA scores) equal to or below the threshold value with a second value. In certain particular embodiments, the threshold value is 0, the first value is 1, and the second value is 0. The one or more GSVA scores can be generated using a method as described above.
In certain embodiments, the weight factors are calculated based on training a machine learning model, wherein the trained machine learning model can classify whether the skin of the patient is indicative of the disease state based on the one or more GSVA scores of the patient. The gene sets from which the one or more GSVA scores of the patient are generated can be features of the machine learning model. The feature co-efficient of the features can be the weight factors. The weight factor for a respective GSVA score is the feature co-efficient of the gene set (e.g., a feature) from which the GSVA score is generated. The feature co-efficient, can be the average feature co-efficient of the iterations run.
In certain embodiments, the machine learning model is trained with a reference data set. In some embodiments, the reference data set contains a plurality of individual reference data sets. A respective individual reference data set of the plurality of individual reference data sets can contain i) one or more GSVA scores of a respective reference subject, and ii) data regarding whether the respective reference subject has the disease state. The one or more GSVA scores of the respective reference subject can be generated using a method as described above. The plurality of individual reference data sets can be obtained from a plurality of reference biological samples. In certain embodiments, a first portion of the reference biological samples can be obtained and/or derived from reference subjects having the disease state, and a second portion of the reference biological samples can be obtained and/or derived from reference subjects not having the disease state. Oversampling or undersampling correction of the dataset is performed if necessary. In certain embodiments, a first portion of the first portion of the reference biological samples can be obtained and/or derived from reference subjects having lupus disease state; a second portion of the first portion of the reference biological samples can be obtained and/or derived from reference subjects having PSO disease state; a third portion of the first portion of the reference biological samples can be obtained and/or derived from reference subjects having AD disease state; and a fourth portion of the first portion of the reference biological samples can be obtained and/or derived from reference subjects having SSc disease state. In a non-limiting example, the disease risk score is generated using the following method. For each reference subject for a reference data set, one GSVA score for each of the 48 Tables (e.g., Tables 4A-1 to 4A-20, and 4B-1 to 4B-28) is generated (e.g., for enrichment of the genes listed in the Tables). A first portion of the reference subjects of the reference data set have lupus disease state, a second portion of the reference subjects of the reference data set have PSO disease state, a third portion of the reference subjects of the reference data set have AD disease state, a fourth portion of the reference subjects of the reference data set have SSc disease state, and a fifth portion of the reference subjects of the reference data set are healthy controls. Oversampling or undersampling correction of the dataset is performed if necessary. The GSVA scores in each sample were binarized. In certain embodiments, where GSVA scores >0 were replaced with 1, and GSVA scores <0 were replaced with 0. Logistic regression with ridge penalty was performed, with the 48 binarized GSVA scores of the samples (e.g., reference subjects). The gene set listed in the 48 Tables are the features of the machine learning model. Feature coefficients for the features were calculated for each iteration and final coefficients were obtained by taking the average of all iterations ran. The final coefficients of a feature can be the weight factors of the feature (e.g. gene set). To obtain weighted GSVA score of a binarized GSVA score, the binarized GSVA score is multiplied with the final coefficient of the gene set from which the binarized GSVA score is generated. To calculate a risk score of a reference subject, the weighted GSVA scores of the reference subject are obtained from binarized GSVA scores of the reference subject, and the weighted GSVA scores are summed to obtain the risk score of the reference subject. In certain embodiments, the analyzing in step (b′) comprises classifying skin of the patient based on the disease risk score, and if the skin of the patient is indicative of the disease state based on the disease risk score, further analyzing the data set to classify whether the skin of the patient is indicative of i) lupus or ii) AD, PSO and/or SSc disease state.
In some aspects, the present disclosure provides a method for assessing skin of a patient. The method can include analyzing a data set to classify the skin of the patient as indicative of a disease state of the patient. The data set can comprise and/or can be derived from gene expression measurements of at least 2 genes selected from the genes listed in Table 1, Table 2, Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, Table 4C and Table 4D, in a biological sample from the patient. In certain embodiments, the at least 2 genes are selected from the genes listed in Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28.
In certain embodiments, the disease state is an inflammatory skin disease state. In certain embodiments, the disease state is a rheumatic skin disease state. In certain embodiments, the disease state is lupus (e.g., systemic lupus erythematosus (SLE)), psoriasis (PSO), atopic dermatitis (AD), and/or systemic sclerosis (scleroderma) (SSc) disease state. The skin of the patient can contain one or more lesions, or does not contain a lesion. In certain embodiments, the skin of the patient contains one or more lesions. In certain embodiments, the skin of the patient does not contain a lesion.
In certain embodiments, the lupus is SLE, CLE, DLE, ACLE, SCLE, or CCLE, or any combination thereof. In certain embodiments, the lupus is SLE. In certain embodiments, the lupus is CLE. In certain embodiments, the lupus is DLE. In certain embodiments, the lupus is ACLE. In certain embodiments, the lupus is SCLE. In certain embodiments, the lupus is CCLE. In certain embodiments, the disease state is lupus disease state. In certain embodiments, the disease state is SLE disease state. In certain embodiments, the disease state is CLE disease state. In certain embodiments, the disease state is DLE disease state. In certain embodiments, the disease state is ACLE disease state. In certain embodiments, the disease state is SCLE. In certain embodiments, the disease state is CCLE disease state.
In certain embodiments, the SLE disease state is DLE disease state, ACLE disease state, SCLE disease state, or CCLE disease state, or any combination thereof. In certain embodiments, the SLE disease state is DLE disease state. In certain embodiments, SLE disease state is SCLE disease state.
In certain embodiments, the disease state is lupus, PSO, AD, and/or SSc, disease state, and the data set is analyzed to classify the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc, disease state. In certain embodiments, the disease state is lupus, PSO, AD, or SSc, disease state, and the data set is analyzed to classify the skin of the patient as indicative of the lupus, PSO, AD, or SSc, disease state. In certain embodiments, the disease state is lupus disease state, and the data set is analyzed to classify the skin of the patient as indicative of the lupus disease state. In certain embodiments, the disease state is lupus disease state, and the data set is analyzed to classify whether the skin of the patient is indicative of a group 1 lupus disease state, group 2 lupus disease state, group 3 lupus disease state, or not having the lupus disease state. Group 1, 2, and 3 lupus disease state can be characterized by gene enrichment analysis corresponding to group 1, 2 and 3 lupus disease, respectively, as described in Example 2, and
In certain embodiments, the at least 2 genes comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 305, 310, 315, 320, 325, 330, 335, 340, 345, 350, 355, 360, 365, 370, 375, 380, 385, 390, 395, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500, 1550, 1600, 1650, or all (e.g., all genes listed in the Tables), or any value or range there between genes.
In certain embodiments, the at least 2 genes comprises independently at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 305, 310, 315, 320, 325, 330, 335, 340, 345, 350, 355, 360, or all, or any value or range there between genes selected from the genes listed in each of one or more Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28. In certain embodiments, the one or more Tables can include 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, or 48, or 1 to 48 any range there between Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28. In certain embodiments, all 48 Tables, e.g., Tables 4A-1 to 4A-20 and Tables 4B-1 to 4B-28) are selected. In certain embodiments, the skin of the patient comprises one or more lesions and the Tables selected comprises Table 4B-8, Table 4B-25, Table 4B-14, Table 4A-16, Table 4B-10, Table 4B-28, and Table 4B-23,
In certain embodiments, the patient has lupus, PSO, AD, and/or SSc. In certain embodiments, the patient is suspected of having lupus, PSO, AD, and/or SSc. In certain embodiments, the patient is at elevated risk of having lupus, PSO, AD, and/or SSc. In certain embodiments, the patient is asymptomatic for lupus, PSO, AD, and/or SSc. In certain embodiments, the patient has DLE, and/or SCLE. In certain embodiments, the patient is suspected of having DLE, and/or SCLE. In certain embodiments, the patient is at elevated risk of having DLE, and/or SCLE. In certain embodiments, the patient is asymptomatic for DLE, and/or SCLE. In certain embodiments, the patient has, is suspected of having, is at elevated risk of having and/or is asymptomatic for lupus. In certain embodiments, the patient has, is suspected of having, is at elevated risk of having and/or is asymptomatic for PSO. In certain embodiments, the patient has, is suspected of having, is at elevated risk of having and/or is asymptomatic for AD. In certain embodiments, the patient has, is suspected of having, is at elevated risk of having and/or is asymptomatic for SSc. In certain embodiments, the method can further comprise administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state. The treatment can be configured to treat, reduce severity, and/or reduce risk of having the lupus, PSO, AD, or SSc. In some embodiments, the treatment is configured to treat the lupus, PSO, AD, or SSc. In some embodiments, the treatment is configured to reduce a severity of the lupus, PSO, AD, or SSc. In some embodiments, the treatment is configured to reduce a risk of having the lupus, PSO, AD, or SSc. The treatment can be one or more treatments of lupus, PSO, AD, and/or SSc. In certain embodiments, the method can comprise administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the DLE or SCLE disease state. The treatment can be configured to treat, reduce severity, and/or reduce risk of having the DLE or SCLE. In some embodiments, the treatment is configured to treat the DLE or SCLE. In some embodiments, the treatment is configured to reduce a severity of the DLE or SCLE. In some embodiments, the treatment is configured to reduce a risk of having the DLE or SCLE. The treatment can be one or more treatments of DLE or SCLE. In certain embodiments, the treatment is configured to treat, reduce severity, and/or reduce risk of having lupus. In certain embodiments, the treatment is configured to treat, reduce severity, and/or reduce risk of having PSO. In certain embodiments, the treatment is configured to treat, reduce severity, and/or reduce risk of having AD. In certain embodiments, the treatment is configured to treat, reduce severity, and/or reduce risk of having SSc. In certain embodiments, the treatment is configured to treat, reduce severity, and/or reduce risk of having DLE. In certain embodiments, the treatment is configured to treat, reduce severity, and/or reduce risk of having SCLE. In some embodiments, the treatment comprises a pharmaceutical composition.
A treatment used in the context of the present methods may be any known to those of skill in the art for treating, e.g., reducing the severity of or reducing the risk of, the disease state in the patient. In some embodiments, the treatment comprises an immunosuppressive treatment. In some embodiments, the treatment comprises a pharmaceutical composition comprising one or more agents that target and/or inhibit: TNF (e.g., etanercept, infliximab, adalimumab, certolizumab); IL-12/23 (IL23 complex) (e.g., ustekinumab, guselkumab, risankizumab; an interferon or interferon receptor (e.g., anifrolumab, which binds to IFNAR); proteasome (e.g., bortezomib, carfilzomib, ixazomib); CD38 (e.g., daratumumab, isatuximab); SLAMF7 (e.g., elotuzumab); IMPDH (mycophenylate mofetil); BlyS (e.g., belimumab); CD19 (e.g., inebilizumab); CD20 (e.g., rituximab, obinutuzumab); CD20/CD3 (e.g., glofitamab); NPL4 (e.g., disulfiram); neutrophil elastase (e.g., alvelestat); a growth factor receptor, e.g., FGFR, PDGFR, VEGFR (e.g., nintedanib, pirfenidone); BDCA2 (e.g., BIIB059); ILT7 (e.g., Daxdilmab). In some embodiments, the pharmaceutical composition comprises an agent that targets plasma cells (e.g., bortezomib, carfilzomib, ixazomib, daratumumab, isatuximab, elotuzumab, mycophenylate mofetil), B cells (e.g., belimumab, inebilizumab, rituximab, glofitamab, obinutuzumab), neutrophils (e.g., disulfiram, alvelestat), TGFB fibroblasts (e.g., nintedanib, pirfenidone), and/or dendritic cells (e.g., BIIB059, Daxdilmab). In some embodiments, a treatment for DLE comprises an agent that targets plasma cells and/or B cells. In some embodiments, a treatment for psoriasis comprises an agent that targets neutrophils. In some embodiments, a treatment for systemic sclerosis comprises an agent that targets TGFB fibroblasts and/or dendritic cells. In some embodiments, a treatment for atopic dermatitis comprises an agent that targets IL23. In some embodiments, the treatment can be one or more treatments shown in
The biological sample can comprise a skin biopsy sample, a blood sample, isolated peripheral blood mononuclear cells (PBMCs), or any derivative thereof. In certain embodiments, the biological sample comprises a skin biopsy sample, or any derivative thereof. In certain embodiments, the biological sample comprises a blood sample, or any derivative thereof. In certain embodiments, the biological sample comprises isolated peripheral blood mononuclear cells (PBMCs), or any derivative thereof.
In certain embodiments, the method further comprises determining a likelihood of the classification of the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state of the patient. In certain embodiments, the method further comprises monitoring the skin of the patient, wherein the monitoring comprises assessing the skin of the patient at a plurality of different time points. A difference in the assessment of the skin of the patient among the plurality of time points can be indicative of one or more clinical indications selected from the group consisting of: (i) a diagnosis of the skin of the patient, (ii) a prognosis of the skin of the patient, and (iii) an efficacy or non-efficacy of a course of treatment for treating the skin of the patient.
In certain embodiments, the skin of the patient comprises one or more lesions, and the data set is analyzed to classify the skin of the patient as indicative of the lupus disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-8, Table 4B-25, Table 4B-14, Table 4A-16, Table 4B-22, Table 4B-10, Table 4A-11, Table 4B-16, Table 4B-26, Table 4A-1, Table 4A-19, Table 4A-15, Table 4B-28, Table 4B-15, and Table 4B-23, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-8, Table 4B-25, Table 4B-14, Table 4A-16, Table 4B-22, Table 4B-10, Table 4A-11, Table 4B-16, Table 4B-26, Table 4A-1, Table 4A-19, Table 4A-15, Table 4B-28, Table 4B-15, and Table 4B-23, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain particular embodiments, for the embodiments described in this paragraph Tables selected includes at least Tables 4B-8 and B-10.
In certain embodiments, the skin of the patient does not comprise a lesion, and the data set is analyzed to classify the skin of the patient as indicative of the lupus disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the at least 2 genes comprise at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-26, Table 4A-8, Table 4A-14, Table 4A-16, Table 4B-11, Table 4A-1, Table 4B-6, Table 4A-10, Table 4B-10, Table 4B-16, Table 4B-2, Table 4B-19, Table 4B-13, Table 4B-1, and Table 4B-25, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between tables selected from Table 4B-26, Table 4A-8, Table 4A-14, Table 4A-16, Table 4B-11, Table 4A-1, Table 4B-6, Table 4A-10, Table 4B-10, Table 4B-16, Table 4B-2, Table 4B-19, Table 4B-13, Table 4B-1, and Table 4B-25, or any combination thereof, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient comprises one or more lesions, and the data set is analyzed to classify the skin of the patient as indicative of the PSO disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-3, Table 4B-25, Table 4B-10, Table 4B-16, Table 4B-8, Table 4B-14, Table 4B-2, Table 4A-7, Table 4B-28, Table 4B-23, Table 4B-20, Table 4B-26, Table 4A-13, Table 4B-18, and Table 4A-16, and iii) the data set is analyzed to classify the skin of the patient as indicative of the PSO disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-3, Table 4B-25, Table 4B-10, Table 4B-16, Table 4B-8, Table 4B-14, Table 4B-2, Table 4A-7, Table 4B-28, Table 4B-23, Table 4B-20, Table 4B-26, Table 4A-13, Table 4B-18, and Table 4A-16, and iii) the data set is analyzed to classify the skin of the patient as indicative of the PSO disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain particular embodiments, for the embodiments described in this paragraph Tables selected includes at least Tables 4B-8 and B-10.
In certain embodiments, the skin of the patient does not comprise a lesion, and the data set is analyzed to classify the skin of the patient as indicative of the PSO disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the at least 2 gene comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-1, Table 4B-3, Table 4B-12, Table 4A-14, Table 4A-20, Table 4B-17, Table 4B-20, Table 4B-27, Table 4A-9, Table 4A-15, Table 4A-18, Table 4A-13, Table 4B-26, Table 4B-2, and Table 4A-5, and iii) the data set is analyzed to classify the skin of the patient as indicative of the PSO disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-1, Table 4B-3, Table 4B-12, Table 4A-14, Table 4A-20, Table 4B-17, Table 4B-20, Table 4B-27, Table 4A-9, Table 4A-15, Table 4A-18, Table 4A-13, Table 4B-26, Table 4B-2, and Table 4A-5, and iii) the data set is analyzed to classify the skin of the patient as indicative of the PSO disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient comprises one or more lesions, and data set is analyzed to classify the skin of the patient as indicative of the AD disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-10, Table 4B-25, Table 4B-8, Table 4B-22, Table 4B-28, Table 4B-16, Table 4A-16, Table 4B-14, Table 4B-13, Table 4B-23, Table 4B-7, Table 4B-15, Table 4A-12, Table 4B-3, and Table 4B-2, and iii) the data set is analyzed to classify the skin of the patient as indicative of the AD disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-10, Table 4B-25, Table 4B-8, Table 4B-22, Table 4B-28, Table 4B-16, Table 4A-16, Table 4B-14, Table 4B-13, Table 4B-23, Table 4B-7, Table 4B-15, Table 4A-12, Table 4B-3, and Table 4B-2, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the AD disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain particular embodiments, for the embodiments described in this paragraph Tables selected includes at least Tables 4B-8 and B-10.
In certain embodiments, the skin of the patient does not comprise a lesion, and data set is analyzed to classify the skin of the patient as indicative of the AD disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the at least 2 genes comprise at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-17, Table 4B-28, Table 4A-6, Table 4A-7, Table 4B-2, Table 4B-20, Table 4A-9, Table 4B-18, Table 4A-12, Table 4A-16, Table 4A-13, Table 4B-23, Table 4B-9, Table 4A-3, and Table 4A-10, and iii) the data set is analyzed to classify the skin of the patient as indicative of the AD disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-17, Table 4B-28, Table 4A-6, Table 4A-7, Table 4B-2, Table 4B-20, Table 4A-9, Table 4B-18, Table 4A-12, Table 4A-16, Table 4A-13, Table 4B-23, Table 4B-9, Table 4A-3, and Table 4A-10, and iii) the data set is analyzed to classify the skin of the patient as indicative of the AD disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient comprises one or more lesions, and the data set is analyzed to classify the skin of the patient as indicative of the SSc disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4A-16, Table 4B-8, Table 4B-25, Table 4B-21, Table 4B-26, Table 4B-10, Table 4B-28, Table 4B-2, Table 4B-27, Table 4B-14, Table 4A-18, Table 4A-6, Table 4A-15, Table 4B-12, and Table 4B-23, and iii) the data set is analyzed to classify the skin of the patient as indicative of the SSc disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4A-16, Table 4B-8, Table 4B-25, Table 4B-21, Table 4B-26, Table 4B-10, Table 4B-28, Table 4B-2, Table 4B-27, Table 4B-14, Table 4A-18, Table 4A-6, Table 4A-15, Table 4B-12, and Table 4B-23, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the SSc disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain particular embodiments, for the embodiments described in this paragraph Tables selected includes at least Tables 4B-8 and B-10.
In certain embodiments, the skin of the patient does not comprise a lesion, and data set is analyzed to classify the skin of the patient as indicative of the SSc disease state.
In certain embodiments, the skin of the patient comprises one or more lesions, and the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) at least 2 genes comprise at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4A-11, Table 4B-13, Table 4A-17, and Table 4B-10, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) at least 2 genes comprise at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4A-11, Table 4B-13, Table 4A-17, Table 4B-23, Table 4B-20, and able 4B-10, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4A-11, Table 4B-13, Table 4A-17, and Table 4B-10, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4B-23, Table 4B-13, Table 4A-17, and Table 4B-20, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or 13, or any range there between, tables selected from Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4B-13, and Table 4A-17, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain particular embodiments, all the 13 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 or 17 or any range there between, tables selected from Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4A-11, Table 4B-23, Table 4B-13, Table 4A-17, Table 4B-20 and Table 4B-10, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain particular embodiments, all the 17 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient does not comprise a lesion, and the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the at least 2 genes comprise at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4B-1, Table 4A-10, Table 4A-12, Table 4B-13, and Table 4B-15, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the at least 2 genes comprise at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4B-1, Table 4A-10, Table 4A-12, Table 4B-13, Table 4B-23, Table 4B-12 and Table 4B-15, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4B-1, Table 4A-10, Table 4A-12, Table 4B-13, and Table 4B-15, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4B-23, Table 4A-10, Table 4B-12, Table 4B-13, and Table 4B-15, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or 13, or any range there between, tables selected from Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4A-10, Table 4B-13, and Table 4B-15, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain particular embodiments, all the 13 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17, or any range there between, tables selected from Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4B-1, Table 4A-10, Table 4A-12, Table 4B-13, Table 4B-23, Table 4B-12, and Table 4B-15, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state. In certain particular embodiments, all the 17 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient comprises one or more lesions, and the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, Table 4B-5, and Table 4B-7, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, Table 4B-5, Table 4B-23, and Table 4B-7, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, Table 4B-5, and Table 4B-7, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, Table 4B-5, and Table 4B-23, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprises independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or 16, or any range there between, tables selected from Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, Table 4B-5, Table 4B-7, and Table 4B-23, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain particular embodiments, all the 16 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14, or any range there between, tables selected from Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, and Table 4B-5, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain particular embodiments, all the 14 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, i) the skin of the patient does not comprise a lesion, and the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the at least 2 genes comprise at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, Table 4B-24, Table 4B-22, Table 4A-7, Table 4B-10, and Table 4A-10, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the at least 2 genes comprise at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, Table 4B-24, Table 4B-22, Table 4A-7, Table 4B-10, Table 4A-5, Table 4B-17, Table 4B-12, Table 4A-3, and Table 4A-10, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, Table 4B-24, Table 4B-22, Table 4A-7, Table 4B-10, and Table 4A-10, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, Table 4B-22, Table 4A-5, Table 4B-17, Table 4B-12, and, Table 4A-3, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or 11, or any range there between, tables selected from Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, and Table 4B-22, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain particular embodiments, all the 11 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, or 19, or any range there between, tables selected from Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, Table 4B-24, Table 4B-22, Table 4A-7, Table 4B-10, Table 4A-5, Table 4B-17, Table 4B-12, Table 4A-3, and Table 4A-10, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state. In certain particular embodiments, all the 19 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient comprises one or more lesions, and the data set is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4A-15, Table 4B-13, and Table 4A-8, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4A-15, Table 4B-13, Table 4A-1 and Table 4A-8, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4A-15, Table 4B-13, and Table 4A-8, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4A-1, Table 4B-13, and Table 4A-8, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14, or any range there between, tables selected from Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4B-13, and Table 4A-8, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state. In certain particular embodiments, all the 14 Tables (e.g., listed in the previous sentence) are selected. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 16, or any range there between, tables selected from Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4A-15, Table 4A-1, Table 4B-13, and Table 4A-8, and iii) the data set is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient does not comprise a lesion, and the data set is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state.
In certain embodiments, i) the skin of the patient does not comprise a lesion, and the data set is analyzed to classify the skin of the patient as indicative of the AD or PSO disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the at least 2 genes comprise at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4B-1, Table 4B-14, Table 4B-3, Table 4B-7, Table 4B-17, Table 4A-9, Table 4B-12, Table 4A-4, Table 4B-10, Table 4A-14, Table 4B-20, Table 4B-22, Table 4B-16, Table 4B-13, and Table 4A-11 and iii) the data set is analyzed to classify the skin of the patient as indicative of the AD or PSO disease state. In certain embodiments, i) the skin of the patient does not comprise a lesion, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4B-1, Table 4B-14, Table 4B-3, Table 4B-7, Table 4B-17, Table 4A-9, Table 4B-12, Table 4A-4, Table 4B-10, Table 4A-14, Table 4B-20, Table 4B-22, Table 4B-16, Table 4B-13, and Table 4A-11, and iii) in step (b) the data set is analyzed to classify the skin of the patient as indicative of the AD or PSO disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient comprise one or more lesion, and the data set is analyzed to classify the skin of the patient as indicative of the AD or PSO disease state.
In certain embodiments, the skin of the patient comprises one or more lesions, and the data set is analyzed to classify the skin of the patient as indicative of the DLE or SCLE disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, or all or any range or value there between, genes selected from the genes listed in Table 4A-16, Table 4B-26, Table 4B-25, Table 4B-2, Table 4B-22, Table 4B-14, Table 4A-13, Table 4A-15, Table 4B-4, Table 4B-9, Table 4A-10, Table 4A-12, Table 4B-6, Table 4B-1, and Table 4A-5, and iii) the data set is analyzed to classify the skin of the patient as indicative of the discoid lupus erythematosus (DLE) or Subacute cutaneous lupus erythematosus (SCLE) disease state. In certain embodiments, i) the skin of the patient comprises one or more lesions, ii) the at least 2 genes comprise independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, or all or any range or value there between, genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or any range there between, tables selected from Table 4A-16, Table 4B-26, Table 4B-25, Table 4B-2, Table 4B-22, Table 4B-14, Table 4A-13, Table 4A-15, Table 4B-4, Table 4B-9, Table 4A-10, Table 4A-12, Table 4B-6, Table 4B-1, Table 4A-5, or any combination thereof, and iii) the data set is analyzed to classify the skin of the patient as indicative of the discoid lupus erythematosus (DLE) or Subacute cutaneous lupus erythematosus (SCLE) disease state. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient does not comprise a lesion, and the data set is analyzed to classify the skin of the patient as indicative of the DLE or SCLE disease state. In certain embodiments, the skin of the patient does not comprise a lesion, and the data set is analyzed to classify the skin of the patient as indicative of the PSO or SSc disease state. In certain embodiments, the skin of the patient does not comprise a lesion, and the data set is analyzed to classify the skin of the patient as indicative of the AD or SSc disease state. In certain embodiments, the skin of the patient comprises one or more lesions, and the data set is analyzed to classify the skin of the patient as indicative of the PSO or SSc disease state. In certain embodiments, the skin of the patient comprises one or more lesion, and the data set is analyzed to classify the skin of the patient as indicative of the AD or SSc disease state.
The data set can be generated from the biological sample from the patient. For example, nucleic acid molecules of the patient in the biological sample can be assessed to obtain the data set. In certain embodiments, the gene expression measurements of the at least 2 genes in the biological sample can be performed using any suitable method known to those of skill in the art including but not limited to DNA sequencing, RNA sequencing, microarray data, RNA-Seq, qPCR, northern blotting, fluorescent in situ hybridization, serial analysis of gene expression, tiling arrays or any combination thereof, to obtain the data set. In certain embodiments, data set can be derived from the gene expression measurement data of the biological sample, wherein the gene expression measurement data is analyzed using a suitable data analysis tool including but not limited to a BIG-C™ big data analysis tool, an I-Scope™ big data analysis tool, a T-Scope™ big data analysis tool, a CellScan big data analysis tool, an MS (Molecular Signature) Scoring™ analysis tool, gene set variation analysis (GSVA), gene set enrichment analysis (GSEA), enrichment algorithm, Z score, multiscale embedded gene co-expression network analysis (MEGENA), weighted gene co-expression network analysis (WGCNA), differential expression analysis, log 2 expression analysis, or any combination thereof, to obtain the dataset. In certain embodiments, the gene expression measurement data of the biological sample can be analyzed using GSVA, to obtain the data set.
In certain embodiments, the data set comprises an enrichment score of the patient, wherein the enrichment score is derived from the gene expression measurement data of the at least 2 genes in the biological sample. The enrichment score is derived from the gene expression measurement data using the suitable data analysis tool. The enrichment score can be obtained by assessing enrichment of expression of the at least 2 genes in the biological sample. In certain embodiments, the enrichment score comprises one or more Table specific enrichment scores of the patient, wherein the one or more Table specific enrichment scores are generated using one or more of the Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28, wherein for a respective selected Table, at least one Table specific enrichment score of the patient is generated for enrichment of expression of at least 2 genes listed in the respective Table, in the biological sample. The one or more Table specific enrichment scores of the patient comprises the at least one Table specific enrichment score from each of the selected Table. The at least 2 genes of the data set can comprise the at least 2 genes from each of the selected table. In certain embodiments, the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, or 48, or 1 to 48, or any range there between Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28. In certain embodiments, the all the 48 Tables, e.g., Tables 4A-1 to 4A-20 and Tables 4B-1 to 4B-28, are selected. In certain embodiments, independently for each of the selected Table of the one or more Tables, the at least one Table specific enrichment score from the Table is generated, for enrichment of expression of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295 or 300 or all or any range or value there between, genes selected from the genes listed in the Table, in the biological sample. In certain embodiments, for each of the selected Table one Table specific enrichment score is generated, and the one or more Table specific enrichment of the patient comprises the one Table specific enrichment score from each of the selected Table. In certain embodiments, the Table specific enrichment scores are GSVA scores, and are obtained using GSVA. In certain embodiments, the GSVA scores can be Z-score GSVA scores.
In certain embodiments, the data set is derived using GSVA, wherein the data set comprises one or more GSVA scores of the patient, wherein the one or more GSVA scores are generated using one or more of the Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, or any combination thereof, wherein for a respective selected Table, at least one GSVA score of the patient is generated for enrichment of expression of at least 2 genes listed in the respective Table, in the biological sample. The one or more GSVA scores of the patient comprises the at least one GSVA score from each of the selected Table. The at least 2 genes of the data set can comprise the at least 2 genes from each of the selected table. In certain embodiments, the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, or 48, or 1 to 48, or any range there between Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28. In certain embodiments, the all the 48 Tables, e.g., Tables 4A-1 to 4A-20 and Tables 4B-1 to 4B-28, are selected. In certain embodiments, independently for each of the selected Table of the one or more Tables, the at least one GSVA score from the Table is generated, for enrichment of expression of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295 or 300 or all or any range or value there between, genes selected from the genes listed in the respective Table, in the biological sample. In certain embodiments, for each of the selected Table one GSVA score is generated, and the one or more GSVA score of the patient comprises the one GSVA score from each of the selected Table.
In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-8, Table 4B-25, Table 4B-14, Table 4A-16, Table 4B-22, Table 4B-10, Table 4A-11, Table 4B-16, Table 4B-26, Table 4A-1, Table 4A-19, Table 4A-15, Table 4B-28, Table 4B-15, and Table 4B-23, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-8, Table 4B-25, Table 4B-14, Table 4A-16, Table 4B-22, Table 4B-10, Table 4A-11, Table 4B-16, Table 4B-26, Table 4A-1, Table 4A-19, Table 4A-15, Table 4B-28, Table 4B-15, and Table 4B-23, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the lupus disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain particular embodiments, for the embodiments described in this paragraph Tables selected includes at least Tables 4B-8 and B-10
In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-26, Table 4A-8, Table 4A-14, Table 4A-16, Table 4B-11, Table 4A-1, Table 4B-6, Table 4A-10, Table 4B-10, Table 4B-16, Table 4B-2, Table 4B-19, Table 4B-13, Table 4B-1, and Table 4B-25, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-26, Table 4A-8, Table 4A-14, Table 4A-16, Table 4B-11, Table 4A-1, Table 4B-6, Table 4A-10, Table 4B-10, Table 4B-16, Table 4B-2, Table 4B-19, Table 4B-13, Table 4B-1, and Table 4B-25, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the lupus, disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient contains one or more lesions, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD disease state of the patient. In certain embodiments, the skin of the patient contains one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-10, Table 4B-25, Table 4B-8, Table 4B-22, Table 4B-28, Table 4B-16, Table 4A-16, Table 4B-14, Table 4B-13, Table 4B-23, Table 4B-7, Table 4B-15, Table 4A-12, Table 4B-3, and Table 4B-2, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD disease state of the patient. In certain embodiments, the skin of the patient contains one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-10, Table 4B-25, Table 4B-8, Table 4B-22, Table 4B-28, Table 4B-16, Table 4A-16, Table 4B-14, Table 4B-13, Table 4B-23, Table 4B-7, Table 4B-15, Table 4A-12, Table 4B-3, and Table 4B-2, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the AD, disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain particular embodiments, for the embodiments described in this paragraph Tables selected includes at least Tables 4B-8 and B-10.
In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-17, Table 4B-28, Table 4A-6, Table 4A-7, Table 4B-2, Table 4B-20, Table 4A-9, Table 4B-18, Table 4A-12, Table 4A-16, Table 4A-13, Table 4B-23, Table 4B-9, Table 4A-3, and Table 4A-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-17, Table 4B-28, Table 4A-6, Table 4A-7, Table 4B-2, Table 4B-20, Table 4A-9, Table 4B-18, Table 4A-12, Table 4A-16, Table 4A-13, Table 4B-23, Table 4B-9, Table 4A-3, and Table 4A-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the AD, disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the PSO disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-3, Table 4B-25, Table 4B-10, Table 4B-16, Table 4B-8, Table 4B-14, Table 4B-2, Table 4A-7, Table 4B-28, Table 4B-23, Table 4B-20, Table 4B-26, Table 4A-13, Table 4B-18, and Table 4A-16, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the PSO disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-3, Table 4B-25, Table 4B-10, Table 4B-16, Table 4B-8, Table 4B-14, Table 4B-2, Table 4A-7, Table 4B-28, Table 4B-23, Table 4B-20, Table 4B-26, Table 4A-13, Table 4B-18, and Table 4A-16, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the PSO disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the PSO, disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain particular embodiments, for the embodiments described in this paragraph Tables selected includes at least Tables 4B-8 and B-10.
In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the PSO disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-1, Table 4B-3, Table 4B-12, Table 4A-14, Table 4A-20, Table 4B-17, Table 4B-20, Table 4B-27, Table 4A-9, Table 4A-15, Table 4A-18, Table 4A-13, Table 4B-26, Table 4B-2, and Table 4A-5, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the PSO disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-1, Table 4B-3, Table 4B-12, Table 4A-14, Table 4A-20, Table 4B-17, Table 4B-20, Table 4B-27, Table 4A-9, Table 4A-15, Table 4A-18, Table 4A-13, Table 4B-26, Table 4B-2, and Table 4A-5, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the PSO disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the PSO, disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the SSc disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4A-16, Table 4B-8, Table 4B-25, Table 4B-21, Table 4B-26, Table 4B-10, Table 4B-28, Table 4B-2, Table 4B-27, Table 4B-14, Table 4A-18, Table 4A-6, Table 4A-15, Table 4B-12, and Table 4B-23, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the SSc disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4A-16, Table 4B-8, Table 4B-25, Table 4B-21, Table 4B-26, Table 4B-10, Table 4B-28, Table 4B-2, Table 4B-27, Table 4B-14, Table 4A-18, Table 4A-6, Table 4A-15, Table 4B-12, and Table 4B-23, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the SSc disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the SSc, disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected. In certain particular embodiments, for the embodiments described in this paragraph Tables selected includes at least Tables 4B-8 and B-10
In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the SSc disease state of the patient.
In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4A-11, Table 4B-13, Table 4A-17, and Table 4B-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17, or 1 to 17, or any range there between, Tables selected from the group consisting of Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4A-11, Table 4B-13, Table 4A-17, Table 4B-23, Table 4B-20, and Table 4B-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4B-23, Table 4B-13, Table 4A-17, and Table 4B-20, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or 13, or 1 to 13, or any range there between, Tables selected from the group consisting of Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4B-13, and Table 4A-17, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17, or 1 to 17, or any range there between, Tables selected from the group consisting of Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4A-11, Table 4B-13, Table 4A-17, Table 4B-23, Table 4B-20, and Table 4B-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the lupus or AD, disease state of the patient. In certain particular embodiments, all the 17 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient does not comprise a lesion, and, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4B-1, Table 4A-10, Table 4A-12, Table 4B-13, and Table 4B-15, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4B-23, Table 4A-10, Table 4B-12, Table 4B-13, and Table 4B-15, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or 13, or 1 to 13, or any range there between, Tables selected from the group consisting of Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4A-10, Table 4B-13, and Table 4B-15, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17, or 1 to 17, or any range there between, Tables selected from the group consisting of Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4B-1, Table 4A-10, Table 4A-12, Table 4B-13, Table 4B-23, Table 4B-12, and Table 4B-15, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17, or 1 to 17, or any range there between, Tables selected from the group consisting of Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4B-1, Table 4A-10, Table 4A-12, Table 4B-13, Table 4B-23, Table 4B-12 and Table 4B-15, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the lupus or AD, disease state of the patient. In certain particular embodiments, all the 17 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, Table 4B-5, and Table 4B-7, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, Table 4B-5, and Table 4B-23, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14, or 1 to 14, or any range there between, Tables selected from the group consisting of Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, and Table 4B-5, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or 16, or 1 to 16, or any range there between, Tables selected from the group consisting of Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, Table 4B-5, Table 4B-23, and Table 4B-7, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or 16, or 1 to 16, or any range there between, Tables selected from the group consisting of Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, Table 4B-5, Table 4B-23 and Table 4B-7, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the lupus or PSO, disease state of the patient. In certain particular embodiments, all the 16 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, Table 4B-24, Table 4B-22, Table 4A-7, Table 4B-10, and Table 4A-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, Table 4A-5, Table 4B-17, Table 4B-12, Table 4A-3, and Table 4B-22, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or 11, or 1 to 11, or any range there between, Tables selected from the group consisting of Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, and Table 4B-22, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, or 19, or 1 to 19, or any range there between, Tables selected from the group consisting of Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, Table 4B-24, Table 4B-22, Table 4A-7, Table 4B-10, and Table 4A-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, or 19 Tables selected from the group consisting of Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, Table 4B-24, Table 4B-22, Table 4A-7, Table 4B-10, Table 4A-5, Table 4B-17, Table 4B-12, Table 4A-3, and Table 4A-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the lupus or PSO, disease state of the patient. In certain particular embodiments, all the 19 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4A-15, Table 4B-13, and Table 4A-8 and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4A-1, Table 4B-13, and Table 4A-8 and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14, or 1 to 14, or any range there between, Tables selected from the group consisting of Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4B-13, and Table 4A-8 and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or 16, or 1 to 16, or any range there between, Tables selected from the group consisting of Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4A-15, Table 4B-13, Table 4A-1, and Table 4A-8 and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or 16, or 1 to 16, or any range there between, Tables selected from the group consisting of Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4A-15, Table 4B-13, Table 4A-1 and Table 4A-8 and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the lupus or SSc, disease state of the patient. In certain particular embodiments, all the 16 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state of the patient.
In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD or PSO disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-1, Table 4B-14, Table 4B-3, Table 4B-7, Table 4B-17, Table 4A-9, Table 4B-12, Table 4A-4, Table 4B-10, Table 4A-14, Table 4B-20, Table 4B-22, Table 4B-16, Table 4B-13, and Table 4A-11, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD or PSO disease state of the patient. In certain embodiments, the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4B-1, Table 4B-14, Table 4B-3, Table 4B-7, Table 4B-17, Table 4A-9, Table 4B-12, Table 4A-4, Table 4B-10, Table 4A-14, Table 4B-20, Table 4B-22, Table 4B-16, Table 4B-13, and Table 4A-11, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD or PSO disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the AD or PSO, disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD or PSO disease state of the patient.
In certain embodiments, the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the DLE or SCLE disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the DLE or SSc disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4A-16, Table 4B-26, Table 4B-25, Table 4B-2, Table 4B-22, Table 4B-14, Table 4A-13, Table 4A-15, Table 4B-4, Table 4B-9, Table 4A-10, Table 4A-12, Table 4B-6, Table 4B-1, and Table 4A-5, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the DLE or SCLE disease state of the patient. In certain embodiments, the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or 1 to 15, or any range there between, Tables selected from the group consisting of Table 4A-16, Table 4B-26, Table 4B-25, Table 4B-2, Table 4B-22, Table 4B-14, Table 4A-13, Table 4A-15, Table 4B-4, Table 4B-9, Table 4A-10, Table 4A-12, Table 4B-6, Table 4B-1, and Table 4A-5, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the DLE or SCLE disease state of the patient, and the treatment is administered at least in part on the classification of the skin of the patient as indicative of the DLE or SCLE, disease state of the patient. In certain particular embodiments, all the 15 Tables (e.g., listed in the previous sentence) are selected.
In certain embodiments, analyzing the data set includes providing the data set as an input to a trained machine learning model, wherein the trained machine learning model generate an inference indicating whether the skin of the patient is indicative of the disease state, based on the data set. In certain embodiments, the method further includes receiving, as an output of the trained machine learning model, the inference indicating whether the skin of the patient is indicative of the disease state; and/or electronically outputting a report indicating whether the skin of the patient is indicative of the disease state. In certain embodiments, the trained machine learning model is trained to generate the inference of whether the skin of the patient is indicative of the disease state, based at least on the one or more GSVA scores of the patient. The one or more gene sets from which the one or more GSVA scores can be the features of the trained machine learning model. For a respective GSVA score, the gene set from which the respective GSVA score is generated is the genes based on enrichment of expression which in the biological sample, the respective GSVA score is generated. In certain embodiments, the one or more GSVA scores of the patient is provided as an input to the trained machine learning model. In certain embodiments, the trained machine learning model generates the inference indicating whether the skin of the patient is indicative of the lupus disease state based on the one or more GSVA scores of the patient, and the method can classify whether the skin of the patient is indicative of the lupus disease state. In certain embodiments, the trained machine learning model generates the inference indicating whether the skin of patient is indicative of the PSO disease state based on the one or more GSVA scores of the patient, and the method can classify whether the skin of the patient is indicative of the PSO disease state. In certain embodiments, the trained machine learning model generates the inference indicating whether the skin of the patient is indicative of the AD disease state based on the one or more GSVA scores of the patient, and the method can classify whether the skin of the patient is indicative of the AD disease state. In certain embodiments, the trained machine learning model generates the inference indicating whether the skin of the patient is indicative of the SSc disease state based on the one or more GSVA scores of the patient, and the method can classify whether the skin of the patient is indicative of the SSc disease state. In certain embodiments, the trained machine learning model generates the inference indicating whether the skin of the patient is indicative of the lupus or PSO disease state based on the one or more GSVA scores of the patient, and the method can classify whether the skin of the patient is indicative of the lupus or PSO disease state. In certain embodiments, the trained machine learning model generates the inference indicating whether the skin of the patient is indicative of the lupus or AD disease state based on the one or more GSVA scores of the patient, and the method can classify whether the skin of the patient is indicative of the lupus or AD disease state. In certain embodiments, the trained machine learning model generates the inference indicating whether the skin of the patient is indicative of the lupus or SSc disease state based on the one or more GSVA scores of the patient, and the method can classify whether the skin of the patient is indicative of the lupus or SSc disease state. In certain embodiments, the trained machine learning model generates the inference indicating whether the skin of the patient is indicative of the PSO or AD disease state based on the one or more GSVA scores of the patient, and the method can classify whether the skin of the patient is indicative of the PSO or AD disease state. In certain embodiments, the trained machine learning model generates the inference indicating whether the skin of the patient is indicative of the DLE or SCLE disease state based on the one or more GSVA scores of the patient, and the method can classify whether the skin of the patient is indicative of the DLE or SCLE disease state. The trained machine learning model can be trained using linear regression, logistic regression, Ridge regression, Lasso regression, elastic net (EN) regression, support vector machine (SVM), gradient boosted machine (GBM), k nearest neighbors (kNN), generalized linear model (GLM), naïve Bayes (NB) classifier, neural network, Random Forest (RF), deep learning algorithm, linear discriminant analysis (LDA), decision tree learning (DTREE), adaptive boosting (ADB), Classification and Regression Tree (CART), Hierarchical clustering, or any combination thereof.
The inference of the machine learning classifier can include a confidence value between 0 and 1. In certain embodiments, the confidence value of the inference of the machine learning classifier is between 0 and 1, such as, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1, or any value or ranges there between, that the patient has the disease state. In certain embodiments, the confidence value of the inference of the machine learning classifier is between 0 and 1, such as, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1, or any value or ranges there between, that the subject has lupus disease state. In certain embodiments, the confidence value of the inference of the machine learning classifier is between 0 and 1, such as, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1, or any value or ranges there between, that the subject has PSO disease state. In certain embodiments, the confidence value of the inference of the machine learning classifier is between 0 and 1, such as, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1, or any value or ranges there between, that the subject has AD disease state. In certain embodiments, the confidence value of the inference of the machine learning classifier is between 0 and 1, such as, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1, or any value or ranges there between, that the subject has SSc disease state. In certain embodiments, the confidence value of the inference of the machine learning classifier is between 0 and 1, such as, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1, or any value or ranges there between, that the subject has DLE disease state. In certain embodiments, the confidence value of the inference of the machine learning classifier is between 0 and 1, such as, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1, or any value or ranges there between, that the subject has SCLE disease state.
In certain embodiments, the machine learning model can be trained according to a method described herein, e.g. using the method of steps (a″), (b″), (c″), (d″), (e″), and/or (f″). The trained machine learning model can generate the inference based at least on comparing the data set to a reference data set. The trained machine learning model can be trained using the reference data set, wherein a first portion of the reference data set can be used as training data set, and a second portion of the reference data set can be used as validation dataset. The reference data set can comprise and/or be derived from gene expression measurements of reference biological samples of at least 2 genes selected from the genes listed in Table 1, Table 2, Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, Table 4C and Table 4D. In certain embodiments, the at least 2 genes of the reference data set are selected from the genes listed in Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28. In certain embodiments, the reference data set can be derived from the gene expression measurement data of the reference biological samples, wherein the gene expression measurement data is analyzed using a suitable data analysis tool including but not limited to a BIG-C™ big data analysis tool, an I-Scope™ big data analysis tool, a T-Scope™ big data analysis tool, a CellScan big data analysis tool, an MS (Molecular Signature) Scoring™ analysis tool, gene set variation analysis (GSVA), gene set enrichment analysis (GSEA), enrichment algorithm, Z score, multiscale embedded gene co-expression network analysis (MEGENA), weighted gene co-expression network analysis (WGCNA), differential expression analysis, log 2 expression analysis, or any combination thereof, to obtain the reference data set. In certain embodiments, the gene expression measurement data of the reference biological samples can be analyzed using GSVA, to obtain the reference data set.
In certain embodiments, the reference data set is obtained using GSVA, wherein the reference data set comprises one or more GSVA scores from the reference biological samples, wherein for a respective biological sample the one or more GSVA scores are generated using one or more of the Tables selected from Tables 4A-1, to 4A-20, and Tables 4B-1 to 4B-28, wherein for a respective selected Table, at least one GSVA score of the respective reference biological sample is generated for enrichment of expression of at least 2 genes listed in the respective Table, in the respective reference biological sample. The one or more GSVA scores can comprise the at least one GSVA score from each of the selected Table. The at least 2 genes of the reference data set can comprise the at least 2 genes from each of the selected table. In certain embodiments, the at least 2 genes of the data set, and the at least 2 genes of the reference data set can at least partially overlap (e. g., same). In certain embodiments, the selected tables of the data set (e.g., from which the one or more GSVA scores of the data set is generated), and the selected tables of the reference data set (e.g., from which the one or more GSVA scores of the reference data set is generated) can at least partially overlap (e. g., same). The reference biological samples can be obtained or derived from a plurality of reference subjects. In certain embodiments, the reference biological samples comprise a first plurality of reference biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of reference biological samples obtained or derived from reference subjects not having lupus disease state, wherein the skin of the reference subjects having lupus disease state contains one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of reference biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of reference biological samples obtained or derived from reference subjects not having lupus disease state, wherein the skin of the reference subjects having lupus disease state do not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of reference biological samples obtained or derived from reference subjects having PSO disease state, and a second plurality of reference biological samples obtained or derived from reference subjects not having PSO disease state, wherein the skin of the reference subjects having PSO disease state contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of reference biological samples obtained or derived from reference subjects having PSO disease state, and a second plurality of reference biological samples obtained or derived from reference subjects not having PSO disease state, wherein the skin of the reference subjects having PSO disease state do not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of reference biological samples obtained or derived from reference subjects having AD disease state, and a second plurality of reference biological samples obtained or derived from reference subjects not having AD disease state, wherein the skin of the reference subjects having AD disease state contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of reference biological samples obtained or derived from reference subjects having AD disease state, and a second plurality of reference biological samples obtained or derived from reference subjects not having AD disease state, wherein the skin of the reference subjects having AD disease state do not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of reference biological samples obtained or derived from reference subjects having SSc disease state, and a second plurality of reference biological samples obtained or derived from reference subjects not having SSc disease state, wherein the skin of the reference subjects having SSc disease state contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of reference biological samples obtained or derived from reference subjects having SSc disease state, and a second plurality of reference biological samples obtained or derived from reference subjects not having SSc disease state, wherein the skin of the reference subjects having SSc disease state do not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of reference biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of reference biological samples obtained or derived from reference subjects having PSO disease state, wherein the skin of the reference subjects contains one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of reference biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of reference biological samples obtained or derived from reference subjects having PSO disease state, wherein the skin of the reference subjects does not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of reference biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of reference biological samples obtained or derived from reference subjects having AD disease state, wherein the skin of the reference subjects contains one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of reference biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of reference biological samples obtained or derived from reference subjects having AD disease state, wherein the skin of the reference subjects does not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of reference biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of reference biological samples obtained or derived from reference subjects having SSc disease state, wherein the skin of the reference subjects contains one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of reference biological samples obtained or derived from reference subjects having lupus disease state, and a second plurality of reference biological samples obtained or derived from reference subjects having SSc disease state, wherein the skin of the reference subjects does not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of reference biological samples obtained or derived from reference subjects having AD disease state, and a second plurality of reference biological samples obtained or derived from reference subjects having PSO disease state, wherein the skin of the reference subjects does not contain a lesion. In certain embodiments, the reference biological samples comprise a first plurality of reference biological samples obtained or derived from reference subjects having AD disease state, and a second plurality of reference biological samples obtained or derived from reference subjects having PSO disease state, wherein the skin of the reference subjects contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of reference biological samples obtained or derived from reference subjects having DLE disease state, and a second plurality of reference biological samples obtained or derived from reference subjects having SCLE disease state, wherein the skin of the reference subjects contain one or more lesions. In certain embodiments, the reference biological samples comprise a first plurality of reference biological samples obtained or derived from reference subjects having DLE disease state, and a second plurality of reference biological samples obtained or derived from reference subjects having SCLE disease state, wherein the skin of the reference subjects do not contain a lesion. The reference biological samples can comprise skin biopsy sample, blood sample, isolated peripheral blood mononuclear cells (PBMCs), or any derivative thereof. The patient can be a human. The reference subjects can be humans.
The method can classify the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state with an accuracy of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The method can classify the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state with an sensitivity of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The method can classify the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state with an specificity of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The method can classify the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state with a positive predictive value of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The method can classify the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state with a negative predictive value of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The method can classify the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state with receiver operating characteristic (ROC) curve having an Area-Under-Curve (AUC) of at least about 0.50, at least about 0.55, at least about 0.60, at least about 0.65, at least about 0.70, at least about 0.75, at least about 0.80, at least about 0.85, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least about 0.98, at least about 0.99, or more than about 0.99.
In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with an accuracy of about 70% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with an accuracy of about 70% to about 75%, about 70% to about 80%, about 70% to about 85%, about 70% to about 90%, about 70% to about 92.5%, about 70% to about 95%, about 70% to about 96%, about 70% to about 97%, about 70% to about 98%, about 70% to about 99%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 92.5%, about 75% to about 95%, about 75% to about 96%, about 75% to about 97%, about 75% to about 98%, about 75% to about 99%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 92.5%, about 80% to about 95%, about 80% to about 96%, about 80% to about 97%, about 80% to about 98%, about 80% to about 99%, about 80% to about 100%, about 85% to about 90%, about 85% to about 92.5%, about 85% to about 95%, about 85% to about 96%, about 85% to about 97%, about 85% to about 98%, about 85% to about 99%, about 85% to about 100%, about 90% to about 92.5%, about 90% to about 95%, about 90% to about 96%, about 90% to about 97%, about 90% to about 98%, about 90% to about 99%, about 90% to about 100%, about 92.5% to about 95%, about 92.5% to about 96%, about 92.5% to about 97%, about 92.5% to about 98%, about 92.5% to about 99%, about 92.5% to about 100%, about 95% to about 96%, about 95% to about 97%, about 95% to about 98%, about 95% to about 99%, about 95% to about 100%, about 96% to about 97%, about 96% to about 98%, about 96% to about 99%, about 96% to about 100%, about 97% to about 98%, about 97% to about 99%, about 97% to about 100%, about 98% to about 99%, about 98% to about 100%, or about 99% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with an accuracy of about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with an accuracy of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, or about 99%.
In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a sensitivity of about 70% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a sensitivity of about 70% to about 75%, about 70% to about 80%, about 70% to about 85%, about 70% to about 90%, about 70% to about 92.5%, about 70% to about 95%, about 70% to about 96%, about 70% to about 97%, about 70% to about 98%, about 70% to about 99%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 92.5%, about 75% to about 95%, about 75% to about 96%, about 75% to about 97%, about 75% to about 98%, about 75% to about 99%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 92.5%, about 80% to about 95%, about 80% to about 96%, about 80% to about 97%, about 80% to about 98%, about 80% to about 99%, about 80% to about 100%, about 85% to about 90%, about 85% to about 92.5%, about 85% to about 95%, about 85% to about 96%, about 85% to about 97%, about 85% to about 98%, about 85% to about 99%, about 85% to about 100%, about 90% to about 92.5%, about 90% to about 95%, about 90% to about 96%, about 90% to about 97%, about 90% to about 98%, about 90% to about 99%, about 90% to about 100%, about 92.5% to about 95%, about 92.5% to about 96%, about 92.5% to about 97%, about 92.5% to about 98%, about 92.5% to about 99%, about 92.5% to about 100%, about 95% to about 96%, about 95% to about 97%, about 95% to about 98%, about 95% to about 99%, about 95% to about 100%, about 96% to about 97%, about 96% to about 98%, about 96% to about 99%, about 96% to about 100%, about 97% to about 98%, about 97% to about 99%, about 97% to about 100%, about 98% to about 99%, about 98% to about 100%, or about 99% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a sensitivity of about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a sensitivity of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, or about 99%.
In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a specificity of about 70% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a specificity of about 70% to about 75%, about 70% to about 80%, about 70% to about 85%, about 70% to about 90%, about 70% to about 92.5%, about 70% to about 95%, about 70% to about 96%, about 70% to about 97%, about 70% to about 98%, about 70% to about 99%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 92.5%, about 75% to about 95%, about 75% to about 96%, about 75% to about 97%, about 75% to about 98%, about 75% to about 99%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 92.5%, about 80% to about 95%, about 80% to about 96%, about 80% to about 97%, about 80% to about 98%, about 80% to about 99%, about 80% to about 100%, about 85% to about 90%, about 85% to about 92.5%, about 85% to about 95%, about 85% to about 96%, about 85% to about 97%, about 85% to about 98%, about 85% to about 99%, about 85% to about 100%, about 90% to about 92.5%, about 90% to about 95%, about 90% to about 96%, about 90% to about 97%, about 90% to about 98%, about 90% to about 99%, about 90% to about 100%, about 92.5% to about 95%, about 92.5% to about 96%, about 92.5% to about 97%, about 92.5% to about 98%, about 92.5% to about 99%, about 92.5% to about 100%, about 95% to about 96%, about 95% to about 97%, about 95% to about 98%, about 95% to about 99%, about 95% to about 100%, about 96% to about 97%, about 96% to about 98%, about 96% to about 99%, about 96% to about 100%, about 97% to about 98%, about 97% to about 99%, about 97% to about 100%, about 98% to about 99%, about 98% to about 100%, or about 99% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a specificity of about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a specificity of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, or about 99%.
In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a positive predictive value of about 70% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a positive predictive value of about 70% to about 75%, about 70% to about 80%, about 70% to about 85%, about 70% to about 90%, about 70% to about 92.5%, about 70% to about 95%, about 70% to about 96%, about 70% to about 97%, about 70% to about 98%, about 70% to about 99%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 92.5%, about 75% to about 95%, about 75% to about 96%, about 75% to about 97%, about 75% to about 98%, about 75% to about 99%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 92.5%, about 80% to about 95%, about 80% to about 96%, about 80% to about 97%, about 80% to about 98%, about 80% to about 99%, about 80% to about 100%, about 85% to about 9000, about 85% to about 92.5%, about 85% to about 95%, about 85% to about 96%, about 85% to about 97%, about 85% to about 98%, about 85% to about 99%, about 85% to about 100%, about 90% to about 92.5%, about 90% to about 95%, about 90% to about 96%, about 90% to about 97%, about 90% to about 98%, about 90% to about 99%, about 90% to about 100%, about 92.5% to about 95%, about 92.5% to about 96%, about 92.5% to about 97%, about 92.5% to about 98%, about 92.5% to about 99%, about 92.5% to about 100%, about 95% to about 96%, about 95% to about 97%, about 95% to about 98%, about 95% to about 99%, about 95% to about 1000%, about 96% to about 97%, about 96% to about 98%, about 96% to about 99%, about 96% to about 100%, about 97% to about 98%, about 97% to about 99%, about 97% to about 100%, about 98% to about 99%, about 98% to about 100%, or about 99% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a positive predictive value of about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a positive predictive value of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, or about 99%.
In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a negative predictive value of about 70% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a negative predictive value of about 70% to about 75%, about 70% to about 80%, about 70% to about 85%, about 70% to about 90%, about 70% to about 92.5%, about 70% to about 95%, about 70% to about 96%, about 70% to about 97%, about 70% to about 98%, about 70% to about 99%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 92.5%, about 75% to about 95%, about 75% to about 96%, about 75% to about 97%, about 75% to about 98%, about 75% to about 99%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 92.5%, about 80% to about 95%, about 80% to about 96%, about 80% to about 97%, about 80% to about 98%, about 80% to about 99%, about 80% to about 100%, about 85% to about 90%, about 85% to about 92.5%, about 85% to about 95%, about 85% to about 96%, about 85% to about 97%, about 85% to about 98%, about 85% to about 99%, about 85% to about 100%, about 90% to about 92.5%, about 90% to about 95%, about 90% to about 96%, about 90% to about 97%, about 90% to about 98%, about 90% to about 99%, about 90% to about 100%, about 92.5% to about 95%, about 92.5% to about 96%, about 92.5% to about 97%, about 92.5% to about 98%, about 92.5% to about 99%, about 92.5% to about 100%, about 95% to about 96%, about 95% to about 97%, about 95% to about 98%, about 95% to about 99%, about 95% to about 100%, about 96% to about 97%, about 96% to about 98%, about 96% to about 99%, about 96% to about 100%, about 97% to about 98%, about 97% to about 99%, about 97% to about 100%, about 98% to about 99%, about 98% to about 100%, or about 99% to about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a negative predictive value of about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%. In certain embodiments, the method classifies the skin of the patient as indicative of the disease state of the patient with a negative predictive value of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, or about 99%. In certain embodiments, the trained machine learning model classifies the skin of the patient as indicative of the disease state of the patient with a ROC having an AUC of about 0.7 to about 1.
In certain embodiments, the trained machine learning model classifies the skin of the patient as indicative of the disease state of the patient with a ROC having an AUC of about 0.7 to about 0.75, about 0.7 to about 0.8, about 0.7 to about 0.85, about 0.7 to about 0.9, about 0.7 to about 0.925, about 0.7 to about 0.95, about 0.7 to about 0.96, about 0.7 to about 0.97, about 0.7 to about 0.98, about 0.7 to about 0.99, about 0.7 to about 1, about 0.75 to about 0.8, about 0.75 to about 0.85, about 0.75 to about 0.9, about 0.75 to about 0.925, about 0.75 to about 0.95, about 0.75 to about 0.96, about 0.75 to about 0.97, about 0.75 to about 0.98, about 0.75 to about 0.99, about 0.75 to about 1, about 0.8 to about 0.85, about 0.8 to about 0.9, about 0.8 to about 0.925, about 0.8 to about 0.95, about 0.8 to about 0.96, about 0.8 to about 0.97, about 0.8 to about 0.98, about 0.8 to about 0.99, about 0.8 to about 1, about 0.85 to about 0.9, about 0.85 to about 0.925, about 0.85 to about 0.95, about 0.85 to about 0.96, about 0.85 to about 0.97, about 0.85 to about 0.98, about 0.85 to about 0.99, about 0.85 to about 1, about 0.9 to about 0.925, about 0.9 to about 0.95, about 0.9 to about 0.96, about 0.9 to about 0.97, about 0.9 to about 0.98, about 0.9 to about 0.99, about 0.9 to about 1, about 0.925 to about 0.95, about 0.925 to about 0.96, about 0.925 to about 0.97, about 0.925 to about 0.98, about 0.925 to about 0.99, about 0.925 to about 1, about 0.95 to about 0.96, about 0.95 to about 0.97, about 0.95 to about 0.98, about 0.95 to about 0.99, about 0.95 to about 1, about 0.96 to about 0.97, about 0.96 to about 0.98, about 0.96 to about 0.99, about 0.96 to about 1, about 0.97 to about 0.98, about 0.97 to about 0.99, about 0.97 to about 1, about 0.98 to about 0.99, about 0.98 to about 1, or about 0.99 to about 1. In certain embodiments, the trained machine learning model classifies the skin of the patient as indicative of the disease state of the patient with a ROC having an AUC of about 0.7, about 0.75, about 0.8, about 0.85, about 0.9, about 0.925, about 0.95, about 0.96, about 0.97, about 0.98, about 0.99, or about 1. In certain embodiments, the trained machine learning model classifies the skin of the patient as indicative of the disease state of the patient with a ROC having an AUC of at least about 0.7, about 0.75, about 0.8, about 0.85, about 0.9, about 0.925, about 0.95, about 0.96, about 0.97, about 0.98, or about 0.99. In certain embodiments, the trained machine learning model classifies the skin of the patient as indicative of the disease state of the patient with a ROC having an AUC of at most about 0.75, about 0.8, about 0.85, about 0.9, about 0.925, about 0.95, about 0.96, about 0.97, about 0.98, about 0.99, or about 1.
In certain embodiments, the analyzing the data set comprises generating a disease risk score of the patient based at least on the one or more GSVA scores of the patient, wherein the skin of the patient is classified as indicative of the disease state based on the disease risk score. The skin of the patient can be classified as indicative of the disease state based on comparing the risk score of the patient to a reference value. In certain embodiments, the skin of the patient is classified as indicative of the disease state based on comparing the risk score of the patient to a reference value, wherein risk score at one side (e.g., higher or lower) of the reference value indicates skin of the patient is indicative of the disease state, and risk score at the other side (e.g., lower or higher respectively) of the reference value indicates skin of the patient is not indicative of the disease state. The disease risk score can be generated by a method as described herein. In certain embodiments, the analyzing the data set comprises classifying skin of the patient based on the disease risk score, and if the skin of the patient is indicative of the disease state based on the disease risk score, further analyzing the data set to classify whether the skin of the patient is indicative of i) lupus or ii) AD, PSO and/or SSc disease state.
In an aspect, the present disclosure provides a method for developing a trained machine learning model capable of assessing skin of a patient. The method can include any one of, any combination of, or all of steps (a″), (b″) and (c″). Step (a″) can include performing enrichment assessment of a data set comprising gene expression measurements of a plurality of patient, wherein enrichment of expression least 2 genes selected from the genes listed in Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28, are assessed, to obtain an enrichment measurement data set comprising plurality of enrichment scores. An enrichment score can be generated for each of the plurality of patients. Enrichment scores of different patients can be same or different. For a respective patient of the plurality of patients, the respective enrichment score of the respective patient can be generated from assessing enrichment of expression of the least 2 genes in a biological sample from the respective patient. Step (b″) can include obtaining a combined data set from the plurality of patients, wherein the combined data set comprises a plurality of individual combined data sets, wherein a respective individual combined data set of the plurality of individual combined data sets comprises i) enrichment score determined in step (a″) of a respective patient; and ii) data regarding whether skin of the respective patient is indicative of a disease state of the patient. Step (c″) can include training a first machine learning model based at least on the combined data set obtained in (b″), wherein the first machine learning model is trained to infer whether skin of a patient is indicative of the disease state of the patient, based on the enrichment score of the patient.
In certain embodiments, the method further includes steps (d″), (e″) and/or (f″). Step (d″) can include determining feature importance of one or more predictors of the first machine learning model. Step (e″) can include selecting N predictors of the first machine learning model based at least in part on the feature importance, wherein N is in an integer. Step (f″) can include training a second machine learning model, wherein the second machine learning model is trained to infer whether the skin of a patient is indicative of the disease state of the patient, based at least on measurement data of the N predictors of the patient. In certain embodiments, the N predictors have top N feature importance values. The feature importance of the predictors can be determined using a suitable method. In certain embodiments, the feature importance of the predictors is determined using Gini index, or SHAP (Shapley Additive exPlanations) method or both.
In certain embodiments, the disease state is an inflammatory skin disease state. In certain embodiments, the disease state is a rheumatic skin disease state. In certain embodiments, the disease state is selected from lupus disease state, PSO disease state, AD disease state, or SSc disease state. In certain embodiments, the disease state is lupus disease state. In certain embodiments, the lupus is SLE, DLE, CLE, ACLE, SCLE, CCLE, or any combination thereof. In certain embodiments, the lupus is SLE. In certain embodiments, the lupus is CLE. In certain embodiments, the lupus is DLE. In certain embodiments, the lupus is ACLE. In certain embodiments, the lupus is SCLE. In certain embodiments, the lupus is CCLE. In certain embodiments, the disease state is PSO disease state. In certain embodiments, the disease state is AD disease state. In certain embodiments, the disease state is SSc disease state. In certain embodiments, the disease state is DLE disease state. In certain embodiments, the disease state is SCLE disease state. In certain embodiments, the disease state is DLE or SCLE disease state. In certain embodiments, the disease state is lupus or PSO disease state. In certain embodiments, the disease state is lupus or AD disease state. In certain embodiments, the disease state is lupus or SSc disease state. In certain embodiments, the disease state is PSO or AD disease state. In certain embodiments, the disease state is DLE or SCLE disease state. In certain embodiments, the plurality of patients comprises a first plurality of patients having the disease state, and a second plurality of patients not having the disease state. In certain embodiments, the plurality of patients comprises a first plurality of patients having a first disease state selected from lupus, PSO, AD or SSc disease state, and a second plurality of patients having a second disease state selected from lupus, PSO, AD or SSc disease state, where the first and second disease state are different. In certain embodiments, the plurality of patients comprises a first plurality of patients having a DLE disease state, and a second plurality of patients having SCLE disease state. The skin of the patients having the disease state, first disease state, and second disease state can contain one or more lesions, or do not contain a lesion. In certain embodiments, the skin of the patients having the disease state, first disease state, and second disease contain one or more lesions. In certain embodiments, the skin of the patients having the disease state, first disease state, and second disease do not contain a lesion. The patients can be humans.
In certain embodiments, the respective individual combined data set of the plurality of individual combined data sets comprises i) the enrichment score determined in step (a″) of the respective patient; and ii) data regarding whether the skin of the respective patient is indicative of lupus, PSO, AD, or SSc disease state. In certain embodiments, the first machine learning model is trained to infer whether skin of a patient is indicative of the lupus disease state, PSO disease state, AD disease state, or SSc disease state based at least on the enrichment score of the patient determined in step (a″). In certain embodiments, the second machine learning model is trained to infer whether skin of the patient is indicative of the lupus, PSO, AD, or SSc disease state of the patient, based at least on the measurement data of the N predictors of the patient. In certain embodiments, the respective individual combined data set of the plurality of individual combined data sets comprises i) the enrichment score determined in step (a″) of the respective patient; and ii) data regarding whether the skin of the respective patient is indicative of lupus disease state. In certain embodiments, the first machine learning model is trained to infer whether the skin of a patient is indicative of the lupus disease state, based at least on the enrichment score of the patient determined in step (a″). In certain embodiments, the second machine learning model is trained to infer whether the skin of the patient is indicative of the lupus disease state of the patient, based at least on the measurement data of the N predictors of the patient. In certain embodiments, the respective individual combined data set of the plurality of individual combined data sets comprises i) the enrichment score determined in step (a″) of the respective patient; and ii) data regarding whether the skin of the respective patient is indicative of SSc disease state. In certain embodiments, the first machine learning model is trained to infer whether the skin of a patient is indicative of the SSc disease state, based at least on the enrichment score of the patient determined in step (a″). In certain embodiments, the second machine learning model is trained to infer whether skin of the patient is indicative of the SSc disease state of the patient, based at least on the measurement data of the N predictors of the patient. In certain embodiments, the respective individual combined data set of the plurality of individual combined data sets comprises i) the enrichment score determined in step (a″) of the respective patient; and ii) data regarding whether the skin of the respective patient is indicative of AD disease state. In certain embodiments, the first machine learning model is trained to infer whether skin of a patient is indicative of the AD disease state, based at least on the enrichment score of the patient determined in step (a″). In certain embodiments, the second machine learning model is trained to infer whether skin of the patient is indicative of the AD disease state of the patient, based at least on the measurement data of the N predictors of the patient. In certain embodiments, the respective individual combined data set of the plurality of individual combined data sets comprises i) the enrichment score determined in step (a″) of the respective patient; and ii) data regarding whether the skin of the respective patient is indicative of PSO disease state. In certain embodiments, the first machine learning model is trained to infer whether skin of a patient is indicative of the PSO disease state, based at least on the enrichment score of the patient determined in step (a″). In certain embodiments, the second machine learning model is trained to infer whether skin of the patient is indicative of the PSO disease state of the patient, based at least on the measurement data of the N predictors of the patient.
In certain embodiments, the respective individual combined data set of the plurality of individual combined data sets comprises i) the enrichment score determined in step (a″) of the respective patient; and ii) data regarding whether the skin of the respective patient is indicative of lupus or PSO disease state. In certain embodiments, the first machine learning model is trained to infer whether skin of a patient is indicative of the lupus or PSO disease state, based at least on the enrichment score of the patient determined in step (a″). In certain embodiments, the second machine learning model is trained to infer whether skin of the patient is indicative of the lupus or PSO disease state of the patient, based at least on the measurement data of the N predictors of the patient. In certain embodiments, the respective individual combined data set of the plurality of individual combined data sets comprises i) the enrichment score determined in step (a″) of the respective patient; and ii) data regarding whether the skin of the respective patient is indicative of lupus or AD disease state. In certain embodiments, the first machine learning model is trained to infer whether skin of a patient is indicative of the lupus or AD disease state, based at least on the enrichment score of the patient determined in step (a″). In certain embodiments, the second machine learning model is trained to infer whether skin of the patient is indicative of the lupus or AD disease state of the patient, based at least on the measurement data of the N predictors of the patient. In certain embodiments, the respective individual combined data set of the plurality of individual combined data sets comprises i) the enrichment score determined in step (a″) of the respective patient; and ii) data regarding whether the skin of the respective patient is indicative of lupus or SSc disease state. In certain embodiments, the first machine learning model is trained to infer whether skin of a patient is indicative of the lupus or SSc disease state, based at least on the enrichment score of the patient determined in step (a″). In certain embodiments, the second machine learning model is trained to infer whether skin of the patient is indicative of the lupus or SSc disease state of the patient, based at least on the measurement data of the N predictors of the patient. In certain embodiments, the respective individual combined data set of the plurality of individual combined data sets comprises i) the enrichment score determined in step (a″) of the respective patient; and ii) data regarding whether the skin of the respective patient is indicative of AD or PSO disease state. In certain embodiments, the first machine learning model is trained to infer whether skin of a patient is indicative of the AD or PSO disease state, based at least on the enrichment score of the patient determined in step (a″). In certain embodiments, the second machine learning model is trained to infer whether skin of the patient is indicative of the AD or PSO disease state of the patient, based at least on the measurement data of the N predictors of the patient. In certain embodiments, the respective individual combined data set of the plurality of individual combined data sets comprises i) the enrichment score determined in step (a″) of the respective patient; and ii) data regarding whether the skin of the respective patient is indicative of DLE or SCLE disease state. In certain embodiments, the first machine learning model is trained to infer whether skin of a patient is indicative of the DLE or SCLE disease state, based at least on the enrichment score of the patient determined in step (a″). In certain embodiments, the second machine learning model is trained to infer whether skin of the patient is indicative of the DLE or SCLE disease state of the patient, based at least on the measurement data of the N predictors of the patient.
In certain embodiments, step (a″) further includes normalizing the data set. In certain embodiments, the data set is normalized prior to the enrichment assessment. The data set can be normalized using a suitable normalizing method. In certain embodiments, the data set is normalized using Z-score normalization method.
The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the lupus, PSO, AD, or SSc disease state of the patient, with an accuracy of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the lupus, PSO, AD, or SSc disease state of the patient, with a sensitivity of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the lupus, PSO, AD, or SSc disease state of the patient, with a specificity of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the lupus, PSO, AD, or SSc disease state of the patient, with a positive predictive value of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the lupus, PSO, AD, or SSc disease state of the patient, with a negative predictive value of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the lupus, PSO, AD, or SSc disease state of the patient, with a Receiver operating characteristic curve (ROC) having an Area-Under-Curve (AUC) of at least about 0.50, at least about 0.55, at least about 0.60, at least about 0.65, at least about 0.70, at least about 0.75, at least about 0.80, at least about 0.85, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least about 0.98, at least about 0.99, or more than about 0.99. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the DLE or SCLE disease state of the patient, with an accuracy of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the DLE or SCLE disease state of the patient, with a sensitivity of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the DLE or SCLE disease state of the patient, with a specificity of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the DLE or SCLE disease state of the patient, with a positive predictive value of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the DLE or SCLE disease state of the patient, with a negative predictive value of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the DLE or SCLE disease state of the patient, with a Receiver operating characteristic curve (ROC) having an Area-Under-Curve (AUC) of at least about 0.50, at least about 0.55, at least about 0.60, at least about 0.65, at least about 0.70, at least about 0.75, at least about 0.80, at least about 0.85, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least about 0.98, at least about 0.99, or more than about 0.99.
The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with an accuracy of about 70% to about 100%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with an accuracy of about 70% to about 75%, about 70% to about 80%, about 70% to about 85%, about 70% to about 90%, about 70% to about 92.5%, about 70% to about 95%, about 70% to about 96%, about 70% to about 97%, about 70% to about 98%, about 70% to about 99%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 92.5%, about 75% to about 95%, about 75% to about 96%, about 75% to about 97%, about 75% to about 98%, about 75% to about 99%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 92.5%, about 80% to about 95%, about 80% to about 96%, about 80% to about 97%, about 80% to about 98%, about 80% to about 99%, about 80% to about 100%, about 85% to about 90%, about 85% to about 92.5%, about 85% to about 95%, about 85% to about 96%, about 85% to about 97%, about 85% to about 98%, about 85% to about 99%, about 85% to about 100%, about 90% to about 92.5%, about 90% to about 95%, about 90% to about 96%, about 90% to about 97%, about 90% to about 98%, about 90% to about 99%, about 90% to about 100%, about 92.5% to about 95%, about 92.5% to about 96%, about 92.5% to about 97%, about 92.5% to about 98%, about 92.5% to about 99%, about 92.5% to about 100%, about 95% to about 96%, about 95% to about 97%, about 95% to about 98%, about 95% to about 99%, about 95% to about 100%, about 96% to about 97%, about 96% to about 98%, about 96% to about 99%, about 96% to about 100%, about 97% to about 98%, about 97% to about 99%, about 97% to about 100%, about 98% to about 99%, about 98% to about 100%, or about 99% to about 100%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with an accuracy of about 70%, about 75%, about 80%, about 850%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with an accuracy of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, or about 99%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with an accuracy of at most about 75%, about 80%, about 85%, about 90%, about 92.5% about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%.
The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with a sensitivity of about 70% to about 100%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with a sensitivity of about 70% to about 75%, about 70% to about 80%, about 70% to about 85%, about 70% to about 90%, about 70% to about 92.5%, about 70% to about 95%, about 70% to about 96%, about 70% to about 97%, about 70% to about 98%, about 70% to about 99%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 92.5%, about 75% to about 95%, about 75% to about 96%, about 75% to about 97%, about 75% to about 98%, about 75% to about 99%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 92.5%, about 80% to about 95%, about 80% to about 96%, about 80% to about 97%, about 80% to about 98%, about 80% to about 99%, about 80% to about 100%, about 85% to about 90%, about 85% to about 92.5%, about 85% to about 95%, about 85% to about 96%, about 85% to about 97%, about 85% to about 98%, about 85% to about 99%, about 85% to about 100%, about 90% to about 92.5%, about 90% to about 95%, about 90% to about 96%, about 90% to about 97%, about 90% to about 98%, about 90% to about 99%, about 90% to about 100%, about 92.5% to about 95%, about 92.5% to about 96%, about 92.5% to about 97%, about 92.5% to about 98%, about 92.5% to about 99%, about 92.5% to about 100%, about 95% to about 96%, about 95% to about 97%, about 95% to about 98%, about 95% to about 99%, about 95% to about 100%, about 96% to about 97%, about 96% to about 98%, about 96% to about 99%, about 96% to about 100%, about 97% to about 98%, about 97% to about 99%, about 97% to about 100%, about 98% to about 99%, about 98% to about 100%, or about 99% to about 100%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with a sensitivity of about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with a sensitivity of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, or about 99%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with a sensitivity of at most about 75%, about 80%, about 85%, about 90%, about 92.5% about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%.
The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with a specificity of about 70% to about 100%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with a specificity of about 70% to about 75%, about 70% to about 80%, about 70% to about 85%, about 70% to about 90%, about 70% to about 92.5%, about 70% to about 95%, about 70% to about 96%, about 70% to about 97%, about 70% to about 98%, about 70% to about 99%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 92.5%, about 75% to about 95%, about 75% to about 96%, about 75% to about 97%, about 75% to about 98%, about 75% to about 99%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 92.5%, about 80% to about 95%, about 80% to about 96%, about 80% to about 97%, about 80% to about 98%, about 80% to about 99%, about 80% to about 100%, about 85% to about 90%, about 85% to about 92.5%, about 85% to about 95%, about 85% to about 96%, about 85% to about 97%, about 85% to about 98%, about 85% to about 99%, about 85% to about 100%, about 90% to about 92.5%, about 90% to about 95%, about 90% to about 96%, about 90% to about 97%, about 90% to about 98%, about 90% to about 99%, about 90% to about 100%, about 92.5% to about 95%, about 92.5% to about 96%, about 92.5% to about 97%, about 92.5% to about 98%, about 92.5% to about 99%, about 92.5% to about 100%, about 95% to about 96%, about 95% to about 97%, about 95% to about 98%, about 95% to about 99%, about 95% to about 100%, about 96% to about 97%, about 96% to about 98%, about 96% to about 99%, about 96% to about 100%, about 97% to about 98%, about 97% to about 99%, about 97% to about 100%, about 98% to about 99%, about 98% to about 100%, or about 99% to about 100%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with a specificity of about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with a specificity of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, or about 99%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with a specificity of at most about 75%, about 80%, about 85%, about 90%, about 92.5% about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%.
The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with a positive predictive value of about 70% to about 100%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with a positive predictive value of about 70% to about 75%, about 70% to about 80%, about 70% to about 85%, about 70% to about 90%, about 70% to about 92.5%, about 70% to about 95%, about 70% to about 96%, about 70% to about 97%, about 70% to about 98%, about 70% to about 99%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 92.5%, about 75% to about 95%, about 75% to about 96%, about 75% to about 97%, about 75% to about 98%, about 75% to about 99%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 92.5%, about 80% to about 95%, about 80% to about 96%, about 80% to about 97%, about 80% to about 98%, about 80% to about 99%, about 80% to about 100%, about 85% to about 90%, about 85% to about 92.5%, about 85% to about 95%, about 85% to about 96%, about 85% to about 97%, about 85% to about 98%, about 85% to about 99%, about 85% to about 100%, about 90% to about 92.5%, about 90% to about 95%, about 90% to about 96%, about 90% to about 97%, about 90% to about 98%, about 90% to about 99%, about 90% to about 100%, about 92.5% to about 95%, about 92.5% to about 96%, about 92.5% to about 97%, about 92.5% to about 98%, about 92.5% to about 99%, about 92.5% to about 100%, about 95% to about 96%, about 95% to about 97%, about 95% to about 98%, about 95% to about 99%, about 95% to about 100%, about 96% to about 97%, about 96% to about 98%, about 96% to about 99%, about 96% to about 100%, about 97% to about 98%, about 97% to about 99%, about 97% to about 100%, about 98% to about 99%, about 98% to about 100%, or about 99% to about 100%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with a positive predictive value of about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with a positive predictive value of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, or about 99%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with a positive predictive value of at most about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%.
The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with a negative predictive value of about 70% to about 100%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with a negative predictive value of about 70% to about 75%, about 70% to about 80%, about 70% to about 85%, about 70% to about 90%, about 70% to about 92.5%, about 70% to about 95%, about 70% to about 96%, about 70% to about 97%, about 70% to about 98%, about 70% to about 99%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 92.5%, about 75% to about 95%, about 75% to about 96%, about 75% to about 97%, about 75% to about 98%, about 75% to about 99%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 92.5%, about 80% to about 95%, about 80% to about 96%, about 80% to about 97%, about 80% to about 98%, about 80% to about 99%, about 80% to about 100%, about 85% to about 90%, about 85% to about 92.5%, about 85% to about 95%, about 85% to about 96%, about 85% to about 97%, about 85% to about 98%, about 85% to about 99%, about 85% to about 100%, about 90% to about 92.5%, about 90% to about 95%, about 90% to about 96%, about 90% to about 97%, about 90% to about 98%, about 90% to about 99%, about 90% to about 100%, about 92.5% to about 95%, about 92.5% to about 96%, about 92.5% to about 97%, about 92.5% to about 98%, about 92.5% to about 99%, about 92.5% to about 100%, about 95% to about 96%, about 95% to about 97%, about 95% to about 98%, about 95% to about 99%, about 95% to about 100%, about 96% to about 97%, about 96% to about 98%, about 96% to about 99%, about 96% to about 100%, about 97% to about 98%, about 97% to about 99%, about 97% to about 100%, about 98% to about 99%, about 98% to about 100%, or about 99% to about 100%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with a negative predictive value of about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with a negative predictive value of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, or about 99%. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state, with a negative predictive value of at most about 75%, about 80%, about 85%, about 90%, about 92.5%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%.
The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state of the patient, with a ROC curve having an AUC of about 0.7 to about 1. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state of the patient, with a ROC curve having an AUC of about 0.7 to about 0.75, about 0.7 to about 0.8, about 0.7 to about 0.85, about 0.7 to about 0.9, about 0.7 to about 0.925, about 0.7 to about 0.95, about 0.7 to about 0.96, about 0.7 to about 0.97, about 0.7 to about 0.98, about 0.7 to about 0.99, about 0.7 to about 1, about 0.75 to about 0.8, about 0.75 to about 0.85, about 0.75 to about 0.9, about 0.75 to about 0.925, about 0.75 to about 0.95, about 0.75 to about 0.96, about 0.75 to about 0.97, about 0.75 to about 0.98, about 0.75 to about 0.99, about 0.75 to about 1, about 0.8 to about 0.85, about 0.8 to about 0.9, about 0.8 to about 0.925, about 0.8 to about 0.95, about 0.8 to about 0.96, about 0.8 to about 0.97, about 0.8 to about 0.98, about 0.8 to about 0.99, about 0.8 to about 1, about 0.85 to about 0.9, about 0.85 to about 0.925, about 0.85 to about 0.95, about 0.85 to about 0.96, about 0.85 to about 0.97, about 0.85 to about 0.98, about 0.85 to about 0.99, about 0.85 to about 1, about 0.9 to about 0.925, about 0.9 to about 0.95, about 0.9 to about 0.96, about 0.9 to about 0.97, about 0.9 to about 0.98, about 0.9 to about 0.99, about 0.9 to about 1, about 0.925 to about 0.95, about 0.925 to about 0.96, about 0.925 to about 0.97, about 0.925 to about 0.98, about 0.925 to about 0.99, about 0.925 to about 1, about 0.95 to about 0.96, about 0.95 to about 0.97, about 0.95 to about 0.98, about 0.95 to about 0.99, about 0.95 to about 1, about 0.96 to about 0.97, about 0.96 to about 0.98, about 0.96 to about 0.99, about 0.96 to about 1, about 0.97 to about 0.98, about 0.97 to about 0.99, about 0.97 to about 1, about 0.98 to about 0.99, about 0.98 to about 1, or about 0.99 to about 1. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state of the patient, with a ROC curve having an AUC of about 0.7, about 0.75, about 0.8, about 0.85, about 0.9, about 0.925, about 0.95, about 0.96, about 0.97, about 0.98, about 0.99, or about 1. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state of the patient, with a ROC curve having an AUC of at least about 0.7, about 0.75, about 0.8, about 0.85, about 0.9, about 0.925, about 0.95, about 0.96, about 0.97, about 0.98, or about 0.99. The first machine learning model, and/or the second machine learning model can independently classify the skin of the patient as indicative of the disease state of the patient, with a ROC curve having an AUC of at most about 0.75, about 0.8, about 0.85, about 0.9, about 0.925, about 0.95, about 0.96, about 0.97, about 0.98, about 0.99, or about 1.
The first machine learning model and/or second machine learning model can independently trained using linear regression, logistic regression, Ridge regression, Lasso regression, elastic net (EN) regression, support vector machine (SVM), gradient boosted machine (GBM), k nearest neighbors (kNN), generalized linear model (GLM), naïve Bayes (NB) classifier, a neural network, Random Forest (RF), deep learning algorithm, linear discriminant analysis (LDA), decision tree learning (DTREE), adaptive boosting (ADB), Classification and Regression Tree (CART), Hierarchical clustering, or any combination thereof. In certain embodiments, collinear features are removed during training of a machine learning model.
In certain embodiments, in step (a″) the enrichment assessment of the data set is performed using gene set variation analysis (GSVA), gene set enrichment analysis (GSEA), enrichment algorithm, Z-score, multiscale embedded gene co-expression network analysis (MEGENA), weighted gene co-expression network analysis (WGCNA), differential expression analysis, log 2 expression analysis, or any combination thereof.
In certain embodiments, an enrichment score of a patient of the plurality of patients comprises one or more Table specific enrichment scores of the patient, wherein the one or more Table specific enrichment scores are generated using one or more of the Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, or any combination thereof, wherein for a respective selected Table, at least one Table specific enrichment score of the patient is generated for enrichment of expression of at least 2 genes listed in the respective Table, in the biological sample from the patient. The one or more Table specific enrichment scores of the patient comprises the at least one Table specific enrichment score from each of the selected Table. The at least 2 genes of the data set can comprise the at least 2 genes from each of the selected table. In certain embodiments, the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, or 48, or 1 to 48, or any range there between Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28. In certain embodiments, the all the 48 Tables, e.g., Tables 4A-1 to 4A-20 and Tables 4B-1 to 4B-28, are selected. In certain embodiments, independently for each of the selected Table of the one or more Tables, the at least one Table specific enrichment score from the Table is generated, for enrichment of expression of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295 or 300 or all or any range or value there between, genes selected from the genes listed in the respective Table, in the biological sample from the patient. In certain embodiments, for each of the selected Table one Table specific enrichment score is generated, and the one or more Table specific enrichment of the patient comprises the one Table specific enrichment score from each of the selected Table. In certain embodiments, the first machine learning model can be trained to infer whether skin of a patient is indicative of the disease state of the patient, based on the one or more Table specific enrichment scores of the patient. In certain embodiments, the one or more predictors of the first machine learning model can be one or more gene sets from which the one or more Table specific enrichment scores are generated, wherein for a respective Table specific enrichment score, the gene set from which the respective Table specific enrichment score is generated, are the genes based on enrichment of expression which (e.g., in a biological sample), the respective Table specific enrichment score is generated. The measurement data of the N predictors can be Table specific enrichment scores corresponding to the N predictors. In certain embodiments, the Table specific enrichment scores are GSVA scores, and are obtained using GSVA.
In certain embodiments, in step (a″) the enrichment assessment of the data set is performed using GSVA. In certain embodiments, an enrichment score of a patient of the plurality of patients comprises one or more GSVA scores of the patient. The one or more GSVA scores of a patient, can be generated from gene expression data of the patient using one or more Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28, wherein for each of the one or more selected Tables at least one GSVA score of the patient is generated based on enrichment of expression of at least 2 genes listed in the Table, in a biological sample from the patient. The one or more GSVA scores comprises the at least one GSVA score(s) from each of the selected table. In certain embodiments, the one or more Tables, e.g. of step (a″), comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, or 48, Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28. In certain embodiments, the all the 48 Tables, e.g., Tables 4A-1 to 4A-20 and Tables 4B-1 to 4B-28, are selected. In certain embodiments, independently for each respective Table of the one or more Tables, e.g. of step (a″), the at least one GSVA score is generated, for enrichment of expression of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, or 300, or all, or any range or value there between, genes listed in the respective Table. In certain embodiments, the first machine learning model can be trained to infer whether skin of a patient is indicative of the disease state of the patient, based on the one or more GSVA scores of the patient. In certain embodiments, the one or more predictors of the first machine learning model can be one or more gene sets from which the one or more GSVA scores are generated, wherein for a respective GSVA score of the one or more GSVA scores, the gene set from which the respective GSVA score is generated, are the genes based on enrichment of expression which (e.g., in a biological sample), the respective GSVA score is generated. The measurement data of the N predictors can be GSVA scores corresponding to the N predictors.
In certain embodiments, Nis 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40, or any range there between. In certain embodiments, the N predictors have top 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, feature importance values of the first machine learning model. In certain embodiments, N is about 3 to about 40. In certain embodiments, N is about 3 to about 10, about 3 to about 13, about 3 to about 14, about 3 to about 15, about 3 to about 16, about 3 to about 17, about 3 to about 18, about 3 to about 20, about 3 to about 30, about 3 to about 35, about 3 to about 40, about 10 to about 13, about 10 to about 14, about 10 to about 15, about 10 to about 16, about 10 to about 17, about 10 to about 18, about 10 to about 20, about 10 to about 30, about 10 to about 35, about 10 to about 40, about 13 to about 14, about 13 to about 15, about 13 to about 16, about 13 to about 17, about 13 to about 18, about 13 to about 20, about 13 to about 30, about 13 to about 35, about 13 to about 40, about 14 to about 15, about 14 to about 16, about 14 to about 17, about 14 to about 18, about 14 to about 20, about 14 to about 30, about 14 to about 35, about 14 to about 40, about 15 to about 16, about 15 to about 17, about 15 to about 18, about 15 to about 20, about 15 to about 30, about 15 to about 35, about 15 to about 40, about 16 to about 17, about 16 to about 18, about 16 to about 20, about 16 to about 30, about 16 to about 35, about 16 to about 40, about 17 to about 18, about 17 to about 20, about 17 to about 30, about 17 to about 35, about 17 to about 40, about 18 to about 20, about 18 to about 30, about 18 to about 35, about 18 to about 40, about 20 to about 30, about 20 to about 35, about 20 to about 40, about 30 to about 35, about 30 to about 40, or about 35 to about 40. In certain embodiments, N is about 3, about 10, about 13, about 14, about 15, about 16, about 17, about 18, about 20, about 30, about 35, or about 40. In certain embodiments, N is at most about 10, about 13, about 14, about 15, about 16, about 17, about 18, about 20, about 30, about 35, or about 40.
In an aspect, the present disclosure provides a method for developing a trained machine learning model capable of characterizing a disease state, the method comprising. The method can include any one of, any combination of, or all of steps (a′″), (b′″), and (c′″). Step (a′″) can include performing enrichment assessment of a data set comprising gene expression measurements of a plurality of patients, to obtain an enrichment measurement data set comprising a plurality of enrichment scores. An enrichment score can be generated for each of the plurality of patients. Enrichment scores of different patients can be same or different. For a respective patient of the plurality of patients, the respective enrichment score can be generated from gene expression measurements of a biological sample from the respective patient. Step (b′″) can include obtaining a combined data set from the plurality of patients, wherein the combined data set comprises a plurality of individual combined data sets, wherein a respective individual combined data set of the plurality of individual combined data sets comprises i) enrichment score determined in step (a′″) of a respective patient; and ii) data regarding whether the respective patient has the disease state. Step (c′″) can include training a first machine learning model based on the combined data set obtained in (b′″), wherein the first machine learning model is trained to infer whether a patient has the disease state based on the enrichment score of the patient. In certain embodiments, the method further include steps (d′″), (e′″), and/of (f′″). Step (d′″) can include determining feature importance of one or more predictors of the first machine learning model. Step (e′″) can include selecting N predictors of the first machine learning model based at least in part on the feature importance, wherein N is in an integer. Step (f′″) can include training a second machine learning model, wherein the second machine learning model is trained to infer whether the patient has the disease state of the patient, based on the N predictors. In certain embodiments, the N predictors have top N feature importance values. In certain embodiments, the N predictors have top N feature importance values. The feature importance of the predictors can be determined using a suitable method. In certain embodiments, the feature importance of the predictors is determined using Gini index or SHAP method or both.
In certain embodiments, step (a′″) further includes normalizing the data set. The data set can be normalized prior to the enrichment assessment. The data set can be normalized using a suitable normalizing method. In certain embodiments, the data set is normalized using Z-score normalization method.
The first machine learning model, and/or the second machine learning model can independently classify whether the patient has the disease state, with an accuracy of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The first machine learning model, and/or the second machine learning model can independently classify whether the patient has the disease state, with a sensitivity of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The first machine learning model, and/or the second machine learning model can independently classify whether the patient has the disease state, with a specificity of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The first machine learning model, and/or the second machine learning model can independently classify whether the patient has the disease state, with a positive predictive value of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The first machine learning model, and/or the second machine learning model can independently classify whether the patient has the disease state, with a negative predictive value of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. The first machine learning model, and/or the second machine learning model can independently classify whether the patient has the disease state, with a Receiver operating characteristic curve (ROC) having an Area-Under-Curve (AUC) of at least about 0.50, at least about 0.55, at least about 0.60, at least about 0.65, at least about 0.70, at least about 0.75, at least about 0.80, at least about 0.85, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least about 0.98, at least about 0.99, or more than about 0.99.
The first machine learning model and/or second machine learning model can independently trained using a linear regression, a logistic regression, a Ridge regression, a Lasso regression, an elastic net (EN) regression, a support vector machine (SVM), a gradient boosted machine (GBM), a k nearest neighbors (kNN), a generalized linear model (GLM), a naïve Bayes (NB) classifier, a neural network, a Random Forest (RF), a deep learning algorithm, a linear discriminant analysis (LDA), a decision tree learning (DTREE), an adaptive boosting (ADB), Classification and Regression Tree (CART), Hierarchical clustering, or any combination thereof. In certain embodiments, collinear features are removed during training of the first machine learning model and/or second machine learning model.
In certain embodiments, in step (a′″) the enrichment assessment of the data set is performed using gene set variation analysis (GSVA), gene set enrichment analysis (GSEA), enrichment algorithm, Z-score, multiscale embedded gene co-expression network analysis (MEGENA), weighted gene co-expression network analysis (WGCNA), differential expression analysis, log 2 expression analysis, or any combination thereof.
In certain embodiments, in step (a′″) the enrichment assessment of the data set is performed using GSVA. In certain embodiments, an enrichment score of a patient of the plurality of patients comprises one or more GSVA scores of the patient.
One aspect of the present disclosure is directed to a method for determining a gene set capable of assessing skin of a patient. The method can include, any one of, any combination of, or all of steps (a″″), (b″″) and (c″″). In step (a″″) a first machine learning model can be trained with a reference data set, wherein the reference data set comprises a plurality of individual reference data sets, wherein a respective individual reference data set of the plurality of individual reference data sets comprises i) an enrichment score of a respective reference patient, and ii) data regarding whether skin of the respective reference patient is indicative of a disease state, wherein the first machine learning model is trained to infer whether skin of a patient is indicative of the disease state the patient, based on an enrichment score of the patient. Step (b″″) can include determining feature contribution of one or more of the features of the first machine learning model. Step (c″″) can include selecting N features of the first machine learning model based at least in part on the feature contribution. N can be an integer. The enrichment score of the respective reference patient can comprise one or more table-specific enrichment scores, wherein at least one table-specific enrichment score is generated from each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, or 48, or any range there between Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, Table 4C and Table 4D, and wherein for a respective Table the at least one table-specific enrichment score from the respective Table is generated based on enrichment assessment of expression of at least 1 gene selected from the genes listed in the respective Table, in a biological sample from the respective reference patient. The one or more table-specific enrichment scores can comprise the at least one table-specific enrichment score from each of the selected Table. Set of features of the first machine learning model can be selected from the one or more gene sets from which the one or more Table specific enrichment scores are generated, wherein a gene set from which a respective Table specific enrichment score is generated are the genes based on enrichment of expression which in the biological sample, the respective Table specific enrichment score is generated. Set of features of the first machine learning model can comprise the one or more features of step (b″″). Gene sets forming N features forms the gene set capable of assessing skin of a patient. The gene set (e.g., capable of assessing skin of a patient) and/or a machine learning model developed using the gene set can be used for diagnosis and/or treatment of the disease state of a patient.
In certain embodiments, the disease state is an inflammatory skin disease state. In certain embodiments, the disease state is a rheumatic skin disease state. In certain embodiments the disease state is lupus (e.g., systemic lupus erythematosus (SLE)), psoriasis (PSO), atopic dermatitis (AD), and/or systemic sclerosis (scleroderma) (SSc) disease state. In certain embodiments the disease state is lupus (e.g., systemic lupus erythematosus (SLE)), psoriasis (PSO), atopic dermatitis (AD), or systemic sclerosis (scleroderma) (SSc) disease state. In certain embodiments, the disease state is lupus disease state. In certain embodiments, the lupus is SLE, DLE, CLE, ACLE, SCLE, CCLE, or any combination thereof. In certain embodiments, the lupus is SLE. In certain embodiments, the lupus is CLE. In certain embodiments, the lupus is DLE. In certain embodiments, the lupus is ACLE. In certain embodiments, the lupus is SCLE. In certain embodiments, the lupus is CCLE. In certain embodiments, the disease state is PSO disease state. In certain embodiments, the disease state is AD disease state. In certain embodiments, the disease state is SSc disease state. In certain embodiments, the disease state is lupus or PSO disease state. In certain embodiments, the disease state is lupus or AD disease state. In certain embodiments, the disease state is lupus or SSc disease state. In certain embodiments, the disease state is DLE disease state. In certain embodiments, the disease state is SCLE disease state. In certain embodiments, the disease state is DLE or SCLE disease state. In certain embodiments, the plurality of reference patients comprises a first plurality of patients having the disease state, and a second plurality of patients not having the disease state. In certain embodiments, the plurality of reference patients comprises a first plurality of patients having a first disease state selected from lupus, PSO, AD or SSc disease state, and a second plurality of patients having a second disease state selected from lupus, PSO, AD or SSc disease state, where the first and second disease state are different. In certain embodiments, the plurality of reference patients comprises a first plurality of patients having a DLE disease state, and a second plurality of patients having SCLE disease state. The plurality of reference patients can be human. The skin of the reference patients having the disease state, first disease state, and second disease state can contain one or more lesions, or do not contain a lesion. In certain embodiments, the skin of the reference patients having the disease state, first disease state, and second disease contain one or more lesions. In certain embodiments, the skin of the reference patients having the disease state, first disease state, and second disease do not contain a lesion. The reference patients can be humans.
In certain embodiments, the feature contribution of the one or more features of the first machine learning model can be determined using a SHapley Additive exPlanations (SHAP) method. The feature contribution of the one or more features can be determined based on SHAP values. In certain embodiments, the feature contribution of the one or more features are determined based on SHAP values. The SHAP values can be feature contribution per sample per feature. In certain embodiments, the feature contribution of the one or more features are determined based on SHAP values for the set of features. The N features can be selected based on the SHAP values. In certain embodiments, the N features selected are the N positively contributing features to the model. In certain embodiments, the N features selected are the top N positively contributing features to the model. In certain embodiments, feature importance values of the features can be calculated from the feature contribution values, and the N features can be selected based on the feature importance values. In certain embodiments, the feature importance value for a respective feature can be mean absolute SHAP value of the respective feature across the samples (e.g., reference patients). In certain embodiments, the N features selected have top N feature importance values. In certain embodiments, N is an integer from 2 to 40. In certain embodiments, N is an integer from 10 to 20. In certain embodiments, N is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40, or any range there between. In certain embodiments, N is an integer from 2 to 15. In certain embodiments, N is an integer from 2 to 5, 2 to 6, 2 to 7, 2 to 8, 2 to 9, 2 to 10, 2 to 11, 2 to 12, 2 to 13, 2 to 14, 2 to 15, 5 to 6, 5 to 7, 5 to 8, 5 to 9, 5 to 10, 5 to 11, 5 to 12, 5 to 13, 5 to 14, 5 to 15, 6 to 7, 6 to 8, 6 to 9, 6 to 10, 6 to 11, 6 to 12, 6 to 13, 6 to 14, 6 to 15, 7 to 8, 7 to 9, 7 to 10, 7 to 11, 7 to 12, 7 to 13, 7 to 14, 7 to 15, 8 to 9, 8 to 10, 8 to 11, 8 to 12, 8 to 13, 8 to 14, 8 to 15, 9 to 10, 9 to 11, 9 to 12, 9 to 13, 9 to 14, 9 to 15, 10 to 11, 10 to 12, 10 to 13, 10 to 14, 10 to 15, 11 to 12, 11 to 13, 11 to 14, 11 to 15, 12 to 13, 12 to 14, 12 to 15, 13 to 14, 13 to 15, or 14 to 15. In certain embodiments, N is an integer from 2, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15. In certain embodiments, N is an integer from at least 2, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14. In certain embodiments, N is an integer from at most 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15. In certain embodiments, N is an integer from 5 to 40. In certain embodiments, N is an integer from 5 to 10, 5 to 11, 5 to 12, 5 to 13, 5 to 14, 5 to 15, 5 to 20, 5 to 25, 5 to 30, 5 to 35, 5 to 40, 10 to 11, 10 to 12, 10 to 13, 10 to 14, 10 to 15, 10 to 20, 10 to 25, 10 to 30, 10 to 35, 10 to 40, 11 to 12, 11 to 13, 11 to 14, 11 to 15, 11 to 20, 11 to 25, 11 to 30, 11 to 35, 11 to 40, 12 to 13, 12 to 14, 12 to 15, 12 to 20, 12 to 25, 12 to 30, 12 to 35, 12 to 40, 13 to 14, 13 to 15, 13 to 20, 13 to 25, 13 to 30, 13 to 35, 13 to 40, 14 to 15, 14 to 20, 14 to 25, 14 to 30, 14 to 35, 14 to 40, 15 to 20, 15 to 25, 15 to 30, 15 to 35, 15 to 40, 20 to 25, 20 to 30, 20 to 35, 20 to 40, 25 to 30, 25 to 35, 25 to 40, 30 to 35, 30 to 40, or 35 to 40. In certain embodiments, N is an integer from 5, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, or 40. In certain embodiments, N is an integer from at least 5, 10, 11, 12, 13, 14, 15, 20, 25, 30, or 35. In certain embodiments, N is an integer from at most 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, or 40.
SHAP (Shapley Additive exPlanations) is a machine learning approach used to calculate SHAP values (e.g., feature contribution per sample per feature), of the various tree build machine learning classifiers. SHAP values can allow for estimation of the magnitude by which a feature of the data contributes to the final model prediction, and allow for determination of the features that make high impact on the final model decision (the classification). SHAP can be applied on a trained machine learning model.
The biological sample can comprise a skin biopsy sample, a blood sample, an isolated peripheral blood mononuclear cells (PBMCs), or any derivative thereof. In certain embodiments, the biological sample comprises a skin biopsy sample or any derivative thereof. In certain embodiments, the biological sample comprises a blood sample, or any derivative thereof. In certain embodiments, the biological sample comprises PBMCs or any derivative thereof.
The plurality of individual reference data sets can be obtained from a plurality of reference patients. In certain embodiments, different individual reference data set are obtained from different reference patients. In certain embodiments, oversampling or undersampling correction is made during training of the machine learning model.
In certain embodiments, the one or more of features for which feature contributions are determined in step (b″″), can be selected based on feature importance of the set of features of the first machine learning model. The feature importance and feature contribution, of the features of the first machine learning model can be determined simultaneously, or separately. In certain embodiments, feature importance of the set of features of the first machine learning model is determined, and based on feature importance the one or more features can be selected. In certain embodiments, the one or more features includes all features of the set of features of the machine learning model. In certain embodiments, the one or more features excludes at least one feature from the set of features of the machine learning model.
In certain embodiments, the enrichment score of the respective reference patient can comprise at least one table-specific enrichment score from each of Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28. In certain embodiments, the enrichment score of the respective reference patient comprises one table-specific enrichment score from each of the selected Tables. In certain embodiments, an enrichment score of each of the reference patient of the plurality of reference patients comprises independently at least one table-specific enrichment score from each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, or 48, or any range there between Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, Table 4C and Table 4D. In certain embodiments, the enrichment score of each of the reference patient of the plurality of reference patients can comprise independently at least one table-specific enrichment score from each of Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28. In certain embodiments, enrichment score of each of the reference patient of the plurality of reference patients comprises independently one table-specific enrichment score from each of the selected Tables. In certain embodiments, independently for each of the selected Tables, the at least one table-specific enrichment score for a respective selected Table is generated based on enrichment assessment of expression of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295 or all genes listed in the respective selected Table, in a biological sample.
In certain embodiments, the enrichment assessment is performed using gene set variation analysis (GSVA), gene set enrichment analysis (GSEA), enrichment algorithm, multiscale embedded gene co-expression network analysis (MEGENA), weighted gene co-expression network analysis (WGCNA), differential expression analysis, log 2 expression analysis, or any combination thereof. In certain embodiments, the enrichment assessment is performed using GSVA. In certain embodiments, the enrichment assessment is performed using GSVA, and the Table-specific enrichment score can is a GSVA score. The one or more Table-specific enrichment scores can include one or more GSVA scores, wherein one GSVA score can be generated from each of the selected Table.
In certain embodiments, the first machine learning model is trained to infer whether skin of a patient is indicative of lupus disease state, and a first portion of the plurality of reference patients have lupus, and a second portion of the plurality of reference patients are healthy control. In certain embodiments, the first portion of the plurality of reference patients have one or more skin lesions. In certain embodiments, the first portion of the plurality of reference patients do not have a skin lesion.
In certain embodiments, the first machine learning model is trained to infer whether skin of a patient is indicative of AD disease state, and a first portion of the plurality of reference patients have AD, and a second portion of the plurality of reference patients are healthy control. In certain embodiments, the first portion of the plurality of reference patients have one or more skin lesions. In certain embodiments, the first portion of the plurality of reference patients do not have a skin lesion.
In certain embodiments, the first machine learning model is trained to infer whether skin of a patient is indicative of PSO disease state, and a first portion of the plurality of reference patients have PSO, and a second portion of the plurality of reference patients are healthy control. In certain embodiments, the first portion of the plurality of reference patients have one or more skin lesions. In certain embodiments, the first portion of the plurality of reference patients do not have a skin lesion.
In certain embodiments, the first machine learning model is trained to infer whether skin of a patient is indicative of SSc disease state, and a first portion of the plurality of reference patients have SSc, and a second portion of the plurality of reference patients are healthy control. In certain embodiments, the first portion of the plurality of reference patients have one or more skin lesions. In certain embodiments, the first portion of the plurality of reference patients do not have a skin lesion.
In certain embodiments, the first machine learning model is trained to infer whether skin of a patient is indicative of lupus disease state or PSO disease state, and a first portion of the plurality of reference patients have lupus, and a second portion of the plurality of reference patients have PSO. In certain embodiments, the plurality of reference patients have one or more skin lesions. In certain embodiments, the plurality of reference patients do not have a skin lesion.
In certain embodiments, the first machine learning model is trained to infer whether skin of a patient is indicative of lupus disease state or AD disease state, and a first portion of the plurality of reference patients have lupus, and a second portion of the plurality of reference patients have AD. In certain embodiments, the plurality of reference patients have one or more skin lesions. In certain embodiments, the plurality of reference patients do not have a skin lesion.
In certain embodiments, the first machine learning model is trained to infer whether skin of a patient is indicative of lupus disease state or SSc disease state, and a first portion of the plurality of reference patients have lupus, and a second portion of the plurality of reference patients have SSc. In certain embodiments, the plurality of reference patients have one or more skin lesions. In certain embodiments, the plurality of reference patients do not have a skin lesion.
In certain embodiments, the first machine learning model is trained to infer whether skin of a patient is indicative of AD disease state or PSO disease state, and a first portion of the plurality of reference patients have AD, and a second portion of the plurality of reference patients have PSO. In certain embodiments, the plurality of reference patients have one or more skin lesions. In certain embodiments, the plurality of reference patients do not have a skin lesion.
In certain embodiments, the first machine learning model is trained to infer whether skin of a patient is indicative of DLE disease state or SCLE disease state, and a first portion of the plurality of reference patients have DLE, and a second portion of the plurality of reference patients have SCLE. In certain embodiments, the plurality of reference patients have one or more skin lesions. In certain embodiments, the plurality of reference patients do not have a skin lesion.
In certain embodiments, the method further comprises reducing dimensionality of the first machine learning model by at least: determining, based on the N features of the first machine learning model that were determined based at least in part on the feature contribution, at least one feature of the set of features that can be omitted from the training of a second machine learning model that is to be derived from the first machine learning model; determining a second set of features for training the second machine learning model, wherein the second set of features lacks the at least one feature; and training the second machine learning model derived from the first machine learning model to infer whether skin of the patient is indicative of the disease state of the patient, based on a second enrichment score of the patient. The second machine learning model can be trained with the reference data set or a second reference data set, wherein the second reference data set comprises a second plurality of individual reference data sets, wherein a second respective individual reference data set of the second plurality of individual reference data sets comprises i) a second enrichment score of a second respective reference patient, and ii) second data regarding whether skin of the second respective reference patient is indicative of the disease state. For the second respective reference patient the second enrichment score can comprise values of the second set of features. Certain aspects are directed to a method of developing the second machine learning model.
In certain aspects, a method for developing a trained machine learning model capable of assessing skin of a patient is described. The method can include training a machine learning model, wherein the machine learning model is trained to infer whether skin of a patient is indicative of the disease state, based on the N features, e.g., as determined using the method comprising steps (a″″), (b″″) and/or (c″″).
The machine learning model, first machine learning model, and/or second machine learning model, can independently trained using linear regression, logistic regression, Ridge regression, Lasso regression, elastic net (EN) regression, support vector machine (SVM), gradient boosted machine (GBM), k nearest neighbors (kNN), generalized linear model (GLM), naïve Bayes (NB) classifier, neural network, Random Forest (RF), deep learning algorithm, linear discriminant analysis (LDA), decision tree learning (DTREE), adaptive boosting (ADB), Classification and Regression Tree (CART), Hierarchical clustering, or any combination thereof. In certain embodiments, collinear features are removed during training of a machine learning model.
Lupus can be systemic lupus erythematosus (SLE), cutaneous lupus erythematosus (CLE), discoid lupus erythematosus (DLE), acute cutaneous lupus erythematosus (ACLE), and/or subacute cutaneous lupus erythematosus (SCLE). In certain embodiments, lupus is SLE. In certain embodiments, lupus is CLE. In certain embodiments, lupus is DLE. In certain embodiments, lupus is SCLE. In certain embodiments, lupus is CCLE.
Certain aspects are directed to a method for determining a gene set capable of assessing a disease state of a patient. The method can include any one of, any combination of, or all of steps (a1), (b1) and (c1). Step (a1) can include training a first machine learning model with a reference data set, wherein the reference data set comprises a plurality of individual reference data sets, wherein a respective individual reference data set of the plurality of individual reference data sets comprises i) an enrichment score of a respective reference patient, and ii) data regarding a disease state of the respective reference patient, wherein the first machine learning model is trained to infer about the disease state of a patient, based on an enrichment score of the patient. Step (b1) can include determining feature contribution of one or more features of the first machine learning model. Step (c1) can include selecting N features of the first machine learning model based at least in part on the feature contribution, wherein N is an integer. The enrichment score of the respective reference patient can comprise one or more table-specific enrichment scores, wherein at least one table-specific enrichment score is generated from each of one or more Tables selected from a group of Tables containing curated lists of genes, and wherein for a respective Table the at least one table-specific enrichment score of the respective Table is generated based on enrichment assessment of expression of at least 1 gene selected from the genes listed in the respective Table in a biological sample from the reference patient. Set of features of the first machine learning model can be selected from the one or more Table specific enrichment scores. Genes within Tables corresponding to the N features (e.g., Tables from which Table specific enrichment scores of the N features were generated) forms the gene set capable of assessing the disease state of a patient. The gene set and/or a machine learning model developed using the gene set can be used for diagnosis and/or treatment of the disease state of a patient.
In certain embodiments, the feature contribution of the one or more features of the first machine learning model can be determined using a SHapley Additive exPlanations (SHAP) method. The feature contribution of the one or more features can be determined based on SHAP values. In certain embodiments, the feature contribution of the one or more features are determined based on SHAP values. In certain embodiments, the feature contribution of the one or more features are determined based on SHAP values for the set of features. The N features can be selected based on the SHAP values. In certain embodiments, the N features selected are the N positively contributing features to the model. In certain embodiments, the N features selected are the top N positively contributing features to the model. In certain embodiments, feature importance values of the features can be calculated from the feature contribution values, and the N features can be selected based on the feature importance values. In certain embodiments, the feature importance value for a respective feature can be mean absolute SHAP value of the respective feature across the samples. In certain embodiments, the N features selected have top N feature importance values. In certain embodiments, N is an integer from 2 to 40. In certain embodiments, N is an integer from 10 to 20. In certain embodiments, N is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40, or any range there between.
In certain embodiments, N is an integer from 2 to 15. In certain embodiments, N is an integer from 2 to 5, 2 to 6, 2 to 7, 2 to 8, 2 to 9, 2 to 10, 2 to 11, 2 to 12, 2 to 13, 2 to 14, 2 to 15, 5 to 6, 5 to 7, 5 to 8, 5 to 9, 5 to 10, 5 to 11, 5 to 12, 5 to 13, 5 to 14, 5 to 15, 6 to 7, 6 to 8, 6 to 9, 6 to 10, 6 to 11, 6 to 12, 6 to 13, 6 to 14, 6 to 15, 7 to 8, 7 to 9, 7 to 10, 7 to 11, 7 to 12, 7 to 13, 7 to 14, 7 to 15, 8 to 9, 8 to 10, 8 to 11, 8 to 12, 8 to 13, 8 to 14, 8 to 15, 9 to 10, 9 to 11, 9 to 12, 9 to 13, 9 to 14, 9 to 15, 10 to 11, 10 to 12, 10 to 13, 10 to 14, 10 to 15, 11 to 12, 11 to 13, 11 to 14, 11 to 15, 12 to 13, 12 to 14, 12 to 15, 13 to 14, 13 to 15, or 14 to 15. In certain embodiments, N is an integer from 2, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15. In certain embodiments, N is an integer from at least 2, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14. In certain embodiments, N is an integer from at most 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15. In certain embodiments, N is an integer from 5 to 40. In certain embodiments, N is an integer from 5 to 10, 5 to 11, 5 to 12, 5 to 13, 5 to 14, 5 to 15, 5 to 20, 5 to 25, 5 to 30, 5 to 35, 5 to 40, 10 to 11, 10 to 12, 10 to 13, 10 to 14, 10 to 15, 10 to 20, 10 to 25, 10 to 30, 10 to 35, 10 to 40, 11 to 12, 11 to 13, 11 to 14, 11 to 15, 11 to 20, 11 to 25, 11 to 30, 11 to 35, 11 to 40, 12 to 13, 12 to 14, 12 to 15, 12 to 20, 12 to 25, 12 to 30, 12 to 35, 12 to 40, 13 to 14, 13 to 15, 13 to 20, 13 to 25, 13 to 30, 13 to 35, 13 to 40, 14 to 15, 14 to 20, 14 to 25, 14 to 30, 14 to 35, 14 to 40, 15 to 20, 15 to 25, 15 to 30, 15 to 35, 15 to 40, 20 to 25, 20 to 30, 20 to 35, 20 to 40, 25 to 30, 25 to 35, 25 to 40, 30 to 35, 30 to 40, or 35 to 40. In certain embodiments, N is an integer from 5, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, or 40. In certain embodiments, N is an integer from at least 5, 10, 11, 12, 13, 14, 15, 20, 25, 30, or 35. In certain embodiments, N is an integer from at most 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, or 40.
The biological sample can comprise a tissue sample, a blood sample, an isolated peripheral blood mononuclear cells (PBMCs), or any derivative thereof. In certain embodiments, the biological sample comprises a tissue sample or any derivative thereof. In certain embodiments, the biological sample comprises a blood sample, or any derivative thereof. In certain embodiments, the biological sample comprises PBMCs or any derivative thereof.
The plurality of individual reference data sets can obtained from a plurality of reference patients. In certain embodiments, different individual reference data set is obtained from different reference patients. In certain embodiments, oversampling or undersampling correction is made during training of the machine learning model.
In certain embodiments, the one or more features for which feature contributions are determined in step (b1), can be selected based on feature importance of the set of features of the first machine learning model. The feature importance and feature contribution of features of the first machine learning model can be determined simultaneously, or separately. In certain embodiments, feature importance of the set of features of the first machine learning model is determined, and based on feature importance the one or more features can be selected. In certain embodiments, the one or more features includes all features of the set of features of the machine learning model. In certain embodiments, the one or more features excludes at least one feature from the set of features of the machine learning model.
In certain embodiments, the enrichment score of the respective reference patient comprises at least one table-specific enrichment score from each of the Tables containing curated lists of genes. In certain embodiments, the enrichment score of the respective reference patient comprises one table-specific enrichment score from each of the selected Tables. In some embodiments, enrichment score of each of the reference patients of the plurality of reference patients comprise independently at least one table-specific enrichment score from each of one or more Tables selected from the Tables containing curated lists of genes. In some embodiments, enrichment score of each of the reference patients of the plurality of reference patients comprise independently at least one table-specific enrichment score from each of the Tables containing curated lists of genes. In some embodiments, enrichment score of each of the reference patients of the plurality of reference patients comprise independently table-specific enrichment score from each of the selected Table. In certain embodiments, independently for each of the selected Tables, the at least one table-specific enrichment score for a respective selected Table is generated based on enrichment assessment of expression of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, or 300 or all genes listed in the respective selected Table, in a biological sample.
In certain embodiments, the enrichment assessment is performed using gene set variation analysis (GSVA), gene set enrichment analysis (GSEA), enrichment algorithm, multiscale embedded gene co-expression network analysis (MEGENA), weighted gene co-expression network analysis (WGCNA), differential expression analysis, log 2 expression analysis, or any combination thereof. In certain embodiments, the enrichment assessment is performed using GSVA. The Table-specific enrichment scores can be GSVA scores. The one or more Table-specific enrichment scores can include one or more GSVA scores, wherein one GSVA score can be generated from each of the selected Table.
In certain embodiments, the method further comprises reducing dimensionality of the first machine learning model by at least: determining, based on the N features of the first machine learning model that were determined based at least in part on the feature contribution, at least one feature of the set of features that can be omitted from the training of a second machine learning model that is to be derived from the first machine learning model; determining a second set of features for training the second machine learning model, wherein the second set of features lacks the at least one feature; and training the second machine learning model derived from the first machine learning model to infer about the disease state of the patient, based on a second enrichment score of the patient. The second machine learning model can be trained with the reference data set or a second reference data set, wherein the second reference data set comprises a second plurality of individual reference data sets, wherein a second respective individual reference data set of the second plurality of individual reference data sets comprises i) a second enrichment score of a second reference patient, and ii) second data regarding disease state of the second reference patient. For the second reference patient the second enrichment score can comprise values of second set of features. Certain aspects are directed to a method of developing the second machine learning model.
In certain aspects, a method for developing a trained machine learning model capable of assessing a disease state of a patient is described. The method can include training a machine learning model, wherein the machine learning model is trained to infer about the disease state of the patient, based on one or more of the N features, e.g., as determined using the method comprising steps (a1), (b1) and/or (c1).
The machine learning model, first machine learning model, and/or second machine learning model, can be independently trained using linear regression, logistic regression, Ridge regression, Lasso regression, elastic net (EN) regression, support vector machine (SVM), gradient boosted machine (GBM), k nearest neighbors (kNN), generalized linear model (GLM), naïve Bayes (NB) classifier, neural network, Random Forest (RF), deep learning algorithm, linear discriminant analysis (LDA), decision tree learning (DTREE), adaptive boosting (ADB), Classification and Regression Tree (CART), Hierarchical clustering, or any combination thereof. In certain embodiments, collinear features can be removed during training of a machine learning model.
In an aspect, the present disclosure provides a method for assessing a skin lesion of a subject, comprising: (a2) assaying a biological sample obtained or derived from the subject to produce a data set comprising gene expression measurements of the biological sample from each of a plurality of skin disease-associated genomic loci, wherein the plurality of skin disease-associated genomic loci comprises at least one gene selected from the group of genes listed in Table 1, Table 2, Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, Table 4C, Table 4D, or any combination thereof; (b2) analyzing the data set to classify the skin of the subject as indicative of having a skin disease state; and (c2) electronically outputting a report indicative of the classification of the skin lesion of the subject as indicative of the skin disease state. In certain embodiments, the plurality of skin disease-associated genomic loci comprises a gene list described herein.
In some embodiments, the skin disease is an inflammatory skin disease. In some embodiments, the inflammatory skin disease is lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma). In some embodiments, the present disclosure provides a method for assessing a skin lesion of a subject, comprising: (a2) assaying a biological sample obtained or derived from the subject to produce a data set comprising gene expression measurements of the biological sample from each of a plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci, wherein the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least one gene selected from the group listed in Table 1 Table 2, Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, Table 4C, Table 4D, or any combination thereof; (b2) analyzing the data set to classify the skin lesion of the subject as indicative of a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state; and (c2) electronically outputting a report indicative of the classification of the skin lesion of the subject as indicative of the lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state. In certain embodiments, the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises a gene list described herein.
In some embodiments, the plurality of skin disease, e.g., lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma), associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295 genes selected from the group listed in Table 1, Table 2, Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, Table 4C, Table 4D, or any combination thereof. In certain embodiments, the plurality of skin disease-associated genomic loci, e.g., lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises a gene list described herein.
In some embodiments, the plurality of skin disease, e.g., lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma), associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295 genes selected from the group listed in Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, or any combination thereof. In certain embodiments, the plurality of skin disease, e.g., lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma), associated genomic loci comprises a gene list described herein.
In some embodiments, the method further comprises classifying the skin lesion of the subject as indicative of the skin disease-state, e.g., lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state with an accuracy of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
In some embodiments, the method further comprises classifying the skin lesion of the subject as indicative of the skin disease state, e.g., lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state with an sensitivity of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
In some embodiments, the method further comprises classifying the skin lesion of the subject as indicative of the skin disease state, e.g., lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state with an specificity of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
In some embodiments, the method further comprises classifying the skin lesion of the subject as indicative of the skin disease state, e.g., lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state with a positive predictive value of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
In some embodiments, the method further comprises classifying the skin lesion of the subject as indicative of the skin disease state, e.g., lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state with a negative predictive value of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
In some embodiments, the method further comprises classifying the skin lesion of the subject as indicative of the skin disease state, e.g., lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state with an Area-Under-Curve (AUC) of at least about 0.50, at least about 0.55, at least about 0.60, at least about 0.65, at least about 0.70, at least about 0.75, at least about 0.80, at least about 0.85, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least about 0.98, at least about 0.99, or more than about 0.99.
In some embodiments, the subject has a skin disease selected from, lupus, psoriasis (PSO), atopic dermatitis (AD), and systemic sclerosis (scleroderma, SSc). In some embodiments, the subject is suspected of having a skin disease selected from, lupus, psoriasis (PSO), atopic dermatitis (AD), and systemic sclerosis (scleroderma, SSc). In some embodiments, the subject is at elevated risk of having a skin disease selected from, lupus, psoriasis, atopic dermatitis, and systemic sclerosis (scleroderma). In some embodiments, the subject is asymptomatic for a skin disease selected from, lupus, psoriasis, atopic dermatitis, and systemic sclerosis (scleroderma). Lupus can be SLE, CLE, DLE, ACLE, SCLE, CCLE, or any combination thereof. In certain embodiments, lupus is SLE. In certain embodiments, lupus is CLE. In certain embodiments, lupus is DLE. In certain embodiments, lupus is SLE. In certain embodiments, lupus is ACLE. In certain embodiments, lupus is SCLE. In certain embodiments, lupus is CCLE. In some embodiments, the method further comprises administering a treatment to the subject based at least in part on the classification of the skin lesion of the subject as indicative of the lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state. In some embodiments, the treatment is configured to treat lupus, psoriasis, atopic dermatitis, or systemic sclerosis (scleroderma) of the subject. In some embodiments, the treatment is configured to reduce a severity of lupus, psoriasis, atopic dermatitis, or systemic sclerosis (scleroderma) of the subject. In some embodiments, the treatment is configured to reduce a risk of having lupus, psoriasis, atopic dermatitis, or systemic sclerosis (scleroderma) of the subject. In some embodiments, the treatment is configured to treat, reduce a severity of, and/or reduce a risk of developing SLE. In some embodiments, the treatment is configured to treat, reduce a severity of, and/or reduce a risk of developing CLE. In some embodiments, the treatment is configured to treat, reduce a severity of, and/or reduce a risk of developing DLE. In some embodiments, the treatment is configured to treat, reduce a severity of, and/or reduce a risk of developing SCLE. In some embodiments, the treatment comprises a pharmaceutical.
In some embodiments, (b2) comprises using a trained machine learning classifier to analyze the data set to classify the skin lesion of the subject as indicative of the skin disease state, e.g., lupus, psoriasis, atopic dermatitis, or systemic sclerosis (scleroderma) disease state.
In some embodiments, the trained machine learning classifier is trained using gene expression data obtained by a data analysis tool selected from the group consisting of: a BIG-C™ big data analysis tool, an I-Scope™ big data analysis tool, a T-Scope™ big data analysis tool, a CellScan big data analysis tool, an MS (Molecular Signature) Scoring™ analysis tool, and a Gene Set Variation Analysis (GSVA) tool (e.g., P-Scope).
In some embodiments, the trained machine learning classifier is selected from the group consisting of a linear regression, a logistic regression, a Ridge regression, a Lasso regression, an elastic net (EN) regression, a support vector machine (SVM), a gradient boosted machine (GBM), a k nearest neighbors (kNN), a generalized linear model (GLM), a naïve Bayes (NB) classifier, a neural network, a Random Forest (RF), a deep learning algorithm, and a combination thereof.
In some embodiments, (b2) comprises comparing the data set to a reference data set. In some embodiments, the reference data set comprises gene expression measurements of reference biological samples from each of the plurality of skin disease-associated genomic loci, e.g., lupus, psoriasis, atopic dermatitis, or systemic sclerosis (scleroderma) disease-associated genomic loci. In some embodiments, the reference biological samples comprise a first plurality of biological samples obtained or derived from subjects having a skin disease state, e.g., lupus psoriasis, atopic dermatitis, or systemic sclerosis (scleroderma), and a second plurality of biological samples obtained or derived from subjects not having a skin disease state, e.g., SLE, lupus, psoriasis, atopic dermatitis, or systemic sclerosis (scleroderma) disease state.
In some embodiments, the biological sample comprises a skin biopsy sample, a blood sample, isolated peripheral blood mononuclear cells (PBMCs), or any derivative thereof.
In some embodiments, the method further comprises determining a likelihood of the classification of the skin lesion of the subject as indicative of the skin disease state, e.g., lupus, psoriasis, atopic dermatitis, or systemic sclerosis (scleroderma) disease state.
In some embodiments, the method further comprises monitoring the skin lesion of the subject, wherein the monitoring comprises assessing the skin lesion of the subject at a plurality of different time points.
In some embodiments, a difference among or between the assessments of the skin lesion at the subject among the plurality of time points is indicative of one or more clinical indications selected from the group consisting of: (i) a diagnosis of the skin lesion of the subject, (ii) a prognosis of the skin lesion of the subject, and (iii) an efficacy or non-efficacy of a course of treatment for treating the skin lesion of the subject.
In an aspect, the present disclosure provides a computer system for assessing a skin lesion of a subject, comprising: a database that is configured to store a dataset comprising gene expression data, wherein the gene expression data is obtained by assaying a biological sample obtained or derived from the subject to produce gene expression measurements of the biological sample from each of a plurality of skin disease-associated genomic loci, e.g., lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci, wherein the plurality of genomic loci comprises at least one gene selected from the group listed in Table 1, Table 2, Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, Table 4C, Table 4D, or any combination thereof; and one or more computer processors operatively coupled to the database, wherein the one or more computer processors are individually or collectively programmed to: (i) analyze the data set to classify the skin lesion of the subject as indicative of, e.g., a lupus, psoriasis, atopic dermatitis, or systemic sclerosis (scleroderma) disease state; and (ii) electronically output a report indicative of the classification of the skin lesion of the subject as indicative of the disease state, e.g., lupus, psoriasis, atopic dermatitis, or systemic sclerosis (scleroderma) disease state. In certain embodiments, the plurality of skin disease-associated genomic loci, e.g., lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises a gene list described herein.
In an aspect, the present disclosure provides a non-transitory computer readable medium comprising machine-executable code that, upon execution by one or more computer processors, implements a method for assessing a skin lesion of a subject, the method comprising: (a) assaying a biological sample obtained or derived from the subject to produce a data set comprising gene expression measurements of the biological sample from each of a plurality of skin disease-associated genomic loci, e.g., lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci, wherein the plurality of genomic loci comprises at least one gene selected from the group listed in Table 1, Table 2, Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, Table 4C, Table 4D, or any combination thereof; (b) analyzing the data set to classify the skin lesion of the subject as indicative of, e.g., a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state; and (c) electronically outputting a report indicative of the classification of the skin lesion of the subject as indicative of the disease state, e.g., lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state. In certain embodiments, the plurality of skin disease-associated genomic loci, e.g., lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises a gene list described herein.
In another aspect, the present disclosure provides a method of identifying one or more records having a specific phenotype, the method comprising: receiving a plurality of first records, wherein each first record is associated with one or more of a plurality of phenotypes; receiving a plurality of second records, wherein each second record is associated with one or more of the plurality of phenotypes, and wherein the plurality of second records and the plurality of first records are non-overlapping; applying a machine learning algorithm to at least one first record and at least one second record to determine a classifier; receiving a plurality of third records, wherein the third records are distinct from the plurality of first records and the plurality of second records; and applying the classifier to the plurality of third records to identify one or more third records associated with the specific phenotype.
In some embodiments, the first records and the second records comprise nucleic acid sequencing data, transcriptome data, genome data, epigenome data, proteome data, metabolome data, virome data, methylome data, lipidomic data, lineage-ome data, nucleosomal occupancy data, a genetic variant, a gene fusion, an insertion or deletion (indel), or any combination thereof. In some embodiments, the first records and the second records are in different formats. In some embodiments, the first records and the second records are from different sources, different studies, or both. In some embodiments, the phenotype comprises a disease state, an organ involvement, a medication response, or any combination thereof. In some embodiments, the classifier comprises an elastic generalized linear model classifier, a k-nearest neighbors classifier, a random forest classifier, or any combination thereof.
In some embodiments, the elastic generalized linear model classifier employs an elastic penalty of about 0.8 to about 1. In some embodiments, the elastic generalized linear model classifier employs an elastic penalty of at least about 0.8, about 0.825, about 0.85, about 0.875, about 0.9, about 0.925, about 0.95, about 0.975, or about 1. In some embodiments, the elastic generalized linear model classifier employs an elastic penalty of at most about 0.8, about 0.825, about 0.85, about 0.875, about 0.9, about 0.925, about 0.95, about 0.975, or about 1. In some embodiments, the elastic generalized linear model classifier employs an elastic penalty of about 0.8 to about 0.825, about 0.8 to about 0.85, about 0.8 to about 0.875, about 0.8 to about 0.9, about 0.8 to about 0.925, about 0.8 to about 0.95, about 0.8 to about 0.975, about 0.8 to about 1, about 0.825 to about 0.85, about 0.825 to about 0.875, about 0.825 to about 0.9, about 0.825 to about 0.925, about 0.825 to about 0.95, about 0.825 to about 0.975, about 0.825 to about 1, about 0.85 to about 0.875, about 0.85 to about 0.9, about 0.85 to about 0.925, about 0.85 to about 0.95, about 0.85 to about 0.975, about 0.85 to about 1, about 0.875 to about 0.9, about 0.875 to about 0.925, about 0.875 to about 0.95, about 0.875 to about 0.975, about 0.875 to about 1, about 0.9 to about 0.925, about 0.9 to about 0.95, about 0.9 to about 0.975, about 0.9 to about 1, about 0.925 to about 0.95, about 0.925 to about 0.975, about 0.925 to about 1, about 0.95 to about 0.975, about 0.95 to about 1, or about 0.975 to about 1. In some embodiments, the elastic generalized linear model classifier employs an elastic penalty of about 0.8, about 0.825, about 0.85, about 0.875, about 0.9, about 0.925, about 0.95, about 0.975, or about 1.
In some embodiments, the k-nearest neighbors classifier employs a K value of the size of the plurality of distinct first data sets, wherein k is about 1 to about 20. In some embodiments, the k-nearest neighbors classifier employs a K value of the size of the plurality of distinct first data sets, wherein k is at least about 1, about 2, about 3, about 4, about 5, about 6, about 8, about 10, about 12, about 14, about 16, or about 20. In some embodiments, the k-nearest neighbors classifier employs a K value of the size of the plurality of distinct first data sets, wherein k is at most about 1, about 2, about 3, about 4, about 5, about 6, about 8, about 10, about 12, about 14, about 16, or about 20. In some embodiments, the k-nearest neighbors classifier employs a K value of the size of the plurality of distinct first data sets, wherein k is about 1 to about 2, about 1 to about 3, about 1 to about 4, about 1 to about 5, about 1 to about 6, about 1 to about 8, about 1 to about 10, about 1 to about 12, about 1 to about 14, about 1 to about 16, about 1 to about 20, about 2 to about 3, about 2 to about 4, about 2 to about 5, about 2 to about 6, about 2 to about 8, about 2 to about 10, about 2 to about 12, about 2 to about 14, about 2 to about 16, about 2 to about 20, about 3 to about 4, about 3 to about 5, about 3 to about 6, about 3 to about 8, about 3 to about 10, about 3 to about 12, about 3 to about 14, about 3 to about 16, about 3 to about 20, about 4 to about 5, about 4 to about 6, about 4 to about 8, about 4 to about 10, about 4 to about 12, about 4 to about 14, about 4 to about 16, about 4 to about 20, about 5 to about 6, about 5 to about 8, about 5 to about 10, about 5 to about 12, about 5 to about 14, about 5 to about 16, about 5 to about 20, about 6 to about 8, about 6 to about 10, about 6 to about 12, about 6 to about 14, about 6 to about 16, about 6 to about 20, about 8 to about 10, about 8 to about 12, about 8 to about 14, about 8 to about 16, about 8 to about 20, about 10 to about 12, about 10 to about 14, about 10 to about 16, about 10 to about 20, about 12 to about 14, about 12 to about 16, about 12 to about 20, about 14 to about 16, about 14 to about 20, or about 16 to about 20. In some embodiments, the k-nearest neighbors classifier employs a K value of the size of the plurality of distinct first data sets, wherein k is about 1, about 2, about 3, about 4, about 5, about 6, about 8, about 10, about 12, about 14, about 16, or about 20.
In some embodiments, the K-value of the random forest classifier is incremented by 1 if the k-value is an even number. In some embodiments, applying a machine learning algorithm to the third data set comprises applying a machine learning algorithm to a plurality of unique third data sets.
In some embodiments, the classifier identifies said one or more third records associated with the specific phenotype with an accuracy of about 70% to about 100%. In some embodiments, the classifier identifies said one or more third records associated with the specific phenotype with an accuracy of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, or about 100%. In some embodiments, the classifier identifies said one or more third records associated with the specific phenotype with an accuracy of at most about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, or about 100%. In some embodiments, the classifier identifies said one or more third records associated with the specific phenotype with an accuracy of about 70% to about 75%, about 70% to about 80%, about 70% to about 85%, about 70% to about 90%, about 70% to about 95%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 95%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 95%, about 80% to about 100%, about 85% to about 90%, about 85% to about 95%, about 85% to about 100%, about 90% to about 95%, about 90% to about 100%, or about 95% to about 100%. In some embodiments, the classifier identifies said one or more third records associated with the specific phenotype with an accuracy of about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, or about 100%.
In some embodiments, the classifier identifies said one or more third records associated with the specific phenotype with an accuracy of about 70% to about 100%. In some embodiments, the classifier identifies said one or more third records associated with the specific phenotype with an accuracy of at least about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, or about 100%. In some embodiments, the classifier identifies said one or more third records associated with the specific phenotype with an accuracy of at most about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, or about 100%. In some embodiments, the classifier identifies said one or more third records associated with the specific phenotype with an accuracy of about 70% to about 75%, about 70% to about 80%, about 70% to about 85%, about 70% to about 90%, about 70% to about 95%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 95%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 95%, about 80% to about 100%, about 85% to about 90%, about 85% to about 95%, about 85% to about 100%, about 90% to about 95%, about 90% to about 100%, or about 95% to about 100%. In some embodiments, the classifier identifies said one or more third records associated with the specific phenotype with an accuracy of about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, or about 100%.
In some embodiments, the classifier herein enables a specific phenotype association sensitivity of about 70% to about 100%. In some embodiments, the classifier herein enables a specific phenotype association sensitivity of at least 70%, about 75%, about 80%, about 85%, about 90%, about 95%, or about 100%. In some embodiments, the classifier herein enables a specific phenotype association sensitivity of at most 70%, about 75%, about 80%, about 85%, about 90%, about 95%, or about 100%. In some embodiments, the classifier herein enables a specific phenotype association sensitivity of about 70% to about 75%, about 70% to about 80%, about 70% to about 85%, about 70% to about 90%, about 70% to about 95%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 95%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 95%, about 80% to about 100%, about 85% to about 90%, about 85% to about 95%, about 85% to about 100%, about 90% to about 95%, about 90% to about 100%, or about 95% to about 100%. In some embodiments, the classifier herein enables a specific phenotype association sensitivity of about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, or about 100%.
In some embodiments, the classifier herein enables a specific phenotype association specificity of about 70% to about 100%. In some embodiments, the classifier herein enables a specific phenotype association specificity of at least 70%, about 75%, about 80%, about 85%, about 90%, about 95%, or about 100%. In some embodiments, the classifier herein enables a specific phenotype association specificity of at most 70%, about 75%, about 80%, about 85%, about 90%, about 95%, or about 100%. In some embodiments, the classifier herein enables a specific phenotype association specificity of about 70% to about 75%, about 70% to about 80%, about 70% to about 85%, about 70% to about 90%, about 70% to about 95%, about 70% to about 100%, about 75% to about 80%, about 75% to about 85%, about 75% to about 90%, about 75% to about 95%, about 75% to about 100%, about 80% to about 85%, about 80% to about 90%, about 80% to about 95%, about 80% to about 100%, about 85% to about 90%, about 85% to about 95%, about 85% to about 100%, about 90% to about 95%, about 90% to about 100%, or about 95% to about 100%. In some embodiments, the classifier herein enables a specific phenotype association specificity of about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, or about 100%.
In some embodiments, the method further comprises filtering the first records, the second records, or both. In some embodiments, the filtering comprises removing outliers, removing background noise, removing data without annotation data, normalizing, scaling, variance correcting, Weighted Gene Co-expression Network Analysis, enrichment analysis, dimensionality reduction, or any combination thereof. In some embodiments, the normalizing is performed by Robust Multi-Array Analysis (RMA), Guanine Cytosine Robust Multi-Array Analysis (GCRMA), Linear Models for Microarray Data, variance stabilizing transformation (VST), normal-exponential quantile correction (NEQC), or any combination thereof. In some embodiments, the variance correction comprises employing a local empirical Bayesian shrinkage, adjusting the p-values for multiple hypothesis testing using the Benjamini-Hochberg correction, and removing all data with a set false discovery rate
In some embodiments, the false discovery rate is about 0.000001 to about 0.2. In some embodiments, the false discovery rate is at least about 0.000001. In some embodiments, the false discovery rate is at most about 0.2. In some embodiments, the false discovery rate is about 0.000001 to about 0.00005, about 0.000001 to about 0.00001, about 0.000001 to about 0.0005, about 0.000001 to about 0.0001, about 0.000001 to about 0.005, about 0.000001 to about 0.001, about 0.000001 to about 0.05, about 0.000001 to about 0.01, about 0.000001 to about 0.2, about 0.00005 to about 0.00001, about 0.00005 to about 0.0005, about 0.00005 to about 0.0001, about 0.00005 to about 0.005, about 0.00005 to about 0.001, about 0.00005 to about 0.05, about 0.00005 to about 0.01, about 0.00005 to about 0.2, about 0.00001 to about 0.0005, about 0.00001 to about 0.0001, about 0.00001 to about 0.005, about 0.00001 to about 0.001, about 0.00001 to about 0.05, about 0.00001 to about 0.01, about 0.00001 to about 0.2, about 0.0005 to about 0.0001, about 0.0005 to about 0.005, about 0.0005 to about 0.001, about 0.0005 to about 0.05, about 0.0005 to about 0.01, about 0.0005 to about 0.2, about 0.0001 to about 0.005, about 0.0001 to about 0.001, about 0.0001 to about 0.05, about 0.0001 to about 0.01, about 0.0001 to about 0.2, about 0.005 to about 0.001, about 0.005 to about 0.05, about 0.005 to about 0.01, about 0.005 to about 0.2, about 0.001 to about 0.05, about 0.001 to about 0.01, about 0.001 to about 0.2, about 0.05 to about 0.01, about 0.05 to about 0.2, or about 0.01 to about 0.2. In some embodiments, the false discovery rate is about 0.000001, about 0.00005, about 0.00001, about 0.0005, about 0.0001, about 0.005, about 0.001, about 0.05, about 0.01, or about 0.2.
In some embodiments, the Weighted Gene Co-expression Network Analysis comprises calculating a topology matrix, clustering the data based on the topology matrix, and correlating module eigenvalues for traits on a linear scale by Pearson correlation, for nonparametric traits by Spearman correlation, and for dichotomous traits by point-biserial correlation or t-test. The Pearson correlation or the Product Moment Correlation Coefficient (PMCC), is a number between −1 and 1 that indicates the extent to which two variables are linearly related. The Spearman correlation is a nonparametric measure of rank correlation; statistical dependence between the rankings of two variables.
In some embodiments, the one or more records having a specific phenotype correspond to one or more subjects, and the method further comprises identifying the one or more subjects as (i) having a diagnosis of a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition, (ii) having a prognosis of a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition, (iii) being suitable or not suitable for enrollment in a clinical trial for a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition, (iv) being suitable or not suitable for being administered a therapeutic regimen configured to treat a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition, (v) having an efficacy or not having an efficacy of a therapeutic regimen configured to treat a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition, based at least in part on the specific phenotype corresponding to the one or more subjects.
In another aspect, the present disclosure provides a non-transitory computer-readable storage media encoded with a computer program including instructions executable by a processor to create an application for identifying one or more records having a specific phenotype, the application comprising: a first receiving module receiving a plurality of first records, wherein each first record is associated with one or more of a plurality of phenotypes; a second receiving module receiving a plurality of second records, wherein each second record is associated with one or more of the plurality of phenotypes, and wherein the plurality of second records and the plurality of first records are non-overlapping; a machine learning module applying a machine learning algorithm to at least one first record and at least one second record to determine a classifier; a third receiving module receiving a plurality of third records, wherein the third records are distinct from the plurality of first records and the plurality of second records; and a classifying module applying the classifier to the plurality of third records to identify one or more third records associated with the specific phenotype.
In some embodiments, the first records and the second records comprise nucleic acid sequencing data, transcriptome data, genome data, epigenome data, proteome data, metabolome data, virome data, methylome data, lipidomic data, lineage-ome data, nucleosomal occupancy data, a genetic variant, a gene fusion, an insertion or deletion (indel), or any combination thereof. In some embodiments, the first records and the second records are in different formats. In some embodiments, the first records and the second records are from different sources, different studies, or both. In some embodiments, the phenotype comprises a disease state, an organ involvement, a medication response, or any combination thereof. In some embodiments, the classifier comprises an elastic generalized linear model classifier, a k-nearest neighbors classifier, a random forest classifier, or any combination thereof. In some embodiments, the elastic generalized linear model classifier employs an elastic penalty of about 0.9. In some embodiments, the k-nearest neighbors classifier employs a K-value of about 5% of the size of the plurality of distinct first data sets. In some embodiments, the K-value of the random forest classifier is incremented by 1 if the k-value is an even number. In some embodiments, applying a machine learning algorithm to the third data set comprises applying a machine learning algorithm to a plurality of unique third data sets. In some embodiments, said classifier identifies said one or more third records associated with the specific phenotype with an accuracy of at least about 70%. In some embodiments, the method further comprises filtering the first records, the second records, or both. In some embodiments, the filtering comprises removing outliers, removing background noise, removing data without annotation data, normalizing, scaling, variance correcting, Weighted Gene Co-expression Network Analysis, enrichment analysis, dimensionality reduction, or any combination thereof. In some embodiments, the normalizing is performed by Robust Multi-Array Analysis (RMA), Guanine Cytosine Robust Multi-Array Analysis (GCRMA), Linear Models for Microarray Data, variance stabilizing transformation (VST), normal-exponential quantile correction (NEQC), or any combination thereof. In some embodiments, the variance correction comprises employing a local empirical Bayesian shrinkage, adjusting the p-values for multiple hypothesis testing using the Benjamini-Hochberg correction, and removing all data with a false discovery rate of less than 0.2. In some embodiments, the Weighted Gene Co-expression Network Analysis comprises calculating a topology matrix, clustering the data based on the topology matrix, and correlating module eigenvalues for traits on a linear scale by Pearson correlation, for nonparametric traits by Spearman correlation, and for dichotomous traits by point-biserial correlation or t-test.
In another aspect, the present disclosure provides a method for identifying a disease state or a susceptibility thereof of a subject, comprising: (a) using an assay to process a biological sample derived from the subject to generate a quantitative measure of each of a plurality of disease-associated genomic loci, wherein the plurality of disease-associated genomic loci comprises at least 5 genes listed in Table 1, Table 2, Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, Table 4C, Table 4D, or any combination thereof; (b) processing the dataset to identify the disease state or the susceptibility thereof of the subject at an accuracy of at least about 70%; and (c) electronically outputting a report indicative of the disease state or the susceptibility thereof of the subject.
In some embodiments, the plurality of quantitative measures comprises gene expression measurements. In some embodiments, the disease state comprises a skin disease state, e.g., an active lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition or an inactive lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition. In some embodiments, the lupus condition is SLE. In some embodiments, the lupus condition is CLE. In some embodiments, the lupus condition is DLE. In some embodiments, the lupus condition is ACLE. In some embodiments, the lupus condition is SCLE. In some embodiments, the lupus condition is CCLE. In some embodiments, the plurality of disease-associated genomic loci comprises one or more genes selected from the group consisting of: RAB4B, ADAR, RPL44, CDCA5, MYD88, SNN, BRD3, C7orf43, CDC20, SP1, POFUT1, SAMD4B, ATP6V1B2, TSPAN9, SP140, STK26, IRF4, LCP1, LMO2, SF3B4, HIST2H2AA3, CITED4, ADAM8, TICAM1, and HSD17B7.
In another aspect, the present disclosure provides a method for identifying an immunological state of a subject, comprising: (a) using an assay to process a biological sample derived from the subject to generate a quantitative measure of each of a plurality of genomic loci, wherein the plurality of genomic loci comprises at least 5 genes listed in Table 1, Table 2, Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, Table 4C, Table 4D, or any combination thereof; (b) processing the dataset to identify the immunological state of the subject at an accuracy of at least about 70%; and (c) electronically outputting a report indicative of the immunological state of the subject.
In some embodiments, the plurality of quantitative measures comprises gene expression measurements. In some embodiments, the immunological state comprises an active or inactive state of each of one or more of the plurality of genomic loci. In some embodiments, the plurality of genomic loci comprises one or more genes selected from the group consisting of: RAB4B, ADAR, MRPL44, CDCA5, MYD88, SNN, BRD3, C7orf43, CDC20, SP1, POFUT1, SAMD4B, ATP6V1B2, TSPAN9, SP140, STK26, IRF4, LCP1, LMO2, SF3B4, HIST2H2AA3, CITED4, ADAM8, TICAM1, and HSD17B7.
In another aspect, the present disclosure provides a method for identifying a disease state or a susceptibility thereof of a subject, comprising: (a) using an assay to process a biological sample derived from the subject to generate a quantitative measure of each of a plurality of disease-associated genomic loci, wherein the plurality of disease-associated genomic loci comprises one or more genes associated with a gene cluster of Table 1, Table 2, Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, Table 4C, Table 4D, or any combination thereof; (b) processing the dataset to identify the disease state or the susceptibility thereof of the subject at an accuracy of at least about 70%; and (c) electronically outputting a report indicative of the disease state or the susceptibility thereof of the subject.
In some embodiments, the plurality of quantitative measures comprises gene expression measurements. In some embodiments, the disease state comprises an active lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition or an inactive lupus condition. In some embodiments, the lupus condition is systemic lupus erythematosus (SLE), discoid lupus erythematosus (DLE), CLE, ACLE, SCLE, CCLE, and/or lupus nephritis (LN). In some embodiments, the plurality of disease-associated genomic loci comprises 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, or more than 50 genes associated with the gene cluster.
In another aspect, the present disclosure provides a method for identifying an immunological state of a subject, comprising: (a) using an assay to process a biological sample derived from the subject to generate a quantitative measure of each of a plurality of disease-associated genomic loci, wherein the plurality of disease-associated genomic loci comprises one or more genes associated with a gene cluster of Table 1, Table 2, Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, Table 4C, Table 4D, or any combination thereof; (b) processing the dataset to identify the immunological state of the subject at an accuracy of at least about 70%; and (c) electronically outputting a report indicative of the immunological state of the subject.
In some embodiments, the plurality of quantitative measures comprises gene expression measurements. In some embodiments, the immunological state comprises an active lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition or an inactive lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition. In some embodiments, the lupus condition is systemic lupus erythematosus (SLE), discoid lupus erythematosus (DLE), CLE, ACLE, SCLE, CCLE, and/or lupus nephritis (LN). In some embodiments, the plurality of disease-associated genomic loci comprises 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, or more than 50 genes associated with the gene cluster.
In another aspect, the present disclosure provides a method for identifying an immunological state of a subject, comprising: (a) using an assay to process a biological sample derived from the subject to generate a quantitative measure of each of a plurality of disease-associated genomic loci, wherein the plurality of disease-associated genomic loci comprises one or more genes associated with a pathway of Table 1, Table 2, Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, Table 4C, Table 4D, or any combination thereof; (b) processing the dataset to identify the immunological state of the subject at an accuracy of at least about 70%; and (c) electronically outputting a report indicative of the immunological state of the subject.
In some embodiments, the plurality of quantitative measures comprises gene expression measurements. In some embodiments, the immunological state comprises an active lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition or an inactive lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition. In some embodiments, the lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition is systemic lupus erythematosus (SLE), discoid lupus erythematosus (DLE), CLE, ACLE, SCLE, CCLE, and/or lupus nephritis (LN). In some embodiments, the plurality of disease-associated genomic loci comprises 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, or more than 50 genes associated with the pathway.
In another aspect, the present disclosure provides a computer-implemented method for assessing a condition of a subject, comprising: (a) receiving a dataset of a biological sample of the subject; (b) selecting one or more data analysis tools, wherein the one or more data analysis tools comprise an analysis tool selected from the group consisting of: a BIG-C™ big data analysis tool, an I-Scope™ big data analysis tool, a T-Scope™ big data analysis tool, a CellScan big data analysis tool, an MS (Molecular Signature) Scoring™ analysis tool, a Gene Set Variation Analysis (GSVA) tool (e.g., P-Scope), a CoLTs® (Combined Lupus Treatment Scoring) analysis tool, and a Target Scoring analysis tool, or a combination thereof; (c) processing the dataset using the one or more data analysis tools to generate a data signature of the biological sample of the subject; and (d) based at least in part on the data signature generated in (c), assessing the condition of the subject. For use in the context of the methods set forth in the present disclosure, any tools and methods known to those in the skill of the art may be applied, e.g., as described in “Machine Learning Disease Prediction and Treatment Prioritization,” published as U.S. Pat. App. Pub. No. 2021/0104321 (and WO 2020/102043), incorporated herein by reference in its entirety.
In some embodiments, the dataset comprises mRNA gene expression or transcriptome data, DNA genomic data, proteomic data, metabolomic data, or a combination thereof. In some embodiments, the biological sample comprises a whole blood (WB) sample, a PBMC sample, a tissue sample, a cell sample, or any derivative thereof. In some embodiments, assessing the condition of the subject comprises identifying a disease or disorder of the subject.
In some embodiments, the method further comprises identifying a disease or disorder of the subject at a sensitivity or specificity of at least about 70%. In some embodiments, the method further comprises determining a likelihood of the identification of the disease or disorder of the subject. In some embodiments, the method further comprises providing a therapeutic intervention for the disease or disorder of the subject. In some embodiments, the method further comprises monitoring the disease or disorder of the subject, wherein the monitoring comprises assessing the disease or disorder of the subject at a plurality of time points, wherein the assessing is based at least on the disease or disorder identified at each of the plurality of time points.
In some embodiments, selecting the one or more data analysis tools comprises receiving a user selection of the one or more data analysis tools. In some embodiments, selecting the one or more data analysis tools is automatically performed by the computer without receiving a user selection of the one or more data analysis tools.
In another aspect, the present disclosure provides a computer system for assessing a condition of a subject, comprising: a database that is configured to store a dataset of a biological sample of the subject; and one or more computer processors operatively coupled to the database, wherein the one or more computer processors are individually or collectively programmed to: (i) select one or more data analysis tools, wherein the one or more data analysis tools comprise an analysis tool selected from the group consisting of: a BIG-C™ big data analysis tool, an I-Scope™ big data analysis tool, a T-Scope™ big data analysis tool, a CellScan big data analysis tool, an MS (Molecular Signature) Scoring™ analysis tool, a Gene Set Variation Analysis (GSVA) tool (e.g., P-Scope), a CoLTs® (Combined Lupus Treatment Scoring) analysis tool, and a Target Scoring analysis tool; (ii) process the dataset using the one or more data analysis tools to generate a data signature of the biological sample of the subject; and (iii) based at least in part on the data signature generated in (ii), assess the condition of the subject.
In another aspect, the present disclosure provides a non-transitory computer readable medium comprising machine-executable code that, upon execution by one or more computer processors, implements a method for assessing a condition of a subject, the method comprising: (a) receiving a dataset of a biological sample of the subject; (b) selecting one or more data analysis tools wherein the one or more data analysis tools comprise an analysis tool selected from the group consisting of: a BIG-C™ big data analysis tool, an I-Scope™ big data analysis tool, a T-Scope™ big data analysis tool, a CellScan big data analysis tool, an MS (Molecular Signature) Scoring™ analysis tool, a Gene Set Variation Analysis (GSVA) tool (e.g., P-Scope), a CoLTs® (Combined Lupus Treatment Scoring) analysis tool, and a Target Scoring analysis tool; (c) processing the dataset using the one or more data analysis tools to generate a data signature of the biological sample of the subject; and (d) based at least in part on the data signature generated in (c), assessing the condition of the subject. In any embodiment described herein, the one or more data analysis tools may be a plurality of data analysis tools each independently selected from a BIG-C™ big data analysis tool, an I-Scope™ big data analysis tool, a T-Scope™ big data analysis tool, a CellScan big data analysis tool, an MS (Molecular Signature) Scoring™ analysis tool, a Gene Set Variation Analysis (GSVA) tool (e.g., P-Scope), a CoLTs® (Combined Lupus Treatment Scoring) analysis tool, and a Target Scoring analysis tool.
Analysis of Genomic Loci Associated with Lupus, Psoriasis, Atopic Dermatitis, and/or Systemic Sclerosis (Scleroderma)
In another aspect, the present disclosure provides a computer-implemented method for assessing a lupus, PSO, AD, and/or SSc condition of a subject, comprising: (a) receiving a dataset of a biological sample of the subject, wherein the dataset comprises quantitative measures of gene expression from each a plurality of lupus, PSO, AD, and/or SSc-associated genomic loci; (b) processing the dataset to identify one or more differentially expressed (DE) genomic loci among the plurality of SLE-associated genomic loci; and (c) based at least in part on the one or more DE genomic loci identified in (b), assessing the lupus, PSO, AD, and/or SSc condition of the subject.
In some embodiments, the dataset comprises RNA gene expression or transcriptome data, DNA genomic data, or a combination thereof. In some embodiments, the biological sample comprises a whole blood (WB) sample, a PBMC sample, a tissue sample, a cell sample, or any derivative thereof. In some embodiments, assessing the lupus, PSO, AD, and/or SSc condition of the subject comprises determining a diagnosis of the lupus, PSO, AD, and/or SSc condition, a prognosis of the lupus, PSO, AD, and/or SSc condition, a susceptibility of the lupus, PSO, AD, and/or SSc condition, a treatment for the lupus, PSO, AD, and/or SSc condition, or an efficacy or non-efficacy of a treatment for the lupus, PSO, AD, and/or SSc condition, respectively.
In some embodiments, the method further comprises determining a diagnosis of the lupus, PSO, AD, and/or SSc condition with a sensitivity of at least about 70%. In some embodiments, the method further comprises determining a diagnosis of the lupus, PSO, AD, and/or SSc condition with a specificity of at least about 70%. In some embodiments, the method further comprises determining a diagnosis of the lupus, PSO, AD, and/or SSc condition with a positive predictive value of at least about 70%. In some embodiments, the method further comprises determining a diagnosis of the lupus, PSO, AD, and/or SSc condition with a negative predictive value of at least about 70%. In some embodiments, the method further comprises determining a diagnosis of the lupus, PSO, AD, and/or SSc condition with an Area Under Curve (AUC) of at least about 70%. In some embodiments, the method further comprises determining a likelihood of the diagnosis of the lupus, PSO, AD, and/or SSc condition of the subject.
In some embodiments, the method further comprises generating a plurality of drug candidates for the lupus, PSO, AD, and/or SSc condition of the subject. In some embodiments, the method further comprises evaluating or predicting a relative efficacy of the plurality of drug candidates for the lupus, PSO, AD, and/or SSc condition of the subject. In some embodiments, the method further comprises providing a therapeutic intervention comprising one or more of the plurality of drug candidates for the lupus, PSO, AD, and/or SSc condition of the subject.
In some embodiments, the method further comprises monitoring the lupus, PSO, AD, and/or SSc condition of the subject, wherein the monitoring comprises assessing the lupus, PSO, AD, and/or SSc condition of the subject at each of a plurality of time points, and processing the plurality of assessments of the lupus, PSO, AD, and/or SSc condition of the subject at each of the plurality of time points.
In an aspect, the present disclosure provides systems and methods for using bioinformatics approaches to deconvolute bulk mRNA for various cells and processes involved in lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) organ pathology, including inflammatory cells, endothelial cells, tissue cells.
In an aspect, the present disclosure provides systems and methods for the delineation of the altered metabolism of cells by using gene expression analysis.
In an aspect, the present disclosure provides systems and methods for using various regression models (e.g., classification and regression trees, linear regression, step-wise regression) to dissect the specific metabolic alterations in individual cell types.
In an aspect, the present disclosure provides systems and methods for using animal models and the ability to translate mouse gene expression into the human equivalent to confirm the results in humans and also analyze the effects of treatment.
In an aspect, the present disclosure provides systems and methods for the delineation of the role of specific cells (myeloid cells) and processes (interferon, mitochondrial dysfunction) in lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) tissue pathology.
In an aspect, the present disclosure provides systems and methods for using non-lymphocyte populations in skin and kidney toward diagnostic and/or prognostic biopsy tests.
In an aspect, the present disclosure provides systems and methods for defining gene signatures in individual cell types in a mixed population such as blood or tissue (e.g., skin, kidney).
In an aspect, the present disclosure provides systems and methods for analyzing sets of metabolism genes and their relationship to function and cell type, including subsets of myeloid cells.
Another aspect of the present disclosure provides a non-transitory computer readable medium comprising machine executable code that, upon execution by one or more computer processors, implements any of the methods above or elsewhere herein.
Another aspect of the present disclosure provides a system comprising one or more computer processors and computer memory coupled thereto. The computer memory comprises machine executable code that, upon execution by the one or more computer processors, implements any of the methods above or elsewhere herein.
Additional aspects and advantages of the present disclosure will become readily apparent to those skilled in this art from the following detailed description, wherein only illustrative embodiments of the present disclosure are shown and described. As will be realized, the present disclosure is capable of other and different embodiments, and its several details are capable of modifications in various obvious respects, all without departing from the disclosure. Accordingly, the drawings and description are to be regarded as illustrative in nature, and not as restrictive.
All publications, including any supplementary materials, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.
The novel features of the disclosure are set forth with particularity in the appended claims. A better understanding of the features and advantages of the present disclosure will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the disclosure are utilized, and the accompanying drawings of which:
1. A method for assessing skin of a patient, comprising:
2. The method of embodiment 1, wherein the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295 genes selected from the group of genes listed in Table 1, Table 2, Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, Table 4C, Table 4D or any combination thereof.
3. The method of embodiment 1 or embodiment 2, wherein the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295 genes selected from the group of genes listed in Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, or any combination thereof.
4. The method of any one of embodiments 1 to 3, comprising classifying the skin of the patient as indicative of the lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state with an accuracy of at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
5. The method of any one of embodiments 1 to 4, comprising classifying the skin of the patient as indicative of the lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state with an sensitivity of at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
6. The method of any one of embodiments 1 to 5, comprising classifying the skin lesion of the patient as indicative of the lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state with an specificity of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
7. The method of any one of embodiments 1 to 6, comprising classifying the skin of the patient as indicative of the lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state with a positive predictive value of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
8. The method of any one of embodiments 1 to 7, comprising classifying the skin of the patient as indicative of the lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state with a negative predictive value of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
9. The method of any one of embodiments 1 to 8, comprising classifying the skin of the patient as indicative of the lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state with an Area-Under-Curve (AUC) of at least about 0.80, at least about 0.85, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least about 0.98, at least about 0.99, or more than about 0.99.
10. The method of any one of embodiments 1 to 9, wherein the patient has lupus, psoriasis (PSO), atopic dermatitis (AD), or systemic sclerosis (scleroderma, SSc).
11. The method of any one of embodiments 1 to 9, wherein the patient is suspected of having lupus, psoriasis (PSO), atopic dermatitis (AD), or systemic sclerosis (scleroderma, SSc).
12. The method of any one of embodiments 1 to 9, wherein the patient is at elevated risk of having lupus, psoriasis (PSO), atopic dermatitis (AD), or systemic sclerosis (scleroderma, SSc).
13. The method of any one of embodiments 1 to 9, wherein the patient is asymptomatic for lupus, psoriasis (PSO), atopic dermatitis (AD), or systemic sclerosis (scleroderma, SSc).
14. The method of any one of embodiments 1 to 13, further comprising administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state.
15. The method of embodiment 14, wherein the treatment is configured to treat lupus, psoriasis (PSO), atopic dermatitis (AD), or systemic sclerosis (scleroderma, SSc) of the patient.
16. The method of embodiment 14, wherein the treatment is configured to reduce a severity of lupus, psoriasis (PSO), atopic dermatitis (AD), or systemic sclerosis (scleroderma, SSc) of the patient.
17. The method of embodiment 14, wherein the treatment is configured to reduce a risk of having lupus, psoriasis (PSO), atopic dermatitis (AD), or systemic sclerosis (scleroderma, SSc) of the patient.
18. The method of embodiment 14, wherein the treatment comprises a pharmaceutical composition.
19. The method of any one of embodiment 1 to 18, wherein (b) comprises using a trained machine learning classifier to analyze the data set to classify the skin of the patient as indicative of having the lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state.
20. The method of embodiment 19, wherein the trained machine learning classifier is trained to infer the classification of the skin of the patient based on a set of N features, the machine learning classifier trained by at least determining, from a training dataset, the N features that are usable to determine a binary classification indicative of whether a training dataset patient has i) skin indicative of at least one of one or more inflammatory skin disease state selected from lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state, or healthy state, or i) skin indicative of a first inflammatory skin disease state of the one or more inflammatory skin disease state or a second inflammatory skin disease of the one or more inflammatory skin disease state.
21. The method of embodiment 19 or 20, wherein the trained machine learning classifier is trained using gene expression data obtained by a data analysis tool selected from the group consisting of: a BIG-C™ big data analysis tool, an I-Scope™ big data analysis tool, a T-Scope™ big data analysis tool, a CellScan big data analysis tool, an MS (Molecular Signature) Scoring™ analysis tool, and a Gene Set Variation Analysis (GSVA) tool (e.g., P-Scope).
22. The method of embodiment 19 or 21, wherein the trained machine learning classifier is selected from the group consisting of a linear regression, a logistic regression, a Ridge regression, a Lasso regression, an elastic net (EN) regression, a support vector machine (SVM), a gradient boosted machine (GBM), a k nearest neighbors (kNN), a generalized linear model (GLM), a naïve Bayes (NB) classifier, a neural network, a Random Forest (RF), a deep learning algorithm, a linear discriminant analysis (LDA), a decision tree learning (DTREE), an adaptive boosting (ADB), Classification and Regression Tree (CART), and a combination thereof
23. The method of any one of embodiments 1 to 22, wherein (b) comprises comparing the data set to a reference data set.
24. The method of embodiment 23, wherein the reference data set comprises gene expression measurements of reference biological samples from each of the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci.
25. The method of embodiment 24, wherein the reference biological samples comprise a first plurality of biological samples obtained or derived from patients having a lupus, psoriasis, atopic dermatitis, or systemic sclerosis (scleroderma) disease state and a second plurality of biological samples obtained or derived from patients not having the lupus, psoriasis, atopic dermatitis, or systemic sclerosis (scleroderma) disease state.
26. The method of any one of embodiments 1 to 25, wherein the biological sample comprises a skin biopsy sample, a blood sample, isolated peripheral blood mononuclear cells (PBMCs), or any derivative thereof.
27. The method of any one of embodiments 1 to 26, further comprising determining a likelihood of the classification of the skin of the patient as indicative of the lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease state.
28. The method of any one of embodiments 1 to 27, further comprising monitoring the skin of the patient, wherein the monitoring comprises assessing the skin of the patient at a plurality of different time points.
29. The method of embodiment 28, wherein a difference in the assessment of the skin of the patient among the plurality of time points is indicative of one or more clinical indications selected from the group consisting of: (i) a diagnosis of the skin of the patient, (ii) a prognosis of the skin of the patient, and (iii) an efficacy or non-efficacy of a course of treatment for treating the skin of the patient.
30. The method of any one of embodiments 1 to 29, wherein the skin of the patient comprises one or more lesions, and wherein the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295 genes selected from the group of genes listed in Table 4B-8, Table 4B-25, Table 4B-14, Table 4A-16, Table 4B-22, Table 4B-10, Table 4A-11, Table 4B-16, Table 4B-26, Table 4A-1, Table 4A-19, Table 4A-15, Table 4B-28, Table 4B-15, Table 4B-23, or any combination thereof, and wherein in step (b) the skin of patient is classified as indicative of the lupus disease state.
31. The method of any one of embodiments 1 to 29, wherein the skin of the patient does not comprise a lesion, and wherein the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295 genes selected from the group of genes listed in Table 4B-26, Table 4A-8, Table 4A-14, Table 4A-16, Table 4B-11, Table 4A-1, Table 4B-6, Table 4A-10, Table 4B-10, Table 4B-16, Table 4B-2, Table 4B-19, Table 4B-13, Table 4B-1, Table 4B-25, or any combination thereof, and wherein in step (b) the skin of patient is classified as indicative of the lupus disease state.
32. The method of any one of embodiments 1 to 29, wherein the skin of the patient comprises one or more lesions, and wherein the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295 genes selected from the group of genes listed in Table 4B-3, Table 4B-25, Table 4B-10, Table 4B-16, Table 4B-8, Table 4B-14, Table 4B-2, Table 4A-7, Table 4B-28, Table 4B-23, Table 4B-20, Table 4B-26, Table 4A-13, Table 4B-18, Table 4A-16, or any combination thereof, and wherein in step (b) the skin of patient is classified as indicative of the psoriasis disease state.
33. The method of any one of embodiments 1 to 29, wherein the skin of the patient does not comprise a lesion, and wherein the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295 genes selected from the group of genes listed in Table 4B-1, Table 4B-3, Table 4B-12, Table 4A-14, Table 4A-20, Table 4B-17, Table 4B-20, Table 4B-27, Table 4A-9, Table 4A-15, Table 4A-18, Table 4A-13, Table 4B-26, Table 4B-2, Table 4A-5, or any combination thereof, and wherein in step (b) the skin of patient is classified as indicative of the psoriasis disease state.
34. The method of any one of embodiments 1 to 29, wherein the skin of the patient comprises one or more lesions, and wherein the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295 genes selected from the group of genes listed in Table 4B-10, Table 4B-25, Table 4B-8, Table 4B-22, Table 4B-28, Table 4B-16, Table 4A-16, Table 4B-14, Table 4B-13, Table 4B-23, Table 4B-7, Table 4B-15, Table 4A-12, Table 4B-3, Table 4B-2, or any combination thereof, and wherein in step (b) the skin of patient is classified as indicative of the atopic dermatitis disease state.
35. The method of any one of embodiments 1 to 29, wherein the skin of the patient does not comprise a lesion, and wherein the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295 genes selected from the group of genes listed in Table 4B-17, Table 4B-28, Table 4A-6, Table 4A-7, Table 4B-2, Table 4B-20, Table 4A-9, Table 4B-18, Table 4A-12, Table 4A-16, Table 4A-13, Table 4B-23, Table 4B-9, Table 4A-3, Table 4A-10, or any combination thereof, and wherein in step (b) the skin of patient is classified as indicative of the atopic dermatitis disease state.
36. The method of any one of embodiments 1 to 29, wherein the skin of the patient comprises one or more lesions, and wherein the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295 genes selected from the group of genes listed in Table 4A-16, Table 4B-8, Table 4B-25, Table 4B-21, Table 4B-26, Table 4B-10, Table 4B-28, Table 4B-2, Table 4B-27, Table 4B-14, Table 4A-18, Table 4A-6, Table 4A-15, Table 4B-12, Table 4B-23, or any combination thereof, and wherein in step (b) the skin of patient is classified as indicative of the systemic sclerosis (scleroderma) disease state.
37. The method of any one of embodiments 1 to 29, wherein the skin of the patient comprises one or more lesions, and wherein the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295 genes selected from the group of genes listed in Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4A-11, Table 4B-13, Table 4A-17, Table 4B-10, and wherein in step (b) the skin of patient is classified as indicative of the lupus or atopic dermatitis disease state.
38. The method of any one of embodiments 1 to 29, wherein the skin of the patient does not comprise a lesion, and wherein the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295 genes selected from the group of genes listed in Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4B-1, Table 4A-10, Table 4A-12, Table 4B-13, Table 4B-15, and wherein in step (b) the skin of patient is classified as indicative of the lupus or atopic dermatitis disease state.
39. The method of any one of embodiments 1 to 29, wherein the skin of the patient comprises one or more lesions, and wherein the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295 genes selected from the group of genes listed in Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, Table 4B-5, Table 4B-7, or any combination thereof, and wherein in step (b) the skin of patient is classified as indicative of the lupus or psoriasis disease state.
40. The method of any one of embodiments 1 to 29, wherein the skin of the patient does not comprise a lesion, and wherein the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295 genes selected from the group of genes listed in Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, Table 4B-24, Table 4B-22, Table 4A-7, Table 4B-10, Table 4A-10, or any combination thereof, and wherein in step (b) the skin of patient is classified as indicative of the lupus or psoriasis disease state.
41. The method of any one of embodiments 1 to 29, wherein the skin of the patient comprises one or more lesions, and wherein the plurality of lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) disease-associated genomic loci comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295 genes selected from the group of genes listed in Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4A-15, Table 4B-13, Table 4A-8, or any combination thereof, and wherein in step (b) the skin of patient is classified as indicative of the lupus or systemic sclerosis (scleroderma) disease state.
42. A computer system for assessing a skin of a patient, comprising:
43. The computer system of embodiment 42, further comprising an electronic display operatively coupled to the one or more computer processors, wherein the electronic display comprises a graphical user interface that is configured to display the report.
44. A non-transitory computer readable medium comprising machine-executable code that, upon execution by one or more computer processors, implements a method for assessing a skin of a patient, the method comprising:
45. A method for assessing a skin of a patient, the method comprising:
46. The method of embodiment 45, wherein in step (a), the enrichment assessment is performed for enrichment of expression of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295 genes selected from the group of genes listed in Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, or any combination thereof,
47. The method of embodiment 45 or 46, comprising classifying the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state of the patient with an accuracy of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
48. The method of any one of embodiments 45 to 47, comprising classifying the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state of the patient with a sensitivity of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
49. The method of any one of embodiments 45 to 48, comprising classifying the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state of the patient with a specificity of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
50. The method of any one of embodiments 45 to 49, comprising classifying the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state of the patient with a positive predictive value of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
51. The method of any one of embodiments 45 to 50, comprising classifying the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state of the patient with a negative predictive value of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
52. The method of any one of embodiments 45 to 51, comprising classifying the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state of the patient with a Receiver operating characteristic (ROC) curve having an Area-Under-Curve (AUC) of at least about 0.80, at least about 0.85, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least about 0.98, at least about 0.99, or more than about 0.99.
53. The method of any one of embodiments 45 to 52, wherein the patient has lupus, PSO, AD, or SSc.
54. The method of any one of embodiments 45 to 52, wherein the patient is at elevated risk of having lupus, PSO, AD, or SSc.
55. The method of any one of embodiments 45 to 52, wherein the patient is suspected of having lupus, PSO, AD, or SSc.
56. The method of any one of embodiments 45 to 52, wherein the patient is asymptomatic for lupus, PSO, AD, or SSc.
57. The method of any one of embodiments 45 to 56, further comprising administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of lupus, PSO, AD, or SSc disease state of the patient.
58. The method of embodiment 57, wherein the treatment is configured to treat lupus, PSO, AD, or SSc of the patient.
59. The method of embodiment 57, wherein the treatment is configured to reduce a severity of lupus, PSO, AD, or SSc of the patient.
60. The method of embodiment 57, wherein the treatment is configured to reduce a risk of having lupus, PSO, AD, or SSc of the patient.
61. The method of any one of embodiments 57 to 60, wherein the treatment comprises a pharmaceutical composition.
62. The method of any one of embodiments 45 to 61, wherein the biological sample comprises a skin biopsy sample, a blood sample, isolated peripheral blood mononuclear cells (PBMCs), or any derivative thereof.
63. The method of any one of embodiments 45 to 62, wherein the enrichment assessment of the data set in step (a) is performed using gene set variation analysis (GSVA), gene set enrichment analysis (GSEA), enrichment algorithm, multiscale embedded gene co-expression network analysis (MEGENA), weighted gene co-expression network analysis (WGCNA), differential expression analysis, log 2 expression analysis, or any combination thereof.
64. The method of any one of embodiments 45 to 63, wherein the enrichment assessment of the data set in step (a) is performed using GSVA of the dataset.
65. The method of embodiment 64, wherein the enrichment score obtained in step (a) comprises one or more GSVA scores of the patient, wherein the one or more GSVA scores are generated using one or more of the Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, or any combination thereof, wherein for a respective Table, at least one GSVA score of the patient is generated for enrichment of expression of at least 2 genes listed in the respective Table.
66. The method of embodiment 65, wherein the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, or 48, Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28.
67. The method of embodiment 65 or embodiment 66, wherein independently for each respective Table of the one or more Tables, the at least one GSVA score is generated, for enrichment of expression of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295 genes listed in the respective Table.
68. The method of any one of embodiments 45 to 67, wherein (b) comprises using a trained machine learning model to analyze the enrichment score of the patients to classify the skin of the patient as indicative of the lupus, PSO, AD and/or SSc disease state.
69. The method of embodiment 68, wherein the analyzing in step (b) comprises providing the one or more GSVA scores of the patient as an input to the trained machine-learning model, wherein the trained machine-learning model is trained to generate an inference of whether the skin of the patient is indicative of the lupus, PSO, AD and/or SSc disease state of the patient, based at least on the GSVA scores.
70. The method of any one of embodiments 68 to 69, wherein the method further comprises receiving, as an output of the trained machine-learning model, the inference indicating whether the skin of the patient is indicative of the lupus, PSO, AD, and/or SSc disease state of the patent, based at least on the enrichment score of the patient.
71. The method of any one of embodiments 68 to 70, wherein the machine learning model is trained using a linear regression, a logistic regression, a Ridge regression, a Lasso regression, an elastic net (EN) regression, a support vector machine (SVM), a gradient boosted machine (GBM), a k nearest neighbors (kNN), a generalized linear model (GLM), a naïve Bayes (NB) classifier, a neural network, a Random Forest (RF), a deep learning algorithm, a linear discriminant analysis (LDA), a decision tree learning (DTREE), an adaptive boosting (ADB), Classification and Regression Tree (CART), or any combination thereof.
72. The method of any one of embodiments 65 to 71, wherein the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4B-8, Table 4B-25, Table 4B-14, Table 4A-16, Table 4B-22, Table 4B-10, Table 4A-11, Table 4B-16, Table 4B-26, Table 4A-1, Table 4A-19, Table 4A-15, Table 4B-28, Table 4B-15, and Table 4B-23, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus disease state of the patient.
73. The method of any one of embodiments 65 to 71, wherein the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4B-26, Table 4A-8, Table 4A-14, Table 4A-16, Table 4B-11, Table 4A-1, Table 4B-6, Table 4A-10, Table 4B-10, Table 4B-16, Table 4B-2, Table 4B-19, Table 4B-13, Table 4B-1, and Table 4B-25, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus disease state of the patient.
74. The method of embodiment 72 or 73, comprising administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the lupus disease state of the patient.
75. The method of any one of embodiments 65 to 71, wherein the skin of the patient contains one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4B-10, Table 4B-25, Table 4B-8, Table 4B-22, Table 4B-28, Table 4B-16, Table 4A-16, Table 4B-14, Table 4B-13, Table 4B-23, Table 4B-7, Table 4B-15, Table 4A-12, Table 4B-3, and Table 4B-2, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD disease state of the patient.
76. The method of any one of embodiments 65 to 71, wherein the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4B-17, Table 4B-28, Table 4A-6, Table 4A-7, Table 4B-2, Table 4B-20, Table 4A-9, Table 4B-18, Table 4A-12, Table 4A-16, Table 4A-13, Table 4B-23, Table 4B-9, Table 4A-3, and Table 4A-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD disease state of the patient.
77. The method of embodiment 75 or 76, comprising administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the AD disease state of the patient.
78. The method of any one of embodiments 65 to 71, wherein the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4B-3, Table 4B-25, Table 4B-10, Table 4B-16, Table 4B-8, Table 4B-14, Table 4B-2, Table 4A-7, Table 4B-28, Table 4B-23, Table 4B-20, Table 4B-26, Table 4A-13, Table 4B-18, and Table 4A-16, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the PSO disease state of the patient.
79. The method of any one of embodiments 65 to 71, wherein the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4B-1, Table 4B-3, Table 4B-12, Table 4A-14, Table 4A-20, Table 4B-17, Table 4B-20, Table 4B-27, Table 4A-9, Table 4A-15, Table 4A-18, Table 4A-13, Table 4B-26, Table 4B-2, and Table 4A-5, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the PSO disease state of the patient.
80. The method of embodiment 78 or 79, comprising administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the PSO disease state of the patient.
81. The method of any one of embodiments 65 to 71, wherein the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4A-16, Table 4B-8, Table 4B-25, Table 4B-21, Table 4B-26, Table 4B-10, Table 4B-28, Table 4B-2, Table 4B-27, Table 4B-14, Table 4A-18, Table 4A-6, Table 4A-15, Table 4B-12, and Table 4B-23, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the SSc disease state of the patient.
82. The method of embodiment 81, comprising administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the SSc disease state of the patient.
83. The method of any one of embodiments 65 to 71, wherein the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4A-11, Table 4B-13, Table 4A-17, and Table 4B-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient.
84. The method of any one of embodiments 65 to 71, wherein the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4B-1, Table 4A-10, Table 4A-12, Table 4B-13, and Table 4B-15, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient.
85. The method of embodiment 83 or 84, comprising administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the lupus or AD disease state of the patient.
86. The method of any one of embodiments 65 to 71, wherein the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, Table 4B-5, and Table 4B-7, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient.
87. The method of any one of embodiments 65 to 71, wherein the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, Table 4B-24, Table 4B-22, Table 4A-7, Table 4B-10, and Table 4A-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient.
88. The method of embodiment 86 or 87, comprising administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the lupus or PSO disease state of the patient.
89. The method of any one of embodiments 65 to 71, wherein the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4A-15, Table 4B-13, and Table 4A-8 and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state of the patient.
90. The method of embodiment 89, comprising administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the lupus or SSc disease state of the patient.
91. A method for developing a trained machine learning model capable of assessing skin of a patient, the method comprising:
92. The method of embodiment 91, wherein the method further comprises,
93. The method of embodiment 91 or 92, wherein the N predictors have top N feature importance values.
94. The method of any one of embodiments 91 to 93, wherein the step (a) further comprises normalizing the data set prior to the enrichment assessment.
95. The method of embodiment 94, wherein the data set is normalized using Z-score normalization method.
96. The method of any one of embodiments 91 to 95, wherein the first machine learning model, and/or the second machine learning model independently classifies the skin of the patient indicative of the lupus, PSO, AD, and/or SSc disease state of the patient, with an accuracy of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
97. The method of any one of embodiments 91 to 96, wherein the first machine learning model, and/or the second machine learning model independently classifies the skin of the patient indicative of the lupus, PSO, AD, and/or SSc disease state of the patient, with a sensitivity of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
98. The method of any one of embodiments 91 to 97, wherein the first machine learning model, and/or the second machine learning model independently classifies the skin of the patient indicative of the lupus, PSO, AD, and/or SSc disease state of the patient, with a specificity of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
99. The method of any one of embodiments 91 to 98, wherein the first machine learning model, and/or the second machine learning model independently classifies the skin of the patient indicative of the lupus, PSO, AD, and/or SSc disease state of the patient, with a positive predictive value of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
100. The method of any one of embodiments 91 to 99, wherein the first machine learning model, and/or the second machine learning model independently classifies the skin of the patient indicative of the lupus, PSO, AD, and/or SSc disease state of the patient, with a negative predictive value of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
101. The method of any one of embodiments 91 to 100, wherein the first machine learning model, and/or the second machine learning model independently classifies the skin of the patient indicative of the lupus, PSO, AD, and/or SSc disease state of the patient, with a Receiver operating characteristic curve (ROC) having an Area-Under-Curve (AUC) of at least about 0.80, at least about 0.85, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least about 0.98, at least about 0.99, or more than about 0.99.
102. The method of any one of embodiments 91 to 101, wherein the first machine learning model and/or second machine learning model is independently trained using a linear regression, a logistic regression, a Ridge regression, a Lasso regression, an elastic net (EN) regression, a support vector machine (SVM), a gradient boosted machine (GBM), a k nearest neighbors (kNN), a generalized linear model (GLM), a naïve Bayes (NB) classifier, a neural network, a Random Forest (RF), a deep learning algorithm, a linear discriminant analysis (LDA), a decision tree learning (DTREE), an adaptive boosting (ADB), Classification and Regression Tree (CART), or any combination thereof.
103. The method of any one of embodiments 91 to 102, wherein in step (a) the enrichment assessment of the data set is performed using gene set variation analysis (GSVA), gene set enrichment analysis (GSEA), enrichment algorithm, multiscale embedded gene co-expression network analysis (MEGENA), weighted gene co-expression network analysis (WGCNA), differential expression analysis, log 2 expression analysis, or any combination thereof.
104. The method of any one of embodiments 91 to 103, wherein in step (a) the enrichment assessment of the data set is performed using GSVA.
105. The method of embodiment 104, wherein an enrichment score of a patient of the plurality of patients comprises one or more GSVA scores of the patient, wherein the one or more GSVA scores are generated from gene expression data of the patient using one or more Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28, wherein for each of the one or more selected Tables at least one GSVA score of the patient is generated based on enrichment of expression of at least 2 genes listed in the Table.
106. The method of embodiment 105, wherein the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, or 48, Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28.
107. The method of any one of embodiments 105 to 106, wherein for each respective Table of the one or more Tables, the at least one GSVA score is generated, for enrichment of expression of independently at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295 genes listed in the respective Table.
108. The method of any one of embodiments 91 to 107, wherein the N is an integer from 3 to 40.
109. The method of embodiment 107 or 108, wherein the N predictors have top 10 to 20 feature importance values.
110. A method for developing a trained machine learning model capable of characterizing a disease state, the method comprising:
111. The method of embodiment 110, wherein the method further comprises,
112. The method of embodiment 110 or 111, wherein the N predictors have top N feature importance values.
113. The method of any one of embodiments 110 to 112, wherein the step (a) further comprises normalizing the data set prior to enrichment assessment.
114. The method of embodiment 113, wherein the data set is normalized using Z-score normalization method.
115. The method of any one of embodiments 110 to 114, wherein the first machine learning model, and/or the second machine learning model independently classifies the patient having the disease state with an accuracy of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
116. The method of any one of embodiments 110 to 115, wherein the first machine learning model, and/or the second machine learning model independently classifies the patient having the disease state with a sensitivity of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
117. The method of any one of embodiments 110 to 116, wherein the first machine learning model, and/or the second machine learning model independently classifies the patient having the disease state with a specificity of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
118. The method of any one of embodiments 110 to 117, wherein the first machine learning model, and/or the second machine learning model independently classifies the patient having the disease state with a positive predictive value of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
119. The method of any one of embodiments 110 to 118, wherein the first machine learning model, and/or the second machine learning model independently classifies the patient having the disease state with a negative predictive value of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
120. The method of any one of embodiments 110 to 119, wherein the first machine learning model, and/or the second machine learning model independently classifies the patient having the disease state with a Receiver operating characteristic curve (ROC) having an Area-Under-Curve (AUC) of at least about 0.80, at least about 0.85, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least about 0.98, at least about 0.99, or more than about 0.99.
121. The method of any one of embodiments 110 to 120, wherein the first machine learning model and/or second machine learning model is independently trained using a linear regression, a logistic regression, a Ridge regression, a Lasso regression, an elastic net (EN) regression, a support vector machine (SVM), a gradient boosted machine (GBM), a k nearest neighbors (kNN), a generalized linear model (GLM), a naïve Bayes (NB) classifier, a neural network, a Random Forest (RF), a deep learning algorithm, a linear discriminant analysis (LDA), a decision tree learning (DTREE), an adaptive boosting (ADB), Classification and Regression Tree (CART), or any combination thereof.
122. The method of any one of embodiments 110 to 121, wherein in step (a) the enrichment assessment of the data set is performed using gene set variation analysis (GSVA), gene set enrichment analysis (GSEA), enrichment algorithm, multiscale embedded gene co-expression network analysis (MEGENA), weighted gene co-expression network analysis (WGCNA), differential expression analysis, log 2 expression analysis, or any combination thereof.
123. The method of any one of embodiments 110 to 122, wherein in step (a) the enrichment assessment of the data set is performed using GSVA, and an enrichment score of a patient comprises one or more GSVA scores of the patient.
124. A method for determining a gene set capable of assessing skin of a patient, the method comprising:
125. The method of embodiment 124, wherein in step (b), the feature contribution of the one or more features of the first machine learning model is determined using a SHapley Additive exPlanations (SHAP) method.
126. The method of embodiment 124 or 125, wherein the feature contribution of the one or more features were determined based on SHAP values for the set of features and the N features were selected based on the SHAP values.
127. The method of any one of embodiments 124 to 126, wherein N is an integer from 2 to 40.
128. The method of any one of embodiments 124 to 127, wherein N is an integer from 10 to 20.
129. The method of any one of embodiments 124 to 128, wherein the enrichment score of the reference patient comprises at least one table-specific enrichment score from each of Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28.
130. The method of any one of embodiments 124 to 129, wherein independently for each of the selected Tables, the at least one table-specific enrichment score for a respective selected Table is generated based on enrichment assessment of expression of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, or all genes listed in the respective selected Table.
131. The method of any one of embodiments 124 to 130, wherein the enrichment score of the reference patient comprises one table-specific enrichment score from each of the selected Tables.
132. The method of any one of embodiments 124 to 131, wherein the enrichment assessment is performed using gene set variation analysis (GSVA), gene set enrichment analysis (GSEA), enrichment algorithm, multiscale embedded gene co-expression network analysis (MEGENA), weighted gene co-expression network analysis (WGCNA), differential expression analysis, log 2 expression analysis, or any combination thereof.
133. The method of any one of embodiments 124 to 132, wherein the enrichment assessment is performed using GSVA, and one or more table-specific enrichment scores comprises one or more GSVA scores.
134. The method of any one of embodiments 124 to 133, wherein the first machine learning model is trained using linear regression, logistic regression, Ridge regression, Lasso regression, elastic net (EN) regression, support vector machine (SVM), gradient boosted machine (GBM), k nearest neighbors (kNN), generalized linear model (GLM), naïve Bayes (NB) classifier, neural network, Random Forest (RF), deep learning algorithm, linear discriminant analysis (LDA), decision tree learning (DTREE), adaptive boosting (ADB), Classification and Regression Tree (CART), or any combination thereof.
135. The method of any one of embodiments 124 to 134, wherein the biological sample comprises a skin biopsy sample, a blood sample, isolated peripheral blood mononuclear cells (PBMCs), or any derivative thereof.
136. The method of any one of embodiments 124 to 135, wherein the plurality of individual reference data sets are obtained from a plurality of reference patients.
137. The method of embodiment 136, wherein the first machine learning model is trained to infer whether skin of a patient is indicative of lupus disease state, and a first portion of the plurality of reference patients have lupus, and a second portion of the plurality of reference patients are healthy control.
138. The method of embodiment 137, wherein the first portion of the plurality of reference patients have one or more skin lesions.
139. The method of embodiment 137, wherein the first portion of the plurality of reference patients do not have a skin lesion.
140. The method of embodiment 136, wherein the first machine learning model is trained to infer whether skin of a patient is indicative of AD disease state, and a first portion of the plurality of reference patients have AD, and a second portion of the plurality of reference patients are healthy control.
141. The method of embodiment 140, wherein the first portion of the plurality of reference patients have one or more skin lesions.
142. The method of embodiment 140, wherein the first portion of the plurality of reference patients do not have a skin lesion.
143. The method of embodiment 136, wherein the first machine learning model is trained to infer whether skin of a patient is indicative of PSO disease state, and a first portion of the plurality of reference patients have PSO, and a second portion of the plurality of reference patients are healthy control.
144. The method of embodiment 143, wherein the first portion of the plurality of reference patients have one or more skin lesions.
145. The method of embodiment 143, wherein the first portion of the plurality of reference patients do not have a skin lesion.
146. The method of embodiment 136, wherein the first machine learning model is trained to infer whether skin of a patient is indicative of SSc disease state, and a first portion of the plurality of reference patients have SSc, and a second portion of the plurality of reference patients are healthy control.
147. The method of embodiment 146, wherein the first portion of the plurality of reference patients have one or more skin lesions.
148. The method of embodiment 146, wherein the first portion of the plurality of reference patients do not have a skin lesion.
149. The method of embodiment 136, wherein the first machine learning model is trained to infer whether skin of a patient is indicative of lupus disease state or PSO disease state, and a first portion of the plurality of reference patients have lupus, and a second portion of the plurality of reference patients have PSO.
150. The method of embodiment 149, wherein the plurality of reference patients have one or more skin lesions.
151. The method of embodiment 149, wherein the reference patients do not have a skin lesion.
152. The method of embodiment 136, wherein the first machine learning model is trained to infer whether skin of a patient is indicative of lupus disease state or AD disease state, and a first portion of the plurality of reference patients have lupus, and a second portion of the plurality of reference patients have AD.
153. The method of embodiment 152, wherein the plurality of reference patients have one or more skin lesions.
154. The method of embodiment 152, wherein the reference patients do not have a skin lesion.
155. The method of embodiment 136, wherein the first machine learning model is trained to infer whether skin of a patient is indicative of lupus disease state or SSc disease state, and a first portion of the plurality of reference patients have lupus, and a second portion of the plurality of reference patients have SSc.
156. The method of embodiment 155, wherein the plurality of reference patients have one or more skin lesions.
157. The method of embodiment 155, wherein the reference patients do not have a skin lesion.
158. The method of any one of embodiments 124 to 157, further comprising reducing dimensionality of the first machine learning model by at least:
159. The method of embodiment 158, wherein the second machine learning model is trained with the reference data set or a second reference data set, wherein the second reference data set comprises a second plurality of individual reference data sets, wherein a second respective individual reference data set of the second plurality of individual reference data sets comprises i) the second enrichment score of a second reference patient, and ii) second data regarding whether skin of the second reference patient is indicative of the disease state selected from lupus disease state, PSO disease state, AD disease state, or SSc disease state.
160. A method for developing a trained machine learning model capable of assessing skin of a patient, the method comprising:
161. A method for determining a gene set capable of assessing a disease state of a patient, the method comprising:
162. The method of embodiment 161, wherein in step (b), the feature contribution of the one or more features of the first machine learning model is determined using a SHapley Additive exPlanations (SHAP) method.
163. The method of embodiment 161 or 162, wherein the feature contribution of the one or more features were determined based on determining SHAP values for the set of features and the N features were selected based on the SHAP values.
164. The method of any one of embodiments 161 to 163, wherein N is an integer from 2 to 40.
165. The method of any one of embodiments 161 to 164, wherein N is an integer from 10 to 20.
166. The method of any one of embodiments 161 to 165, wherein independently for each of the selected Tables, the at least one table-specific enrichment score for a respective selected Table is generated based on enrichment assessment of expression of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300 or all, genes listed in the respective selected Table.
167. The method of any one of embodiments 161 to 166, wherein the enrichment score of the reference patient comprises one table-specific enrichment score for each of the selected Tables.
168. The method of any one of embodiments 161 to 167, wherein the enrichment assessment is performed using gene set variation analysis (GSVA), gene set enrichment analysis (GSEA), enrichment algorithm, multiscale embedded gene co-expression network analysis (MEGENA), weighted gene co-expression network analysis (WGCNA), differential expression analysis, log 2 expression analysis, or any combination thereof.
169. The method of any one of embodiments 161 to 168, wherein the enrichment assessment is performed using GSVA, and one or more table-specific enrichment scores comprises one or more GSVA scores.
170. The method of any one of embodiments 161 to 169, wherein the first machine learning model is trained using linear regression, logistic regression, Ridge regression, Lasso regression, elastic net (EN) regression, support vector machine (SVM), gradient boosted machine (GBM), k nearest neighbors (kNN), generalized linear model (GLM), naïve Bayes (NB) classifier, neural network, Random Forest (RF), deep learning algorithm, linear discriminant analysis (LDA), decision tree learning (DTREE), adaptive boosting (ADB), Classification and Regression Tree (CART), or any combination thereof.
171. The method of any one of embodiments 161 to 170, wherein the biological sample comprises a tissue sample, a blood sample, an isolated peripheral blood mononuclear cells (PBMCs), or any derivative thereof.
172. The method of any one of embodiments 161 to 171, wherein the plurality of individual reference data sets are obtained from a plurality of reference patients.
173. The method of any one of embodiments 161 to 172, further comprising reducing dimensionality of the first machine learning model by at least:
174. The method of embodiment 173, wherein the second machine learning model is trained with the reference data set or a second reference data set, wherein the second reference data set comprises a second plurality of individual reference data sets, wherein a second respective individual reference data set of the second plurality of individual reference data sets comprises i) the second enrichment score of a second reference patient, and ii) second data regarding the disease state of the patient.
175. A method for developing a trained machine learning model capable of a disease state of a patient, the method comprising:
176. A method for assessing a skin of a patient, the method comprising: analyzing a data set comprising or derived from gene expression measurements of at least 2 genes selected from the genes listed in Table 1, Table 2, Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, Table 4B-28, Table 4C and Table 4D, in a biological sample from the patient, to classify the skin of the patient as indicative of a lupus, PSO, AD, and/or SSc disease state of the patient.
177. The method of embodiment 176, wherein the at least 2 genes are selected from the genes listed in Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28.
178. The method of embodiments 176 or 177, wherein the at least 2 genes comprises at least 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 305, 310, 315, 320, 325, 330, 335, 340, 345, 350, 355, 360, 365, 370, 375, 380, 385, 390, 395, 400, 450, 500, 550, 600, 650, 700, 750, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500, 1550, 1600, or 1650 genes.
179. The method of any one of embodiments 176 to 178, wherein the at least 2 genes comprises independently at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 305, 310, 315, 320, 325, 330, 335, 340, 345, 350, 355, 360, or all, or any value or range there between genes selected from the genes listed in each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, or 48 Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28.
180. The method of any one of embodiments 176 to 179, comprising classifying the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state of the patient with an accuracy of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
181. The method of any one of embodiments 176 to 180, comprising classifying the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state of the patient with a sensitivity of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
182. The method of any one of embodiments 176 to 181, comprising classifying the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state of the patient with a specificity of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
183. The method of any one of embodiments 176 to 182, comprising classifying the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state of the patient with a positive predictive value of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
184. The method of any one of embodiments 176 to 183, comprising classifying the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state of the patient with a negative predictive value of at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%.
185. The method of any one of embodiments 176 to 184, comprising classifying the skin of the patient as indicative of the lupus, PSO, AD, and/or SSc disease state of the patient with a Receiver operating characteristic (ROC) curve having an Area-Under-Curve (AUC) of at least about 0.80, at least about 0.85, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least about 0.98, at least about 0.99, or more than about 0.99.
186. The method of any one of embodiments 176 to 185, wherein the patient has lupus, PSO, AD, or SSc.
187. The method of any one of embodiments 176 to 185, wherein the patient is at elevated risk of having lupus, PSO, AD, or SSc.
188. The method of any one of embodiments 176 to 185, wherein the patient is suspected of having lupus, PSO, AD, or SSc.
189. The method of any one of embodiments 176 to 185, wherein the patient is asymptomatic for lupus, PSO, AD, or SSc.
190. The method of any one of embodiments 176 to 189, further comprising administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of lupus, PSO, AD, or SSc disease state of the patient.
191. The method of embodiment 190, wherein the treatment is configured to treat lupus, PSO, AD, or SSc of the patient.
192. The method of embodiment 190, wherein the treatment is configured to reduce a severity of lupus, PSO, AD, or SSc of the patient.
193. The method of embodiment 190, wherein the treatment is configured to reduce a risk of having lupus, PSO, AD, or SSc of the patient.
194. The method of any one of embodiments 190 to 193, wherein the treatment comprises a pharmaceutical composition.
195. The method of any one of embodiments 176 to 194, wherein the biological sample comprises a skin biopsy sample, a blood sample, isolated peripheral blood mononuclear cells (PBMCs), or any derivative thereof.
196. The method of any one of embodiments 176 to 195, wherein the data set is derived from the gene expression measurement data using gene set variation analysis (GSVA), gene set enrichment analysis (GSEA), enrichment algorithm, multiscale embedded gene co-expression network analysis (MEGENA), weighted gene co-expression network analysis (WGCNA), differential expression analysis, Z-score, log 2 expression analysis, or any combination thereof.
197. The method of any one of embodiments 176 to 196, wherein the data set is derived from the gene expression measurement data using GSVA.
198. The method of embodiment 197, wherein the data set comprises one or more GSVA scores of the patient, wherein the one or more GSVA scores are generated using one or more of the Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28, wherein for a respective Table, at least one GSVA score of the patient is generated for enrichment of expression of at least 2 genes listed in the respective Table, and the one or more GSVA scores comprises the at least one GSVA score from each of the selected Table.
199. The method of embodiment 198, wherein the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, or 48, Tables selected from Table 4A-1, Table 4A-2, Table 4A-3, Table 4A-4, Table 4A-5, Table 4A-6, Table 4A-7, Table 4A-8, Table 4A-9, Table 4A-10, Table 4A-11, Table 4A-12, Table 4A-13, Table 4A-14, Table 4A-15, Table 4A-16, Table 4A-17, Table 4A-18, Table 4A-19, Table 4A-20, Table 4B-1, Table 4B-2, Table 4B-3, Table 4B-4, Table 4B-5, Table 4B-6, Table 4B-7, Table 4B-8, Table 4B-9, Table 4B-10, Table 4B-11, Table 4B-12, Table 4B-13, Table 4B-14, Table 4B-15, Table 4B-16, Table 4B-17, Table 4B-18, Table 4B-19, Table 4B-20, Table 4B-21, Table 4B-22, Table 4B-23, Table 4B-24, Table 4B-25, Table 4B-26, Table 4B-27, and Table 4B-28.
200. The method of embodiment 198 or embodiment 199, wherein independently for each Table of the one or more Tables selected, the at least one GSVA score is generated, for enrichment of expression of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, or 295 genes listed in the respective Table.
201. The method of any one of embodiments 176 to 200, wherein the analyzing the data set comprises using a trained machine learning model to classify the skin of the patient as indicative of the lupus, PSO, AD and/or SSc disease state, wherein the trained machine-learning model is trained to generate an inference of whether the skin of the patient is indicative of the lupus, PSO, AD and/or SSc disease state of the patient, based at least on the data set.
202. The method of embodiment 201, wherein the analyzing he analyzing the data set comprises providing the one or more GSVA scores of the patient as an input to the trained machine-learning model, wherein the trained machine-learning model is trained to generate the inference based at least on the GSVA scores.
203. The method of any one of embodiments 201 to 202, wherein the method further comprises receiving, as an output of the trained machine learning model, the inference; and/or electronically outputting a report indicating whether the skin of the patient is indicative of the lupus, PSO, AD, and/or SSc disease state.
204. The method of any one of embodiments 201 to 203, wherein the machine learning model is trained using linear regression, logistic regression, Ridge regression, Lasso regression, elastic net (EN) regression, support vector machine (SVM), gradient boosted machine (GBM), k nearest neighbors (kNN), generalized linear model (GLM), naïve Bayes (NB) classifier, neural network, Random Forest (RF), deep learning algorithm, linear discriminant analysis (LDA), decision tree learning (DTREE), adaptive boosting (ADB), Classification and Regression Tree (CART), or any combination thereof.
205. The method of any one of embodiments 176 to 204, wherein the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4B-8, Table 4B-25, Table 4B-14, Table 4A-16, Table 4B-22, Table 4B-10, Table 4A-11, Table 4B-16, Table 4B-26, Table 4A-1, Table 4A-19, Table 4A-15, Table 4B-28, Table 4B-15, and Table 4B-23, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus disease state of the patient.
206. The method of any one of embodiments 176 to 204, wherein the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4B-26, Table 4A-8, Table 4A-14, Table 4A-16, Table 4B-11, Table 4A-1, Table 4B-6, Table 4A-10, Table 4B-10, Table 4B-16, Table 4B-2, Table 4B-19, Table 4B-13, Table 4B-1, and Table 4B-25, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus disease state of the patient.
207. The method of embodiment 205 or 206, comprising administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the lupus disease state of the patient.
208. The method of any one of embodiments 176 to 204, wherein the skin of the patient contains one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4B-10, Table 4B-25, Table 4B-8, Table 4B-22, Table 4B-28, Table 4B-16, Table 4A-16, Table 4B-14, Table 4B-13, Table 4B-23, Table 4B-7, Table 4B-15, Table 4A-12, Table 4B-3, and Table 4B-2, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD disease state of the patient.
209. The method of any one of embodiments 176 to 204, wherein the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4B-17, Table 4B-28, Table 4A-6, Table 4A-7, Table 4B-2, Table 4B-20, Table 4A-9, Table 4B-18, Table 4A-12, Table 4A-16, Table 4A-13, Table 4B-23, Table 4B-9, Table 4A-3, and Table 4A-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the AD disease state of the patient.
210. The method of embodiment 208 or 209, comprising administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the AD disease state of the patient.
211. The method of any one of embodiments 176 to 204, wherein the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4B-3, Table 4B-25, Table 4B-10, Table 4B-16, Table 4B-8, Table 4B-14, Table 4B-2, Table 4A-7, Table 4B-28, Table 4B-23, Table 4B-20, Table 4B-26, Table 4A-13, Table 4B-18, and Table 4A-16, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the PSO disease state of the patient.
212. The method of any one of embodiments 176 to 204, wherein the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4B-1, Table 4B-3, Table 4B-12, Table 4A-14, Table 4A-20, Table 4B-17, Table 4B-20, Table 4B-27, Table 4A-9, Table 4A-15, Table 4A-18, Table 4A-13, Table 4B-26, Table 4B-2, and Table 4A-5, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the PSO disease state of the patient.
213. The method of embodiment 211 or 212, comprising administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the PSO disease state of the patient.
214. The method of any one of embodiments 176 to 204, wherein the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4A-16, Table 4B-8, Table 4B-25, Table 4B-21, Table 4B-26, Table 4B-10, Table 4B-28, Table 4B-2, Table 4B-27, Table 4B-14, Table 4A-18, Table 4A-6, Table 4A-15, Table 4B-12, and Table 4B-23, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the SSc disease state of the patient.
215. The method of embodiment 214, comprising administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the SSc disease state of the patient.
216. The method of any one of embodiments 176 to 204, wherein the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4B-7, Table 4B-27, Table 4A-8, Table 4A-9, Table 4B-3, Table 4A-10, Table 4A-4, Table 4B-4, Table 4B-1, Table 4A-15, Table 4B-8, Table 4A-11, Table 4B-13, Table 4A-17, and Table 4B-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient.
217. The method of any one of embodiments 176 to 204, wherein the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4B-16, Table 4A-14, Table 4B-26, Table 4A-1, Table 4A-15, Table 4B-10, Table 4B-25, Table 4A-8, Table 4A-16, Table 4B-28, Table 4B-1, Table 4A-10, Table 4A-12, Table 4B-13, and Table 4B-15, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or AD disease state of the patient.
218. The method of embodiment 216 or 217, comprising administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the lupus or AD disease state of the patient.
219. The method of any one of embodiments 176 to 204, wherein the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4B-1, Table 4A-4, Table 4A-7, Table 4A-14, Table 4A-6, Table 4B-3, Table 4B-20, Table 4A-16, Table 4A-15, Table 4B-18, Table 4B-11, Table 4A-11, Table 4B-17, Table 4B-5, and Table 4B-7, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient.
220. The method of any one of embodiments 176 to 204, wherein the skin of the patient does not comprise a lesion, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4A-14, Table 4B-1, Table 4A-16, Table 4A-15, Table 4B-16, Table 4A-12, Table 4A-8, Table 4A-1, Table 4B-25, Table 4B-26, Table 4B-24, Table 4B-22, Table 4A-7, Table 4B-10, and Table 4A-10, and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or PSO disease state of the patient.
221. The method of embodiment 219 or 220, comprising administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the SLE or PSO disease state of the patient.
222. The method of any one of embodiments 176 to 204, wherein the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4A-20, Table 4B-27, Table 4B-11, Table 4B-8, Table 4A-4, Table 4A-19, Table 4A-9, Table 4B-20, Table 4B-16, Table 4B-7, Table 4B-21, Table 4B-23, Table 4A-15, Table 4B-13, and Table 4A-8 and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the lupus or SSc disease state of the patient.
223. The method of embodiment 222, comprising administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the lupus or SSc disease state of the patient.
224. The method of any one of embodiments 176 to 204, wherein the skin of the patient comprises one or more lesions, and the one or more Tables comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 Tables selected from the group consisting of Table 4A-16, Table 4B-26, Table 4B-25, Table 4B-2, Table 4B-22, Table 4B-14, Table 4A-13, Table 4A-15, Table 4B-4, Table 4B-9, Table 4A-10, Table 4A-12, Table 4B-6, Table 4B-1, and Table 4A-5 and the one or more GSVA scores of the patient is analyzed to classify the skin of the patient as indicative of the DLE or SCLE disease state of the patient.
225. The method of embodiment 224, comprising administering a treatment to the patient based at least in part on the classification of the skin of the patient as indicative of the DLE or SCLE disease state of the patient.
Unless otherwise defined, all technical terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.
As used herein, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise. Any reference to “or” herein is intended to encompass “and/or” unless otherwise stated.
As used herein, the term “about” refers to an amount that is near the stated amount by 10%, 5%, or 1%, including increments therein.
As used herein, the phrases “at least one”, “one or more”, and “and/or” are open-ended expressions that are both conjunctive and disjunctive in operation. For example, each of the expressions “at least one of A, B and C”, “at least one of A, B, or C”, “one or more of A, B, and C”, “one or more of A, B, or C” and “A, B, and/or C” means A alone, B alone, C alone, A and B together, A and C together, B and C together, or A, B and C together.
As used herein, the term “Gini impurity” refers to a measure of how often a randomly chosen element from the set may be incorrectly labeled if it is randomly labeled according to the distribution of labels in the subset.
As used herein the term “lesion” refers to a potential disease lesion, e.g., a skin lesion potentially associated with and/or potentially directly resulting from lupus, psoriasis, atopic dermatitis, systemic sclerosis (scleroderma), or a combination thereof, as determined by one of skill in the art. In some embodiments, the lesion does not include a traumatic injury, e.g., a cut, scrape, scratch, burn, etc., and/or a skin affliction of any known origin not associated with a disease state indicated by the skin classification, e.g., contact dermatitis, a food allergy, and/or a drug reaction. In some embodiments, the skin lesion does not include a lesion that is not potentially associated with and/or potentially directly resulting from lupus, psoriasis, atopic dermatitis, systemic sclerosis (scleroderma), or a combination thereof.
Reference in the specification to “embodiments,” “certain embodiments,” “preferred embodiments,” “specific embodiments,” “some embodiments,” “an embodiment,” “one embodiment” or “other embodiments” mean that a particular feature, structure, or characteristic described in connection with the embodiments is included in at least some embodiments, but not necessarily all embodiments, of the present disclosure.
Many complex and multi-systematic diseases and conditions currently pose major diagnostic and therapeutic challenges. Despite the wealth of records from, for example, genetic, epigenetic, and gene expression data that has emerged in the past few years, physicians often still rely on clinical evaluation and laboratory tests, including measurement of autoantibodies and complement levels.
Successful relation of records (e.g., gene expression records) to a specific disease phenotype activity has been attempted, including efforts to identify individual genes that predicted subsequent flares, and through the determination of a discrete group of differentially expressed (DE) genes that may be found in a particular record. Despite these advances, however, no such approach is available with sufficient predictive value to utilize in evaluation and treatment.
As such, there is a need for a predictive tool for evaluating patient at both the chemical and cellular levels to advance personalized treatment. Data analytical techniques such as machine learning enable proper correlation between genetic records and phenotypes.
The machine learning models tested here provide the basis of personalized medicine. Integration of the methods herein with emerging high-throughput record sampling technologies may unlock the potential to develop a simple blood test to predict phenotypic activity. The disclosures herein may be generalized to predict other manifestations, such as organ involvement. A better understanding of the cellular processes that drive pathogenesis may eventually lead to customized therapeutic strategies based on records' unique patterns of cellular activation.
One aspect disclosed herein is a method of identifying one or more records (e.g., raw gene expression data, whole gene expression data, blood gene expression data, or informative gene modules). The method may comprise receiving a plurality of first records, receiving a plurality of second records, receiving a plurality of third records, applying a machine learning algorithm to at least one first record and at least one second record to determine a classifier (e.g., a machine learning classifier), and applying the classifier to the plurality of third records. Applying the classifier to the plurality of third records may identify one or more third records associated with the specific phenotype. In some embodiments, applying a machine learning algorithm to the third data set comprises applying a machine learning algorithm to a plurality of unique third data sets.
The records may comprise, for example, raw gene expression data, whole gene expression data, blood gene expression data, informative gene modules, or any combination thereof. The records may be generated by Weighted Gene Co-expression Network Analysis (WGCNA). In some embodiments, at least one of the first records and the second records comprise nucleic acid sequencing data, transcriptome data, genome data, epigenome data, proteome data, metabolome data, virome data, methylome data, lipidomic data, lineage-ome data, nucleosomal occupancy data, a genetic variant, a gene fusion, an insertion or deletion (indel), or any combination thereof. In some embodiments, the first records and the second records are in different formats. In some embodiments, the first records and the second records are from different sources, different studies, or both.
In some embodiments each record is associated with a specific phenotype (e.g., a disease state, an organ involvement, or a medication response). Each first record may be associated with one or more of a plurality of phenotypes. The plurality of second records and the plurality of first records may be non-overlapping. The third records may be distinct from the plurality of first records, the plurality of second records, or both. The third records may comprise a plurality of unique third data sets.
The records may be received from the Gene Expression Omnibus (GEO, publicly available from the National Center for Biotechnology Information, e.g., on the website operated by National Library of Medicine, National Institutes of Health). The records may be associated with purified cell populations, whole blood gene expression, or both. A data set may comprise records comprising microarray, next-generation sequencing, and any other form of high-throughput functional genomic data known to those of skill in the art. The records received from a Gene Expression Omnibus source may comprise GSE10325, GSE26975, GSE38351, GSE39088, GSE45291, GSE49454, GSE72535, GSE52471, GSE81071, GSE109248, GSE100093, GSE120809, GSE117239, GSE117468, GSE130588, GSE58095, GSE95065, GSE121212, GSE137430, GSE157194, GSE130955, or any combination thereof. The records received from a Gene Expression Omnibus source may comprise GSE32583, GSE49898, GSE72410, GSE153021, GSE32591, GSE86423, GSE8642, or any combination thereof.
For example, as the most important genes may be involved in a number of functions other than interferon signaling, such RNA processing, ubiquitylation, and mitochondrial processes, these pathways may play important roles in directing, or at least be indicative of, phenotypic activity. CD4 T cells originally may contribute the most important modules. However, when the modules are de-duplicated, CD14 monocyte-derived modules prove important as unique genes expressed by CD14 monocytes in tandem with interferon genes may be informative in the study of cell-specific methods of pathogenesis.
In some embodiments, the phenotype comprises a disease state, an organ involvement a medication response, or any combination thereof. The disease state may comprise an active disease state, or an inactive disease state. At least one of the active disease state and the inactive disease state may be characterized by standard clinical composite outcome measures. The active disease state may comprise a Disease Activity Index of 6 or greater.
The disease may comprise an acute disease, a chronic disease, a clinical disease, a flare-up disease, a progressive disease, a refractory disease, a subclinical disease, or a terminal disease. The disease may comprise a localized disease, a disseminated disease, or a systemic disease. The disease may comprise an immune disease, a cancer, a genetic disease, a metabolic disease, an endocrine disease, a neurological disease, a musculoskeletal disease, or a psychiatric disease. The active disease state may comprise a Systemic Lupus Erythematosus Disease Activity Index (SLEDAI) of 6 or greater.
The organ involvement may comprise a possibly involved organ. The possibly involved organ may comprise bone, skin, hematopoietic system, spleen, liver, lung, mucosa, eye, ear, pituitary, or any combination thereof. The medication response may comprise an ultra-rapid metabolizer response, an extensive metabolizer response, an intermediate metabolizer response, or a poor metabolizer response. The ultra-rapid metabolizer response may refer to a record with substantially increased metabolic activity. The extensive metabolizer response may refer to a record with normal metabolic activity. The intermediate metabolizer response may refer to a record with reduced metabolic activity. The poor metabolizer response may refer to a record with little to no functional metabolic activity.
The classifiers described herein may be used in machine learning algorithms. A variety of machine learning classifiers exist, wherein each classifier produces a unique machine learning process and/or output. The machine learning algorithms may comprise a biased algorithm or an unbiased algorithm. The biased algorithm may comprise Gene Set Enrichment Analysis (GSVA) enrichment of phenotype-associated cell-specific modules. The unbiased approach may employ all available phenotypic data. The machine learning algorithm may comprise an elastic generalized linear model (GLM), a k-nearest neighbors classifier (KNN), a random forest (RF) classifier, or any combination thereof. GLM, KNN, and RF machine learning algorithms may be performed using the glmnet, caret, and randomForest R packages, respectively.
The random forest classifier is able to sort through the inherent heterogeneity of the plurality of records to identify one or more third records associated with the specific phenotype. In some embodiments, the classifier identifies said one or more third records associated with the specific phenotype with an accuracy of at least about 70%. The implementation of the random forest classifier herein enable a specific phenotype association sensitivity of 85% and a specific phenotype association specificity of 83%. Further classifier optimization, however, may yield improved results.
KNN may classify unknown samples based on their proximity to a set number K of known samples. K may be 5% of the size of the pluralities of first, second, and third records. Alternatively, K may be 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, or any increment therein. A large K value may enable more precise calculations with less overall noise. Alternatively, the k-value may be determined through cross-validation by using an independent set of records to validate the K value. If the initial value of k is even, 1 may be added in order to avoid ties. RF may generate 500 decision trees which vote on the class of each sample. The Gini impurity index, a standard measure of misclassification error, correlates to the importance of such variables. In addition, pooled predictions may be assigned based on the average class probabilities across the three classifiers.
The GLM algorithm may carry out logistic regression with a tunable elastic penalty term to find a balance between an L1 (LASSO) and an L2 (ridge), whereby penalties facilitate variable selection in order to generate sparse solutions. Least Absolute Shrinkage and Selection Operator (LASSO) is a regularization feature selection technique to reduce overfitting in regression problems. Ridge regression employs a penalty term is to shrink the LASSO coefficient values. In some embodiments, the elastic generalized linear model classifier employs an elastic penalty of about 0.9, wherein the penalty is 90% lasso and 10% ridge. The elastic penalty may be 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, or any increments therein.
Records may be classified as active or inactive using two different methodologies: (1) a leave-one-study-out cross-validation approach or (2) a 10-fold cross-validation approach. GLM, KNN, and RF classifiers may be tasked with identifying active and inactive state records based on whole blood (WB) gene expression data and module enrichment data.
Supervised classification approaches using elastic generalized linear modeling, k-nearest neighbors, and random forest classifiers may be implemented. The trends in performance when cross-validating by one of the pluralities of records or cross-validating 10-fold display the potential advantages and disadvantages of diagnostic tests incorporating gene expression data or module enrichment. Cross-validating by one of the pluralities of records may be used to generalize 1-fold cross validation as a suboptimal scenario, whereas a 10-fold cross-validation is in fact more optimal. Although classification of active and inactive records from the pluralities of different records with 1-fold cross-validation may be suboptimal, module enrichment may be employed to smooth out much of the technical variation between data sets. 10-fold cross-validation may enable a more standardized diagnostic test. Although the plurality of second records and the plurality of first records are non-overlapping, the test set employs overlapping records to facilitate proper classification.
Furthermore, modules that may be negatively associated with phenotypic activity may be just as important in classification as positively associated modules. Further study of underrepresented categories of transcripts may enhance understanding and correlation of phenotypic activity.
Reduction of technical noise may improve classification. For example, RNA-Seq platforms, which produce transcript count records rather than probe intensity values, may display less technical variation across records if all samples are processed in the same way.
The strong performance of the random forest classifier indicates that nonlinear, decision tree-based methods of classification may be ideal because decision trees ask questions about new records sequentially and adaptively. Random forest does not apply a one-size-fits-all approach to each of the different types of records to allow for classification of records whose expression patterns make them a minority within their phenotype. As such, active records that do not resemble the majority of active records still have a strong chance of being properly classified by random forest. By contrast other methods may approach variables from new records all at once.
In some embodiments, the method further comprises filtering the first records, the second records, or both. In some embodiments, the filtering comprises normalizing, variance correction, removing outliers, removing background noise, removing data without annotation data, scaling, Weighted Gene Co-expression Network Analysis, enrichment analysis, dimensionality reduction, or any combination thereof.
In some embodiments, the normalizing is performed by Robust Multi-Array Analysis (RMA), Guanine Cytosine Robust Multi-Array Analysis (GCRMA), Linear Models for Microarray Data, variance stabilizing transformation (VST), normal-exponential quantile correction (NEQC), or any combination thereof. RMA may summarize the perfect matches through a median polish algorithm, quantile normalization, or both. Variance-stabilizing transformation may simplify considerations in graphical exploratory data analysis, allow the application of simple regression-based or analysis of variance techniques, or both. Normalized expression values may be variance corrected using local empirical Bayesian shrinkage, and DE may be assessed using the Linear Models for Microarray Data (LIMMA) package. Resulting p-values may be adjusted for multiple hypothesis testing using the Benjamini-Hochberg correction, which resulted in a false discovery rate (FDR). Significant genes within each study may be filtered to retain DE genes with an FDR<0.2, which may be considered statistically significant. The FDR may be selected a priori to diminish the number of genes that may be excluded as false negatives.
In some embodiments, the variance correction comprises employing a local empirical Bayesian shrinkage, adjusting the p-values for multiple hypothesis testing using the Benjamini-Hochberg correction, removing all data with a false discovery rate of less than 0.2, or any combination thereof. The Benjamini-Hochberg procedure may decrease the false discovery rate caused by incorrectly rejecting the true null hypotheses control for small p-values.
In some embodiments, the Weighted Gene Co-expression Network Analysis comprises calculating a topology matrix, clustering the data based on the topology matrix, correlating module eigenvalues for traits on a linear scale by Pearson correlation for nonparametric traits by Spearman correlation and for dichotomous traits by point-biserial correlation or t-test, or both. A topology matrix may specify the connections between vertices in directed multigraph.
Log 2-normalized microarray expression values from purified CD4, CD14, CD19, CD33, and low density granulocyte (LDG) populations may be used as input to WGCNA to conduct an unsupervised clustering analysis, resulting in co-expression “modules,” or groups of densely interconnected genes which may correspond to comparably regulated biologic pathways. For each experiment, an approximately scale-free topology matrix (TOM) may be first calculated to encode the network strength between probes. Probes may be clustered into WGCNA modules based on TOM distances. Resultant dendrograms of correlation networks may be trimmed to isolate individual modular groups of probes by partitioning around medoids and labeled using color assignments based on module size. Expression profiles of genes within modules may be summarized by a module eigengene (ME), which may be analogous to the module's first principal component. MEs act as characteristic expression values for their respective modules and may be correlated with sample traits such as SLEDAI or cell type by Pearson correlation for continuous or semi-continuous traits and by point-biserial correlation for dichotomous traits.
WGCNA modules from CD4, CD14, CD19, and CD33 cells may be tested for correlation to SLEDAI. Plasma cell modules may be generated by differential expression analysis and not WGCNA, but may be included because of the established importance of plasma cells in SLE pathogenesis.
Removing the outliers may be performed by statistical analysis using R and relevant Bioconductor packages. Non-normalized arrays may be inspected for visual artifacts or poor hybridization using Affy QC plots. Principal Component Analysis (PCA) plots may be used to inspect the raw data files for outliers. Data sets culled of outliers may be cleaned of background noise and normalized using RMA, GCRMA, or NEQC where appropriate. Data sets may be then filtered to remove probes with low intensity values and probes without gene annotation data. WB gene expression data sets may be filtered to only include genes that passed quality control in all data sets. Differential expression (DE) analysis and WGCNA may then be carried out on data sets. WB gene expression data sets may then be further processed before machine learning analysis. WB gene expression values may be centered and scaled to have zero-mean and unit-variance within each data set and the standardized expression values from each data set may be joined for classification.
The GSVA-R package may be used as a non-parametric method for estimating the variation of pre-defined gene sets in WB gene expression data sets. Standardized expression values from WB data sets may be used to test for enrichment of cell-specific WGCNA gene modules using the Single-sample Gene Set Enrichment Analysis (ssGSEA) method, which scores single samples in isolation and may be thus shielded from technical variation within and among data sets. Statistical analysis of GSVA enrichment scores may be performed by Spearman correlation or Welch's unequal variances t-test, where appropriate. GSVA may be performed on three WB datasets using 25 WGCNA modules made from purified cells with correlation or published relationship to SLEDAI.
Patterns of enrichment of WGCNA modules that are derived from isolated cell populations of WB that are correlated to the phenotype may be more useful than gene expression across the pluralities of records to identify active versus inactive state records. To characterize the relationships between gene signatures from various records and phenotypic activity, WGCNA may be used to generate co-expression gene modules from purified populations of cells from records with an active disease state. Such records may be subsequently tested for enrichment in whole blood of other records. WGCNA analysis of leukocyte subsets may result in several gene modules with significant Pearson correlations to SLEDAI (all |r|>0.47, p<0.05). CD4, CD14, CD19, and CD33 cells with 3, 6, 8, and 4 significant modules, respectively. Two low-density granulocyte (LDG) modules may be created by performing WGCNA analysis of LDGs along with either neutrophils or HC neutrophils and merging the modules most strongly expressed by LDGs Two plasma cell (PC) modules may be created by using the most increased and decreased transcripts of isolated plasma cells compared to naïve and memory B cells.
Gene Ontology (GO) analysis of the genes within each of the record indicates that that some processes, such as those related to interferon signaling, RNA transcription, and protein translation, may be shared among cell types, whereas other processes may be unique to certain cell types and may be used to better classification of records.
GSVA enrichment may be performed using the 25 cell-specific gene modules in WB from 156 records (82 active, 74 inactive). Of the 25 cell-specific modules, 12 had enrichment scores with significant Spearman correlations to SLEDAI (p<0.05), and 14 had enrichment scores with significant differences between active and inactive state records by Welch's unequal variances t-test (p<0.05). Notably, each cell type produced at least one module with a significant correlation to SLEDAI in WB and at least one module with a significant difference in enrichment scores between active and inactive records, demonstrating a relationship between phenotypic activity in specific cellular subsets and overall phenotypic activity in WB. However, as the Spearman's rho values ranged from −0.40 to +0.36, no one module may have a substantial predictive value. Furthermore, the effect sizes as measured by Cohen's d when testing active versus inactive enrichment scores ranged from −0.85 to +0.79. The CD4 Floralwhite and Orangered4 modules, which had the largest positive and negative effect sizes, respectively, showed a high degree of overlap in the enrichment scores of active and inactive records, where error bars indicate mean±standard deviation. WB may be unable to fully separate active records from inactive records.
Analysis of individual phenotypic activity associated peripheral cellular subset gene modules may not be sufficient to predict phenotypic activity in unrelated WB data sets, since no single module from any cell type may be able to separate active from inactive state records. Although no single module had a sufficiently high predictive value, many cell-specific gene modules may be combined and optimized to predict phenotypes of active records. Moreover, the results emphasized the need for more advanced analysis to employ gene expression analysis to predict phenotypic activity.
When training and testing sets are formed by holding out entire data sets, machine learning algorithms using raw gene expression data had an average classification accuracy of only 53 percent. However, converting this gene expression data to module enrichment improved classification accuracy to 71 percent. When training and testing sets are formed by mixing records from the three data sets, module enrichment remained at a 70 percent classification accuracy. However, classification accuracy using raw gene expression increased to a mean of 79 percent. The best overall performance came from the random forest classifier, which had a predictive accuracy of 84 percent.
The performance of each machine learning algorithm may be determined by evaluating 2 different forms of cross-validation. A random 10-fold cross-validation may randomly assign each record to one of 10 groups. A leave-one-study-out cross-validation may determine the effects of systematic technical differences among data sets on classification performance. For each pass of cross-validation, one fold or study may be held out as a test set, whereby the classifiers are trained on the remaining data. Accuracy may be assessed as the proportion of records correctly classified across all testing folds. Performance metrics such as sensitivity and specificity may be assessed after cross-validation by agglomerating class probabilities and assignments from each fold or study. Receiver Operating Characteristic (ROC) curves may be generated using the pROC R package.
In almost all cases, the random forest classifier outperformed the GLM and KNN classifiers, although the results may be not significantly different when assessed by testing for equality of proportions (p>0.05). Pooled predictions based on the class probabilities from the three classifiers may not improve overall performance.
When cross-validating by study, the use of expression values may achieve an accuracy of only 53 percent, which is consistent with the findings that gene expression values may provide less value towards classifying unfamiliar records. When the training records and test records are greatly heterogeneous, the classifiers learning patterns may be less helpful for classifying test records. Remarkably, the use of module enrichment scores improved accuracy to approximately 70 percent.
The 10-fold cross-validation with raw gene expression values may result in better performance compared to the leave-one-study-out cross-validation. This increase in performance may be attributed to the presence of records from all plurality of first, second, and third records in both the training and test sets. In this case, the classifiers may learn patterns inherent to each set of records. In this circumstance, the random forest classifier may be the strongest performer with 84% accuracy (85% sensitivity, 83% specificity), whereby the ROC curve demonstrates an excellent tradeoff between recall and fall-out. The performance of module enrichment, however may not be substantially different between 10-fold cross-validation and leave-one-study-out cross-validation.
Overall, in a study-by-study approach (leave-one-study-out cross-validation), module enrichment may be more successful than raw gene expression. Importantly, when using the 10-fold cross-validation approach, raw gene expression may outperform module enrichment. Thus, phenotypic activity classification based on raw gene expression may be sensitive to technical variability, whereas classification based on module enrichment may cope better with variation among data sets.
The variable importance of Random forest provides insight into directors of the identification of phenotypic activity, random forest classifiers may be trained on all records from each of the plurality of records in order to identify the most important genes and modules as determined by mean decrease in the Gini impurity, a measure of misclassification error.
The most important genes and modules identified a wide array of cell types and biological functions. The most important genes encompass such diverse functions as interferon signaling, pattern recognition receptor signaling, and control of survival and proliferation. Notably, the most influential modules may be skewed away from B cell-derived modules and towards T cell- and myeloid cell-derived modules. As some of these modules had overlapping genes, the variable importance experiment may be repeated with modules that may be first scrubbed of any genes that appeared in more than one module before GSVA enrichment scoring. The relative variable importance scores of the de-duplicated modules correlated strongly with those of the original modules (Spearman's rho=0.73, p=5.18E-5), indicating that module behavior may be partly driven by the overlapping genes but strongly driven by unique genes. Variable importance of top 25 individual genes. LDG: low-density granulocyte; PC: plasma cell.
CD4_Floralwhite and CD14_Yellow, two interferon-related modules which maintained high importance after deduplication, may be further analyzed to study the effect of unique genes on module importance. Gene lists may be tested for statistical overrepresentation of Gene Ontology biological process terms with FDR correction on pantherdb.org. CD4_Floralwhite did not show any significant enrichment, but CD14_Yellow, which had the highest importance after deduplication, may be highly enriched for genes with the “Immune Effector Process” designation (26/77 genes, FDR=9.38E-11 by Fisher's exact test). This suggests that CD14+ monocytes express unique genes that may play important roles in the initiation of phenotypic activity.
Several important findings on the topic of gene expression heterogeneity within and across data sets have been elucidated by this study. First, DE analysis of active vs inactive records may be insufficient for proper classification of phenotypic activity, as systematic differences between data sets render conventional bioinformatics techniques largely non-generalizable.
Further, WGCNA modules created from the cellular components of WB and correlated to SLEDAI phenotypic activity may improve classification of phenotypic activity in records. The use of cell-specific gene modules based on a priori knowledge about their relevance to disease fared slightly better than raw gene expression, as it generated informative enrichment patterns, and many of the modules maintained significant correlations with SLEDAI in WB. However, these enrichment scores failed to completely separate active records from inactive records by hierarchical clustering.
Conventional bioinformatics approaches do not satisfactorily identify one or more records having a specific phenotype. DE analysis of a plurality of first records, a plurality of second records, and a plurality of third records having an active disease state and a non-active disease state displayed the major differences and heterogeneity. First, the 100 most significant DE genes by FDR in the plurality of first, second, and third records may be used to carry out hierarchical clustering of active and inactive disease state records. Active disease state records are clearly separated from inactive records, but only partially separated from inactive records.
Out of 6,640 unique DE genes from the three pluralities of records, 5,170 genes are unique to one of the plurality of records, 1,234 are shared by two of the plurality of records, and 36 are shared by all three of the plurality of records. There is minimal overlap of the 100 most significant genes by FDR in each of the pluralities of records. The only overlaps among the top 100 DE genes in each study by FDR are: TWY3 and EHBP1, shared between the plurality of first records and the plurality of third records; and LZIC, shared between the plurality of first records and plurality of second records. Furthermore, the fold change distributions of the 100 most significant DE genes in each of the pluralities of records varied considerably. In the plurality of first records, 94 of the 100 most significant genes are downregulated in active disease state records; in the plurality of second records, all of the top 100 genes are upregulated in active disease state records; and in the plurality of third records, the top 100 genes are more evenly distributed (41 up, 59 down). Orange bars denote active state records, wherein black bars denote inactive state records.
The plurality of first, second, and third records may represent different populations and may be collected on different microarray platforms. The lack of commonality among the genes most descriptive of active state records and inactive state records in each of the pluralities of records casts doubt on whether active and inactive states from the different pluralities of records may be easily determined using conventional techniques.
Records from the pluralities of first, second, and third records may then be joined to evaluate whether unsupervised techniques may separate active state records from inactive state records. Hierarchical clustering on the 297 unique most significant DE genes by FDR showed considerable heterogeneity, and active records and inactive records did not consistently separate, per the heat map of the top 100 DE genes by FDR from each of the pluralities of records (combined total of 297 unique genes from the plurality of first, second, and third records) expressed in all records. As such, conventional techniques failed to identify active records, highlighting the need for more advanced algorithms.
In some embodiments, the platforms, systems, media, and methods described herein include a digital processing device, or use of the same. In further embodiments, the digital processing device includes one or more hardware central processing units (CPUs) or general purpose graphics processing units (GPGPUs) that carry out the device's functions. In still further embodiments, the digital processing device further comprises an operating system configured to perform executable instructions. In some embodiments, the digital processing device is optionally connected a computer network. In further embodiments, the digital processing device is optionally connected to the Internet such that it accesses the World Wide Web. In still further embodiments, the digital processing device is optionally connected to a cloud computing infrastructure. In other embodiments, the digital processing device is optionally connected to an intranet. In other embodiments, the digital processing device is optionally connected to a data storage device.
In accordance with the description herein, suitable digital processing devices include, by way of non-limiting examples, server computers, desktop computers, laptop computers, notebook computers, sub-notebook computers, netbook computers, netpad computers, set-top computers, media streaming devices, handheld computers, Internet appliances, mobile smartphones, tablet computers, personal digital assistants, video game consoles, and vehicles. Those of skill in the art will recognize that many smartphones are suitable for use in the system described herein. Those of skill in the art will also recognize that select televisions, video players, and digital music players with optional computer network connectivity are suitable for use in the system described herein. Suitable tablet computers include those with booklet, slate, and convertible configurations, known to those of skill in the art.
In some embodiments, the digital processing device includes an operating system configured to perform executable instructions. The operating system is, for example, software, including programs and data, which manages the device's hardware and provides services for execution of applications. Those of skill in the art will recognize that suitable server operating systems include, by way of non-limiting examples, FreeBSD, OpenBSD, NetBSD®, Linux, Apple® Mac OS X Server®, Oracle® Solaris®, Windows Server®, and Novell® NetWare®. Those of skill in the art will recognize that suitable personal computer operating systems include, by way of non-limiting examples, Microsoft® Windows®, Apple® Mac OS X®, UNIX®, and UNIX-like operating systems such as GNU/Linux®. In some embodiments, the operating system is provided by cloud computing. Those of skill in the art will also recognize that suitable mobile smart phone operating systems include, by way of non-limiting examples, Nokia® Symbian® OS, Apple® iOS®, Research In Motion® BlackBerry OS®, Google® Android®, Microsoft® Windows Phone® OS, Microsoft® Windows Mobile® OS, Linux®, and Palm® WebOS®. Those of skill in the art will also recognize that suitable media streaming device operating systems include, by way of non-limiting examples, Apple TV®, Roku®, Boxee®, Google TV®, Google Chromecast®, Amazon Fire®, and Samsung® HomeSync®. Those of skill in the art will also recognize that suitable video game console operating systems include, by way of non-limiting examples, Sony® PS3®, Sony® PS4®, Microsoft® Xbox 360®, Microsoft Xbox One, Nintendo® Wii®, Nintendo® Wii U®, and Ouya®.
In some embodiments, the device includes a storage and/or memory device. The storage and/or memory device is one or more physical apparatuses used to store data or programs on a temporary or permanent basis. In some embodiments, the device is volatile memory and requires power to maintain stored information. In some embodiments, the device is non-volatile memory and retains stored information when the digital processing device is not powered. In further embodiments, the non-volatile memory comprises flash memory. In some embodiments, the non-volatile memory comprises dynamic random-access memory (DRAM). In some embodiments, the non-volatile memory comprises ferroelectric random access memory (FRAM). In some embodiments, the non-volatile memory comprises phase-change random access memory (PRAM). In other embodiments, the device is a storage device including, by way of non-limiting examples, CD-ROMs, DVDs, flash memory devices, magnetic disk drives, magnetic tapes drives, optical disk drives, and cloud computing-based storage. In further embodiments, the storage and/or memory device is a combination of devices such as those disclosed herein.
In some embodiments, the digital processing device includes a display to send visual information to a user. In some embodiments, the display is a liquid crystal display (LCD). In further embodiments, the display is a thin film transistor liquid crystal display (TFT-LCD). In some embodiments, the display is an organic light emitting diode (OLED) display. In various further embodiments, on OLED display is a passive-matrix OLED (PMOLED) or active-matrix OLED (AMOLED) display. In some embodiments, the display is a plasma display. In other embodiments, the display is a video projector. In yet other embodiments, the display is a head-mounted display in communication with the digital processing device, such as a VR headset. In further embodiments, suitable VR headsets include, by way of non-limiting examples, HTC Vive, Oculus Rift, Samsung Gear VR, Microsoft HoloLens, Razer OSVR, FOVE VR, Zeiss VR One, Avegant Glyph, Freefly VR headset, and the like. In still further embodiments, the display is a combination of devices such as those disclosed herein.
In some embodiments, the digital processing device includes an input device to receive information from a user. In some embodiments, the input device is a keyboard. In some embodiments, the input device is a pointing device including, by way of non-limiting examples, a mouse, trackball, track pad, joystick, game controller, or stylus. In some embodiments, the input device is a touch screen or a multi-touch screen. In other embodiments, the input device is a microphone to capture voice or other sound input. In other embodiments, the input device is a video camera or other sensor to capture motion or visual input. In further embodiments, the input device is a Kinect, Leap Motion, or the like. In still further embodiments, the input device is a combination of devices such as those disclosed herein.
In some embodiments, the platforms, systems, media, and methods disclosed herein include one or more non-transitory computer readable storage media encoded with a program including instructions executable by the operating system of an optionally networked digital processing device. In further embodiments, a computer readable storage medium is a tangible component of a digital processing device. In still further embodiments, a computer readable storage medium is optionally removable from a digital processing device. In some embodiments, a computer readable storage medium includes, by way of non-limiting examples, CD-ROMs, DVDs, flash memory devices, solid state memory, magnetic disk drives, magnetic tape drives, optical disk drives, cloud computing systems and services, and the like. In some cases, the program and instructions are permanently, substantially permanently, semi-permanently, or non-transitorily encoded on the media.
In some embodiments, the platforms, systems, media, and methods disclosed herein include at least one computer program, or use of the same. A computer program includes a sequence of instructions, executable in the digital processing device's CPU, written to perform a specified task. Computer readable instructions may be implemented as program modules, such as functions, objects, Application Programming Interfaces (APIs), data structures, and the like, that perform particular tasks or implement particular abstract data types. In light of the disclosure provided herein, those of skill in the art will recognize that a computer program may be written in various versions of various languages.
The functionality of the computer readable instructions may be combined or distributed as desired in various environments. In some embodiments, a computer program comprises one sequence of instructions. In some embodiments, a computer program comprises a plurality of sequences of instructions. In some embodiments, a computer program is provided from one location. In other embodiments, a computer program is provided from a plurality of locations. In various embodiments, a computer program includes one or more software modules. In various embodiments, a computer program includes, in part or in whole, one or more web applications, one or more mobile applications, one or more standalone applications, one or more web browser plug-ins, extensions, add-ins, or add-ons, or combinations thereof.
In some embodiments, a computer program includes a web application. In light of the disclosure provided herein, those of skill in the art will recognize that a web application, in various embodiments, utilizes one or more software frameworks and one or more database systems. In some embodiments, a web application is created upon a software framework such as Microsoft® NET or Ruby on Rails (RoR). In some embodiments, a web application utilizes one or more database systems including, by way of non-limiting examples, relational, non-relational, object oriented, associative, and XML database systems. In further embodiments, suitable relational database systems include, by way of non-limiting examples, Microsoft® SQL Server, mySQL™, and Oracle®. Those of skill in the art will also recognize that a web application, in various embodiments, is written in one or more versions of one or more languages. A web application may be written in one or more markup languages, presentation definition languages, client-side scripting languages, server-side coding languages, database query languages, or combinations thereof. In some embodiments, a web application is written to some extent in a markup language such as Hypertext Markup Language (HTML), Extensible Hypertext Markup Language (XHTML), or eXtensible Markup Language (XML). In some embodiments, a web application is written to some extent in a presentation definition language such as Cascading Style Sheets (CSS). In some embodiments, a web application is written to some extent in a client-side scripting language such as Asynchronous Javascript and XML (AJAX), Flash® Actionscript, Javascript, or Silverlight®. In some embodiments, a web application is written to some extent in a server-side coding language such as Active Server Pages (ASP), ColdFusion®, Perl, Java™, JavaServer Pages (JSP), Hypertext Preprocessor (PHP), Python™, Ruby, Tcl, Smalltalk, WebDNA®, or Groovy. In some embodiments, a web application is written to some extent in a database query language such as Structured Query Language (SQL). In some embodiments, a web application integrates enterprise server products such as IBM® Lotus Domino®. In some embodiments, a web application includes a media player element. In various further embodiments, a media player element utilizes one or more of many suitable multimedia technologies including, by way of non-limiting examples, Adobe® Flash®, HTML 5, Apple® QuickTime®, Microsoft® Silverlight®, Java™, and Unity®.
In some embodiments, a computer program includes a standalone application, which is a program that is run as an independent computer process, not an add-on to an existing process, e.g., not a plug-in. Those of skill in the art will recognize that standalone applications are often compiled. A compiler is a computer program(s) that transforms source code written in a programming language into binary object code such as assembly language or machine code. Suitable compiled programming languages include, by way of non-limiting examples, C, C++, Objective-C, COBOL, Delphi, Eiffel, Java™, Lisp, Python™, Visual Basic, and VB NET, or combinations thereof. Compilation is often performed, at least in part, to create an executable program. In some embodiments, a computer program includes one or more executable complied applications.
In some embodiments, the computer program includes a web browser plug-in (e.g., extension, etc.). In computing, a plug-in is one or more software components that add specific functionality to a larger software application. Makers of software applications support plug-ins to enable third-party developers to create abilities which extend an application, to support easily adding new features, and to reduce the size of an application. When supported, plug-ins enable customizing the functionality of a software application. For example, plug-ins are commonly used in web browsers to play video, generate interactivity, scan for viruses, and display particular file types. Those of skill in the art will be familiar with several web browser plug-ins including, Adobe® Flash® Player, Microsoft® Silverlight®, and Apple® QuickTime®.
In view of the disclosure provided herein, those of skill in the art will recognize that several plug-in frameworks are available that enable development of plug-ins in various programming languages, including, by way of non-limiting examples, C++, Delphi, Java™, PHP, Python™, and VB NET, or combinations thereof.
Web browsers (also called Internet browsers) are software applications, designed for use with network-connected digital processing devices, for retrieving, presenting, and traversing information resources on the World Wide Web. Suitable web browsers include, by way of non-limiting examples, Microsoft® Internet Explorer®, Mozilla® Firefox®, Google® Chrome, Apple® Safari®, Opera Software® Opera®, and KDE Konqueror. In some embodiments, the web browser is a mobile web browser. Mobile web browsers (also called mircrobrowsers, mini-browsers, and wireless browsers) are designed for use on mobile digital processing devices including, by way of non-limiting examples, handheld computers, tablet computers, netbook computers, subnotebook computers, smartphones, music players, personal digital assistants (PDAs), and handheld video game systems. Suitable mobile web browsers include, by way of non-limiting examples, Google® Android® browser, RIM BlackBerry® Browser, Apple® Safari®, Palm® Blazer, Palm® WebOS® Browser, Mozilla® Firefox® for mobile, Microsoft® Internet Explorer® Mobile, Amazon® Kindle® Basic Web, Nokia® Browser, Opera Software® Opera® Mobile, and Sony® PSP™ browser.
In some embodiments, the platforms, systems, media, and methods disclosed herein include software, server, and/or database modules, or use of the same. In view of the disclosure provided herein, software modules are created by techniques known to those of skill in the art using machines, software, and languages known to the art. The software modules disclosed herein are implemented in a multitude of ways. In various embodiments, a software module comprises a file, a section of code, a programming object, a programming structure, or combinations thereof. In further various embodiments, a software module comprises a plurality of files, a plurality of sections of code, a plurality of programming objects, a plurality of programming structures, or combinations thereof. In various embodiments, the one or more software modules comprise, by way of non-limiting examples, a web application, a mobile application, and a standalone application. In some embodiments, software modules are in one computer program or application. In other embodiments, software modules are in more than one computer program or application. In some embodiments, software modules are hosted on one machine. In other embodiments, software modules are hosted on more than one machine. In further embodiments, software modules are hosted on cloud computing platforms. In some embodiments, software modules are hosted on one or more machines in one location. In other embodiments, software modules are hosted on one or more machines in more than one location.
In some embodiments, the platforms, systems, media, and methods disclosed herein include one or more databases, or use of the same. In view of the disclosure provided herein, those of skill in the art will recognize that many databases are suitable for identifying one or more records having a specific phenotype. In various embodiments, suitable databases include, by way of non-limiting examples, relational databases, non-relational databases, object oriented databases, object databases, entity-relationship model databases, associative databases, and XML databases. Further non-limiting examples include SQL, PostgreSQL, MySQL, Oracle, DB2, and Sybase. In some embodiments, a database is internet-based. In further embodiments, a database is web-based. In still further embodiments, a database is cloud computing-based. In other embodiments, a database is based on one or more local computer storage devices.
The present disclosure provides systems and methods to perform data analysis using drug or target scoring algorithms and/or big data analysis tools. In various aspects, such drug or target scoring algorithms and/or big data analysis tools may be used to perform analysis of data sets including, for example, mRNA gene expression or transcriptome data, DNA genomic data, proteomic data, metabolomic data, other types of “-omic” data, or a combination thereof.
In an aspect, the present disclosure provides a computer-implemented method for assessing a condition of a subject, comprising: (a) receiving a dataset of a biological sample of the subject; (b) selecting one or more data analysis tools, wherein the one or more data analysis tools comprise an analysis tool selected from the group consisting of: a BIG-C™ big data analysis tool, an I-Scope™ big data analysis tool, a T-Scope™ big data analysis tool, a CellScan big data analysis tool, an MS (Molecular Signature) Scoring™ analysis tool, a Gene Set Variation Analysis (GSVA) tool (e.g., P-Scope), a CoLTs® (Combined Lupus Treatment Scoring) analysis tool, and a Target Scoring analysis tool; (c) processing the dataset using the one or more data analysis tools to generate a data signature of the biological sample of the subject; and (d) based at least in part on the data signature generated in (c), assessing the condition of the subject.
In some embodiments, the dataset comprises mRNA gene expression or transcriptome data, DNA genomic data, proteomic data, metabolomic data, or a combination thereof. In some embodiments, the biological sample comprises a whole blood (WB) sample, a PBMC sample, a tissue sample, a cell sample, or any derivative thereof. In some embodiments, assessing the condition of the subject comprises identifying a disease or disorder of the subject.
In some embodiments, the method further comprises identifying a disease or disorder of the subject at a sensitivity or specificity of at least about 70%. In some embodiments, the method further comprises determining a likelihood of the identification of the disease or disorder of the subject. In some embodiments, the method further comprises providing a therapeutic intervention for the disease or disorder of the subject. In some embodiments, the method further comprises monitoring the disease or disorder of the subject, wherein the monitoring comprises assessing the disease or disorder of the subject at a plurality of time points, wherein the assessing is based at least on the disease or disorder identified at each of the plurality of time points.
In some embodiments, selecting the one or more data analysis tools comprises receiving a user selection of the one or more data analysis tools. In some embodiments, selecting the one or more data analysis tools is automatically performed by the computer without receiving a user selection of the one or more data analysis tools.
In another aspect, the present disclosure provides a computer system for assessing a condition of a subject, comprising: a database that is configured to store a dataset of a biological sample of the subject; and one or more computer processors operatively coupled to the database, wherein the one or more computer processors are individually or collectively programmed to: (i) select one or more data analysis tools comprising: a BIG-C™ big data analysis tool, an I-Scope™ big data analysis tool, a T-Scope™ big data analysis tool, a CellScan big data analysis tool, an MS (Molecular Signature) Scoring™ analysis tool, a Gene Set Variation Analysis (GSVA) tool (e.g., P-Scope), a CoLTs® (Combined Lupus Treatment Scoring) analysis tool, a Target Scoring analysis tool, or a combination thereof; (ii) process the dataset using the one or more data analysis tools to generate a data signature of the biological sample of the subject; and (iii) based at least in part on the data signature generated in (ii), assess the condition of the subject.
In another aspect, the present disclosure provides a non-transitory computer readable medium comprising machine-executable code that, upon execution by one or more computer processors, implements a method for assessing a condition of a subject, the method comprising: (a) receiving a dataset of a biological sample of the subject; (b) selecting one or more data analysis tools, wherein the one or more data analysis tools comprise an analysis tool selected from the group consisting of: a BIG-C™ big data analysis tool, an I-Scope™ big data analysis tool, a T-Scope™ big data analysis tool, a CellScan big data analysis tool, an MS (Molecular Signature) Scoring™ analysis tool, a Gene Set Variation Analysis (GSVA) tool (e.g., P-Scope), a CoLTs® (Combined Lupus Treatment Scoring) analysis tool, and a Target Scoring analysis tool; (c) processing the dataset using the one or more data analysis tools to generate a data signature of the biological sample of the subject; and (d) based at least in part on the data signature generated in (c), assessing the condition of the subject. In any embodiment described herein, the one or more data analysis tools may be a plurality of data analysis tools each independently selected from a BIG-C™ big data analysis tool, an I-Scope™ big data analysis tool, a T-Scope™ big data analysis tool, a CellScan big data analysis tool, an MS (Molecular Signature) Scoring™ analysis tool, a Gene Set Variation Analysis (GSVA) tool (e.g., P-Scope), a CoLTs® (Combined Lupus Treatment Scoring) analysis tool, and a Target Scoring analysis tool.
To obtain a blood sample, various techniques may be used, e.g., a syringe or other vacuum suction device. A blood sample may be optionally pre-treated or processed prior to use. A sample, such as a blood sample, may be analyzed under any of the methods and systems herein within 4 weeks, 2 weeks, 1 week, 6 days, 5 days, 4 days, 3 days, 2 days, 1 day, 12 hr, 6 hr, 3 hr, 2 hr, or 1 hr from the time the sample is obtained, or longer if frozen. When obtaining a sample from a subject (e.g., blood sample), the amount may vary depending upon subject size and the condition being screened. In some embodiments, at least 10 mL, 5 mL, 1 mL, 0.5 mL, 250, 200, 150, 100, 50, 40, 30, 20, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 μL of a sample is obtained. In some embodiments, 1-50, 2-40, 3-30, or 4-20 μL of sample is obtained. In some embodiments, more than 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95 or 100 NL of a sample is obtained.
The sample may be taken before and/or after treatment of a subject with a disease or disorder. Samples may be obtained from a subject during a treatment or a treatment regime. Multiple samples may be obtained from a subject to monitor the effects of the treatment over time. The sample may be taken from a subject known or suspected of having a disease or disorder for which a definitive positive or negative diagnosis is not available via clinical tests. The sample may be taken from a subject suspected of having a disease or disorder. The sample may be taken from a subject experiencing unexplained symptoms, such as fatigue, nausea, weight loss, aches and pains, weakness, or bleeding. The sample may be taken from a subject having explained symptoms. The sample may be taken from a subject at risk of developing a disease or disorder due to factors such as familial history, age, hypertension or pre-hypertension, diabetes or pre-diabetes, overweight or obesity, environmental exposure, lifestyle risk factors (e.g., smoking, alcohol consumption, or drug use), or presence of other risk factors.
In some embodiments, a sample may be taken at a first time point and assayed, and then another sample may be taken at a subsequent time point and assayed. Such methods may be used, for example, for longitudinal monitoring purposes to track the development or progression of a disease. In some embodiments, the progression of a disease may be tracked before treatment, after treatment, or during the course of treatment, to determine the treatment's effectiveness. For example, a method as described herein may be performed on a subject prior to, and after, treatment with a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition therapy to measure the disease's progression or regression in response to the lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition therapy.
After obtaining a sample from the subject, the sample may be processed to generate datasets indicative of a disease or disorder of the subject. For example, a presence, absence, or quantitative assessment of nucleic acid molecules of the sample from a panel of condition-associated genomic loci or may be indicative of a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition of the subject. Processing the sample obtained from the subject may comprise (i) subjecting the sample to conditions that are sufficient to isolate, enrich, or extract a plurality of nucleic acid molecules, and (ii) assaying the plurality of nucleic acid molecules to generate the dataset (e.g., microarray data, nucleic acid sequences, or quantitative polymerase chain reaction (qPCR) data). Methods of assaying may include any assay known in the art or described in the literature, for example, a microarray assay, a sequencing assay (e.g., DNA sequencing, RNA sequencing, or RNA-Seq), or a quantitative polymerase chain reaction (qPCR) assay.
In some embodiments, a plurality of nucleic acid molecules is extracted from the sample and subjected to sequencing to generate a plurality of sequencing reads. The nucleic acid molecules may comprise ribonucleic acid (RNA) or deoxyribonucleic acid (DNA). The extraction method may extract all RNA or DNA molecules from a sample. Alternatively, the extraction method may selectively extract a portion of RNA or DNA molecules from a sample. Extracted RNA molecules from a sample may be converted to cDNA molecules by reverse transcription (RT).
The sample may be processed without any nucleic acid extraction. For example, the disease or disorder may be identified or monitored in the subject by using probes configured to selectively enrich nucleic acid (e.g., RNA or DNA) molecules corresponding to a panel of condition-associated genomic loci. The probes may be nucleic acid primers. The probes may have sequence complementarity with nucleic acid sequences from one or more of the panel of condition-associated genomic loci. The panel of condition-associated genomic loci may comprise at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least about 25, at least about 30, at least about 35, at least about 40, at least about 45, at least about 50, at least about 55, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, at least about 90, at least about 95, at least about 100, or more condition-associated genomic loci.
The probes may be nucleic acid molecules (e.g., RNA or DNA) having sequence complementarity with nucleic acid sequences (e.g., RNA or DNA) of one or more genomic loci (e.g., condition-associated genomic loci). These nucleic acid molecules may be primers or enrichment sequences. The assaying of the sample using probes that are selective for the one or more genomic loci (e.g., condition-associated genomic loci) may comprise use of array hybridization, polymerase chain reaction (PCR), or nucleic acid sequencing (e.g., RNA sequencing or DNA sequencing, such as RNA-Seq).
The assay readouts may be quantified at one or more genomic loci (e.g., condition-associated genomic loci) to generate the data indicative of the disease or disorder. For example, quantification of array hybridization or polymerase chain reaction (PCR) corresponding to a plurality of genomic loci (e.g., condition-associated genomic loci) may generate data indicative of the disease or disorder. Assay readouts may comprise quantitative PCR (qPCR) values, digital PCR (dPCR) values, digital droplet PCR (ddPCR) values, fluorescence values, etc., or normalized values thereof.
The present disclosure provides systems and methods to perform data analysis using drug or target scoring algorithms and/or big data analysis tools. In various aspects, such drug or target scoring algorithms and/or big data analysis tools may be used to perform analysis of data sets including, for example, mRNA gene expression or transcriptome data, DNA genomic data, proteomic data, metabolomic data, other types of “-omic” data, or a combination thereof. Systems and methods of the present disclosure may use one or more of the following: a BIG-C™ big data analysis tool, an I-Scope™ big data analysis tool, a T-Scope™ big data analysis tool, a CellScan big data analysis tool, an MS (Molecular Signature) Scoring™ analysis tool, a Gene Set Variation Analysis (GSVA) tool (e.g., P-Scope), a CoLTs® (Combined Lupus Treatment Scoring) analysis tool, and a Target Scoring analysis tool.
A non-limiting example of a workflow of a method to assess a condition of a subject using one or more data analysis tools and/or algorithms may comprise receiving a dataset of a biological sample of a subject. Next, the method may comprise selecting one or more data analysis tools and/or algorithms. For example, the data analysis tools and/or algorithms may comprise a BIG-C™ big data analysis tool, an I-Scope™ big data analysis tool, a T-Scope™ big data analysis tool, a CellScan big data analysis tool, an MS (Molecular Signature) Scoring™ analysis tool, a Gene Set Variation Analysis (GSVA) tool (e.g., P-Scope), a CoLTs® (Combined Lupus Treatment Scoring) analysis tool, a Target Scoring analysis tool, or a combination thereof. Next, the method may comprise processing the dataset using selected data analysis tools and/or algorithms to generate a data signature of the biological sample of the subject. Next, the method may comprise assessing the condition of the subject based on the data signature.
The BIG-C (Biologically Informed Gene Clustering) tool may be configured to sort large groups of genes into a set of functional groups (e.g., 53 functional groups). The functional groups are created utilizing publicly available information from online tools and databases including UniProtKB/Swiss-Prot, GO Terms, KEGG pathways, NCBI PubMed, and the Interactome. The functional groups may include one or more of: Active RNA, Anti-apoptosis, anti-proliferation, autophagy, chromatin remodeling, cytoplasm and biochemistry, cytoskeleton, DNA repair, endocytosis, endoplasmic reticulum, endosome and vesicles, fatty acid biosynthesis, cell surface, transcription, glycolysis and gluconeogenesis, golgi, immune cell surface, immune secreted, immune signaling, integrin pathway, interferon stimulated genes, intracellular signaling, lysosome, melanosome, MHC class I, MHC class II, microRNA processing, microRNA, mitochondrial transcription, mitochondria, mitochondria oxidative phosphorylation, mitochondrial TCA cycle, mRNA processing, mRNA splicing, non-coding RNA, nuclear receptor, nucleus and nucleolus, palmitoylation, pattern recognition receptors, peroxisomes, pro-apoptosis, pro-cell cycle, proteasome, pseudogenes, RAS superfamily, reactive oxygen species protection, secreted and extracellular matrix, transcription factors, transporters, transposon control, ubiquitylation and sumoylation, unfolded protein and stress, and unknown. Enrichment scores for each group are calculated based on an overlap p value to determine the functional groups over or under-expressed in the gene expression dataset. The BIG-C may be configured such that each gene is sorted into only one of the 53 functional groups, allowing for a quick and relatively simple understanding of types of genes enriched and co-expressed in a big dataset.
The I-Scope™ tool may be configured to identify immune infiltrates. Hematopoietic cells are unique in that they move throughout the body patrolling for threats to the host, and may infiltrate tissue sites not normally home to immune cells. I-Scope™ may be configured to identify hematopoietic cells through an iterative search of more than 17,000 genes identified in more than 50 microarray datasets. From this search, 1226 candidate genes are identified and researched for restriction in hematopoietic cells as determined by the HPA, GTEx and FANTOM5 datasets (e.g., available at proteinatlas.org). 926 genes meet the criteria for being mainly restricted to hematopoietic lineages (brain, reproductive organ exclusions were permitted). These genes are researched for immune cell specific expression in 27 hematopoietic sub-categories: alpha beta T cell, T cell, regulatory T Cell, activated T cell, anergic T cell, gamma delta T cells, CD8 T, NK/NKT cell, NK cell, T & B cells, B cells, germinal center B cells, B cell and plasmacytoid dendritic cell, T &B & myeloid, B & myeloid, T & myeloid, MHC Class II expressing cell, monocyte, dendritic cell, plasmacytoid dendritic cells, myeloid cell, plasma cell, erythrocyte, neutrophil, low density granulocyte, granulocyte, and platelet. Transcripts are entered into I-Scope™ and the number of transcripts in each category determined. Odd's ratios are calculated with confidence intervals using the Fisher's exact test in R.
The T-Scope™ tool may be configured to help identify types of non-hematopoietic cells in gene expression datasets. T-Scope™ may be configured by downloading approximately 10,000 tissue enriched and 8,000 cell line enriched genes from the human protein atlas along with their tissue or cell line designation (e.g., available at proteinatlas.org). Genes found in more than four tissues are eliminated. Housekeeping genes described in the gene expression study by She et al. are also removed (e.g., as described by She et al., “Definition, conservation and epigenetics of housekeeping and tissue-enriched genes,” BMC Genomics 2009, 10:269, which is incorporated herein by reference in its entirety). This list is further curated by removing genes differentially expressed in 34 hematopoietic cell gene expression datasets and adding kidney specific genes from datasets downloaded from the GEO repository and processed by Ampel BioSolutions. The resulting categories of genes represent genes enriched in the following 42 tissue/cell specific categories: adrenal gland, breast, cartilage, cerebral cortex, uterine cervix, chondrocyte, colon, duodenum, endometrium, epididymis, esophagus fallopian tube, esophagus, fibroblast, heart muscle, keratinocyte, kidney, liver, lung, melanocyte, ovary pancreas, parathyroid gland, placenta, podocyte, prostrate, rectum, salivary gland, seminal vesicle, skeletal muscle, skin, small intestine, smooth muscle, stomach, synoviocyte, testis, kidney loop of henle, kidney proximal tubule, kidney distal tubule, and kidney collecting duct.
The CellScan tool may be a combination of I-Scope™ and T-Scope™, and may be configured to analyse tissues with suspected immune infiltrations that may also have tissue specific genes. CellScan may potentially be more stringent than either I-Scope™ or T-Scope™ because it may be used to distinguish resident tissue cells from non-resident hematopoietic cells.
The MS (Molecular Signature) Scoring tool may be configured to assess specific pathways in a disease state. Information on genes that encode for proteins that participate in a specific signaling pathway, and whether the gene product promotes or inhibits the pathway, are compiled and curated through literature mining. Curated pathways presented by the company include CD40-CD40 ligand, IL-6, IL-12/23, TNF, IL-17, IL-21, S1P1, IL-13 and PDE4, but this method may be used for any known signaling pathway with available data. To determine if a signaling pathway is over or under-expressed in a microarray dataset, the gene list for each signaling pathway may be queried against the limma differentially expressed genes from a disease state compared to healthy controls, and the differentially expressed genes in the signaling pathway may be identified for each set. The fold changes for genes that promoted the pathway may be added together and the fold changes for genes that inhibited the pathway may be subtracted from the score. This total score may be normalized based on the number of genes that may be detected on the specific microarray platform used for the experiment. Activation scores of −100 to +100 may be determined using this method with negative scores indicating an inhibition of the specific pathway in the disease state and positive scores indicating an up-regulation of a specific pathway in the disease state. The Fischer's exact test may be performed to determine if there was sufficient overlap of genes between the experimental differentially expressed genes and the genes in the signaling pathway.
Gene Set Variation Analysis (GSVA) may be performed (for example, as described in Catalina et al. (2019, Communications Biology, “Gene expression analysis delineates the potential roles of multiple interferons in systemic lupus erythematosus”, which is incorporated herein by reference in its entirety) to determine enrichment of signaling pathways in individual patient samples. Gene set variation analysis may be performed using an open source software package for the coding language R available at the R Bioconductor (bioconductor.org), e.g., as described by Hanzelman et al., (“GSVA: gene set variation analysis for microarray and RNA-Seq data,” BMC Bioinformatics, 2013, which is incorporated herein by reference in its entirety). The modules of genes to interrogate the datasets may be developed. Modules of genes determined to represent a specific signaling pathway or process may be identified (e.g., using publicly available datasets). For example, the IFNB1 signaling pathway is taken from a publicly available gene expression dataset of peripheral blood cells treated with IFNB1 in vitro. Genes co-expressed in this dataset (genes either all increased or decreased compared to control treated peripheral blood) are used to create modules of genes representing the IFNB1 signaling pathway, and GSVA is used to determine the enrichment of this set of genes and hence the IFNB1 signaling pathway in individual patient and control samples.
The CoLTs®, or Combined Lupus Treatment Scoring, may be configured to rank identified drugs or therapies by a number of essential characteristics, including scientific rationale, experience in lupus mice/human cells (preclinical), previous clinical experience in autoimmunity, drug properties, and safety profile, including adverse events. Face and test validities may be established by scoring SOC medications and confirming the scores with a panel of lupus clinicians. The final result may be the CoLTs® score. A CoLTs® algorithm may also be configured for drugs in development (DID), which typically do not have drug metabolism and adverse event information available.
The target scoring algorithm may be configured to prioritize a specific gene or protein that is potentially a good choice to target with a drug in lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) patients. It may be utilized even if there is currently no drug available to the target gene or protein. The algorithm may be based on the addition of 18 data based determinations plus the overall scientific rationale and generates scores from −13 (not a good target in SLE) to 27 (very promising target in SLE).
BIG-C® is a fast and efficient cloud-based tool to functionally categorize gene products. With coverage of over 80% of the genome, BIG-C® leverages publicly available databases such as UniProtKB/Swiss-Prot, GO terms, KEGG pathways, NCBI PubMed and Interactome to place genes into 53 functional categories. The sorting into only one of 53 functional groups allows for a quick and relatively simple understanding of types of genes enriched and co-expressed in a big dataset. This assists in deriving further insights from genes expressed for a given disease state in human or pre-clinical mouse models.
BIG-C® may be used to functionally categorize immunological genes that are not covered in cancer databases such as GO and KEGG (e.g., as described by Grammer et al. 2016, “Drug repositioning in SLE: crowd-sourcing, literature-mining and Big Data analysis,” Lupus, 25(10), 1150-1170, which is incorporated herein by reference in its entirety). Using a knowledge base of over 5000 patients with systemic lupus erythematosus (SLE), over 16432 genes are each placed into one of 53 BIG-C® functional categories, and statistical analysis is performed to identify enriched categories. BIG-C® categories are cross-examined with the GO and KEGG terms to obtain additional information and insights.
A sample BIG-C® workflow may comprise the following steps. First, SLE genomic datasets are derived from whole blood, peripheral blood mononuclear cells, affected tissues, and purified immune cells. Second, datasets are analyzed using DE analysis (as shown by a differential expression heatmap) or Weighted Gene Coexpression Network Analysis (WGCNA) (as shown by a gene coexpression plot). Third, expressed genes are annotated using publicly available databases (e.g., UniProtKB/Swiss-Prot database, Human Immunodeficiencies database, Mouse MGI database, Entrez Molecular Sequence database, PubMed, and the Human Tissue Atlas). Fourth, signatures are cross-referenced with purified single-cell microarray datasets and RNAseq experiments. Fifth, BIG-C® is leveraged to separate the individual annotated genes into one of 53 functional categories (e.g., as described by Labonte et al. 2018, “Identification of alterations in macrophage activation associated with disease activity in systemic lupus erythematosus,” PloS one, 13(12), e0208132, which is incorporated herein by reference in its entirety). Sixth, chi-squared analysis is used to determine enriched categories of interest from overlap p-values. Seventh, enriched categories are cross-examined with GO and KEGG terms to derive key insights for further analysis.
I-Scope™ may be a tool configured for cross-examining the presence and activity of varying types of immune cell infiltrates with observed gene expression patterns. It may take annotated gene expression data and analyze it for hematopoietic cell lineage. I-Scope™ may be used downstream of the BIG-C® (Biologically Informed Gene-Clustering) tool in that it helps to provide even more insight into the nature of the genes being expressed after categorization.
I-Scope™ addresses the need to understand the involvement of specific cells for a given disease state. While it is helpful to understand the relative up-regulation and down-regulation at the gene expression level, it is even more informative to understand specifically in which cells this is occurring. I-Scope™ may be configured to identify hematopoietic cells through an iterative search of more than 17,000 genes identified in more than 50 microarray datasets (e.g., as described by Hubbard et al., “Analysis of Lupus Synovitis Gene Expression Reveals Dysregulation of Pathogenic Pathways Activated within Infiltrating Immune Cells,” Arthritis Rheumatol, 2018; 70 (suppl 10), which is incorporated herein by reference in its entirety). I-Scope™ may function by restricting the analysis to genes of hematopoietic cell heritage and allow for cross-checking against purified single-cell experiments or datasets. The cross-check confirms and categorizes specific transcript signatures to the 28 hematopoietic cell sub-categories, ultimately allowing for cellular activity analysis across multiple samples and disease states. When combined with BIG-C® categories, the cellular activity may be correlated to specific functions within a given cell type.
A sample I-Scope™ workflow may comprise the following steps. First, candidate genes are identified from SLE (systemic lupus erythematosus) datasets potentially associated with immune cell expression. Second, using HPA, GTEx, and FANTOM5 datasets, expression signatures associated with hematopoietic cell lineage are identified. Third, signatures are cross-referenced with purified single-cell microarray datasets and RNAseq experiments. Fourth, transcripts are categorized into 28 hematopoietic cell sub-categories and assess cellular expression across different samples and disease states. Odd's ratios are calculated with confidence intervals using the Fisher's exact test in R. An I-Scope™ signature analysis for a given sample may lead to the I-Scope™ signature analysis across multiple samples and disease states.
The T-Scope™ tool may be configured for cross-examining gene expression signatures of a given sample with a database of non-hematopoietic cell types (e.g., as described by Hubbard et al., “Analysis of Gene Expression from Systemic Lupus Erythematosus Synovium Reveals Unique Pathogenic Mechanisms [Abstract], Annual Meeting of the American College of Rheumatology; June 2019; Chicago, IL, which is incorporated herein by reference in its entirety). T-Scope™ may comprise a database of 704 transcripts allocated to 45 independent categories. Transcripts detected in the sample are matched to one of the cellular categories within the T-Scope™ tool to derive further insights on tissue cell activity. T-Scope™ may be used downstream of the BIG-C® (Biologically Informed Gene-Clustering) tool to understand which tissue cell types are present. In conjunction with I-Scope™ (which provides information related to immune cells), T-Scope™ may be performed to provide a complete view of all possible cell activity in a given sample.
T-Scope™ addresses the need to understand the involvement of specific tissue cells for a given disease state. While it is helpful to understand the relative up-regulation and down-regulation at the gene expression level, it is even more informative to understand specifically in which cells this is occurring. T-Scope™ may be configured by downloading a set of approximately 10,000 tissue enriched and 8,000 cell line enriched genes from the Human Protein Atlas along with their tissue or cell line designation. Genes differentially expressed in hematopoietic cell datasets are removed and kidney specific genes are added from the GEO repository. T-Scope™ may function by restricting the analysis to genes of known tissue cell heritage and allow for cross-checking against purified single-cell experiments or datasets. The cross-check confirms and categorizes specific transcript signatures to the 45 tissue cell sub-categories, ultimately allowing for cellular activity analysis across multiple samples and disease states. When combined with BIG-C® categories, the cellular activity may be correlated to specific functions within a given tissue cell type.
A sample T-Scope™ workflow may comprise the following steps. First, candidate genes are identified from SLE (systemic lupus erythematosus) differential expression datasets potentially associated with tissue cell expression. Second, using publicly available databases, expression signatures associated with potential tissue cell activity are identified. Third, signatures are cross-referenced with microarray, scRNAseq or RNAseq experiments. Fourth, transcripts are categorized into 45 tissue cell sub-categories and cellular expression is assessed across different samples and disease states. Results may be obtained using T-Scope™ in combination with I-Scope™ for identification of cells post-DE-analysis.
A cloud-based genomic platform may be configured to provide users with access to CellScan™, which comprises a suite of tools for the identification, analysis, and prioritization of targets for drug development and/or repositioning. This platform is powered by a database containing the genomic information gathered from 5000+ autoimmune patients. The cloud-based genomic platform may leverage results from RNAseq and microarray experiments in conjunction with clinical information, such as medication and lab tests, to provide undiscovered insights.
CellScan™ may go beyond typical ‘omics analysis by performing one or more of the following: functionally categorizing genes and their products (e.g., using BIG-C®); deconvolving gene expression data to identify unique immunological cell types from blood or biopsy samples (e.g., using I-Scope™); identifying tissue specific cell from biopsy samples (e.g., using T-Scope™); identifying receptor-ligand interactions and subsequent signaling pathways (e.g., using MS-Scoring™); ranking genes and their products for targeting by drugs and miRNA mimetics (e.g., using Target-Scoring™); and prioritizing FDA-approved drugs and drugs-in-development for treatment in patients or pre-clinical models (e.g., using CoLTs®).
CellScan™ applications may include one or more of: Biomarker Discovery, Disease Mechanisms, Drug Mechanism of Action, Drug Mechanism of Toxicity, and Target Identification and Validation. Experimental approaches supported by CellScan™ may include one or more of lncRNA, Metabolomics, MicroArray, miRNA, mRNA, qPCR, Proteomics, and RNAseq.
Data analysis and interpretation with CellScan™ may build on comprehensive, manually curated content of a knowledge base. Powerful, quick, and efficient tools may be used to perform deep analysis of NGS and miRNA data to identify gene function, immunological and tissue cell type, pathways, and target/drug appropriate for a specific disease state.
CellScan™ features may be configured to optimize or maximize the impact of information that surfaces in an analysis so that interpretation of a dataset is comprehensive and elucidates actionable insights. These features may include one or more of: NGS RNAseq data analysis, biomarker scoring, and prioritizing targets and drugs for human clinical trials and/or pre-clinical models. The NGS RNAseq data analysis may comprise interrogating RNA and miRNA data for function, cell-type (immunological or tissue) and pathways. The biomarker scoring may comprise using a knowledge base and gene expression data to assess and prioritize biomarkers associated with a target disease or phenotype. The target/drug prioritization may comprise leveraging objective scoring of targets and drugs based on parameters such as scientific rationale, evidence in mouse/human cells, prior clinical data, overall drug properties, and the risk of adverse events.
The knowledge base may be a repository created from millions of individual pieces of information gathered about genes, cells, tissues, drugs, and diseases, and manually reviewed for accuracy and includes rich contextual details and links to original publications. The knowledge base may enable access to relevant and substantiated knowledge from primary literature as well as public and private databases for comprehensive interpretation of NGS/RNAseq data elucidating function/pathways and prioritize targets/drugs for given disease states. An example list of reference databases for the content in CellScan™, with both human and mouse species-specific identifiers supported.
MS-Scoring™ may be configured to identify receptor-ligand interactions and predict ongoing signaling pathways. In addition, MS-Scoring™ may be used to validate molecular pathways as potential targets for new or repurposed drug therapies. The specificity of next-generation drug therapies requires a way to understand the potential of a given therapy to act on the intended biochemical target. Moreover, a potential application of this is the repositioning of drug therapies that may have the correct biochemical targeting to address multiple clinical needs beyond the initial intended therapeutic value.
MS-Scoring™ may be specifically developed to address gaps in the QIAGEN IPA® (Ingenuity Pathway Analysis) tool that does not contain many immunologically relevant pathways. Similar to IPA®, MS-Scoring™ 1 may use log-fold change information to score the target and its signaling pathway to verify the viability of the targets. If the fold-change of the genes of a signaling pathway appears to be upregulated or inhibitors appear to be downregulated, MS-Scoring™ 1 may provide a score of +1. Conversely if the genes of a signaling pathway appear downregulated or the inhibitors upregulated, MS-Scoring™ 1 may provide a score of −1. A score of zero may be provided if no fold-change is observed. The scores may then be summed and normalized across the entire pathway to yield a final % score between −100 (inhibition) and +100 (up-regulation). Higher absolute magnitude scores, scores that are close to −100 or +100, may indicate a high potential for therapeutic targeting. The Fischer's exact test may be performed to determine if there is sufficient overlap of genes between the experimental differentially expressed genes and the genes in the signaling pathway.
A sample MS-Scoring™ 1 workflow may comprise the following steps. First, potential drugs and pathways are identified by LINCS (Library of Integrated Network-Based Cellular Signatures) as candidates for therapeutic intervention. Second, MS-Scoring™ 1 is used to evaluate individual transcript elements of the target pathway. Third, signatures are cross-referenced with purified single-cell microarray datasets and RNAseq experiments. Fourth, scores are compiled and normalized to provide an overall % score for the pathway and higher absolute magnitude scores indicate a higher potential for therapeutic targeting.
MS-Scoring™ 1 may be performed of IL-12 and IL-23 related pathways for targeting using ustekinumab for SLE (systemic lupus erythematosus) drug repositioning (e.g., as described by Grammer et al., 2016, “Drug repositioning in SLE: crowd-sourcing, literature-mining and Big Data analysis,” Lupus, 25(10), 1150-1170, which is incorporated herein by reference in its entirety).
MS-Scoring™ 2 may utilize custom-defined gene modules that represent a signaling pathway or process and is particularly useful for gene expression datasets from microarray or RNAseq. The MS-Scoring™ 2 tool may be configured to take a deeper look at signaling pathways analyzed using the MS-Scoring™ 1. The tool may analyze raw gene expression data and assess enrichment by the Gene Set Variation Analysis (as described herein), which assigns an indexed score to the individual co-expressed pathways between −1 and +1 indicating levels of down-regulation and up-regulation respectively.
A sample MS-Scoring™ 2 workflow may comprise the following steps. First, a signaling pathway of interest is selected from the MS-Scoring™ 2 menu. Second, a raw gene expression data is inputted into the MS-Scoring™ 2 tool. Third, enrichment of signaling pathway(s) is assessed on a patient by patient basis. Fourth, the data may then be used to drive insight for the target signaling pathways in individual patient samples.
Results from GSVA Analysis on SLE (systemic lupus erythematosus) signaling pathways may be, e.g., as described by Hanzelmann et al., “GSVA: Gene Set Variation Analysis for Microarray and RNA-Seq Data,” BMC Bioinformatics, vol. 14, no. 1, 2013, p. 7., which is incorporated herein by reference in its entirety.
A scoring method called CoLTs®, or Combined Lupus Treatment Scoring, may be configured to assessing and prioritizing the repositioning potential of drug therapies. CoLTs® may rank identified drugs/therapies by a number of essential characteristics, including scientific rationale, experience in lupus mice/human cells (preclinical), previous clinical experience in autoimmunity, drug properties, and safety profile, including adverse events. Face and test validities may be established by scoring standard of care (SOC) medications and confirming the scores with a panel of lupus clinicians. The final result may be the CoLTs® score. A CoLTs® algorithm may also be configured for drugs in development (DID) since they typically do not have drug metabolism and adverse event information available.
CoLTs® may be configured to perform objective scoring of drug molecules based on a hypothesis-based literature search of publicly available databases. The tool has the ability to rank drug molecules from both FDA-approved and non-approved classes and ranked based upon parameters such as scientific rationale, evidence in mouse/human cells, prior clinical data, overall drug properties, and the risk of adverse events. The parameters are used within five independent drug therapy categories: small molecules, biologics, complementary and alternative therapies, and drugs in development.
CoLTs® may address the need for a systematic and objective way to evaluate the potential of drug therapies to be repositioned for treatment of autoimmune diseases, initially within SLE (systemic lupus erythematosus). The composite score may embody all the accessible information in literature databases, inclusive of efficacy and adverse reactions, to be able to assist in the prioritization of drug development. While the composite score takes into account many aspects of a drug, it may heavily weigh the risk of adverse events and ranges from −16 to +11. CoLT Scoring® may be validated through repeated scoring of 215 potential therapies using a total of over 5000 reference data points as well as by clinicians specializing in the field of rheumatology. Specifically, CoLTs®' prediction of Stelara/Ustekinumab to be a top priority biologic for lupus drug repositioning is validated by a successful Phase 2 clinical trial (e.g., as described by Vollenhoven et al., “Efficacy and Safety of Ustekinumab, an IL-12 and IL-23 Inhibitor, in Patients with Active Systemic Lupus Erythematosus: Results of a Multicentre, Double-Blind, Phase 2, Randomised, Controlled Study.” The Lancet, vol. 392, no. 10155, 2018, pp. 1330-1339, which is incorporated herein by reference in its entirety). CoLTs® may be calibrated on SoC (Standard of Care) therapies for the individual autoimmune disease being assessed.
Within the ten major categories, rationale ranges from 0 to +3, mouse/human in vitro experience ranges from −1 to +1, clinical properties are on a scale of −3 to +3, the adverse effect of inducing lupus ranges from −1 to 0, metabolic properties range from −2 to 0, and finally adverse events (such as toxicity, infection, carcinogenic, etc.) were given a score of −5 to 0 (e.g., as described by Grammer et al., 2016, “Drug repositioning in SLE: crowd-sourcing, literature-mining and Big Data analysis,” Lupus, 25(10), 1150-1170, which is incorporated herein by reference in its entirety). For example, CoLT Scoring® of SOC Therapies in Lupus (Belimumab, HCQ, and Rituximab) may be performed.
The Target scoring algorithm may be configured to prioritize a specific gene or protein that would potentially be a good choice to target with a drug in lupus patients. It may be utilized even if there is currently no drug available to the target gene or protein. The algorithm may be based on the addition of 18 data based determinations plus the overall scientific rationale and generates scores from −13 (not a good target in SLE) to 27 (very promising target in SLE).
Target-Scoring™ may be configured to assessing and prioritizing the potential of molecular targets for further development of drug therapies. The Target-Scoring™ tool is very similar to CoLTs® except it approaches the need for new SLE therapies from a different angle. Target Scoring may be configured to perform an objective assessment of molecular targets for the development of new or repurposed drug therapies. Like CoLTs®, it also derives data from a hypothesis-based literature search and generates a composite score based on the publicly available information. Leveraging the composite score, researchers may better prioritize the development of novel drug therapies addressing the assessed targets of interest.
Target-Scoring™ may utilize 19 different scoring categories to derive a composite score that ranges from −13 to +27 for the suitability of a gene target for SLE therapy development. Target-Scoring™ may be validated through repeated scoring of potential therapies as well as by clinicians (e.g., clinicians specializing in the field of immunology).
A non-limiting example of a method to assess a condition of a subject, e.g., an SLE, DLE, PSO, AD, or SSc condition, may comprise one or more of the following operations. A dataset of a biological sample of a subject is received. The dataset may comprise quantitative measures of gene expression from each of a plurality of lupus-associated genomic loci.
A patient sample may be harvested using any method known to those of skill in the art. To obtain a blood sample, various techniques may be used, e.g., a syringe or other vacuum suction device. A blood sample may be optionally pre-treated or processed prior to use. A sample, such as a blood sample, may be analyzed under any of the methods and systems herein within 4 weeks, 2 weeks, 1 week, 6 days, 5 days, 4 days, 3 days, 2 days, 1 day, 12 hr, 6 hr, 3 hr, 2 hr, or 1 hr from the time the sample is obtained, or longer if frozen. When obtaining a sample from a subject (e.g., blood sample), the amount may vary depending upon subject size and the condition being screened. In some embodiments, at least 10 mL, 5 mL, 1 mL, 0.5 mL, 250, 200, 150, 100, 50, 40, 30, 20, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 μL of a sample is obtained. In some embodiments, 1-50, 2-40, 3-30, or 4-20 μL of sample is obtained. In some embodiments, more than 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95 or 100 μL of a sample is obtained.
To obtain a skin biopsy sample, various techniques may be used. The skin biopsy sample can include skin samples removed from the body of the subject. In certain embodiments, the skin biopsy sample include cells and/or tissues from cutaneous, intradermal, or subcutaneous layer, or from any abnormal tissue in theses layers. In certain embodiments, the skin biopsy sample includes cutaneous tissues. In certain embodiments, the skin biopsy sample includes subcutaneous tissues. Skin biopsy can be performed using any suitable technique known to those of skill in the art. In certain embodiments, skin biopsy can be performed using shave biopsy, punch biopsy, excisional biopsy, or any combination thereof. In certain embodiments, the shave biopsy procedure includes removing a small section of the top layers of skin (epidermis and a portion of the dermis). In certain embodiments, the punch biopsy procedure includes using a tool, such as a circular tool to remove a small core of skin, including deeper layers (epidermis, dermis and superficial fat). In certain embodiments, the excisional biopsy procedure includes removing a lump, lesion and/or an area of abnormal skin. In certain particular embodiments, entire or effectively entire lump, lesion and/or the area of abnormal skin is removed. In certain particular embodiments, the lump, lesion and/or area of abnormal skin is removed through fatty layer of the skin. The area, size, and amount of the skin biopsy sample may vary depending upon the condition being analyzed. In some embodiments, the skin sample is obtained via one, two, three, four, five, or more shave biopsies, punch biopsies, incisional or excisional biopsies, or a combination of the above. In some embodiments, the skin sample comprises, for example, elastin and/or collagen. The skin sample may be obtained from any desired anatomical location on the body, including one or more of the scalp, face, neck, chest, arms, legs, hands, back, buttocks, upper or lower extremities, or genitalia for example. The skin sample may have any appropriate depth. In some embodiments the skin sample has a depth of about 2 mm to about 25 mm. In some embodiments the skin sample has a depth of about 2 mm to about 3 mm, about 2 mm to about 4 mm, about 2 mm to about 5 mm, about 2 mm to about 6 mm, about 2 mm to about 7 mm, about 2 mm to about 8 mm, about 2 mm to about 9 mm, about 2 mm to about 10 mm, about 2 mm to about 15 mm, about 2 mm to about 20 mm, about 2 mm to about 25 mm, about 3 mm to about 4 mm, about 3 mm to about 5 mm, about 3 mm to about 6 mm, about 3 mm to about 7 mm, about 3 mm to about 8 mm, about 3 mm to about 9 mm, about 3 mm to about 10 mm, about 3 mm to about 15 mm, about 3 mm to about 20 mm, about 3 mm to about 25 mm, about 4 mm to about 5 mm, about 4 mm to about 6 umm, about 4 umm to about 7 mm, about 4 mm to about 8 mm, about 4 mm to about 9 mm, about 4 mm to about 10 mm, about 4 mm to about 15 mm, about 4 mm to about 20 mm, about 4 mm to about 25 mm, about 5 mm to about 6 mm, about 5 mm to about 7 mm, about 5 mm to about 8 mm, about 5 mm to about 9 mm, about 5 mm to about 10 mm, about 5 mm to about 15 mm, about 5 mm to about 20 mm, about 5 mm to about 25 mm, about 6 umm to about 7 mm, about 6 rum to about 8 mm, about 6 mm to about 9 mm, about 6 mm to about 10 mm, about 6 mm to about 15 mm, about 6 mm to about 20 mm, about 6 mm to about 25 mm, about 7 mm to about 8 mm, about 7 mm to about 9 mm, about 7 mm to about 10 mm, about 7 mm to about 15 mm, about 7 mm to about 20 mm, about 7 mm to about 25 umm, about 8 umm to about 9 mm, about 8 rum to about 10 mm, about 8 mm to about 15 mm, about 8 mm to about 20 mm, about 8 mm to about 25 mm, about 9 mm to about 10 mm, about 9 mm to about 15 mm, about 9 mm to about 20 mm, about 9 mm to about 25 mm, about 10 mm to about 15 mm, about 10 mm to about 20 mm, about 10 mm to about 25 mm, about 15 mm to about 20 mm, about 15 mm to about 25 mm, or about 20 mm to about 25 mm. In some embodiments the skin sample has a depth of about 2 mm, about 3 mm, about 4 mm, about 5 mm, about 6 mm, about 7 mm, about 8 mm, about 9 mm, about 10 mm, about 15 mm, about 20 mm, or about 25 mm. In some embodiments the skin sample has a depth of at least about 2 mm, about 3 mm, about 4 mm, about 5 mm, about 6 mm, about 7 mm, about 8 mm, about 9 mm, about 10 mm, about 15 mm, or about 20 mm. In some embodiments the skin sample has a depth of at most about 3 mm, about 4 mm, about 5 mm, about 6 mm, about 7 mm, about 8 mm, about 9 mm, about 10 mm, about 15 mm, about 20 mm, or about 25 mm. The skin sample can include one or more layers of the epidermis, dermis, and hypodermis. Skin sampling techniques for disease analysis are widely described in the literature, e.g., by the Mayo Clinic on their website (mayoclinic.org), available under “Skin Biopsy” (Mayo Clinic, Rochester, MN), incorporated herein by reference in its entirety.
The sample may be taken before and/or after treatment of a subject with a disease or disorder. Samples may be obtained from a subject during a treatment or a treatment regime. Multiple samples may be obtained from a subject to monitor the effects of the treatment over time. The sample may be taken from a subject known or suspected of having a disease or disorder for which a definitive positive or negative diagnosis is not available via clinical tests. The sample may be taken from a subject suspected of having a disease or disorder. The sample may be taken from a subject experiencing unexplained symptoms, such as fatigue, nausea, weight loss, aches and pains, weakness, or bleeding. The sample may be taken from a subject having explained symptoms. The sample may be taken from a subject at risk of developing a disease or disorder due to factors such as familial history, age, hypertension or pre-hypertension, diabetes or pre-diabetes, overweight or obesity, environmental exposure, lifestyle risk factors (e.g., smoking, alcohol consumption, or drug use), or presence of other risk factors.
In some embodiments, a sample may be taken at a first time point and assayed, and then another sample may be taken at a subsequent time point and assayed. Such methods may be used, for example, for longitudinal monitoring purposes to track the development or progression of a disease or disorder (e.g., an SLE condition). In some embodiments, the progression of a disease may be tracked before treatment, after treatment, or during the course of treatment, to determine the treatment's effectiveness. For example, a method as described herein may be performed on a subject prior to, and after, treatment with an SLE therapy to measure the disease's progression or regression in response to the SLE therapy.
After obtaining a sample from the subject, the sample may be processed to generate datasets indicative of a condition (e.g., an SLE condition) of the subject. For example, a presence, absence, or quantitative assessment of nucleic acid molecules of the sample from a panel of condition-associated (e.g., SLE-associated) genomic loci or may be indicative of a condition (e.g., an SLE condition) of the subject. Processing the sample obtained from the subject may comprise (i) subjecting the sample to conditions that are sufficient to isolate, enrich, or extract a plurality of nucleic acid molecules, and (ii) assaying the plurality of nucleic acid molecules to generate the dataset (e.g., microarray data, nucleic acid sequences, or quantitative polymerase chain reaction (qPCR) data). Methods of assaying may include any assay known in the art or described in the literature, for example, a microarray assay, a sequencing assay (e.g., DNA sequencing, RNA sequencing, or RNA-Seq), or a quantitative polymerase chain reaction (qPCR) assay.
In some embodiments, a plurality of nucleic acid molecules is extracted from the sample and subjected to sequencing to generate a plurality of sequencing reads. The nucleic acid molecules may comprise ribonucleic acid (RNA) or deoxyribonucleic acid (DNA). The extraction method may extract all RNA or DNA molecules from a sample. Alternatively, the extraction method may selectively extract a portion of RNA or DNA molecules from a sample. Extracted RNA molecules from a sample may be converted to cDNA molecules by reverse transcription (RT).
The sample may be processed without any nucleic acid extraction. For example, the disease or disorder may be identified or monitored in the subject by using probes configured to selectively enrich nucleic acid (e.g., RNA or DNA) molecules corresponding to a panel of SLE-associated genomic loci. The probes may be nucleic acid primers. The probes may have sequence complementarity with nucleic acid sequences from one or more of the panel of condition-associated (e.g., SLE-associated) genomic loci. The panel of condition-associated genomic loci may comprise at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least about 25, at least about 30, at least about 35, at least about 40, at least about 45, at least about 50, at least about 55, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, at least about 90, at least about 95, at least about 100, or more condition-associated genomic loci.
The probes may be nucleic acid molecules (e.g., RNA or DNA) having sequence complementarity with nucleic acid sequences (e.g., RNA or DNA) of one or more genomic loci (e.g., condition-associated genomic loci). These nucleic acid molecules may be primers or enrichment sequences. The assaying of the sample using probes that are selective for the one or more genomic loci (e.g., condition-associated genomic loci) may comprise use of array hybridization, polymerase chain reaction (PCR), or nucleic acid sequencing (e.g., RNA sequencing or DNA sequencing, such as RNA-Seq).
The assay readouts may be quantified at one or more genomic loci (e.g., condition-associated genomic loci) to generate the data indicative of the disease or disorder. For example, quantification of array hybridization or polymerase chain reaction (PCR) corresponding to a plurality of genomic loci (e.g., condition-associated genomic loci) may generate data indicative of the disease or disorder. Assay readouts may comprise quantitative PCR (qPCR) values, digital PCR (dPCR) values, digital droplet PCR (ddPCR) values, fluorescence values, etc., or normalized values thereof.
In some embodiments, the present disclosure provides a system, method, or kit having data analysis realized in software application, computing hardware, or both. In various embodiments, the analysis application or system includes at least a data receiving module, a data pre-processing module, a data analysis module, a data interpretation module, or a data visualization module. In one embodiments, the data receiving module may comprise computer systems that connect laboratory hardware or instrumentation with computer systems that process laboratory data. In one embodiments, the data pre-processing module may comprise hardware systems or computer software that performs operations on the data in preparation for analysis. Examples of operations that may be applied to the data in the pre-processing module include affine transformations, denoising operations, data cleaning, reformatting, or subsampling. A data analysis module, which may be specialized for analyzing genomic data from one or more genomic materials, can, for example, take assembled genomic sequences and perform probabilistic and statistical analysis to identify abnormal patterns related to a disease, pathology, state, risk, condition, or phenotype. A data interpretation module may use analysis methods, for example, drawn from statistics, mathematics, or biology, to support understanding of the relation between the identified abnormal patterns and health conditions, functional states, prognoses, or risks. A data visualization module may use methods of mathematical modeling, computer graphics, or rendering to create visual representations of data that may facilitate the understanding or interpretation of results.
Feature sets may be generated from datasets obtained using one or more assays of a biological sample obtained or derived from a subject, and a trained algorithm may be used to process one or more of the feature sets to identify or assess a condition (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition) of a subject. For example, the trained algorithm may be used to apply a machine learning classifier to a plurality of condition-associated genomic loci that are associated with two or more classes of individuals inputted into a machine learning model, in order to classify a subject into one of the two or more classes of individuals. For example, the trained algorithm may be used to apply a machine learning classifier to a plurality of condition-associated that are associated with individuals with known conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition) and individuals not having the condition (e.g., healthy individuals, or individuals who do not have a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition), in order to classify a subject as having the condition (e.g., positive test outcome) or not having the condition (e.g., negative test outcome).
The trained algorithm may be configured to identify the presence (e.g., positive test result) or absence (e.g., negative test result) of one or more conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition) with an accuracy of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than 99%. This accuracy may be achieved for a set of at least about 25, at least about 50, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, at least about 500, at least about 1,000, or more than about 1,000 independent samples.
The trained algorithm may comprise a machine learning algorithm, such as a supervised machine learning algorithm. The supervised machine learning algorithm may comprise, for example, a Random Forest, a support vector machine (SVM), a neural network, or a deep learning algorithm. The trained algorithm may comprise a classification and regression tree (CART) algorithm. The trained algorithm may comprise an unsupervised machine learning algorithm.
The trained algorithm may comprise a classifier configured to accept as input a plurality of input variables or features (e.g., condition-associated genomic loci) and to produce or output one or more output values based on the plurality of input variables or features (e.g., condition-associated genomic loci). The plurality of input variables or features may comprise one or more datasets indicative of the presence (e.g., positive test result) or absence (e.g., negative test result) of one or more conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition). For example, an input variable or feature may comprise a number of sequences corresponding to or aligning to each of the plurality of condition-associated genomic loci.
The plurality of input variables or features may also include clinical information of a subject, such as health data. For example, the health data of a subject may comprise one or more of: a diagnosis of one or more conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition), a prognosis of one or more conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition), a risk of having one or more conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition), a treatment history of one or more conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition), a history of previous treatment for one or more conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition), a history of prescribed medications, a history of prescribed medical devices, age, height, weight, sex, smoking status, and one or more symptoms of the subject.
For example, the disease or disorder may comprise one or more of: systemic lupus erythematosus (SLE), discoid lupus erythematosus (DLE), lupus nephritis (LN), psoriasis (PSO), atopic dermatitis (AD), or systemic sclerosis (scleroderma, SSc). As another example, the symptoms may include one or more of: alopecia, anti-dsDNA seropositivity, arthritis, fever, hematuria, leukopenia, low serum complement, mucosal ulcer, myositis, pericarditis, pleurisy, proteinuria, pyuria, rash, thrombocytopenia, urinary cast, vasculitis, visual disturbance, or a combination thereof. As another example, the prescribed medications or drugs may include one or more of: antimalarials, corticosteroids, immunosuppressants, and nonsteroidal anti-inflammatory drugs (NSAIDs).
The trained algorithm may comprise a classifier, such that each of the one or more output values comprises one of a fixed number of possible values (e.g., a linear classifier, a logistic regression classifier, etc.) indicating a classification of the sample by the classifier. The trained algorithm may comprise a binary classifier, such that each of the one or more output values comprises one of two values (e.g., {0, 1}, {positive, negative}, or {high-risk, low-risk}) indicating a classification of the sample by the classifier. The trained algorithm may be another type of classifier, such that each of the one or more output values comprises one of more than two values (e.g., {0, 1, 2}, {positive, negative, or indeterminate}, or {high-risk, intermediate-risk, or low-risk}) indicating a classification of the sample by the classifier.
The classifier may be configured to classify samples by assigning output values, which may comprise descriptive labels, numerical values, or a combination thereof. Some of the output values may comprise descriptive labels. Such descriptive labels may provide an identification or indication of the presence (e.g., positive test result) or absence (e.g., negative test result) of one or more conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition) of the subject, and may comprise, for example, positive, negative, high-risk, intermediate-risk, low-risk, or indeterminate. Such descriptive labels may provide an identification of a treatment for the one or more conditions of the subject, and may comprise, for example, a therapeutic intervention, a duration of the therapeutic intervention, and/or a dosage of the therapeutic intervention suitable to treat the one or more conditions of the subject. Such descriptive labels may provide an identification of secondary clinical tests that may be appropriate to perform on the subject, and may comprise, for example, an imaging test, a blood test, a computed tomography (CT) scan, a magnetic resonance imaging (MRI) scan, an ultrasound scan, a chest X-ray, a positron emission tomography (PET) scan, a PET-CT scan, or any combination thereof. For example, such descriptive labels may provide a prognosis of the one or more conditions of the subject. As another example, such descriptive labels may provide a relative assessment of the one or more conditions of the subject. Some descriptive labels may be mapped to numerical values, for example, by mapping “positive” to 1 and “negative” to 0.
The classifier may be configured to classify samples by assigning output values that comprise numerical values, such as binary, integer, or continuous values. Such binary output values may comprise, for example, {0, 1}, {positive, negative}, or {high-risk, low-risk}. Such integer output values may comprise, for example, {0, 1, 2}. Such continuous output values may comprise, for example, a probability value of at least 0 and no more than 1. Such continuous output values may comprise, for example, an un-normalized probability value of at least 0. Such continuous output values may indicate a prognosis of the one or more conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition) of the subject. Some numerical values may be mapped to descriptive labels, for example, by mapping 1 to “positive” and 0 to “negative.”
The classifier may be configured to classify samples by assigning output values based on one or more cutoff values. For example, a binary classification of samples may assign an output value of “positive” or 1 if the sample indicates that the subject has at least a 50% probability of having one or more conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition), thereby assigning the subject to a class of individuals receiving a positive test result. As another example, a binary classification of samples may assign an output value of “negative” or 0 if the sample indicates that the subject has less than a 50% probability of having one or more conditions (e.g., a disease or disorder), thereby assigning the subject to a class of individuals receiving a negative test result. In this case, a single cutoff value of 50% is used to classify samples into one of the two possible binary output values or classes of individuals (e.g., those receiving a positive test result and those receiving a negative test result). Examples of single cutoff values may include about 1%, about 2%, about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, and about 99%.
As another example, the classifier may be configured to classify samples by assigning an output value of “positive” or 1 if the sample indicates that the subject has a probability of having one or more conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition) of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more. The classification of samples may assign an output value of “positive” or 1 if the sample indicates that the subject has a probability of having one or more conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition) of more than about 50%, more than about 55%, more than about 60%, more than about 65%, more than about 70%, more than about 75%, more than about 80%, more than about 85%, more than about 90%, more than about 91%, more than about 92%, more than about 93%, more than about 94%, more than about 95%, more than about 96%, more than about 97%, more than about 98%, or more than about 99%.
The classifier may be configured to classify samples by assigning an output value of “negative” or 0 if the sample indicates that the subject has a probability of having one or more conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition) of less than about 50%, less than about 45%, less than about 40%, less than about 35%, less than about 30%, less than about 25%, less than about 20%, less than about 15%, less than about 10%, less than about 9%, less than about 8%, less than about 7%, less than about 6%, less than about 5%, less than about 4%, less than about 3%, less than about 2%, or less than about 1%. The classification of samples may assign an output value of “negative” or 0 if the sample indicates that the subject has a probability of having one or more conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition) of no more than about 50%, no more than about 45%, no more than about 40%, no more than about 35%, no more than about 30%, no more than about 25%, no more than about 20%, no more than about 15%, no more than about 10%, no more than about 9%, no more than about 8%, no more than about 7%, no more than about 6%, no more than about 5%, no more than about 4%, no more than about 3%, no more than about 2%, or no more than about 1%.
The classifier may be configured to classify samples by assigning an output value of “indeterminate” or 2 if the sample is not classified as “positive”, “negative”, 1, or 0. In this case, a set of two cutoff values is used to classify samples into one of the three possible output values or classes of individuals (e.g., corresponding to outcome groups of individuals having “low risk,” “intermediate risk,” and “high risk” of having one or more conditions, such as a disease or disorder). Examples of sets of cutoff values may include {1%, 99%}, {2%, 98%}, {5%, 95%}, {10%, 90%}, {15%, 85%}, {20%, 80%}, {25%, 75%}, {30%, 70%}, {35%, 65%}, {40%, 60%}, and {45%, 55%}. Similarly, sets of n cutoff values may be used to classify samples into one of n+1 possible output values or classes of individuals, where n is any positive integer.
The trained algorithm may be trained with a plurality of independent training samples. Each of the independent training samples may comprise a sample from a subject, associated datasets obtained by assaying the sample (as described elsewhere herein), and one or more known output values or classes of individuals corresponding to the sample (e.g., a clinical diagnosis, prognosis, absence, or treatment efficacy of a condition of the subject). Independent training samples may comprise samples and associated datasets and outputs obtained or derived from a plurality of different subjects. Independent training samples may comprise samples and associated datasets and outputs obtained at a plurality of different time points from the same subject (e.g., on a regular basis such as weekly, biweekly, or monthly), as part of a longitudinal monitoring of a subject before, during, and after a course of treatment for one or more conditions of the subject. Independent training samples may be associated with presence of the condition (e.g., training samples comprising samples and associated datasets and outputs obtained or derived from a plurality of subjects known to have the condition). Independent training samples may be associated with absence of the condition (e.g., training samples comprising samples and associated datasets and outputs obtained or derived from a plurality of subjects who are known to not have a previous diagnosis of the condition or who have received a negative test result for the condition).
The trained algorithm may be trained with at least about 5, at least about 10, at least about 15, at least about 20, at least about 25, at least about 30, at least about 35, at least about 40, at least about 45, at least about 50, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, or at least about 500 independent training samples. The independent training samples may comprise samples associated with presence of the condition and/or samples associated with absence of the condition. The trained algorithm may be trained with no more than about 500, no more than about 450, no more than about 400, no more than about 350, no more than about 300, no more than about 250, no more than about 200, no more than about 150, no more than about 100, or no more than about 50 independent training samples associated with presence of the condition (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition). The trained algorithm may be trained with no more than about 500, no more than about 450, no more than about 400, no more than about 350, no more than about 300, no more than about 250, no more than about 200, no more than about 150, no more than about 100, or no more than about 50 independent training samples associated with absence of the condition (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition). In some embodiments, the sample is independent of samples used to train the trained algorithm.
The trained algorithm may be trained with a first number of independent training samples associated with a presence of the condition (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition) and a second number of independent training samples associated with an absence of the condition (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition). The first number of independent training samples associated with presence of the condition (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition) may be no more than the second number of independent training samples associated with absence of the condition (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition). The first number of independent training samples associated with a presence of the condition (e.g., a disease or disorder) may be equal to the second number of independent training samples associated with an absence of the condition (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition). The first number of independent training samples associated with a presence of the condition (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition) may be greater than the second number of independent training samples associated with an absence of the condition (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition).
The trained algorithm may comprise a classifier configured to identify the presence (e.g., positive test result) or absence (e.g., negative test result) of one or more conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition) at an accuracy of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more; for at least about 5, at least about 10, at least about 15, at least about 20, at least about 25, at least about 30, at least about 35, at least about 40, at least about 45, at least about 50, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, or at least about 500 independent training samples. The accuracy of identifying the presence (e.g., positive test result) or absence (e.g., negative test result) of the one or more conditions by the trained algorithm may be calculated as the percentage of independent test samples (e.g., subjects known to have the condition or subjects with negative clinical test results for the condition) that are correctly identified or classified as having or not having the condition.
The trained algorithm may comprise a classifier configured to identify one or more conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition) with a positive predictive value (PPV) of at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more. The PPV of identifying the condition using the trained algorithm may be calculated as the percentage of samples identified or classified as having the condition that correspond to subjects that truly have the condition.
The trained algorithm may comprise a classifier configured to identify one or more conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition) with a negative predictive value (NPV) of at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more. The NPV of identifying the condition using the trained algorithm may be calculated as the percentage of samples identified or classified as not having the condition that correspond to subjects that truly do not have the condition.
The trained algorithm may comprise a classifier configured to identify one or more conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition) with a clinical sensitivity at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, at least about 99.1%, at least about 99.2%, at least about 99.3%, at least about 99.4%, at least about 99.5%, at least about 99.6%, at least about 99.7%, at least about 99.8%, at least about 99.9%, at least about 99.99%, at least about 99.999%, or more. The clinical sensitivity of identifying the condition using the trained algorithm may be calculated as the percentage of independent test samples associated with presence of the condition (e.g., subjects known to have the condition) that are correctly identified or classified as having the condition.
The trained algorithm may comprise a classifier configured to identify one or more conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition) with a clinical specificity of at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, at least about 99.1%, at least about 99.2%, at least about 99.3%, at least about 99.4%, at least about 99.5%, at least about 99.6%, at least about 99.7%, at least about 99.8%, at least about 99.9%, at least about 99.99%, at least about 99.999%, or more. The clinical specificity of identifying the condition using the trained algorithm may be calculated as the percentage of independent test samples associated with absence of the condition (e.g., subjects with negative clinical test results for the condition) that are correctly identified or classified as not having the condition.
The trained algorithm may comprise a classifier configured to identify the presence (e.g., positive test result) or absence (e.g., negative test result) of one or more conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition) with an Area-Under-Curve (AUC) of at least about 0.50, at least about 0.55, at least about 0.60, at least about 0.65, at least about 0.70, at least about 0.75, at least about 0.80, at least about 0.81, at least about 0.82, at least about 0.83, at least about 0.84, at least about 0.85, at least about 0.86, at least about 0.87, at least about 0.88, at least about 0.89, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least about 0.98, at least about 0.99, or more. The AUC may be calculated as an integral of the Receiver Operator Characteristic (ROC) curve (e.g., the area under the ROC curve) associated with the trained algorithm in classifying samples as having or not having the condition.
Classifiers of the trained algorithm may be adjusted or tuned to improve or optimize one or more performance metrics, such as accuracy, PPV, NPV, clinical sensitivity, clinical specificity, AUC, or a combination thereof (e.g., a performance index incorporating a plurality of such performance metrics, such as by calculating a weight sum therefrom), of identifying the presence (e.g., positive test result) or absence (e.g., negative test result) of the condition. The classifiers may be adjusted or tuned by adjusting parameters of the classifiers (e.g., a set of cutoff values used to classify a sample as described elsewhere herein, or weights of a neural network) to improve or optimize the performance metrics. The one or more classifiers may be adjusted or tuned so as to reduce an overall classification error (e.g., an “out-of-bag” or oob error rate for a Random Forest classifier). The one or more classifiers may be adjusted or tuned continuously during the training process (e.g., as sample datasets are added to the training set) or after the training process has completed.
The trained algorithm may comprise a plurality of classifiers (e.g., an ensemble) such that the plurality of classifications or outcome values of the plurality of classifiers may be combined to produce a single classification or outcome value for the sample. For example, a sum or a weighted sum of the plurality of classifications or outcome values of the plurality of classifiers may be calculated to produce a single classification or outcome value for the sample. As another example, a majority vote of the plurality of classifications or outcome values of the plurality of classifiers may be identified to produce a single classification or outcome value for the sample. In this manner, a single classification or outcome value may be produced for the sample having greater confidence or statistical significance than the individual classifications or outcome values produced by each of the plurality of classifiers.
After the trained algorithm is initially trained, a subset of the inputs may be identified as most influential or most important to be included for making high-quality classifications (e.g., having highest permutation feature importance). For example, a subset of the panel of condition-associated genomic loci may be identified as most influential or most important to be included for making high-quality classifications or identifications of conditions (or sub-types of conditions). The panel of condition-associated genomic loci, or a subset thereof, may be ranked based on classification metrics indicative of each influence or importance of each individual condition-associated genomic locus toward making high-quality classifications or identifications of conditions (or sub-types of conditions). Such metrics may be used to reduce, in some cases significantly, the number of input variables (e.g., predictor variables) that may be used to train the one or more classifiers of the trained algorithm to a desired performance level (e.g., based on a desired minimum accuracy, PPV, NPV, clinical sensitivity, clinical specificity, AUC, or a combination thereof).
For example, if training a classifier of the trained algorithm with a plurality comprising several dozen or hundreds of input variables to the classifier results in an accuracy of classification of more than 99%, then training the classifier of the trained algorithm instead with only a selected subset of no more than about 5, no more than about 10, no more than about 15, no more than about 20, no more than about 25, no more than about 30, no more than about 35, no more than about 40, no more than about 45, no more than about 50, or no more than about 100 such most influential or most important input variables among the plurality may yield decreased but still acceptable accuracy of classification (e.g., at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99%).
As another example, if training a classifier of the trained algorithm with a plurality comprising several dozen or hundreds of input variables to the classifier results in a sensitivity or specificity of classification of more than 99%, then training the classifier of the trained algorithm instead with only a selected subset of no more than about 5, no more than about 10, no more than about 15, no more than about 20, no more than about 25, no more than about 30, no more than about 35, no more than about 40, no more than about 45, no more than about 50, or no more than about 100 such most influential or most important input variables among the plurality may yield decreased but still acceptable sensitivity or specificity of classification (e.g., at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99%).
The subset of the plurality of input variables (e.g., the panel of condition-associated genomic loci) to the classifier of the trained algorithm may be selected by rank-ordering the entire plurality of input variables and selecting a predetermined number (e.g., no more than about 5, no more than about 10, no more than about 15, no more than about 20, no more than about 25, no more than about 30, no more than about 35, no more than about 40, no more than about 45, no more than about 50, or no more than about 100) of input variables with the best classification metrics (e.g., permutation feature importance).
Upon identifying the subject as having one or more conditions (e.g., a disease or disorder, such as a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition), the subject may be optionally provided with a therapeutic intervention (e.g., prescribing an appropriate course of treatment to treat the one or more conditions of the subject). The therapeutic intervention may comprise a prescription of an effective dose of a drug, a further testing or evaluation of the condition, a further monitoring of the condition, or a combination thereof. If the subject is currently being treated for the condition with a course of treatment, the therapeutic intervention may comprise a subsequent different course of treatment (e.g., to increase treatment efficacy due to non-efficacy of the current course of treatment).
The therapeutic intervention may include prescribed medications or drugs, which may include one or more of: antimalarials, corticosteroids, immunosuppressants, and nonsteroidal anti-inflammatory drugs (NSAIDs). The therapeutic intervention may be effective to alleviate or decrease one or more symptoms, which may include one or more of: alopecia, anti-dsDNA seropositivity, arthritis, fever, hematuria, leukopenia, low serum complement, mucosal ulcer, myositis, pericarditis, pleurisy, proteinuria, pyuria, rash, thrombocytopenia, urinary cast, vasculitis, visual disturbance, or a combination thereof.
The therapeutic intervention may comprise recommending the subject for a secondary clinical test to confirm a diagnosis of the condition. This secondary clinical test may comprise an imaging test, a blood test, a computed tomography (CT) scan, a magnetic resonance imaging (MRI) scan, an ultrasound scan, a chest X-ray, a positron emission tomography (PET) scan, a PET-CT scan, or any combination thereof.
The feature sets (e.g., comprising quantitative measures of a panel of condition-associated genomic loci) may be analyzed and assessed (e.g., using a trained algorithm comprising one or more classifiers) over a duration of time to monitor a patient (e.g., subject who has a condition or who is being treated for a condition). In such cases, the feature sets of the patient may change during the course of treatment. For example, the quantitative measures of the feature sets of a patient with decreasing risk of the condition due to an effective treatment may shift toward the profile or distribution of a healthy subject (e.g., a subject without the condition). Conversely, for example, the quantitative measures of the feature sets of a patient with increasing risk of the condition due to an ineffective treatment may shift toward the profile or distribution of a subject with higher risk of the condition or a more advanced stage or severity of the condition.
The condition of the subject may be monitored by monitoring a course of treatment for treating the condition of the subject. The monitoring may comprise assessing the condition of the subject at two or more time points. The assessing may be based at least on the feature sets (e.g., quantitative measures of a panel of condition-associated genomic loci) determined at each of the two or more time points. The therapeutic intervention may include prescribed medications or drugs, which may include one or more of: antimalarials, corticosteroids, immunosuppressants, and nonsteroidal anti-inflammatory drugs (NSAIDs). The therapeutic intervention may be effective to alleviate or decrease one or more symptoms, which may include one or more of: alopecia, anti-dsDNA seropositivity, arthritis, fever, hematuria, leukopenia, low serum complement, mucosal ulcer, myositis, pericarditis, pleurisy, proteinuria, pyuria, rash, thrombocytopenia, urinary cast, vasculitis, visual disturbance, or a combination thereof. The assessing may be based at least on the presence, absence, or severity of one or more symptoms, such as alopecia, anti-dsDNA seropositivity, arthritis, fever, hematuria, leukopenia, low serum complement, mucosal ulcer, myositis, pericarditis, pleurisy, proteinuria, pyuria, rash, thrombocytopenia, urinary cast, vasculitis, visual disturbance, or a combination thereof.
In some embodiments, a difference in the feature sets (e.g., quantitative measures of a panel of condition-associated genomic loci) determined between the two or more time points may be indicative of one or more clinical indications, such as (i) a diagnosis of the condition of the subject, (ii) a prognosis of the condition of the subject, (iii) an increased risk of the condition of the subject, (iv) a decreased risk of the condition of the subject, (v) an efficacy of the course of treatment for treating the condition of the subject, and (vi) a non-efficacy of the course of treatment for treating the condition of the subject.
In some embodiments, a difference in the feature sets (e.g., quantitative measures of a panel of condition-associated genomic loci) determined between the two or more time points may be indicative of a diagnosis of the condition of the subject. For example, if the condition was not detected in the subject at an earlier time point but was detected in the subject at a later time point, then the difference is indicative of a diagnosis of the condition of the subject. A clinical action or decision may be made based on this indication of diagnosis of the condition of the subject, such as, for example, prescribing a new therapeutic intervention for the subject. The clinical action or decision may comprise recommending the subject for a secondary clinical test to confirm the diagnosis of the condition. This secondary clinical test may comprise an imaging test, a blood test, a computed tomography (CT) scan, a magnetic resonance imaging (MRI) scan, an ultrasound scan, a chest X-ray, a positron emission tomography (PET) scan, a PET-CT scan, or any combination thereof.
In some embodiments, a difference in the feature sets (e.g., quantitative measures of a panel of condition-associated genomic loci) determined between the two or more time points may be indicative of a prognosis of the condition of the subject.
In some embodiments, a difference in the feature sets (e.g., quantitative measures of a panel of condition-associated genomic loci) determined between the two or more time points may be indicative of the subject having an increased risk of the condition. For example, if the condition was detected in the subject both at an earlier time point and at a later time point, and if the difference is a negative difference (e.g., the quantitative measures of a panel of condition-associated genomic loci increased from the earlier time point to the later time point), then the difference may be indicative of the subject having an increased risk of the condition. A clinical action or decision may be made based on this indication of the increased risk of the condition, e.g., prescribing a new therapeutic intervention or switching therapeutic interventions (e.g., ending a current treatment and prescribing a new treatment) for the subject. The clinical action or decision may comprise recommending the subject for a secondary clinical test to confirm the increased risk of the condition. This secondary clinical test may comprise an imaging test, a blood test, a computed tomography (CT) scan, a magnetic resonance imaging (MRI) scan, an ultrasound scan, a chest X-ray, a positron emission tomography (PET) scan, a PET-CT scan, or any combination thereof.
In some embodiments, a difference in the feature sets (e.g., quantitative measures of a panel of condition-associated genomic loci) determined between the two or more time points may be indicative of the subject having a decreased risk of the condition. For example, if the condition was detected in the subject both at an earlier time point and at a later time point, and if the difference is a positive difference (e.g., the quantitative measures of a panel of condition-associated genomic loci decreased from the earlier time point to the later time point), then the difference may be indicative of the subject having a decreased risk of the condition. A clinical action or decision may be made based on this indication of the decreased risk of the condition (e.g., continuing or ending a current therapeutic intervention) for the subject. The clinical action or decision may comprise recommending the subject for a secondary clinical test to confirm the decreased risk of the condition. This secondary clinical test may comprise an imaging test, a blood test, a computed tomography (CT) scan, a magnetic resonance imaging (MRI) scan, an ultrasound scan, a chest X-ray, a positron emission tomography (PET) scan, a PET-CT scan, or any combination thereof.
In some embodiments, a difference in the feature sets (e.g., quantitative measures of a panel of condition-associated genomic loci) determined between the two or more time points may be indicative of an efficacy of the course of treatment for treating the condition of the subject. For example, if the condition was detected in the subject at an earlier time point but was not detected in the subject at a later time point, then the difference may be indicative of an efficacy of the course of treatment for treating the condition of the subject. A clinical action or decision may be made based on this indication of the efficacy of the course of treatment for treating the condition of the subject, e.g., continuing or ending a current therapeutic intervention for the subject. The clinical action or decision may comprise recommending the subject for a secondary clinical test to confirm the efficacy of the course of treatment for treating the condition. This secondary clinical test may comprise an imaging test, a blood test, a computed tomography (CT) scan, a magnetic resonance imaging (MRI) scan, an ultrasound scan, a chest X-ray, a positron emission tomography (PET) scan, a PET-CT scan, or any combination thereof.
In some embodiments, a difference in the feature sets (e.g., quantitative measures of a panel of condition-associated genomic loci) determined between the two or more time points may be indicative of a non-efficacy of the course of treatment for treating the condition of the subject. For example, if the condition was detected in the subject both at an earlier time point and at a later time point, and if the difference is a negative or zero difference (e.g., the quantitative measures of a panel of condition-associated genomic loci increased or remained at a constant level from the earlier time point to the later time point), and if an efficacious treatment was indicated at an earlier time point, then the difference may be indicative of a non-efficacy of the course of treatment for treating the condition of the subject. A clinical action or decision may be made based on this indication of the non-efficacy of the course of treatment for treating the condition of the subject, e.g., ending a current therapeutic intervention and/or switching to (e.g., prescribing) a different new therapeutic intervention for the subject. The clinical action or decision may comprise recommending the subject for a secondary clinical test to confirm the non-efficacy of the course of treatment for treating the condition. This secondary clinical test may comprise an imaging test, a blood test, a computed tomography (CT) scan, a magnetic resonance imaging (MRI) scan, an ultrasound scan, a chest X-ray, a positron emission tomography (PET) scan, a PET-CT scan, or any combination thereof.
In various embodiments, machine learning methods are applied to distinguish samples in a population of samples. In one embodiments, machine learning methods are applied to distinguish samples between healthy and diseased (e.g., a lupus condition such as SLE or DLE, psoriasis, atopic dermatitis, or systemic sclerosis (scleroderma)) samples.
The present disclosure provides kits for identifying or monitoring a disease or disorder (e.g., a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition) of a subject. A kit may comprise probes for identifying a quantitative measure (e.g., indicative of a presence, absence, or relative amount) of sequences at each of a panel of condition-associated genomic loci in a sample of the subject. A quantitative measure (e.g., indicative of a presence, absence, or relative amount) of sequences at each of a panel of condition-associated genomic loci in the sample may be indicative of the disease or disorder (e.g., a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition) of the subject. The probes may be selective for the sequences at the panel of condition-associated genomic loci in the sample. A kit may comprise instructions for using the probes to process the sample to generate datasets indicative of a quantitative measure (e.g., indicative of a presence, absence, or relative amount) of sequences at each of the panel of condition-associated genomic loci in a sample of the subject.
The probes in the kit may be selective for the sequences at the panel of condition-associated genomic loci in the sample. The probes in the kit may be configured to selectively enrich nucleic acid (e.g., RNA or DNA) molecules corresponding to the panel of condition-associated genomic loci. The probes in the kit may be nucleic acid primers. The probes in the kit may have sequence complementarity with nucleic acid sequences from one or more of the panel of condition-associated genomic loci. The panel of condition-associated genomic loci or genomic regions may comprise at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, or more distinct condition-associated genomic loci.
The instructions in the kit may comprise instructions to assay the sample using the probes that are selective for the sequences at the panel of condition-associated genomic loci in the cell-free biological sample. These probes may be nucleic acid molecules (e.g., RNA or DNA) having sequence complementarity with nucleic acid sequences (e.g., RNA or DNA) from one or more of the plurality of panel of condition-associated genomic loci. These nucleic acid molecules may be primers or enrichment sequences. The instructions to assay the cell-free biological sample may comprise introductions to perform array hybridization, polymerase chain reaction (PCR), or nucleic acid sequencing (e.g., DNA sequencing or RNA sequencing) to process the sample to generate datasets indicative of a quantitative measure (e.g., indicative of a presence, absence, or relative amount) of sequences at each of the panel of condition-associated genomic loci in the sample. A quantitative measure (e.g., indicative of a presence, absence, or relative amount) of sequences at each of a panel of condition-associated genomic loci in the sample may be indicative of a disease or disorder (e.g., a lupus, psoriasis, atopic dermatitis, and/or systemic sclerosis (scleroderma) condition).
The instructions in the kit may comprise instructions to measure and interpret assay readouts, which may be quantified at one or more of the panel of condition-associated genomic loci to generate the datasets indicative of a quantitative measure (e.g., indicative of a presence, absence, or relative amount) of sequences at each of the panel of condition-associated genomic loci in the sample. For example, quantification of array hybridization or polymerase chain reaction (PCR) corresponding to the panel of condition-associated genomic loci may generate the datasets indicative of a quantitative measure (e.g., indicative of a presence, absence, or relative amount) of sequences at each of the panel of condition-associated genomic loci in the sample. Assay readouts may comprise quantitative PCR (qPCR) values, digital PCR (dPCR) values, digital droplet PCR (ddPCR) values, fluorescence values, etc., or normalized values thereof. Various systems, methods, classifiers and kits are described in WO 2020/102043, which is entirely incorporated herein by reference.
In some embodiments, the dataset comprises RNA gene expression or transcriptome data, DNA genomic data, or a combination thereof. In some embodiments, the biological sample comprises a whole blood (WB) sample, a PBMC sample, a tissue sample, a cell sample or any derivative thereof. In some embodiments, assessing the SLE condition of the subject comprises determining a diagnosis of the SLE condition, a prognosis of the SLE condition, a susceptibility of the SLE condition, a treatment for the SLE condition, or an efficacy or non-efficacy of a treatment for the SLE condition.
In some embodiments, the method further comprises determining a diagnosis of the SLE condition with a sensitivity of at least about 70%. In some embodiments, the method further comprises determining a diagnosis of the SLE condition with a specificity of at least about 70%. In some embodiments, the method further comprises determining a diagnosis of the SLE condition with a positive predictive value of at least about 70%. In some embodiments, the method further comprises determining a diagnosis of the SLE condition with a negative predictive value of at least about 70%. In some embodiments, the method further comprises determining a diagnosis of the SLE condition with an Area Under Curve (AUC) of at least about 70%. In some embodiments, the method further comprises determining a likelihood of the diagnosis of the SLE condition of the subject.
In some embodiments, the method further comprises generating a plurality of drug candidates for the SLE condition of the subject. In some embodiments, the method further comprises evaluating or predicting a relative efficacy of the plurality of drug candidates for the SLE condition of the subject. In some embodiments, the method further comprises providing a therapeutic intervention comprising one or more of the plurality of drug candidates for the SLE condition of the subject.
In some embodiments, the method further comprises monitoring the SLE condition of the subject, wherein the monitoring comprises assessing the SLE condition of the subject at each of a plurality of time points, and processing the plurality of assessments of the SLE condition of the subject at each of the plurality of time points.
The following illustrative examples are representative of embodiments of the software applications, systems, and methods described herein and are not meant to be limiting in any way.
In an aspect, the present disclosure provides systems and methods for using bioinformatics approaches to deconvolute bulk mRNA for various cells and processes involved in lupus organ pathology, including inflammatory cells, endothelial cells, tissue cells.
In an aspect, the present disclosure provides systems and methods for the delineation of the altered metabolism of cells by using gene expression analysis.
In an aspect, the present disclosure provides systems and methods for using various regression models (e.g., classification and regression trees, linear regression, step-wise regression) to dissect the specific metabolic alterations in individual cell types.
In an aspect, the present disclosure provides systems and methods for using animal models and the ability to translate mouse gene expression into the human equivalent to confirm the results in humans and also analyze the effects of treatment.
In an aspect, the present disclosure provides systems and methods for the delineation of the role of specific cells (myeloid cells) and processes (interferon, mitochondrial dysfunction) in lupus tissue pathology.
In an aspect, the present disclosure provides systems and methods for using non-lymphocyte populations in skin and kidney toward diagnostic and/or prognostic biopsy tests.
In an aspect, the present disclosure provides systems and methods for defining gene signatures in individual cell types in a mixed population such as blood or tissue (e.g., skin, kidney).
In an aspect, the present disclosure provides systems and methods for analyzing sets of metabolism genes and their relationship to function and cell type, including subsets of myeloid cells (e.g., subsets of myeloid celis new).
To compare lupus pathogenesis in disparate tissues, we analyzed gene expression profiles of human discoid lupus erythematosus (DLE) and lupus nephritis (LN). We found common increases in myeloid cell-defining gene sets and decreases in genes controlling glucose and lipid metabolism in lupus-affected skin and kidney. Regression models in DLE indicated increased glycolysis was correlated with keratinocyte, endothelial, and inflammatory cell transcripts, and decreased tricarboxylic (TCA) cycle genes were correlated with the keratinocyte signature. In LN, regression models demonstrated decreased glycolysis and TCA cycle genes were correlated with increased endothelial or decreased kidney cell transcripts, respectively. Less severe glomerular LN exhibited similar alterations in metabolism and tissue cell transcripts before monocyte/myeloid cell infiltration in some patients. Additionally, changes to mitochondrial and peroxisomal transcripts were associated with specific cells rather than global signal changes. Examination of murine LN gene expression demonstrated metabolic changes were not driven by acute exposure to type I interferon and may be restored after immunosuppression. Finally, expression of HAVCR1, a tubule damage marker, was negatively correlated with the TCA cycle signature in LN models. These results indicate that altered metabolic dysfunction is a common, reversible change in lupus-affected tissues and appears to reflect damage downstream of immunologic processes.
Systemic lupus erythematosus (SLE) is a complex autoimmune disease that affects multiple tissues within the body, including the skin and kidneys [1,2]. Although the primary mechanisms of SLE pathogenesis involve hyperactivity of both the innate and adaptive immune systems, evidence surrounding the involvement of perturbed metabolic activity has recently emerged [3-5]. Whereas systemic metabolic dysregulation has been associated with lupus-related morbidities, such as atherosclerosis [6], the contribution of cellular metabolic abnormalities in human lupus-affected tissues has yet to be fully explored. Metabolic derangements in tissues and their contributions to disease pathology have been investigated in a number of inflammatory and rheumatic diseases. For example, abnormalities in mitochondrial functions contribute to immune/inflammatory skin diseases [7], and skin cells have been shown to upregulate the pentose phosphate pathway (PPP) under oxidative stress [8]. Moreover, rheumatoid arthritis synoviocytes shift their metabolism to glycolysis because of local hypoxia [9] and osteoarthritis exhibited increased synovial tricarboxylic acid (TCA) cycle intermediates [10]. Similarly, metabolic impairment has been found in many forms of kidney disease [11,12], including defects in fatty acid oxidation (FAO) that have been correlated with fibrosis progression in the kidney tubulointerstitium [11].
At the cellular level, metabolic abnormalities in inflammatory diseases may be related to the nature of the immune cell infiltrate and its activation status. For example, macrophage polarization is regulated by glycolysis and FAO [13], and T cells reallocate glucose and upregulate glycolysis following activation [14]. Moreover, macrophage markers have been associated with increased PPP activity in kidney diseases, including lupus nephritis (LN) [15]. Additionally, exhausted T cells with altered mitochondrial function have been found in murine LN [16], and T cells isolated from SLE patients exhibit alterations in lipid composition [17]. The idea of targeting T cell metabolism for lupus therapy has been advocated [18,19] based on the finding that CD4 T cells in lupus-prone mice had elevated glycolysis and oxidative metabolism that may be normalized with metformin and 2-Deoxy-D-glucose (2DG) treatment resulting in disease improvement [4].
Taken together, these studies suggest metabolic abnormalities in either infiltrating inflammatory cells or tissue cells contribute to and/or reflect tissue damage. To examine this in greater detail, we analyzed gene expression profiles in human and murine lupus tissues to discern the nature of abnormal metabolic pathways, elucidate cellular origins of metabolic abnormalities, and determine how inflammatory cells contribute to physiologic processes involved in tissue inflammation and damage.
Dysregulation of metabolic gene signatures was found to be common among lupus-affected tissues. Despite thousands of differentially expressed genes (DEGs) in discoid lupus erythematosus (DLE), World Health Organization (WHO) class III/IV LN glomerulus (GL) and WHO class III/IV LN tubulointerstitium (TI), there were only 559 increased and 324 decreased transcripts in common (
As there is considerable transcriptomic heterogeneity among lupus patients [20,21], we sought to examine expression of genes controlling metabolism at the individual patient level, and, therefore, employed gene set variation analysis (GSVA) [22] (Table 2). Generally, lupus tissues exhibited lower GSVA scores indicative of metabolic pathways, whereas controls had higher metabolism GSVA scores (
It was found that increased myeloid cell signatures and decreased tissue cell signatures characterize the majority of lupus patients. To determine whether cellular changes accounted for the decreased metabolic signatures, we examined enrichment of immune and non-hematopoietic cell signatures in lupus tissues. Increased tissue enrichment of immune cell signatures was variable, with DLE and LN GL demonstrating more enrichment of inflammatory cell signatures as compared to LN TI (
Although the monocyte/MC signature was consistently increased among lupus-affected tissues, its nature in each tissue varied. Linear regression of the monocyte/MC signature and FCN1 expression indicated DLE and LN GL contained inflammatory monocyte-derived macrophages [23], but the correlation in LN TI was weak (
It was found that Class II LN GL is molecularly similar to class III/IV LN GL. To elucidate whether the same metabolic signature changes were present in less severe LN, we expanded our analysis to incorporate WHO class II LN samples, where less immune cells have been observed histologically [26]. Unexpectedly, we found that genes controlling glycolysis, the TCA cycle, FAO, and AA metabolism were decreased in class II LN GL (
As in the glomerulus, class II LN TI was not statistically different from class III/IV LN TI (
It was found that cellular signatures are associated with metabolic gene signature dysregulation in lupus-affected tissues. Stepwise regression, which identifies the independent variables that best explain the dependent variable [27], was employed to analyze cellular signature associations with metabolism gene signature changes. To improve precision, highly collinear cellular signatures in DLE were combined for stepwise analysis. In DLE, the inflammatory cell, EC, and keratinocyte signatures were positively correlated with the glycolysis signature, as indicated by a positive regression coefficient (
We then implemented both stepwise regression and classification and regression tree (CART) analysis, which partitions data by independent variables [28], to determine the cellular signatures that were most associated with the metabolic signature changes in all classes of LN and controls. In LN GL, all metabolic signatures except for the PPP signature exhibited some dependence on the kidney cell signature by either stepwise regression or CART (
The contribution of kidney-specific cell signatures to most metabolic changes was also observed in the TI (
We sought to confirm these findings by referencing data from single-cell RNA-sequencing (scRNA-seq) of LN biopsies [30]. In LN, tissue-resident macrophages exhibited decreased OXPHOS genes (
It was found that mitochondrial and peroxisomal signature changes and local hypoxia contribute to changes in metabolic gene expression in specific cells. As mitochondria and peroxisomes are the primary organelles responsible for metabolic processes such as OXPHOS and FAO [31], we sought to examine whether there were detectable changes to organelle-specific gene expression that may explain the altered metabolic state. There was no evidence of global mitochondrial gene expression changes, although mitochondrial genes were decreased in approximately half of DLE patients, and mitochondrial transcription was increased in 15/30 LN GL patients (
Since hypoxia has been cited as a driver of kidney disease in the TI [32], and a hypoxic microenvironment may result in decreased oxidative metabolism, we examined the contribution of HIF1A to metabolic gene signatures. Some lupus patients had increased expression of HIF1A by GSVA (
It was found that metabolic gene expression changes occur independent of acute IFN stimulation. The IFN gene signature (IGS), a known hallmark of lupus [33,34] and lupus-affected tissues [35], has been implicated in metabolic alteration of MCs [36-39]. To determine the functional relationship between IFN stimulation and metabolic alteration, we examined metabolic gene expression longitudinally in the IFNα-accelerated NZB/W murine model of LN, where the IGS increases at early timepoints following injection of IFNα adenovirus and then increases again when kidney disease develops (
It was found that metabolic and cellular gene expression changes in murine LN are corrected by immunosuppressive treatment. To determine the robustness of the observed metabolic and cellular gene expression changes and determine whether dysregulation may be reversed by immunosuppressive therapy, we analyzed gene expression in pre- and post-treatment kidneys of lupus mice. Although some metabolic changes had been identified in the kidneys of untreated and treated NZB/W and NZM2410 mice [41], no analysis of the nature of the affected cells was carried out.
Metabolic gene expression was significantly altered in NZM2410, NZB/W, IFNα-accelerated NZB/W (GSE72410), and NZW/BXSB mice. Treatment of NZM2410 mice with BAFF-R-Ig and proteinuric NZB/W mice with a combination of cyclophosphamide (CTX)+CTLA4-Ig+anti-CD154 restored TCA cycle, FAAO, FABO, and AA metabolism gene expression (
GSVA of cellular changes in murine LN models demonstrated similar results to human LN, although inflammatory infiltrate and changes in the EC and podocyte signatures were less robust (
It was found that metabolic changes correlate with expression of genes indicating tubular damage. Finally, we examined whether changes in genes controlling metabolism occurred synchronously with changes in expression of HAVCR1 (KIM1) and LCN2, which are known markers of tubular damage [42,43]. Although HAVCR1 expression was increased in class III/IV human LN TI patients, there were no significant changes to LCN2 in any class of LN TI (
Multi-pronged bioinformatic analyses of gene expression data from human lupus tissues revealed that despite intra-tissue heterogeneity metabolic dysfunction was present in all tissues. Immune effector cells have high metabolic needs [13,14,44,45] and, therefore, we initially hypothesized immune infiltration may be responsible for the observed lupus tissue-wide metabolic dysregulation. Although kidney-infiltrating CD8 T cells in murine LN are functionally exhausted with defective mitochondria [16], anergic/activated T cell markers were not found in these human LN samples, and regression models indicated T cells contributed minimally to changes in renal metabolic gene expression. Similarly, monocyte/MCs, which were increased in some patients from all tissues, might be expected to contribute to enhanced glucose metabolism—either glycolysis (M1 macrophages) or OXPHOS (M2 macrophages) [44]. Indeed, monocyte/MC signatures were inversely correlated with OXPHOS in both LN tissues, and positively correlated with glycolysis in LN GL, suggesting they are likely M1 in nature and may contribute to the altered metabolic landscape of intact tissues. Although gene expression revealed differing origins of the renal MC populations, as they reflect monocyte-derived macrophages in LN GL and tissue-resident macrophages in LN TI, the consistent MC presence aligns with their prominent role in tissue damage [46]. Observed increases in the monocyte/MC signature and strong inverse correlations between MCs and metabolism may reflect the role of MCs in tissue damage, even when T and B cells are not yet abundant.
To examine whether metabolic abnormalities represented primarily tissue cell defects as opposed to changes in inflammatory cell metabolism, we analyzed gene expression in class II LN, in which less inflammatory infiltrate is evident histologically [26]. Even though class II LN samples had evidence of increased immune/inflammatory cell signatures that coincided with changes to metabolic signatures, examples were observed in which the changes in metabolic signatures were found in the absence of a monocyte/MC signature, suggesting alterations in metabolic signatures may be initiated immediately following immune complex (IC) deposition and complement activation. Subsequent monocyte/MC and other inflammatory cell activation/infiltration may then contribute to further damage of tissue cells. Indeed, changes in kidney gene expression may occur following early IC deposition, but before microscopic detection of inflammation [47], consistent with our transcriptomic results in class II LN.
Our findings of altered metabolism in lupus tissues align with changes seen in other forms of tissue pathology. We observed a positive association between the keratinocyte and glycolysis signatures, and upregulation of glycolysis has been observed in keratinocytes during cutaneous infection [48]. Increased glycolysis [49,50] and decreased PPAR signaling, TCA cycle, and OXPHOS have been reported in dermal fibroblasts subjected to radiation [50], indicating dermal fibroblasts have the potential to contribute to inflammation-induced alterations in metabolism. However, in the current study, stepwise regression indicated that the fibroblast signature was negatively associated with the glycolysis, PPP, and OXPHOS signatures, whereas associations with FAO did not achieve statistical significance. The negative correlation between the fibroblast and PPP signatures contrasts with the observed upregulation of the PPP in cultured fibroblasts and keratinocytes that were exposed to UV-induced oxidative stress [8]. This suggests that in vivo fibroblasts are altered by signals different than that mediated by UV light, as expected since fibroblasts are deep in the dermis and shielded from such ambient exposure.
Metabolic dysregulation is also common in kidney disease [12,15,29,51,52]. In non-diabetic chronic kidney disease, TCA cycle abnormalities measured in urine metabolites coincided with changes to kidney gene expression [51], supporting our conclusions that metabolic dysregulation primarily reflects altered renal cell function as opposed to changes in immune cell metabolism. Moreover, defects in FAO have been correlated with fibrosis progression in TI disease [11]. Both human and murine models with TI fibrosis exhibited decreased expression of FAO enzymes and resultant increases in lipid deposition, which was reversed by correcting the metabolic abnormalities [11]. We similarly found that FAO signatures are substantially decreased in the TI; however, regression models indicated that altered FAO transcripts were most associated with decreased kidney cell signatures. Although fibroblasts were not predicted by the models, fibrosis or fibroblast enrichment may contribute to decreased kidney cell and proximal tubule transcripts in the TI.
In the glomerulus, ECs appear to play an additional role in disease. Endothelial activation in LN has been suggested [53], and EC transcripts were increased in 83% of all LN GL samples. Increased EC transcripts may reflect altered EC function, potentially resulting from cytokine/growth factor stimulation and/or hypoxia-induced cellular damage. Abnormal angiogenesis has been reported in diabetic nephropathy resulting in leaky vessels54, but the function of glomerular ECs in LN is less well-defined. Indeed, glomerular ECs have been found to be dependent upon podocyte stimulation for differentiation [55], whereas other studies suggest EC damage precedes podocyte injury [56]. Our findings from class II LN support the latter, as signature changes to ECs occurred in the absence of podocyte changes in some patients, suggesting that ECs are early participants in LN.
The relationship between the EC signature and metabolic gene expression changes implied an alteration in EC physiology in LN. Although healthy ECs are highly glycolytic [57], all regression techniques in LN GL indicated an inverse correlation between the EC and glycolysis signatures. Consistent with our findings, in diabetic nephropathy stalled glycolytic flux has been observed in ECs [57]. These data suggest that glomerular ECs are metabolically altered, perhaps because of IC and complement stimulation and/or cytokine exposure, making them less capable of maintaining normal function. Indeed, in the GL, the EC signature had a positive regression coefficient with the FABO signature, and quiescent ECs have been reported to upregulate FABO [58], supporting the conclusion that ECs are functionally deranged in LN GL.
Analysis of metabolism-associated genes in cell clusters derived from scRNA-seq of LN biopsies [30] further supports our finding that changes in metabolism are most closely related to kidney cell gene expression, with minor contributions from resident or infiltrating immune cells. Tissue-resident macrophages exhibited decreased OXPHOS genes, whereas CD4 T cell metabolism was unclear. Notably, the kidney epithelial cell cluster reported many metabolism-associated genes, suggesting decreased glycolysis and TCA cycle, but increased OXPHOS. Proximal tubules, which have the most mitochondria of any kidney epithelial cell, are known to be dependent on oxidative metabolism [59], and this further supports the idea that diminished OXPHOS in bulk RNA in part reflects decreased kidney epithelial cell transcripts. Additionally, because scRNA-seq looks at expression of individual cells as opposed to the bulk environment, the detected kidney epithelial cells are likely the residual functionally normal ones. However, because of technical issues including cell yield and read depth, there are difficulties in determining the status of individual cell metabolism from the scRNA-seq data. Altogether, scRNA-seq appeared to be less effective than deconvolution of bulk RNA to detect important but subtle changes in cellular metabolism.
To determine whether defective organelles were responsible for metabolic alteration, we examined gene expression specific for both mitochondrial and peroxisomal function. We observed changes to the mitochondrial and peroxisomal gene signals in some patients, and correlation analysis suggested these changes were associated with specific cell types. Notably, the peroxisome biogenesis signature was positively associated with signatures for ECs, kidney cells, and proximal tubules. Moreover, as the kidney is highly susceptible to hypoxia [60], we investigated the propensity for hypoxia to contribute to altered metabolism. Although GSVA demonstrated no significant increases in HIF1A expression, there was an association between HIF1A and the PPP and glycolysis signatures in LN GL and LN TI, respectively, suggesting that hypoxia may have specific effects on metabolism, and may contribute to some of the metabolic changes observed in the tissues.
The relationship between the IGS and metabolic signature changes in IFNα-accelerated LN mice indicated that acute type I IFN exposure may not explain the observed metabolic changes. The mouse studies were particularly informative because IFNα exposure was regulated. Although it has been demonstrated that type I IFN stimulation may increase OXPHOS and FAO in DCs and MCs [36,38], inhibit isocitrate dehydrogenase (part of the TCA cycle) in macrophages [39], and alter oxidative metabolism in other cells [61], IFNα overexpression did not change metabolic signatures at early timepoints. However, there was an inverse relationship between the IGS and metabolic defects after LN onset, when the IGS recurred. These results support the conclusion that downregulation of metabolic pathways is unlikely to be explained by the known actions of type I IFN alone, but rather during LN progression, decreased metabolic signatures may be parallel reflections of continued IGS exposure and inflammation.
Metabolic gene expression was altered in four murine LN models and immunosuppressive treatment, not known to directly affect cellular metabolism, restored metabolic gene expression. Although combination therapy diminished inflammatory cell abundance in the kidneys of NZB/W mice, we also observed restoration of metabolic and kidney cell gene expression after treatment in models with little inflammatory infiltrate or those in which inflammatory cells were not significantly decreased with treatment. This suggests that although inflammatory cells play a critical role in mediating tissue cell damage and metabolic dysfunction, damage is not related only to local inflammatory cells, as intensive therapy with CTX restores tissue cell defects without significant changes to inflammatory cells. Importantly, these results demonstrate that metabolic abnormalities in tissue cells are reversible with immunosuppressive therapy and restored metabolic gene expression might be considered a goal of effective lupus treatment. Moreover, monitoring tissue metabolism may be especially important in situations where anti-metabolic drugs, such as metformin or 2DG, are employed. Whereas these drugs may be promising for correction of individual immune cell defects, they have potential consequences for already metabolically deranged tissue cells.
Additionally, these studies reveal subtle differences in pathology of glomerular and tubulointerstitial involvement in LN. In both kidney regions, we observed decreased resident non-hematopoietic cell signatures and increased monocyte/MC signatures. However, monocyte/MCs in the glomerulus were likely monocyte-derived, whereas those in the tubulointerstitium appeared more like tissue-resident macrophages. Moreover, tubulointerstitial diseases in class III/IV LN was characterized by less inflammatory infiltrate than was observed in the glomerulus, and although metabolic signatures were comparably decreased in all classes of glomerular LN, metabolic signature changes in class II tubulointerstitial LN were less consistently regulated. These results align with studies that show tubulointerstitial damage occurs later in LN and predicts end stage renal disease [62]. Poorer outcomes in class III/IV may be related to persistence of abnormalities or inhibition of repair mechanism that might contribute to progressive renal disease. Nonetheless, it is noteworthy that even the more modest immunologic damage in class II LN was associated with marked changes in metabolic signatures.
Detectable changes to immune cell, EC, kidney cell, and metabolic gene signatures in all classes of LN GL is notable. We found that gene expression may detect cellular changes with greater sensitivity than immunohistochemistry, when little or no inflammatory infiltrate is observed histologically. Gene expression may provide an advance to current classification or diagnostic techniques, as gene expression changes are detectable before discernable immunohistochemical changes, and transcriptomic analysis of metabolism may elucidate potential functional rather than merely histopathologic changes.
Our study is not without limitations. We performed post-hoc analysis of bulk gene expression in three lupus-affected tissues comprising limited numbers of lupus patients. Moreover, 37.5% of LN patients were being treated with immunosuppresives [53], that may have affected gene signatures. Additionally, the gene signature we identified as reflecting general kidney cells may be more specific for tubule cells, despite the strong representation in the glomerulus. Furthermore, regression analyses provided an estimate of the cellular variables that are most associated with each metabolic signature, but accuracy may be limited by sample size, and there is a chance of overfitting. Future work with larger cohorts currently not available would be necessary to validate these results. Additionally, direct assessment of functional metabolism may be necessary to assay how metabolic changes at the gene expression level reflect changes in protein content and cellular function.
In conclusion, prominent alterations in cellular metabolism signatures are characteristic of lupus tissue pathology. Systems bioinformatics and assessment with regression modeling techniques revealed that the monocyte/MC signature, including both monocyte-derived macrophages and tissue-resident macrophages, was increased in many lupus patients, kidney cell signatures were decreased in LN, the EC signature was increased in LN GL, and these cell signature changes were associated with altered metabolism signatures. Moreover, apoptotic mitochondrial gene changes were associated with MC genes in DLE and LN GL. In murine LN, metabolic dysregulation correlated with tubular damage marker expression and metabolic gene changes were reverted to normal by immunosuppressive therapy. Altogether, altered metabolism may serve as a promising biomarker or therapeutic target for lupus tissue disease, especially as metabolic gene expression changes precede expression of the renal damage biomarker LCN2 in human LN, and coincide with changes to HAVCR1. Indeed, urinalysis has been used to measure metabolite biomarkers in the kidney [51,52] and metabolism transcripts may be used to estimate degree of kidney cell damage and assess treatment efficacy. Although treatment strategies aimed at metabolic restoration are not straightforward, the current findings support the conclusion that immunosuppressive therapy may restore metabolic function, and, thereby, may ameliorate damage in specific lupus-affected tissues.
Human and mouse gene expression datasets were analyzed as follows. Raw data from publicly available human and murine lupus datasets were derived from the Gene Expression Omnibus (GEO) repository.
GSE72535 comprises microarray analysis of lesional skin biopsies from human patients with DLE with no systemic involvement with Cutaneous Lupus Activity and Severity Index (CLASI)≥2 [63]. Some DLE patients were treated with various therapies including corticosteroids, immunomodulators, and hydroxychloroquine [63].
GSE32591 comprises microarray analysis of human renal biopsies that were originally derived from the European Renal cDNA Bank (ERCB) [53]. LN patients from the ERCB (n=32) had an average age of 35.1±2.4 years, average proteinuria of 2.9±0.6 g/day, and eGFRMDRD of 63.7±5.4 ml/min/1.73m2. LN patients were treated with various therapies, including steroids and immunosuppressants53. Two patients from this group were excluded from our analyses because they had non-inflammatory class V LN.
GSE86423 comprises samples from the IFNα-accelerated NZB/W LN model. Female NZB/W mice were injected with an adenovirus vector expressing recombinant murine IFNα (5×109 particles) at 9 weeks40. Kidney gene expression was measured in mice at 0, 1, 2, 3, 4, 5, 7, and 9 weeks post IFNα injection [40].
GSE32583 and GSE49898 comprise samples from three murine lupus models. Kidney gene expression was measured in NZM2410 mice including 6-8 week pre-disease mice, 22-30 week diseased mice, and treated mice in remission (Tx+15w) [41,53]. NZM2410 mice in the Tx+15w group were treated with adenovirus expressing BAFF-R-Ig at 22 weeks, and then sacrificed at 30-35 weeks or 55 weeks [41]. Kidney gene expression in NZB/W mice was measured at 16w, 23w, 36w, and after treatment [41,53]. Both 23w and 36w mice shown in the main figures had confirmed proteinuria. Some NZB/W mice with proteinuria (>300 mg/dl) at two timepoints were treated with combination therapy—one dose of 50 mg/kg CTX, six doses of 100 μg CTLA-4-Ig, and one dose 250 μg of anti-CD40L41. Mice were determined to be in remission if they achieved proteinuria of ≤30 mg/dl at two timepoints [41]. One group was sacrificed 3-4 weeks after remission (Tx Rem.+3-4w) and another was sacrificed >5-14 weeks after remission (Tx Rem.+>5w) [41]. The latter group had confirmed histologic relapse [41]. Kidney gene expression from NZW/BXSB mice was measured at 17w (prenephritic mice) or 18-21w mice with confirmed proteinuria [53].
GSE72410 comprises samples from the IFNα-accelerated NZB/W LN model. NZB/W mice were treated with adenovirus-expressing murine IFNα at 14-15w (1.2×108 IFNαrAd5-CMV)64. Kidney gene expression was measured in 17w naïve NZB/W (naïve), as well as IFNα-accelerated NZB/W mice: 17w mice 21 days post IFNα injection (IFN W3), and in mice treated with vehicle or CTX for four weeks beginning three weeks post IFNα injection (IFN W7+Veh or IFN W7+CTX) [64].
GSE153021 comprises samples from the MRL/lpr LN model. MRL/lpr mice were treated with vehicle, prednisone, mycophenolate mofetil (MMF), FK506, or all three (Multi-target) for eight weeks beginning at week 8 or age-matched wildtype MRL/MpJ mice treated with vehicle [65].
Quality control and data normalization were performed as follows. Microarray data were processed [66] using free, open source programs (GEOquery, affy, affycoretools, limma, and simpleaffy). Unnormalized arrays were inspected for visual artifacts or poor RNA hybridization using QC plots. Datasets were annotated using their native chip definition files (CDFs). Probes missing gene annotation data were discarded. Raw data (CEL files) from the Affymetrix platform were background corrected and normalized using guanine cytosine robust multiarray average (GCRMA) or robust multichip average (RMA) algorithms, whereas raw data files from Illumina chip were read and normalized using neqc (limma R package).
RNA-seq data (GSE72410 and GSE153021) was processed from FASTQ files as described by Daamen [67] and also described below. SRA files were downloaded and converted into FASTQ format. Read ends and adapters were trimmed with Trimmomatic (v0.38) using a sliding window, ilmnclip, and headcrop filters. The reads were head cropped at 6 bp and adapters were removed before read alignment. Reads were mapped to the mouse reference genome m17 using STAR, and the .sam files were converted to sorted .bam files using Sambamba. The mouse reference genome was downloaded from GENCODE. Read counts were summarized using the featureCounts function of the Subread package (v1.61). The DESeq2 workflow was used to filter RNA-seq genes with low expression (i.e. genes with very few reads). The filtered raw counts were normalized using DESeq method and then log 2 transformed.
Principal component analysis was used to inspect the raw data files from each dataset for outliers. All log 2 transformed data was formatted into R expression set objects (E-sets).
DEG analysis was performed as follows. For human dataset DEG analysis, Affymetrix probes were additionally annotated with custom BrainArray (BA) chip definition files (CDFs) [66, 68]. Any probes with different Affymetrix and BA gene annotations were excluded. GCRMA-normalized expression values were variance corrected using local empirical Bayesian shrinkage before calculation of DEGs using the ebayes function in the BioConductor LIMMA package. P-values were adjusted for multiple hypothesis testing using the Benjamini-Hochberg correction, which resulted in a False Discovery Rate (FDR). Significant Affymetrix and BA probes within each study were merged and filtered to retain probes with a pre-set FDR<0.2 which were considered statistically significant. This FDR was employed to avoid falsely excluding genes of interest. This list was further filtered to retain only the most significant probe per gene in order to remove duplicate genes.
Network analysis and visualization were performed as follows. Cytoscape (V3.8.0) [69] with the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) (V1.5.1) and ClusterMaker2 (V1.3.1) [70] plugins was used to create and visualize protein-protein interactions among the 883 common DEGs. Clusters were generated with the Molecular Complex Detection (MCODE) clustering algorithm within ClusterMaker2 and a node score cutoff of 0.2, k-Core of 2, and a max depth of 100 were set.
Functional Enrichment Analysis was performed as follows. Functional enrichment of Cytoscape-derived MCODE clusters was performed using BIG-C, a clustering tool developed to categorize the biologic function of large lists of genes [66]. The top three significant BIG-C categories (p<0.05, OR >1) were reported.
Gene Set Variation Analysis (GSVA) was performed as follows. GSVA22 for R/Bioconductor was used as a non-parametric, unsupervised method for estimating the variation of pre-defined gene sets in dataset samples. For each dataset only one CDF, Affymetrix or Illumina, was used for each probe. For genes with multiple Affymetrix probe identifiers, only the probe with the highest inter-quartile range (IQR) of expression [71] was retained. Genes with IQR=0 were removed. GSVA enrichment scores were calculated non-parametrically using a Kolmogorov Smirnoff (KS)-like rank statistic [22]; a negative value for a particular sample and gene set means that the gene set has a lower expression than the gene set with a positive value. The same GSVA probes used in calculations for class III/IV LN GL and TI were specified for calculations in all classes of LN.
GSVA Gene Sets were analyzed as follows. Metabolism gene sets were created from literature mining. Hematopoietic gene sets were derived from Immune/Inflammatory-Scope (I-Scope), a tool developed to identify immune cell-specific genes in big data [72]. Non-hematopoietic cellular gene sets were derived from T-Scope [72] or literature mining. Many gene sets in T-Scope were derived from The Human Protein Atlas (www.proteinatlas.org) [73,74]. The keratinocyte signature was derived from the keratinocyte-specific genes of Gazel et al. [75]. Kidney-specific lists generated from the Human Protein Atlas and single-cell data were additionally modified to incorporate genes found by both transcriptomics and immunohistochemistry [76]. The mesangial cell signature was derived from PanglaoDB [77]. In all tissues, the hematopoietic, EC, and fibroblast gene sets were evaluated. For non-hematopoietic gene signatures, those relevant to each tissue were reported (DLE—keratinocyte, melanocyte; LN GL—kidney cell, podocyte, mesangial cell; LN TI—kidney cell, proximal tubule, Loop of Henle (LoH) cell, distal tubule, collecting duct cell). Mitochondrial and peroxisomal gene sets were derived from literature mining and the BIG-C, and also compared to signatures in the MSIG database. The Apoptotic Mitochondrial Changes signature (M7482) was accessed on the MSIG database [78,79] and is derived from GO_0008637. GO_Mitochondrial_Fission (M12786) and GOBP_Peroxiome_Fission (M22828) signatures were accessed on the MSIG database and modified slightly. The IGS is the type I IFN core signature [35].
All human metabolism, non-hematopoietic cell, and IFN gene sets were converted to murine gene sets using the homologene R package and human2mouse function. Genes that were not converted programmatically were manually converted by GeneCards and Mouse Genome Informatics orthologs. Murine hematopoietic cell gene sets were curated by literature mining. For murine datasets with expression of two or more anergic/activated T cell markers, the anergic/activated T cell signature was combined with the T cell signature for GSVA analysis.
Although the same gene sets were input for each category in each tissue (Table 2), the genes used in calculation of the GSVA enrichment score for each tissue differ slightly based upon the gene measurement platform and expression within that sample. All reported GSVA enrichment scores, except for the HIF1A gene signature, were calculated based upon a minimum of three genes. The HIF1A gene signature only includes HIF1A because there were no additional genes determined to be specific to hypoxia only. Although two genes were well co-expressed, the DC signature in LN TI did not meet the minimum three gene requirement for GSVA, nor did the LDG signature in LN GL and LN TI.
Hierarchical clustering was performed as follows. Human lupus tissue samples were hierarchically clustered by the Euclidian distance of their GSVA enrichment scores into two (k=2) or four (k=4) clusters using the heatmap.2 function in R.
Regression Models were analyzed as follows. For all linear models, GSVA scores for cellular signatures in all tissue samples were input as independent variables, and the pathway GSVA score (metabolism signature, mitochondrial signature, or peroxisomal signature) was input as the dependent variable. As GSVA scales the expression of a signature from −1 to −1, the value for each input cellular or metabolic signature in each sample is relative to the same signature in other samples and to the other signatures in the same sample. To ensure that collinearity between immune cell signatures did not confound results for stepwise regression analyses in DLE, we combined the pDC, skin-specific DC, monocyte/MC, T cell, anergic/activated T cell, B cell, and plasma cell signatures into the “inflammatory cell” signature, because the genes were highly co-expressed. The list of all genes used as the “inflammatory cell” signature may be found in Table 2. This reduced the number of input variables for DLE stepwise analysis to ten cell signatures.
Visualization was performed as follows. Final figures were generated in GraphPad Prism or Adobe Illustrator.
Overlap p-values and ORs for functional enrichment of DEGs were calculated in R using two-sided fisher.test with confidence level=0.95. Because of known heterogeneity in lupus patients, the number of lupus patients in each tissue who fell above or below the control mean±1 standard deviation (SD) were then reported in order to determine whether individual patients exhibit an increased or decreased signature when the population did not achieve statistical significance. Calculation of mean and standard deviation (SD) for the control samples for each GSVA score in each tissue was performed in Microsoft Excel. All analyses in GraphPad Prism were carried out with version 8.2.0 (435) or later versions. Control and lupus sample populations of GSVA scores for each gene set were assessed for normality using the D'Agostino-Pearson test in GraphPad Prism, and the distributions for 75% or more of the gene sets in each population were determined to be normal. Welch's t-test with Bonferroni correction for GSVA enrichment in human samples was performed in GraphPad Prism. Bonferroni correction for metabolic signatures, immune cell signatures, non-hematopoietic signatures, or mitochondrial/peroxisomal signatures were performed separately. The Mann-Whitney U test for GSVA enrichment in murine samples was performed in GraphPad Prism. Univariate (simple) linear regression and Pearson correlation analyses were carried out in GraphPad Prism. Hedges' g effect sizes were calculated in R using the cohen.d function with Hedges' correction under the “effsize” package. Stepwise regression was performed in R using the lm function followed by the stepAIC function. Variance inflation factor (VIF)<10 was confirmed for each independent variable. Although some data points were determined to be influential to the stepwise equation using Cook's D, no samples were removed from the models in efforts to capture the heterogeneity present in lupus. CART analysis was performed in R using the rpart function with the “anova” method. Each resulting decision tree was pruned once except for the glycolysis and PPP signatures in LN TI. GSVA scores for all individual samples (patient or control) are presented as individual data points in either dot or violin plots. The number of samples for each group may be found in the figure legends. Information regarding the statistical comparisons made and level of significance is mentioned in the figure legends.
Data availability: All microarray datasets in this publication are available on NCBI's GEO database. All bioinformatic software used in this publication is open source, and freely available for R. Example codes used here (LIIMMA, GSVA, stepwise regression, and CART) are available at figshare, www.figshare.com as “AMPEL BioSolutions LINMA Differential Expression Analysis Code,” “AIMPEL BioSolutions GSVA AFFY nonzeroIQR Code”, and “AMPEL BioSolutions Stepwise and CART Code,” respectively.
Inflammatory skin diseases have unique clinical features but may have both selective and overlapping responses to targeted therapies. To determine the unique and shared molecular features of inflammatory skin diseases, we carried out a comprehensive analysis of gene expression from cutaneous lupus erythematosus (CLE) and compared it to that of psoriasis, atopic dermatitis, and systemic sclerosis. Using gene set variation analysis (GSVA), we found that lesional samples from each condition had unique features, but all four diseases displayed common enrichment in multiple inflammatory cell and pathway gene signatures, including the interferon, tumor necrosis factor, and IL-23 gene signatures. These findings were confirmed by both classification and regression tree (CART) analysis and machine learning (ML) models. Nonlesional samples from each disease also differed from normal samples and each other by ML. Notably, the features used in classification of nonlesional disease compared to control were more distinct than their lesional counterparts and GSVA confirmed unique features of nonlesional disease. These data show that lesional and nonlesional skin samples from CLE and other inflammatory skin diseases have unique profiles of gene expression abnormalities, especially in nonlesional skin. The results suggest a model in which diseases-specific abnormalities in “pre-lesional” skin may permit environmental stimuli to trigger inflammatory responses leading to both the unique and shared manifestations of each disease. Dissection of molecular pathways enriched in both clinically involved and uninvolved skin can advance the understanding of the pathogenesis of these conditions and identify novel therapies.
Autoimmune and inflammatory diseases, such as systemic lupus erythematosus (SLE), can affect many organs, including the skin. Indeed, skin manifestations of lupus, known as cutaneous lupus erythematosus (CLE), are common and occur in 70-85% of lupus patients (1-3). Historically, CLE is classified into three subtypes based on clinical and serological features: acute CLE (ACLE), subacute CLE (SCLE), and chronic CLE (CCLE) (4). The heterogeneity of CLE makes it difficult to determine particular cytokines or inflammatory pathways to target therapeutically and, as a result, no therapies are specifically approved for CLE (5). Both an innate immune response, coordinated through Toll-like receptor activation as well as multiple adaptive immune responses have been reported in the initiation and propagation of CLE (2). Targeting B cells with belimumab (6) and type 1 interferon (IFN) with anifrolumab (7) show some benefit by decreasing cutaneous manifestations of SLE. In contrast, other inflammatory skin diseases, such as psoriasis (PSO) have numerous approved therapies (8), and dupilumab, an inhibitor of IL-4 receptor signaling, is an effective therapy for both atopic dermatitis (AD) and PSO (9). This overlap of central, nonredundant pathways between PSO and AD illustrates that diseases with markedly different clinical phenotypes may have similar immunopathogenic underpinnings.
Although independent transcriptomic analyses have provided insight into the molecular landscape of CLE, a complete molecular characterization of the disease is limited by small patient cohorts (10-17). Previous bulk gene expression studies focused on specific aspects of lupus skin disease, such as the presence of T helper (Th) 17 cells (10) or specific macrophage populations (11), the correlation between inflammatory cell populations and fibroblast marker expression (12), cytokine expression (13), inflammasome signaling (14) or IFN signaling (15-17). However, there remains a need to examine the interplay of inflammatory cells, non-hematopoietic cells, and pathway perturbations to understand the molecular events in CLE pathogenesis in further detail
Whereas the dissimilarities in clinical manifestations of CLE and other inflammatory skin diseases have been well documented, the molecular differences between CLE and other inflammatory skin diseases are less completely studied. For instance, keratinocytes, one of the predominant non-hematopoietic cell populations in the skin, have been implicated in PSO pathogenesis (18) and shown to be hypersensitive to IFN signaling in CLE (15), yet understanding of their role in CLE is limited. Moreover, systemic sclerosis (SSc), another inflammatory skin disease, is characterized by fibrosis and vascular damage due to excessive deposits of extracellular matrix and differentiation of fibroblasts to myofibroblasts (19), but little is known about the role of fibrosis in the pathogenesis of CLE. Finally, AD is characterized by an allergic reaction owing to a loss in skin barrier function, fibrosis, and Th2 cell signaling (20), but these functions in CLE have not been explored. Detailed comparison among the molecular signatures of CLE, PSO, AD, and SSc could achieve better understanding of the primary pathogenic mechanisms and provide direction for new therapeutic avenues in these conditions.
In this study, we compared the gene expression signatures of four inflammatory skin diseases: CLE, PSO, AD, and SSc. In order to achieve a read depth sufficient to maintain the in vivo proportions of cellular signals in the biopsies without technical distortion and to capture the majority of molecular pathways, we analyzed bulk RNA. In addition, we employed analytic tools to deconvolute transcriptomic data and determine cellular and pathway signals enriched across heterogeneous cohorts of patients from each of the diseases. Using gene set variation analysis (GSVA), we determined that lesional skin of the four skin diseases expressed both shared and unique molecular signatures. Machine learning (ML) demonstrated that both lesional and nonlesional samples of each disease could be classified as distinct from control samples as well as from each other. Notably, nonlesional skin of each disease was more distinct than lesional skin, as there were more common features in ML classification of lesional skin among the four diseases. GSVA confirmed the molecular differences between uninvolved skin of the various conditions. These results suggest a model in which the nonlesional skin of patients with inflammatory skin diseases harbors unique abnormalities that potentially make the skin differentially sensitive to specific environmental stimuli. Inciting stimuli appear to induce responses with many overlapping inflammatory molecular features shared by the diseases. Altogether, this suggests that therapeutics employed in the treatment of one inflammatory skin disease may be useful in the treatment of additional diseases and confirms the utility of gene expression analysis in understanding the immunopathogenesis of clinical and pre-clinical disease.
Comprehensive Gene Expression Analysis of DLE Reveals Similarities and Differences with Other Inflammatory Skin Diseases.
We carried out a comprehensive transcriptomic analysis of five independent datasets of samples biopsied from both patients with DLE, the most frequent subset of CLE (4), and healthy controls (Table 3). In order to examine cellular and pathway signaling on an individual patient level, we carried out GSVA using a total of 48 informative gene signatures (Table 4A-1 to 4A-20, and 4B-1 to 4B-28). Hierarchical clustering of GSVA enrichment scores demonstrated that DLE was molecularly separable from healthy skin (
To understand the molecular landscape of cutaneous lupus in the context of other inflammatory skin diseases, we examined gene expression data derived from skin biopsies of patients with PSO, AD or SSc. Overall, there was enrichment of most myeloid and lymphoid-derived cell signatures across all four diseases as compared to control, whereas expression of skin-specific dendritic cells (DCs) differed among the diseases (left,
Next, we employed classification and regression tree (CART) analysis using GSVA enrichment scores of 48 cellular and pathway gene signatures to discern the gene expression variables that best classified the inflammatory skin diseases (
To distinguish inflammatory skin diseases more precisely and confirm the major transcriptomic contributors, we employed several ML algorithms. First, we examined distinct binary classification of pooled lesional DLE, PSO, AD, and SSc compared to pooled control samples using the ensemble decision tree, random forest (RF), with the 48 cellular and pathway signature GSVA scores as input features. The areas under the receiver operating characteristic (AuROC) curves and precision-recall (AuPR) curves for each binary classification were greater than 0.96 in all cases, indicating excellent performance and appropriate binary classification for each disease compared to control samples (
Next, we directly compared gene expression signatures of DLE samples with those of other inflammatory skin diseases. Distinctions in cellular and pathway signature enrichment among DLE and PSO samples were observed using hierarchical clustering of GSVA scores in two datasets with samples from these inflammatory skin conditions (
Transcriptomic Profiles of Nonlesional Skin Samples Distinguish Inflammatory Skin Diseases from Each Other
Although the molecular characteristics of lesional skin in each independent disease have been well-studied, less is known about the transcriptomic profiles of uninvolved skin. To determine whether there are underlying immunological abnormalities contributing to disease, we examined gene expression profiles in nonlesional skin samples from patients with DLE, PSO and AD to assess the extent to which they differed from either control or lesional skin; nonlesional SSc data was not available for analysis. Analysis of GSVA enrichment scores demonstrated that nonlesional samples were transcriptionally different from lesional samples (
Given that nonlesional skin of each disease was distinct from control skin and appeared to be distinct from other diseases, we next compared our binary classification of nonlesional DLE compared to nonlesional PSO or nonlesional AD using balance strategies described above (
The data demonstrated that ML employed specific gene signatures for the classification of nonlesional samples from patients with inflammatory skin diseases. To probe the differences in nonlesional skin in greater detail, we carried out an additional analysis using GSVA. For this analysis, we pooled the control samples and nonlesional samples from all datasets and employed Z-score normalization to scale data from samples obtained from different datasets. We found that nonlesional DLE samples compared to control samples show upregulation of B cells, melanocytes and complement protein gene signatures (
To confirm these GSVA results, we also employed a mean of Z-scores calculation for enrichment of signatures and found similar results (
Because the ML pipeline was able to determine specific signatures that separated related diseases, we sought to determine whether subtypes of CLE could be distinguished by analyzing gene expression profiles. We previously observed a robust upregulation of immune cell gene signatures, including the pDC, monocyte, T cell and B cell signatures by GSVA comparison of DLE and control samples (refer to
Keratinocyte and T cell signatures are often upregulated in inflammatory skin diseases. Because the ML analyses to this point have focused on gene signatures previously implicated in lupus (17, 21, 22), with less emphasis on those implicated in other inflammatory skin diseases (23, 24), we examined previously published PSO- or AD-specific gene signatures were also differentially enriched among the four lesional inflammatory skin diseases as compared to healthy controls. To accomplish this, we first evaluated gene sets derived from keratinocytes stimulated with various cytokines (Table 4C). We found, in all diseases, that many of the keratinocyte gene signatures were highly enriched (
Inflammatory skin disease risk score was calculated to understand activity of cellular and immune pathways in lesional skin diseases. GSVA of 48 gene signatures representing cells (immune and non-hematopoietic) (Tables 4A-1 to 4A-20) and pathways (Tables 4B-1 to 4B-20) was run independently on 16 datasets including samples from DLE, AD, PSO, and SSc. When pooled, the datasets comprised 90 lesional DLE, 183 lesional PSO, 132 lesional AD, 97 lesional SSC and 164 normal skin controls. To derive a model by which the inflammatory skin disease risk score could be generated, the GSVA scores in each sample were binarized, where GSVA scores >0 became 1, and GSVA scores <0 became 0. Logistic regression with ridge penalty was then run by random sampling with replacement 41 samples from each disease (41 lesional DLE, 41 lesional PSO, 41 lesional AD, and 41 lesional SSc), thereby totaling to 164 lesional skin disease samples, and 164 non-disease skin controls on each iteration, with the 48 binarized GSVA scores in each sample serving as features. Coefficients were calculated for each iteration and final coefficients were obtained by taking the average of all iterations.
Analysis of the Transcriptomic Profiles of Rheumatic Skin Diseases Reveals Disease-specific Endotypes. Patients with rheumatic skin diseases such as cutaneous lupus erythematosus (CLE) and systemic sclerosis (SSc) can be classified from individuals with healthy skin using the enrichment of specific molecular signatures However, as patient heterogeneity is a well-known feature of these diseases, it is important to ascertain whether there are distinct patient molecular phenotypes (endotypes). Moreover, identifying specific disease endotypes from analysis of peripheral blood might increase the capacity to recognize patient endotypes.
Gene expression data derived from publicly available lesional skin biopsies of CLE, specifically discoid lupus erythematosus (DLE), the most severe CLE subtype, and SSc patients was analyzed using gene set variation analysis (GSVA) of informative gene modules. Paired blood samples were also analyzed when available. K-means clustering was then applied to the GSVA enrichment scores to identify molecular endotypes in both skin diseases.
In DLE, k-means clustering revealed three subsets with distinct features (
Both DLE and SSc skin exhibit distinct subsets based upon their molecular profile (endotype). In addition, the most severe SSc skin subset can be identified from blood gene expression. Identifying specific molecular endotypes of patients with inflammatory skin disease may facilitate matching individual patients with effective therapies.
Here, we employed a comprehensive analysis of gene expression profiles to characterize the molecular features of four inflammatory skin diseases. Although considerable inter- and intra-dataset heterogeneity was observed, we documented molecular gene signatures that define both lesional and nonlesional skin of the various conditions. Notable among the findings were the shared and unique features of lesional skin among the four diseases and the unique features of nonlesional uninvolved skin. Altogether, this analysis demonstrates the informative power of transcriptomics to determine pathological characteristics of specific stages of each disease.
Our analyses involved multiple bioinformatic and statistical approaches that allowed us to understand the molecular pathways underlying the pre-clinical and clinical stages of inflammatory skin diseases. First and foremost, we assessed numerous datasets for each disease so that we could capture the transcriptional landscape of each condition and overcome the heterogeneity among patients and datasets. Second, each dataset was independently evaluated by GSVA using informative gene signatures we previously employed in the analysis of lupus (17, 21, 22, 27), gene signatures derived from interrogation of other inflammatory skin diseases (23, 24), as well as additional signatures we generated because of their relevance to skin pathogenesis. This analysis allowed us to observe unique patterns in the enrichment of inflammatory pathway signatures among and between the diseases and document that the diseases were molecularly separable. We employed ML models, including CART and random forest, to determine that effective classification between disease and control or between diseases was achievable and to identify the most important features labeling the conditions. The ML models not only permitted the effective classification of samples, but also allowed for dimensionality reduction, scaling the original 48 input gene signatures down to 15 features most important in each classification.
Despite previously noted heterogeneity (28, 29), our analysis revealed that the molecular landscape of DLE was more homogeneous across datasets comprising patients from different centers was sufficiently similar to permit accurate classification. Similarly, datasets including patients with SSc demonstrated consistent gene expression patterns. In contrast, we and others found datasets comprising patients with PSO and AD to be more molecularly heterogeneous (30). Nevertheless, we identified definitional transcriptional elements for each of the various conditions that included both shared and specific molecular perturbations. Comparison of the lesional DLE, PSO, AD and SSc transcriptomes using GSVA demonstrated that these four inflammatory skin diseases have numerous inflammatory pathways in common. IFN, TNF, IL-12 complex, IL-23 complex, T cell IL-23, anti-inflammation and unfolded protein gene signatures were commonly upregulated among lesional biopsies from the four inflammatory diseases. Indeed, CART analysis, which was used as a first path algorithm to detect important discriminators within the data, demonstrated the IFN and IL-12 complex gene signatures were the two most important features in distinguishing lesional DLE, PSO, AD and SSc from their respective control samples. Moreover, ML algorithms documented that of the 15 features necessary for accurate classification of each disease from control, seven features are common among all four diseases, including the IGS, IL-12 complex, IL-23 complex, TNF, plasma cell, T cell IL-23 and anti-inflammation gene signatures. Altogether, there were six shared and upregulated features between the GSVA and ML methods, suggesting that despite different genetic predispositions and disease manifestations, lesional DLE, PSO, AD and SSc have a common inflammatory microenvironment that differentiates them from control skin. This was further supported by the overlapping enrichment of numerous signatures among at least two of the four diseases. For example, GSVA demonstrated the neutrophil signature was upregulated in the majority of PSO and SSc patients, whereas the pDC, monocyte, monocyte/myeloid cell and B cell signatures were increased in the majority of DLE and SSc patients; the T cell, IL-21 complex, inflammasome and cell cycle signatures were increased in all diseases, except SSc.
Despite the similarities among the diseases however, we detected unique characteristics of each inflammatory skin disease. Indeed, we observed clear molecular distinctions between lesional samples from patients with DLE, PSO, AD, and SSc. The GSVA analysis revealed that the NK cell signature was only upregulated in lesional DLE compared to control samples, whereas the IL-1 cytokine signature was uniquely upregulated in lesional PSO compared to controls. Notably, however neither signature proved to be of particular importance in ML classification of either disease from controls. Similarly, the proteasome and Langerhans cell signatures were uniquely enriched in AD compared to control samples, and the endothelial cell and fibroblast signatures were uniquely enriched in lesional SSc compared to controls, these signatures were not of particular importance in ML classification of either disease. Despite this complexity, ML was able to delineate the most important features for classification of each condition from normal. For example, although increased in some patients from all diseases, the monocyte, T cell and B cell signatures were more important in the classification of DLE. Moreover, the keratinocyte and neutrophil signatures were most important in classifying PSO, and not the other diseases, a finding that is consistent with the role of keratinocyte proliferation and neutrophil infiltration in PSO (18, 31-33). In addition, the IL-21 complex signature was upregulated in all diseases except SSc, but was unique to ML classification of AD, consistent with the role of IL-21 in allergic skin diseases (34, 35). Finally, the TGFB fibroblast signature, was important in classification of SSc, which aligns with the central role of fibrosis in this disease (23, 36). Furthermore, ML demonstrated that the pDC, fibroblast, and glycolysis signatures are important in classifying lesional DLE from the other lesional diseases, illustrating that ML can identify molecular changes as effective classification features among samples. These findings strongly imply there are unique molecular features in lesional biopsies of inflammatory skin disease, along with a panoply of shared features.
Although there are numerous reports of gene expression abnormalities in lesional skin, less is known about the architecture of clinically uninvolved skin as compared to healthy skin. Examination of nonlesional skin in DLE, PSO, and AD provided new insights into the molecular processes operating in uninvolved skin and suggested a unique pre-clinical set of abnormalities in each condition. Application of both ML and Z-score based approaches as orthogonal analytic tools to assess the differences between nonlesional and normal skin revealed unique patterns of abnormalities in each inflammatory skin condition. Notably, only the apoptosis signature was one of the top 15 features employed by ML to classify nonlesional DLE, PSO, and AD versus pooled controls (
Of note, unlike lesional disease, we did not observe a prominent role for the IGS in nonlesional skin from DLE, PSO, or AD. This contrasts with some previous studies suggesting that nonlesional skin from patients with SLE or DLE is influenced by type 1 IFN (45-47). However, this contention was based largely on single-cell RNA-seq analysis of nonlesional keratinocytes and their expression of the IGS (45, 46), whereas our studies have evaluated expression of the IGS by deconvolution of bulk tissue gene expression. Our data revealed increased IGS in a few DLE samples, which may align with the increase of IFN action in only select cell clusters from single-cell RNA-seq analysis, but not in the majority of samples. Altogether, this suggests that IFN is not a dominant factor of nonlesional disease in either CLE or PSO and may instead reflect the concurrent exposure to UV light or presence of specific autoantibodies, both of which are associated with upregulation of the IGS (48-50).
Taken together, the data suggest a model in which patients with inflammatory skin disease manifest a specific set of pre-clinical molecular abnormalities that could predispose a patient to the development of typical clinical features, perhaps after encountering an environmental trigger (such as UV light, bacterial products or allergens). Upon development of cutaneous inflammation, common molecular features are upregulated, although the lesional disease maintains a unique gene expression profile (
Previous reports were not able to separate molecular features of DLE from those of SCLE despite the dramatic differences in clinical phenotype (51). Both DLE and SCLE are characterized by interface dermatitis, but the differences in clinical manifestations suggest different molecular underpinnings. By GSVA, gene expression profiles of these two entities were similar to each other. In fact, GSVA analysis showed the same gene signatures were significantly enriched in each subtype compared to control. However, the effect size was greater in significantly enriched modules in DLE compared to SCLE. The quantitative differences were sufficient for MIL to classify DLE from SCLE by using predominately the plasma cell, neutrophil, pDC, melanocyte and GC B cell features as well as the TNF, IL-12 and IL-1 cytokine inflammatory features to classify CLE subtypes.
The results of this analysis lend insight into future treatment strategies for CLE, PSO, AD, and SSc based on the observed common and distinct molecular characteristics. For example, IL-17 is a well-known target for PSO treatment (52) and has been explored in therapy for lupus (53) and AD (54); however, we did not observe consistent upregulation of the IL-17 complex signature among the lesional manifestations of DLE, AD, and SSc, suggesting IL-17 neutralizing therapy may be best suited for lesional PSO alone. However, we observed upregulation of IL-17 complex and Th17 gene signatures in nonlesional PSO and AD samples, as well as upregulation of the IL-17 complex signature in DLE as compared to control samples, suggesting that IL-17 targeting might be appropriate to prevent the emergence of typical skin lesions in all three diseases as well as to treat established plaques in lesional PSO. Of note, two of five lesional DLE datasets demonstrated significant upregulated of the IL-17 complex and Th17 signature suggesting that a subset of DLE patients might be responsive to IL-17 neutralization, and a study investigating the role of secukinumab, a monoclonal antibody to IL-17a, in DLE is ongoing (55). In addition, the consistent upregulation of the TNF signature in each lesional inflammatory skin disease supports the possibility that TNF neutralizing agents may ameliorate inflammation in all four conditions. To date, TNF neutralizing agents are effective in treating PSO (26) (etanercept (56), infliximab (57), adalimumab (58), and certolizumab (59)), while others report their possible efficacy in SLE (60) and AD (61, 62). Notably, a recent phase II trial found that intradermal injection of a TNF neutralizing agent, etanercept, as opposed to traditional systemic injection, induced remission in DLE (63, 64), supporting the conclusion that local presence of TNF in the skin lesion is pathogenic in DLE. The IL-12 signature was important in classifying all four lesional skin diseases, suggesting potential efficacy for the IL-12/23 inhibitor, ustekinumab, which is approved for treating PSO(65). Recent phase III trials in lupus were unsuccessful (66, 67), but improvement of skin and mucocutaneous lesions was noted in phase II trials (68). Finally, consistent enrichment of the IGS was noted in lesional skin of all four diseases, suggesting the potential for efficacy of interferon inhibitors such as anifrolumab. Indeed, anifrolumab treatment, which was recently approved for SLE, caused a significant reduction in skin involvement in CLE compared to patients receiving placebo (7).
Some of the individual datasets had small sample numbers; therefore, it was necessary to pool lesional samples from each disease, nonlesional samples from each disease and controls to achieve sufficient sample numbers for ML. In addition, some datasets had few or no controls, and thereby, nonlesional skin could not be compared to control samples by GSVA without the pooling of samples and employing normalization steps. Despite this, we found a number of changes in nonlesional DLE similar to those previously reported by other techniques, for example the decrease in Langerhans cells via immunohistochemistry (69). Moreover, because many of the widely used keratinocyte gene signatures were highly correlated with each other, ML analysis on cutaneous gene signatures previously reported in PSO and AD was not possible (30, 70, 71). Despite these caveats and the intra- and inter-dataset heterogeneity, we identified gene signatures both similar and distinct in lesional and nonlesional inflammatory skin diseases.
In summary, this transcriptomic analysis is one of the first comprehensive studies to evaluate four inflammatory skin diseases concurrently and introduce comparative analyses of both lesional and nonlesional samples with control samples. We elucidated similarities and differences among both lesional and nonlesional DLE, PSO, AD, and SSC. Overall, our combined GSVA/ML analysis demonstrated that although there are seven shared features for classifying lesional DLE, PSO, AD, and SSc from pooled controls; nonlesional skin samples among diseases are molecularly more distinct from one another than lesional samples. This reveals that nonlesional skin samples are extremely informative about the underlying disease process and could be used in a subset of patients for future clinical trials (72). Indeed, nonlesional skin may be more useful in identifying the driving features underlying pathogenesis, since during chronic lesional disease the inflammatory milieu among diseases becomes more similar. In addition, although enrichment analysis of all cell types and pathways is important in the overall definition of disease pathology and necessary to understand for treatment, specific features may be more important in molecular diagnostics for identifying one disease from another.
Experimental Design: 15 publicly available gene expression datasets (accessed from the Gene Expression Omnibus (GEO)) were analyzed (Table 3), including: 11 Affymetrix/Illumina microarray datasets (GSE52471 (10), GSE72535 (11), GSE81071 (14, 15, 68, 69), GSE109248 (13), GSE100093 (16), GSE120809 (12), GSE117239 (75), GSE117468 (52), GSE130588 (76), GSE58095 (77), GSE95065 and 4 RNA-seq datasets (GSE121212 (30), GSE137430 (54), GSE157194 (71), GSE130955 (78)). GSE81071 was split into two parts based on the submission date on GEO (GSE81071 from 2017 referred to in the text as GSE81071_A and GSE81071 from 2019 referred to in the text as GSE81071_B). All datasets comprise gene expression derived from skin biopsies of lesional or nonlesional skin from patients with an inflammatory skin disease, including PSO, AD, SSc, and CLE subtypes including DLE, SCLE, and ACLE or skin biopsies derived from healthy control subjects. For GSE117239 (75), GSE117468 (52), GSE130588 (76), GSE137430 (54), and GSE157194 (71), only lesional and nonlesional samples at baseline without drug treatment were included in the analysis.
Statistical Analysis: Statistical differences between cohorts were evaluated using unpaired t-test with Welch's correction for GSVA enrichment scores of lesional and nonlesional samples, mean Z-scores of nonlesional samples versus control, and paired t-test with Welch's correction for lesional versus nonlesional comparison were carried out in GraphPad PRISM. Calculation of mean and standard deviation (SD) for each GSVA score in each tissue was performed in Microsoft Excel. The number of samples for each dataset detailed in Table 5B. Further statistical details can be found below.
Microarray data: Microarray data was normalized using either GeneChip Robust Multiarray Average (GCRMA), Robust Multiarray Average (RMA), or normexp background correction (NEQC) based on the microarray platform. Outliers and batch effects were identified using principal component analysis (PCA) plots. For the dataset with known batch effects, GSE81071, raw gene expression values were normalized using 11 housekeeping genes, which were shown to not vary significantly across datasets (79). These 11 housekeeping genes were: chromosome1 Open Reading Frame 43 (C1orf43), Charged multivesicular body protein 2A (CHMP2A), ER membrane protein complex subunit 7 (EMC7), glucose-6-phosphate isomerase (GPI), proteosome subunit beta type 2 (PSMB2), proteosome subunit beta type 4 (PSMB4), member RAS oncogene family (RAB7A), receptor accessory protein 5 (REEP5), small nuclear ribonucleoprotein D3 (SNRPD3), valosin containing protein (VCP), and vacuolar protein sorting 29 homolog (VPS29).
RNAseq data: SRA toolkit (NCBI Sequence Read Archive, Version 2.10) was used to fetch .sra files from GEO and convert them to .fastq files. Quality of the FASTQ files was checked using FASTQC software (Babraham Institute Bioinformatics, Version 0.11.9). Adapters were removed using Trimmomatic software (80) (Version 0.4) and appropriate head crop parameters. Trimmed reads were aligned to the human reference genome (hg38) using STAR aligner (81) (Version 2.7). STAR output .sam files were converted to .bam files using sambamba (82) (Version 0.8). Read summarization was provided using the featureCounts (83) function of the Subread (84) (Version 2.0) package. Count normalization and regularized log transformation were carried out using r log function in DESeq2 (85) (Version 1.32) R package.
Gene Set Variation Analysis: Gene Set Variation Analysis (86) (GSVA) is a non-parametric, unsupervised method for estimating variations in gene set enrichment among the samples of an expression dataset. The GSVA algorithm was implemented using the R Bioconductor open-source package gsva (version 1.40). GSVA was carried out in one of the following ways:
When individual datasets were analyzed, the preprocessed log 2 gene expression matrix of each dataset was used as the GSVA input. GSVA was run on each dataset separately. Before running GSVA, input genes were filtered and only those with interquartile range (IQR) of expression >0 across all the samples were considered for analysis. All analysis in
For the analysis of pooled nonlesional and control samples, log 2 gene expression values generated from independent preprocessing of all 16 datasets were concatenated to create a matrix whose rows consisted of 8425 genes detected across all datasets and whose columns consisted of the 1065 samples comprised of DLE, nonlesional (NL) DLE, ACLE, SCLE, PSO, NL PSO, AD, NL AD, SSc, and CTLs. Log 2 values were then transformed to Z scores using scale (function in R. Z-score transformation converts each sample to have expression values with mean and unit variance equal to 0 (87, 88). This transformation permitted comparison of nonlesional disease samples to control directly. GSVA was then run on the following three inputs 1) 21 pooled nonlesional DLE and 168 pooled control samples, 2) 132 pooled nonlesional AD and 168 pooled control samples, and 3) 163 pooled nonlesional PSO and 168 pooled samples. The data presented in
GSVA Gene Sets: The gene sets used for GSVA can be found in Table 4A-C Cellular pathway signatures: Gene sets employed in our GSVA analysis included 48 annotated and novel cellular and pathway signatures that have been implicated in lupus (4, 5, 6) or inflammatory skin diseases (7, 8). Immune cell gene sets were previously evaluated (21, 27) or amended slightly based upon data from the Human Protein Atlas (89, 90). Non-hematopoietic cell signatures were derived from the Human Protein Atlas (89, 90), previously published gene sets (91), and literature mining as previously described and employed. Pathway gene signatures were previously evaluated in lupus (21, 27), previously published (23, 24), or newly adopted by literature mining (92, 93). The output GSVA scores of each signature were used as features for training and validating ML classifiers. 40 of the 48 cellular and pathway gene signatures were used to implement the GSVA analysis on pooled nonlesional and control samples. The following signatures were excluded from this analysis because of insufficient gene numbers (</=2) in the 8,425 genes used: LDG, GC B cell, erythrocyte, IL1 cytokines, IL12 complex, IL21 complex, IL23 complex, and the immunoproteasome.
Keratinocyte signatures: 30 gene sets specific to keratinocytes treated with individual cytokines were created by from previously published studies (18, 94-100), (70, 101-107). Only those genes that are upregulated in keratinocytes when treated with various cytokines were included in these sets.
T cell signatures: Gene sets for T cells were created from literature mining (108-111) and the Human Protein Atlas (89, 90) to distinguish seven different T cell subsets that have been implicated in inflammatory skin disease.
Classification and Regression Tree (CART): The library rpart (Version 4.1) was used to implement the CART algorithm for classification described previously (112, 113) and library rpart.plot (Version 3.1) was used to visualize classification trees. GSVA enrichment scores of cellular and pathway signatures were used as independent variables and specific disease (either DLE, PSO, AD, SSC, or CTL) was used as the dependent variable for analysis. Classification trees were built independently for each disease
Creating input for ML: The input for ML was created by pooling GSVA enrichment scores of cellular and pathway gene signatures from multiple skin datasets based on classification of skin disease or sample (Table 5B). For every dataset, GSVA enrichment scores, that range from −1 to +1, were concatenated from multiple datasets, providing a sufficiently large cohort to train and validate various ML algorithms. 14 input data frames were created for 14 separate binary ML classifications (Table 5A). Seven of the 14 binary classifications involve comparing control samples (164 CTL) with either lesional samples (DLE, PSO, AD or SSc) or nonlesional samples (DLE, PSO or AD) of inflammatory skin diseases (Table 5A A-D and I-K), whereas the other six binary classifications involve comparing lesional DLE samples with lesional samples of other diseases (either PSO, AD or SSc) (Table 5A E-H) and nonlesional DLE samples with nonlesional samples of other diseases (Table 5A L-M). In addition, another binary classification consisted of comparing nonlesional PSO and nonlesional AD (Table 5A). For lesional skin classification, pooled samples were 90 DLE, 132 AD, 97 SSc, and 183 PSO. For nonlesional skin classification, pooled samples were 21 DLE, 163 PSO, and 132 AD, and for healthy skin pooled samples were 164 CTL (Table 5B).
Class balance strategies: Four class balance strategies, including: random undersampling (Table 5A C), random oversampling (Table 5A E, K) removing samples from an entire dataset (Table 5A F), and Synthetic Minority Oversampling Technique (SMOTE) (114) (Table 5A I, L, M) were used for classifications with class imbalance. The random undersampling strategy involves randomly selecting samples from the majority class, whereas the random oversampling strategy involves randomly duplicating examples from the minority class. SMOTE functions by randomly selecting samples from the minority class, finding its k nearest neighbors, randomly selecting a neighbor, and generating a synthetic sample at a randomly selected point between two samples in the feature space. As previously noted, we used random undersampling to trim the number of examples in the majority class then used SMOTE to oversample the minority class to balance class distribution. The purpose of all class balancing strategies was to have balanced representation of both classes for ML. The dataset was split into 70% training and 30% validation and class balancing strategies were applied on the training dataset. ML algorithms were then implemented, and evaluation matrices were noted. Receiver Operating Characteristic (ROC) curves and Precision-Recall (PR) curves were plotted using the matplotlib (Version3.3.4) library of Python. A ROC curve is graphical way to visualize trade-off between sensitivity and specificity. High area under the curve represents a low false-positive rate and a high true-positive rate. A PR curve is a measure of classification when classes are imbalanced. High area under the PR curve represents both high recall and high precision, where high precision relates to a low false-positive rate, and high recall relates to a low-false negative rate. For our analysis, we were interested in features that contributed the most towards separation of classes, hence RF was chosen as the primary ML classifier because it gives impurity-based feature importance. The top 15 features with decreasing Gini index from each classification were summarized in a bar graph using ggplots2 (Version 3.3.5) library in R. Capability of the top 15 features alone to separate the two respective classes was tested by repeating the 14 binary ML classifications using only the top 15 features. Various overlaps between the top 15 features of multiple classifications were visualized in Venn diagrams.
Binary ML classification: 14 separate binary ML classifications were carried out using scikit-learn (Version 0.24.1) library in Python (Version 3.8.2). For each binary classification, performance of several ML algorithms, including: Logistic regression (LR), K-Nearest Neighbor (KNN), Naïve Bayes (NB), Support Vector Machines (SVM), Random Forest (RF), and Gradient Boosting (GB) was evaluated based on sensitivity, specificity, Cohen kappa score, f-1 score, and accuracy. RF was chosen as the primary ML classifier because it gives impurity-based feature importance. The top 15 features with decreasing Gini index from each classification were summarized in a bar graph using ggplots2 (Version 3.3.5) library in R. Capability of the top 15 features to separate two respective classes was tested by repeating the 14 binary ML classifications using only the top 15 features.
Feature correlation: Before carrying out binary ML classification, feature selection was necessary in order to remove noninformative or redundant features. We assessed feature redundancy by calculating the Pearson correlation between each feature and every other feature. Pearson correlation between features was computed using the cor function in R. corplotlibrary in R was used to plot 22 Pearson correlation plots (
Statistical Analysis: Statistical differences between cohorts were evaluated using Welch's T test for lesional disease versus control GSVA scores from a single dataset, nonlesional samples versus control GSVA scores from combined dataset, mean Z-cores of nonlesional samples versus mean Z-scores of control samples of a single gene signature and Paired T test for lesional versus nonlesional comparison. The magnitude of this difference (the effect size) was estimated using Hedge's g(115) calculated as below
cohort 1 and cohort 2 could be either disease and their respective control samples of a single dataset or nonlesional samples and control samples from combined dataset or mean z scores of nonlesional samples and mean z scores of control samples of a single gene signature or lesional and their paired nonlesional samples of a single dataset. All the statistical analysis was carried out in using effectSize (version 0.8.1) and stats (version 3.6.2) library in R.
Data Visualization: Heatmaps of GSVA Hedges' G effect size and violin plots of GSVA enrichment scores were visualized using GraphPad PRISM (Version 9.2.0). GSVA enrichment scores of gene signatures were visualized using violin plots in Prism or ComplexHeatmap (116) for hierarchical clustering (Version 2.8) package in R. Figures were made using AdobeIllustrator Creative Cloud (Version 25.3.1).
Data and materials availability: All transcriptomic data are previously published and available in the NCBI Gene Expression Omnibus (GSE52471, GSE72535, GSE81071, GSE109248, GSE100093, GSE120809, GSE117239, GSE117468, GSE130588, GSE137430, GSE157194, GSE121212, GSE95065, GSE58095, GSE130955) as seen in Table 3. All data are available in the main text or the supplementary materials. All bioinformatic software used in this publication is open source, freely available for R and Python. Additionally, example code used in this paper for GSVA, CART, and ML are available at figshare, www.figshare.com. File names are “AMVPEL BioSolutions GSVA Code AFFY nonzeroIQR Code”, “AMVPEL BioSolutions Stepwise and CARTCode”, “AMVIPEL BioSolutions MVL Binary”.
sapiens)
sapiens)
sapiens)
sapiens)
While preferred embodiments have been shown and described herein, such embodiments are provided by way of example only. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the scope of the disclosure. It may be understood that various alternatives to the embodiments described herein may be employed in practice. Numerous different combinations of embodiments described herein are possible, and such combinations are considered part of the present disclosure. In addition, all features discussed in connection with any one embodiment herein may be readily adapted for use in other embodiments herein. It is intended that the following claims define the scope of the disclosure and that methods and structures within the scope of these claims and their equivalents be covered thereby.
This application claims priority to U.S. Provisional Patent Applications No. 63/216,999, filed Jun. 30, 2021; No. 63/246,726, filed Sep. 21, 2021; No. 63/313,177, filed Feb. 23, 2022; and No. 63/343,855, filed May 19, 2022, all of which are incorporated in full herein by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2022/035552 | 6/29/2022 | WO |
Number | Date | Country | |
---|---|---|---|
63216999 | Jun 2021 | US | |
63246726 | Sep 2021 | US | |
63313177 | Feb 2022 | US | |
63343855 | May 2022 | US |