METHYLATION MARKERS FOR DIAGNOSING CANCER

BACKGROUND OF THE DISCLOSURE

Cancer is a leading cause of deaths worldwide, with annual cases expected to increase from 14 million in 2012 to 22 million during the next two decades (WHO). Diagnostic procedures for liver cancer, in some cases, begin only after a patient is already present with symptoms, leading to costly, invasive, and sometimes time-consuming procedures. In addition, inaccessible areas sometimes prevent an accurate diagnosis. Further, high cancer morbidities and mortalities are associated with late diagnosis.

SUMMARY OF THE DISCLOSURE

In certain embodiments, disclosed herein is a method of selecting a subject suspected of having cancer for treatment, comprising: (a) contacting treated DNA with at least one probe from a probe panel to generate an amplified product, wherein the at least one probe hybridizes under high stringency condition to a target sequence of a cg marker selected from Table 1, Table 2, Table 7, Table 8, or Table 13, and wherein the treated DNA is processed from a biological sample obtained from the subject; (b) analyzing the amplified product to generate a methylation profile of the cg marker; (c) comparing the methylation profile to a reference model relating methylation profiles of cg markers from Tables 1, 2, 7, 8, and 13 to a set of cancers; (d) based on the comparison of step c), determining: (i) whether the subject has cancer; and (ii) which cancer type the subject has; and (e) administering an effective amount of a therapeutic agent to the subject if the subject is determined to have cancer and the cancer type is determined.

In certain embodiments, disclosed herein is a method of detecting the methylation status of a set of cg markers, comprising: (a) processing a biological sample obtained from a subject with a deaminating agent to generate treated DNA comprising deaminated nucleotides; (b) contacting the treated DNA with at least one probe that hybridizes under high stringency condition to a target sequence of a cg marker from Table 1, Table 2, Table 7, Table 8, Table 13, Table 14, or Table 20; and (c) quantitatively detecting the methylation status of the cg marker, wherein said detection comprises a real-time quantitative probe-based PCR or a digital probe-based PCR.

Disclosed herein, in certain embodiments, is a method of detecting a methylation pattern of a set of biomarkers in a subject suspected of having a cancer, the method comprising: (a) processing an extracted genomic DNA with a deaminating agent to generate a genomic DNA sample comprising deaminated nucleotides, wherein the extracted genomic DNA is obtained from a biological sample from the subject suspected of having a cancer; and (b) detecting the methylation pattern of one or more biomarkers selected from Table 1, Table 2, Table 7, Table 8, Table 13, Table 14, or Table 20 from the extracted genomic DNA by contacting the extracted genomic DNA with a set of probes, wherein the set of probes hybridizes to the one or more biomarkers, and perform a DNA sequencing analysis to determine the methylation pattern of the one or more biomarkers. In some embodiments, said detecting comprises a real-time quantitative probe-based PCR or a digital probe-based PCR. In some embodiments, the digital probe-based PCR is a digital droplet PCR. In some embodiments, the set of probes comprises a set of padlock probes. In some embodiments, step b) comprises detecting the methylation pattern of one or more biomarkers selected from Table 2, Table 13, Table 14, or Table 20. In some embodiments, step b) comprises detecting the methylation pattern of one or more biomarkers selected from cg19516279, cg06100368, cg25945732, cg19155007, cg17952661, cg04072843, cg01250961, cg08131100, cg03788131, cg17528648, cg07784526, cg18948743, cg23986470, cg00846300, cg01029638, cg08350814, cg05098590, cg18085998, cg06532037, cg15313226, cg16232979, cg26149167, cg01237565, cg16561543, cg13771313, cg13771313, cg08169020, cg08169020, cg21153697, cg07326648, cg14309384, cg20923716, cg09095222, cg22220310, cg21950459, cg13332729, cg10802543, cg20707333, cg13169641, cg25352342, cg09921682, cg02504622, cg17373759, cg06547203, cg06826710, cg00902147, cg17609887, cg15721142, cg08116711, cg00736681, cg18834029, cg06969479, cg24630516, cg16901821, cg20349803, cg23610994, cg19313373, cg16508600, cg24096323, cg24746106, cg12288267, cg10430690, cg24408776, cg05630192, cg12028674, cg24820270, cg12028674, cg26718707, cg10349880, cg09921682, cg25934700, cg14164596, cg24461337, cg23041410, cg07366553, cg26859666, cg06405341, cg08557188, cg00690392, cg03421440, cg07077277, or cg20702527. In some embodiments, the subject is suspected of having a breast cancer and step b) comprises detecting the methylation pattern of one or more biomarkers selected from cg19516279, cg06100368, cg20349803, cg23610994, cg19313373, cg16508600, and cg24096323. In some embodiments, the subject is determined to have a breast cancer if: at least one of the cg markers cg19516279 and cg06100368 is hypermethylated; at least one of the cg markers cg20349803, cg23610994, cg19313373, cg16508600, and cg24096323 is hypomethylated; or a combination thereof. In some embodiments, the subject is suspected of having a liver cancer and step b) comprises detecting the methylation pattern of one or more biomarkers selected from cg25945732, cg19155007, cg17952661, cg25934700, cg14164596, cg24461337, cg23041410, cg07366553, and cg26859666, or cg00456086. In some embodiments, the subject is determined to have a liver cancer if: at least one of the cg markers cg25945732, cg19155007, or cg17952661 is hypermethylated; at least one of the cg markers cg25934700, cg14164596, cg24461337, cg23041410, cg07366553, cg26859666, or cg00456086 is hypomethylated; or a combination thereof. In some embodiments, the subject is suspected of having a liver cancer and step b) comprises detecting the methylation pattern of one or more biomarkers selected from 3-49757316, 8-27183116, 8-141607252, 17-29297711, 3-49757306, 19-43979341, 8-141607236, 5-176829755, 18-13382140, 15-65341965, 3-13152305, 17-29297770, 8-27183316, 5-176829740, 19-41316693, 18-43830649, 15-65341957, 20-44539531, 7-30265625, 2-131129567, 5-176829665, 3-13152273, 8-27183348, 3-49757302, 19-41316697, 8-61821442, 20-44539525, 10-102883105, 11-65849129, 5-176829639, 15-91129457, 2-1625431, 6-151373292, 6-151373294, 20-25027093, 6-14284198, 10-4049295, 19-59023222, 1-184197132, 2-131004117, 2-8995417, 12-10782319, 20-25027033, 6-151373256, 8-86100970, 9-4839459, 17-41221574, 1-153926715, 20-25027044, 20-20177325, 2-1625443, 20-25027085, 11-69420728, 1-229234865, 6-13408877, 22-50643735, 6-151373308, 1-232119750, 8-134361508, or 6-13408858. In some embodiments, the subject is determined to have a liver cancer if: at least one of the markers 3-49757316, 8-27183116, 8-141607252, 17-29297711, 3-49757306, 19-43979341, 8-141607236, 5-176829755, 18-13382140, 15-65341965, 3-13152305, 17-29297770, 8-27183316, 5-176829740, 19-41316693, 18-43830649, 15-65341957, 20-44539531, 7-30265625, 2-131129567, 5-176829665, 3-13152273, 8-27183348, 3-49757302, 19-41316697, 8-61821442, 20-44539525, 10-102883105, 11-65849129, or 5-176829639 is hypermethylated; at least one of the markers 15-91129457, 2-1625431, 6-151373292, 6-151373294, 20-25027093, 6-14284198, 10-4049295, 19-59023222, 1-184197132, 2-131004117, 2-8995417, 12-10782319, 20-25027033, 6-151373256, 8-86100970, 9-4839459, 17-41221574, 1-153926715, 20-25027044, 20-20177325, 2-1625443, 20-25027085, 11-69420728, 1-229234865, 6-13408877, 22-50643735, 6-151373308, 1-232119750, 8-134361508, or 6-13408858 is hypomethylated; or a combination thereof. In some embodiments, the subject is suspected of having an ovarian cancer and step b) comprises detecting the methylation pattern of one or more biomarkers selected from cg04072843, cg01250961, cg24746106, cg12288267, and cg10430690. In some embodiments, the subject is determined to have an ovarian cancer if: at least one of the cg markers cg04072843 and cg01250961 is hypermethylated; at least one of the cg markers cg24746106, cg12288267, and cg10430690 is hypomethylated; or a combination thereof. In some embodiments, the subject is suspected of having a colorectal cancer and step b) comprises detecting the methylation pattern of one or more biomarkers selected from cg08131100, cg03788131, cg17528648, cg07784526, cg18948743, cg23986470, cg00846300, cg25352342, cg09921682, cg02504622, cg17373759, cg12028674, cg24820270, cg12028674, cg26718707, cg10349880, and cg09921682. In some embodiments, the subject is determined to have a colorectal cancer if: at least one of the cg markers cg08131100, cg03788131, cg17528648, cg07784526, cg18948743, cg23986470, or cg00846300 is hypermethylated; at least one of the cg markers cg25352342, cg09921682, cg02504622, cg17373759, cg12028674, cg24820270, cg12028674, cg26718707, cg10349880, or cg09921682 is hypomethylated; or a combination thereof. In some embodiments, the subject is suspected of having a colorectal cancer and step b) comprises detecting the methylation pattern of one or more biomarkers selected from cg10673833, cg10493436, cg10428836, cg27284288, cg16959747, cg17494199, cg23678254, cg24067911, or cg25459300. In some embodiments, the subject is suspected of having a colorectal cancer and step b) comprises detecting the methylation pattern of one or more biomarkers selected from cg05205843, cg11841704, cg06699564, cg08924619, cg11959316, cg08924619, cg06699564, cg01824933, cg08924619, cg05205842, cg08924619, cg04049981, cg09026722, cg03616722, cg08924619, cg05928904, cg08704934, cg09776772, cg17494199, cg01824933, cg16296417, cg09776772, cg09776772, cg05338167, cg10493436, cg011251410, cg16391792, cg06393830, cg09366118, cg22513455, cg17583432, cg23881926, cg09638208, cg12441066, cg27284288, cg04441857, cg17583432, cg10673833, cg19757176, cg08670281, cg17583432, cg04460364, cg16959747, cg15011734, or cg25754195. In some embodiments, the subject is determined to have a colorectal cancer if: at least one of the cg markers cg06393830, cg09366118, cg22513455, cg17583432, cg23881926, cg09638208, cg12441066, cg27284288, cg04441857, cg17583432, cg10673833, cg19757176, cg08670281, cg17583432, cg04460364, cg16959747, cg15011734, or cg25754195 is hypermethylated; at least one of the cg markers cg05205843, cg11841704, cg06699564, cg08924619, cg11959316, cg08924619, cg06699564, cg01824933, cg08924619, cg05205842, cg08924619, cg04049981, cg09026722, cg03616722, cg08924619, cg05928904, cg08704934, cg09776772, cg17494199, cg01824933, cg16296417, cg09776772, cg09776772, cg05338167, cg10493436, cg011251410, or cg16391792 is hypomethylated; or a combination thereof. In some embodiments, the subject is suspected of having a prostate cancer and step b) comprises detecting the methylation pattern of one or more biomarkers selected from cg01029638, cg08350814, cg05098590, cg18085998, cg06532037, cg15313226, cg16232979, cg26149167, cg06547203, cg06826710, cg00902147, cg17609887, and cg15721142. In some embodiments, the subject is determined to have a prostate cancer if: at least one of the cg markers cg01029638, cg08350814, cg05098590, cg18085998, cg06532037, cg15313226, cg16232979, or cg26149167 is hypermethylated; at least one of the cg markers cg06547203, cg06826710, cg00902147, cg17609887, or cg15721142 is hypomethylated; or a combination thereof. In some embodiments, the subject is suspected of having a pancreatic cancer and step b) comprises detecting the methylation pattern of one or more biomarkers selected from cg01237565, cg16561543, and cg08116711. In some embodiments, the subject is determined to have a pancreatic cancer if: at least one of the cg markers cg01237565 or cg16561543 is hypermethylated; cg marker cg08116711 is hypomethylated; or a combination thereof. In some embodiments, the subject is suspected of having acute myeloid leukemia and step b) comprises detecting the methylation pattern of one or more biomarkers selected from cg13771313, cg13771313, and cg08169020. In some embodiments, the subject is suspected of having cervical cancer and step b) comprises detecting the methylation pattern of one or more biomarkers selected from cg08169020, cg21153697, cg07326648, cg14309384, cg20923716, cg22220310, cg21950459, cg13332729, cg10802543, cg20707333, and cg13169641. In some embodiments, the subject is determined to have cervical cancer if: at least one of the cg markers cg08169020, cg21153697, cg07326648, cg14309384, or cg20923716 is hypermethylated; at least one of the cg markers cg22220310, cg21950459, cg13332729, cg10802543, cg20707333, or cg13169641 is hypomethylated; or a combination thereof. In some embodiments, the subject is suspected of having sarcoma and step b) comprises detecting the methylation pattern of one or more biomarkers selected from cg09095222. In some embodiments, the subject is determined to have sarcoma if at least cg marker cg09095222 is hypermethylated. In some embodiments, the subject is suspected of having stomach cancer and step b) comprises detecting the methylation pattern of one or more biomarkers selected from cg00736681 and cg18834029. In some embodiments, the subject is determined to have stomach cancer if at least one of the cg markers cg00736681 or cg18834029 is hypomethylated. In some embodiments, the subject is suspected of having thyroid cancer and step b) comprises detecting the methylation pattern of one or more biomarkers selected from cg06969479, cg24630516, and cg16901821. In some embodiments, the subject is determined to have thyroid cancer if at least one of the cg markers cg06969479, cg24630516, or cg16901821 is hypomethylated. In some embodiments, the subject is suspected of having mesothelioma and step b) comprises detecting the methylation pattern of one or more biomarkers selected from cg05630192. In some embodiments, the subject is determined to have mesothelioma if cg marker cg05630192 is hypomethylated. In some embodiments, the subject is suspected of having glioblastoma and step b) comprises detecting the methylation pattern of one or more biomarkers selected from cg06405341. In some embodiments, the subject is suspected of having lung cancer and step b) comprises detecting the methylation pattern of one or more biomarkers selected from cg08557188, cg00690392, cg03421440, and cg07077277. In some embodiments, the subject is determined to have lung cancer if at least one of the cg markers cg08557188, cg00690392, cg03421440, or cg07077277 is hypomethylated. In some embodiments, the biological sample is a blood sample, a urine sample, a saliva sample, a sweat sample, or a tear sample. In some embodiments, the biological sample is a cell-free DNA sample. In some embodiments, the biological sample comprises circulating tumor cells.

In certain embodiments, disclosed herein is a kit comprising a set of nucleic acid probes that hybridizes to target sequences of cg markers illustrated in Table 1, Table 2, Table 7, Table 8, Table 13, Table 14, Table 20, or a combination thereof. In some embodiments, the set of nucleic acid probes hybridizes to target sequences of cg markers selected from Table 1. In some embodiments, the set of nucleic acid probes hybridizes to target sequences of cg markers selected from Table 2. In some embodiments, the set of nucleic acid probes hybridizes to target sequences of cg markers selected from Table 7. In some embodiments, the set of nucleic acid probes hybridizes to target sequences of cg markers selected from Table 8. In some embodiments, the set of nucleic acid probes hybridizes to target sequences of cg markers selected from Table 13. In some embodiments, the set of nucleic acid probes hybridizes to target sequences of cg markers selected from Table 14. In some embodiments, the set of nucleic acid probes hybridizes to target sequences of cg markers selected from Table 20. In some embodiments, the set of nucleic acid probes hybridizes to target sequences of cg markers selected from cg19516279, cg06100368, cg25945732, cg19155007, cg17952661, cg04072843, cg01250961, cg08131100, cg03788131, cg17528648, cg07784526, cg18948743, cg23986470, cg00846300, cg01029638, cg08350814, cg05098590, cg18085998, cg06532037, cg15313226, cg16232979, cg26149167, cg01237565, cg16561543, cg13771313, cg13771313, cg08169020, cg08169020, cg21153697, cg07326648, cg14309384, cg20923716, cg09095222, cg22220310, cg21950459, cg13332729, cg10802543, cg20707333, cg13169641, cg25352342, cg09921682, cg02504622, cg17373759, cg06547203, cg06826710, cg00902147, cg17609887, cg15721142, cg08116711, cg00736681, cg18834029, cg06969479, cg24630516, cg16901821, cg20349803, cg23610994, cg19313373, cg16508600, cg24096323, cg24746106, cg12288267, cg10430690, cg24408776, cg05630192, cg12028674, cg24820270, cg12028674, cg26718707, cg10349880, cg09921682, cg25934700, cg14164596, cg24461337, cg23041410, cg07366553, cg26859666, cg06405341, cg08557188, cg00690392, cg03421440, cg07077277, cg00456086, and cg20702527. In some embodiments, the set of nucleic acid probes hybridizes to target sequences of cg10673833, cg10493436, cg10428836, cg27284288, cg16959747, cg17494199, cg23678254, cg24067911, or cg25459300. In some embodiments, the set of nucleic acid probes hybridizes to target sequences of cg10673833 or cg25462303. In some embodiments, the set of nucleic acid probes hybridizes to target sequences of cg05205843, cg11841704, cg06699564, cg08924619, cg11959316, cg08924619, cg06699564, cg01824933, cg08924619, cg05205842, cg08924619, cg04049981, cg09026722, cg03616722, cg08924619, cg05928904, cg08704934, cg09776772, cg17494199, cg01824933, cg16296417, cg09776772, cg09776772, cg05338167, cg10493436, cg011251410, or cg16391792. In some embodiments, the set of nucleic acid probes hybridizes to target sequences of cg06393830, cg09366118, cg22513455, cg17583432, cg23881926, cg09638208, cg12441066, cg27284288, cg04441857, cg17583432, cg10673833, cg19757176, cg08670281, cg17583432, cg04460364, cg16959747, cg15011734, or cg25754195. In some embodiments, the set of nucleic acid probes comprises a set of padlock probes.

BRIEF DESCRIPTION OF THE DRAWINGS

Various aspects of the disclosure are set forth with particularity in the appended claims. The file of this patent contains at least one drawing/photograph executed in color. Copies of this patent with color drawing(s)/photograph(s) will be provided by the Office upon request and payment of the necessary fee. A better understanding of the features and advantages of the present disclosure will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the disclosure are utilized, and the accompanying drawings of which:

FIG. 1 illustrates the methylation status of biomarker 7-1577016.

FIG. 2 illustrates the methylation status of biomarker 11-67177103.

FIG. 3 illustrates the methylation status of biomarker 19-10445516 (cg17126555).

FIG. 4 illustrates the methylation status of biomarker 12-122277360.

FIG. 5 illustrates the methylation status of biomarker 6-72130742 (cg24772267).

FIG. 6 illustrates the methylation status of biomarker 3-15369681.

FIG. 7 illustrates the methylation status of biomarker 3-131081177.

FIG. 8 illustrates workflow chart of data generation and analysis. Whole genome methylation data on HCC and normal lymphocytes were used to identify 401 candidate markers. Diagnostic marker selection: Lasso and Random-forest analyses were applied to a training cohort of 715 HCC and 560 normal patients to identify a final selection of 10 markers. These ten markers were applied to a validation cohort of 383 HCC and 275 normal patients. Prognostic marker selection: univariant-cox and LASSO-Cox were applied to a training cohort of 680 HCC patients with survival data to identify a final selection of eight markers. These eight markers were applied to a validation cohort of 369 HCC with survival data.

FIG. 9A-FIG. 9H illustrate cfDNA methylation analysis for diagnosis of HCC. FIG. 9A shows the heatmap of methylation of 28 pairs of matched HCC tumor DNA and plasma cfDNA, with a mean methylation value threshold of 0.1 as a cutoff. FIG. 9B shows the methylation values and standard deviations of ten diagnostic markers in normal plasma, HCC tumor DNA, and HCC patient cfDNA. FIG. 9C and FIG. 9D show the confusion tables of binary results of the diagnostic prediction model in the training (FIG. 9C) and validation datasets (FIG. 9D). FIG. 9E and FIG. 9F illustrate ROC of the diagnostic prediction model with methylation markers in the training (FIG. 9E) and validation datasets (FIG. 9F). FIG. 9G and FIG. 9H show the unsupervised hierarchical clustering of ten methylation markers selected for use in the diagnostic prediction model in the training (FIG. 9G) and validation datasets (FIG. 9H).

FIG. 10A-FIG. 10K illustrate cfDNA methylation analysis and tumor burden, treatment response, and staging. The combined diagnosis score (cd-score) (FIG. 10A) and AFP (FIG. 10B) in healthy controls, individuals with liver diseases (HBV/HCV infection, cirrhosis, and fatty liver) and HCC patients. FIG. 10C shows the cd-score in normal controls and HCC patients with and without detectable tumor burden. FIG. 10D shows the cd-score in normal controls, HCC patients before treatment, with treatment response, and with progression. FIG. 10E shows the cd-score in normal controls and HCC patients before surgery, after surgery, and with recurrence. FIG. 10F shows the cd-score in normal controls and HCC patients from stage I-IV. FIG. 10G shows the ROC of cd-score and AFP for HCC diagnosis in whole HCC cohort. cd-score (FIG. 10H) and AFP (FIG. 10I) in HCC patients with initial diagnosis (before surgery or other treatment), with treatment response, with progression, and with recurrence. cd-score (FIG. 10J) and AFP (FIG. 10K) in HCC patients from stage I-IV.

FIG. 11A-FIG. 11G illustrate cfDNA methylation analysis for prognostic prediction HCC survival. FIG. 11A and FIG. 11B show the overall survival curves of HCC patients with low or high risk of death at 6 months, according to the combined prognosis score (cp-score) in the training (FIG. 11A) and validation datasets (FIG. 11B). Survival curves of HCC patients with stage I/II and stage III/IV in the training (FIG. 11C) and validation datasets (FIG. 11D). The ROC for the cp-score, stage, and cp-score combined with stage in the training (FIG. 11E) and validation datasets (FIG. 11F). FIG. 11G shows the survival curves of HCC patients with combinations of cp-score risk and stage in the whole HCC cohort.

FIG. 12 illustrates an unsupervised hierarchical clustering of top 1000 methylation markers differentially methylated between HCC tumor DNA and normal blood. Each column represents an individual patient and each row represents a CpG marker.

FIG. 13A-FIG. 13B illustrate an exemplary region encompassing two Blocks of Correlated Methylation (BCM) in cfDNA samples of from HCC and normal controls. FIG. 13A shows a genomic neighborhood of the BCM displayed within UCSC genome browser (Pearson correlation track showed correlation data by summing r values for a marker within a BCM. Cg marker names below the Pearson correlation graph (cg14999168, cg14088196, cg25574765) were methylation markers from TCGA. Gene name and common SNPs were also listed. FIG. 13B shows a not-to-scale representation of a set of analyzed cg markers belonging to two BCMs in this region. Boundaries between blocks are indicated by a black rectangle, whereas red squares indicate correlated methylation (r>0.5) between two nearby markers. Correlation between any two markers is represented by a square at the intersection of (virtual) perpendicular lines originating from these two markers. White color indicates no significant correlation. 10 newly identified methylation markers in the left MCB anchored by marker cg14999168 or 11 newly identified methylation markers in the right MCB anchored by cg14088196/cg 25574765 were highly consistent and correlated among HCC ctDNA, normal cfDNA, and HCC tissue DNA. Using markers within the same MCB can significantly enhanced allele calling accuracy. Vertical lines at the bottom of panel b were genomic coordinates of boundaries of two MCBs.

FIG. 14 illustrates an unsupervised hierarchical clustering of exemplary methylation markers for Stage I-Stage IV HCC tumor.

FIG. 15 shows methylation values correlated with treatment outcomes in HCC patients with serial plasma samples. FIG. 15A shows a change in cd-score comparing patients after surgery, with clinical response, and with disease progression (***p<0.001). FIG. 15B shows cd-score trends in individual patients after complete surgical resection with treatment response, and with disease progression. PRE: pre-treatment; POST: after-treatment.

FIG. 16 illustrates a dynamic monitoring of treatment outcomes in individual patients with cd-score and AFP. Dates of treatments are indicated by vertical blue arrows. PD, progressive disease; PR partial response; SD, stable disease; TACE, trans-catheter arterial chemoembolization.

FIG. 17A-FIG. 17C illustrates data analysis of an exemplary marker cg10673833.

FIG. 18 illustrates a workflow for building the diagnostic and prognostic models. Whole genome methylation data on HCC, LUNC and normal blood were used to identify candidate markers for probe design. Left panel: diagnostic marker selection: LASSO analysis was applied to a training cohort of 444 HCC, 299 LUNC, and 1123 normal patients to identify a final selection of 77 markers. These 77 markers were applied to a validation cohort of 445 HCC, 300 LUNC, and 1124 normal patients. Right panel: prognostic marker selection: LASSO-Cox were applied to a training cohort of 433 HCC and 299 LUNC patients with survival data to identify a final selection of 20 markers. These 20 markers were applied to a validation cohort of 434 HCC and 300 LUNC with survival data.

FIG. 19A-FIG. 19D illustrates cfDNA methylation analysis for diagnosis of LUNC and HCC. FIG. 19A shows receiver operating characteristic (ROC) curves and the associated Area Under Curves (AUCs) of the diagnostic prediction model (cd-score) using cfDNA methylation analysis in the validation cohort. FIG. 19B shows box plot of composite scores used to classify normal and cancer patients (left), and LUNC and HCC patients (right). Unsupervised hierarchical clustering of methylation markers differentially methylated between cancer (HCC and LUNC) and normal (FIG. 19C) and between HCC and LUNC (FIG. 19D). Each row represents an individual patient and each column represents a MCB marker.

FIG. 20A-FIG. 20D illustrates methylation profiling in healthy control, high-risk patients and cancer patients. FIG. 20A shows methylation profiling differentiates HCC from high risk liver disease patients or normal controls. High risk liver diseases were defined as hepatitis, liver cirrhosis and fatty liver disease. FIG. 20B shows serum AFP differentiates HCC from high risk liver disease patients or normal controls. FIG. 20C shows methylation profiling differentiates LUNC from patients who smoke and normal controls. FIG. 20D shows serum CEA differentiates LUNC from high risk (smoking) patients.

FIG. 21A-FIG. 21R illustrates cfDNA methylation analysis could predict tumor burden, staging, and treatment response using a composite diagnosis score in LUNC and HCC patients. cd-score in patients with and without detectable tumor burden in LUNC (FIG. 21A) when compared to CEA (FIG. 21I) and HCC (FIG. 21E) when compared to AFP (FIG. 21M); cd-score of patients with stage I/II and stage III/IV disease in LUNC (FIG. 21B) when compared to CEA (FIG. 21J) and HCC (FIG. 21F) patients when compared to AFP (FIG. 21N); cd-score in patients before intervention, after surgery, and with recurrence in LUNC (FIG. 21C) when compared to CEA (FIG. 21K) and HCC (FIG. 21G) when compared to AFP (FIG. 21O); cd-score in patients before intervention, with treatment response, and with worsening progression in LUNC (FIG. 21D) when compared to CEA (FIG. 21L) and HCC (FIG. 21H) when compared to AFP (FIG. 21P); FIG. 21Q: The ROC curve and the AUC of cd-score and AFP for LUNC diagnosis in the entire LUNC cohort. FIG. 21R: The ROC curve and the AUC of cd-score and AFP for HCC diagnosis in the entire HCC cohort.

FIG. 22A-FIG. 22F illustrates prognostic prediction in HCC and LUNC survival based on cfDNA methylation profiling. FIG. 22A shows the overall survival curves of HCC patients with low or high risk of death, according to the combined prognosis score (cp-score) in the validation cohort. FIG. 22B shows the overall survival curves of LUNC patients with low or high risk of death, according to the combined prognosis score (cp-score) in the validation dataset. FIG. 22C shows the survival curves of HCC patients with stage I/II and stage III/IV in the validation cohort. FIG. 22D shows the survival curves of patients with stage I/II and stage III/IV LUNC in the validation cohort. The ROC for 12 months survival predicted by cp-score, CEA, AFP, stage, and cp-score combined with stage of HCC (FIG. 22E) and LUNC (FIG. 22F) in the validation cohort.

FIG. 23A-FIG. 23B illustrates early detection of LUNC using a cfDNA methylation panel. 208 smoker patients was enrolled with lung nodules between 10 mm and 30 mm in size in a prospective trial and measured a cfDNA LUNC methylation panel. Patients were divided into a training and a testing cohort (FIG. 23A); Receiver operating characteristic (ROC) curves and the associated Area Under Curves (AUCs) of the prediction of Stage I LUNC versus benign lung nodules in the validation cohort with 91.4% accuracy (FIG. 23B); table showing prediction results between Stage I LUNC versus benign lung nodules showing high sensitivity and specificity in the validation cohort.

FIG. 24A-FIG. 24D illustrates methylation markers can differentiate between HCC and liver cirrhosis and Detect progression from liver cirrhosis to HCC. A prediction model was first built using 217 HCC and 241 cirrhosis patients and divided patients into a training and a testing cohort (FIG. 24A); Receiver operating characteristic (ROC) curves and the associated Area Under Curves (AUCs) of the prediction of Stage I HCC versus liver cirrhosis in the validation cohort with 89.9% accuracy (FIG. 24B); table showing prediction results between Stage I HCC and liver cirrhosis in a validation cohort (FIG. 24C); table showing prediction results on progression from liver cirrhosis to stage 1HCC with high sensitivity (89.5%) and specificity (98%) (FIG. 24D).

FIG. 25A illustrates unsupervised hierarchical clustering of top 1000 methylation markers differentially methylated in DNA in HCC and LUNC primary tissues versus normal blood.

FIG. 25B shows unsupervised hierarchical clustering of the top 1000 methylation markers differentially methylated between HCC and LUNC tissue DNA. Each column represents an individual patient and each row represents a CpG marker.

FIG. 25C shows global view of supervised hierarchical clustering of all 888 MCBs in the entire cfDNA dataset.

FIG. 26 illustrates Boxplots showing the features of MCBs in cohorts. Top plot: Mean values and deviations of Lasso MCBs in each one versus rest comparison.

FIG. 27 illustrates methylation values correlated with treatment outcomes in HCC and LUNC patients with serial plasma samples. Summary graphs of change in methylation value comparing patients after surgery, with clinical response (Partial Remission (PR) or Stable Disease (SD), or with disease progression/recurrent (PD).

FIG. 28A shows dynamic monitoring of treatment outcomes using the total methylation copy numbers of an MCB in LUNC patients.

FIG. 28B shows dynamic monitoring of treatment outcomes with the methylation value of an MCB in LUNC patients. PD, progressive disease; PR partial response; SD, stable disease; chemo, chemotherapy.

FIG. 29 illustrates dynamic monitoring of treatment outcomes using the total methylation copy numbers of an MCB and CEA in HCC patients.

FIG. 30 shows dynamic monitoring of treatment outcomes with the methylation rate of an MCB in HCC patients. Dates of treatments are indicated in the figure. PD, progressive disease; PR partial response; SD, stable disease; chemo, chemotherapy, TACE, trans-catheter arterial chemoembolization.

FIG. 31A-FIG. 31B illustrate workflow chart described in Example 5. FIG. 31A illustrates an exemplary workflow for building the diagnostic model, prognostic model, and generating the subtype based ctDNA methylation. FIG. 31B shows the enrollment and outcomes of the prospective screening cohort study.

FIG. 32A-FIG. 32H illustrate cfDNA methylation analysis for diagnosis of CRC. FIG. 32A: exemplary workflow for building the diagnostic models. FIG. 32B: Unsupervised hierarchical clustering of methylation markers differentially methylated between cancer (CRC) and normal in the training and the validation (FIG. 32C) testing cohort. Each row represents an individual patient and each column represents a CpG marker. FIG. 32D: Receiver operating characteristic (ROC) curves and the associated Area Under Curves (AUCs) of the diagnostic prediction model (cd-score) using cfDNA methylation analysis in the training and the validation (FIG. 32E) testing cohort. FIG. 32F: ROC curves and corresponding Area Under the Curve (AUCs) of cd-score and CEA for CRC diagnosis. FIG. 32G: Confusion matrices built from diagnostic model prediction in the training and the validation (H) testing cohort.

FIG. 33A-FIG. 33E illustrate prognostic prediction in CRC survival based on cfDNA methylation profiling. FIG. 33A: an exemplary workflow for building the prognostic models. FIG. 33B: Overall survival curves of CRC patients with low or high risk of death, according to the combined prognosis score (cp-score) in the training testing cohort. FIG. 33C: Overall survival curves of HCC patients with low or high risk of death, according to the combined prognosis score (cp-score) in the validation testing cohort. The ROC and corresponding AUCs for 12 months survival predicted by cp-score, Primary tumor location, TNM stage, CEA status and combined all in the training (FIG. 33D) and validation (FIG. 33E) testing cohort.

FIG. 34A illustrates a nomogram for predicting one year overall survival of CRC patients using cp-score and other clinical factors.

FIG. 34B illustrates a calibration plot of nomogram in external validation.

FIG. 35A-FIG. 35E cfDNA methylation subtyping analysis in 801 patients with CRC. FIG. 35A: A schematic diagram shown that the core algorithm utilized in the sample clustering. FIG. 35B: Iteratively unsupervised clustering of cfDNA methylation markers identified two subtypes/clusters in training data. Clinical and molecular features are indicated by the annotation bars above the heatmap. Patients without such information were colored in white. Mutation status was defined by the mutation detected in one of the following genes: BRAF, KRAS, NRAS and PIK3CA. FIG. 35C: Silhouette analysis of the clusters in the last iteration. FIG. 35D: Predicted subtypes/clusters of validation using the 45 makers. FIG. 35E: upper panel: overall survival for each of the cfDNA methylation in each subtypes. (log rank test p<0.05). lower panel: proportion of III-IV stage CRC patients in two subtypes (Chi-squared test, **P<0.01, *P<0.05. left, training cohort; right, validation cohort).

FIG. 36A-FIG. 36B presents a list of Methylation Correlated Blocks (MCBs) used for cd-score generation. FIG. 36A: MCBs markers selected by muti-class LASSO. FIG. 36B: Diagnostic marker selection: LASSO-based feature selection identified 13 markers and Random Forest-based feature selection identified 22 markers for discriminating cancer versus normal. There were 9 overlapping markers between these two methods.

FIG. 37A-FIG. 37F show cfDNA methylation analysis could predict tumor burden, staging, and treatment response using a cd-score in CRC patients. FIG. 37A: cfDNA methylation analysis cd-score in patients with and without detectable tumor burden; FIG. 37B: cd-score of patients with stage I/II and stage III/IV disease; FIG. 37C: cd-score in patients with primary tumor location on left or on right; FIG. 37D: CEA in patients with stage VII and stage III/IV CRC; FIG. 37E: cd-score in patients before treatment, after surgery, and with tumor recurrence; FIG. 37F: CEA in CRC patients before treatment, after surgery, and with tumor recurrence; Recurrence was defined as tumor initially disappeared after treatment/surgery but recurred after a defined period.

FIG. 38A-FIG. 38C illustrate comparison of subtype markers, diagnosis markers and prognosis markers. FIG. 38A: Venn diagram shows the intersects of the three marker lists. Patients in cluster 2 had higher cpscores than those in cluster 1 from the both training cohort (FIG. 38B) and validation cohort (FIG. 38C).

FIG. 39 illustrates patient treatment monitoring with marker methylation level. Dynamic monitoring of treatment outcomes with the methylation value of CpG site cg10673833 (upper panel) and CEA (lower panel) in CRC patients #1-6. Dates of treatments are indicated in the figure. PD, progressive disease; PR partial response; SD, stable disease; chemo, chemotherapy.

FIG. 40A-FIG. 40B illustrates methylation values correlated with treatment outcomes in CRC patients with serial plasma samples. FIG. 40A: Summary graphs of change in methylation value comparing patients after surgery, with clinical response (Partial Remission (PR) or Stable Disease (SD), or with disease progression/recurrent (PD). FIG. 40B: Methylation value trends in individual patients after complete surgical resection, with treatment response, and with disease progression. Delta methylation rate denotes the methylation value difference before treatment and after treatment. PRE: pre-treatment; POST: after-treatment.

FIG. 41 illustrates the methylation status of CpG marker cg00456086.

FIG. 42 illustrates the methylation status of biomarker 3-49757316, 8-27183116, 8-141607252, 17-29297711, and 3-49757306.

FIG. 43 illustrates the methylation status of biomarker 19-43979341, 8-141607236, 5-176829755, 18-13382140, and 15-65341965.

FIG. 44 illustrates the methylation status of biomarker 15-91129457, 2-1625431, 6-151373292, 6-151373294, and 20-25027093.

FIG. 45 illustrates the methylation status of biomarker 6-14284198, 10-4049295, 19-59023222, 1-184197132, and 2-131004117.

FIG. 46 illustrates the methylation status of biomarker 3-13152305, 17-29297770, 8-27183316, 5-176829740, and 19-41316693.

FIG. 47 illustrates the methylation status of biomarker 18-43830649, 15-65341957, 20-44539531, 7-30265625, and 2-131129567.

FIG. 48 illustrates the methylation status of biomarker 2-8995417, 12-10782319, 20-25027033, 6-151373256, and 8-86100970.

FIG. 49 illustrates the methylation status of biomarker 9-4839459, 17-41221574, 1-153926715, 20-25027044, and 20-20177325.

FIG. 50 illustrates the methylation status of biomarker 176829665, 3-13152273, 8-27183348, 3-49757302, and 19-41316697.

FIG. 51 illustrates the methylation status of biomarker 8-61821442, 20-44539525, 10-102883105, 11-65849129, and 5-176829639.

FIG. 52 illustrates the methylation status of biomarker 2-1625443, 20-25027085, 11-69420728, 1-229234865, and 6-13408877.

FIG. 53 illustrates the methylation status of biomarker 22-50643735, 6-151373308, 1-232119750, 8-134361508, and 6-13408858.

DETAILED DESCRIPTION OF THE DISCLOSURE

Cancer is characterized by an abnormal growth of a cell caused by one or more mutations or modifications of a gene leading to dysregulated balance of cell proliferation and cell death. DNA methylation silences expression of tumor suppression genes, and presents itself as one of the first neoplastic changes. Methylation patterns found in neoplastic tissue and plasma demonstrate homogeneity, and in some instances are utilized as a sensitive diagnostic marker. For example, cMethDNA assay has been shown in one study to be about 91% sensitive and about 96% specific when used to diagnose metastatic breast cancer. In another study, circulating tumor DNA (ctDNA) was about 87.2% sensitive and about 99.2% specific when it was used to identify KRAS gene mutation in a large cohort of patients with metastatic colon cancer (Bettegowda et al., Detection of Circulating Tumor DNA in Early- and Late-Stage Human Malignancies. Sci. Transl. Med, 6(224):ra24. 2014). The same study further demonstrated that ctDNA is detectable in >75% of patients with advanced pancreatic, ovarian, colorectal, bladder, gastroesophageal, breast, melanoma, hepatocellular, and head and neck cancers (Bettegowda et al).

Additional studies have demonstrated that CpG methylation pattern correlates with neoplastic progression. For example, in one study of breast cancer methylation patterns, P16 hypermethylation has been found to correlate with early stage breast cancer, while TIMP3 promoter hypermethylation has been correlated with late stage breast cancer. In addition, BMP6, CST6 and TIMP3 promoter hypermethylation have been shown to associate with metastasis into lymph nodes in breast cancer.

In some embodiments, DNA methylation profiling provides higher clinical sensitivity and dynamic range compared to somatic mutation analysis for cancer detection. In other instances, altered DNA methylation signature has been shown to correlate with the prognosis of treatment response for certain cancers. For example, one study illustrated that in a group of patients with advanced rectal cancer, ten differentially methylated regions were used to predict patients' prognosis. Likewise, RASSF1A DNA methylation measurement in serum was used to predict a poor outcome in patients undergoing adjuvant therapy in breast cancer patients in a different study. In addition, SRBC gene hypermethylation was associated with poor outcome in patients with colorectal cancer treated with oxaliplatin in a different study. Another study has demonstrated that ESR1 gene methylation correlates with clinical response in breast cancer patients receiving tamoxifen. Additionally, ARM gene promoter hypermethylation was shown to be a predictor of long-term survival in breast cancer patients not treated with tamoxifen.

In some embodiments, disclosed herein include methods, probes, and kits for diagnosing the presence of cancer and/or a cancer type. In some instances, described herein is a method of profiling the methylation status of a set of CpG markers (or cg markers). In other instances, described herein is a method for selecting a patient based on the methylation status of a set of CpG markers (or cg markers) for treatment.

Methods of Use

DNA methylation is the attachment of a methyl group at the C5-position of the nucleotide base cytosine and the N6-position of adenine. Methylation of adenine primarily occurs in prokaryotes, while methylation of cytosine occurs in both prokaryotes and eukaryotes. In some instances, methylation of cytosine occurs in the CpG dinucleotides motif. In other instances, cytosine methylation occurs in, for example CHG and CHH motifs, where H is adenine, cytosine or thymine. In some instances, one or more CpG dinucleotide motif or CpG site forms a CpG island, a short DNA sequence rich in CpG dinucleotide. In some instances, a CpG island is present in the 5′ region of about one half of all human genes. CpG islands are typically, but not always, between about 0.2 to about 1 kb in length. Cytosine methylation further comprises 5-methylcytosine (5-mCyt) and 5-hydroxymethylcytosine.

The CpG (cytosine-phosphate-guanine) or CG motif refers to regions of a DNA molecule where a cytosine nucleotide occurs next to a guanine nucleotide in the linear strand. In some instances, a cytosine in a CpG dinucleotide is methylated to form 5-methylcytosine. In some instances, a cytosine in a CpG dinucleotide is methylated to form 5-hydroxymethylcytosine.

In some instances, one or more DNA regions are hypermethylated. In such cases, hypermethylation refers to an increase in methylation event of a region relative to a reference region. In some cases, hypermethylation is observed in one or more cancer types, and is useful, for example, as a diagnostic marker and/or a prognostic marker.

In some instances, one or more DNA regions are hypomethylated. In some cases, hypomethylation refers to a loss of the methyl group in the 5-methylcytosine nucleotide in a first region relative to a reference region. In some cases, hypomethylation is observed in one or cancer types, and is useful, for example, as a diagnostic marker and/or a prognostic marker.

In some embodiments, disclosed herein are CpG methylation markers for diagnosis of a cancer in a subject. In some instances, also disclosed herein is a method of selecting a subject suspected of having cancer for treatment. In some instances, the method comprises (a) contacting treated DNA with at least one probe from a probe panel to generate an amplified product, wherein the at least one probe hybridizes under high stringency condition to a target sequence of a cg marker selected from Table 1, Table 2, Table 7, Table 8, or Table 13, and wherein the treated DNA is processed from a biological sample obtained from the subject; (b) analyzing the amplified product to generate a methylation profile of the cg marker; (c) comparing the methylation profile to a reference model relating methylation profiles of cg markers from Tables 1, 2, 7, 8, and 13 to a set of cancers; (d) based on the comparison of step c), determining: (i) whether the subject has cancer; and (ii) which cancer type the subject has; and (e) administering an effective amount of a therapeutic agent to the subject if the subject is determined to have cancer and the cancer type is determined.

In some instances, the method comprises (a) contacting treated DNA with the probe panel to generate amplified products, wherein each probe of the probe panel hybridizes under high stringency condition to a target sequence of a cg marker selected from Table 1, Table 2, Table 7, or Table 8; (b) analyzing the amplified products to generate a methylation profile of the cg markers targeted by the probe panel; (c) comparing the methylation profile to the reference model relating methylation profiles of cg markers from Tables 1, 2, 7, and 8 to a set of cancers; (d) evaluating an output from the model to determine: (i) whether the subject has cancer; and (ii) which cancer type the subject has; and (e) administering an effective amount of a therapeutic agent to the subject if the subject is determined to have cancer and the cancer type is determined.

In some cases, the biological sample is treated with a deaminating agent to generate the treated DNA.

In some cases, the at least one probe from the probe panel is a padlock probe.

In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker selected from Table 1.

In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker selected from Table 2.

In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker selected from Table 4.

In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker selected from Table 5.

In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker selected from Table 7.

In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker selected from Table 8.

In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker selected from Table 13.

In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker selected from cg19516279, cg06100368, cg25945732, cg19155007, cg17952661, cg04072843, cg01250961, cg08131100, cg03788131, cg17528648, cg07784526, cg18948743, cg23986470, cg00846300, cg01029638, cg08350814, cg05098590, cg18085998, cg06532037, cg15313226, cg16232979, cg26149167, cg01237565, cg16561543, cg13771313, cg13771313, cg08169020, cg08169020, cg21153697, cg07326648, cg14309384, cg20923716, cg09095222, cg22220310, cg21950459, cg13332729, cg10802543, cg20707333, cg13169641, cg25352342, cg09921682, cg02504622, cg17373759, cg06547203, cg06826710, cg00902147, cg17609887, cg15721142, cg08116711, cg00736681, cg18834029, cg06969479, cg24630516, cg16901821, cg20349803, cg23610994, cg19313373, cg16508600, cg24096323, cg24746106, cg12288267, cg10430690, cg24408776, cg05630192, cg12028674, cg24820270, cg12028674, cg26718707, cg10349880, cg09921682, cg25934700, cg14164596, cg24461337, cg23041410, cg07366553, cg26859666, cg06405341, cg08557188, cg00690392, cg03421440, cg07077277, cg00456086, or cg20702527. In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker selected from cg19516279, cg06100368, cg20349803, cg23610994, cg19313373, cg16508600, or cg24096323. In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker selected from cg25945732, cg19155007, cg17952661, cg25934700, cg14164596, cg24461337, cg23041410, cg07366553, cg26859666, or cg00456086. In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker selected from cg04072843, cg01250961, cg24746106, cg12288267, or cg10430690. In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker selected from cg08131100, cg03788131, cg17528648, cg07784526, cg18948743, cg23986470, cg00846300, cg25352342, cg09921682, cg02504622, cg17373759, cg12028674, cg24820270, cg12028674, cg26718707, cg10349880, or cg09921682. In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker selected from cg01029638, cg08350814, cg05098590, cg18085998, cg06532037, cg15313226, cg16232979, cg26149167, cg06547203, cg06826710, cg00902147, cg17609887, or cg15721142. In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker selected from cg01237565, cg16561543, or cg08116711. In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker selected from cg13771313, cg13771313, or cg08169020. In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker selected from cg08169020, cg21153697, cg07326648, cg14309384, cg20923716, cg22220310, cg21950459, cg13332729, cg10802543, cg20707333, or cg13169641. In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker cg09095222. In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker selected from cg00736681 or cg18834029. In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker selected from cg06969479, cg24630516, or cg16901821. In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker cg24408776. In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker cg05630192. In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker cg06405341. In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker selected from cg08557188, cg00690392, cg03421440, or cg07077277. In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a cg marker selected from cg10673833, cg10493436, cg10428836, cg27284288, cg16959747, cg17494199, cg23678254, cg24067911, or cg25459300.

In some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a gene selected from a gene panel consisting of BMPR1A, PSD, ARHGAP25, KLF3, PLAC8, ATXN1, Chromosome 6:170, Chromosome 6:3, ATAD2, and Chromosome 8:20.

some instances, the at least one probe hybridizes under high stringency conditions to a target sequence of a gene selected from a gene panel consisting of MYO1G, ADAMTS4, BMPR1A, CD6, RBP5, Chr 13:10, LGAP5, ATXN1, and Chr 8:20.

In some instances, the reference model comprises methylation profiles of cg markers from Tables 1 and 2 generated from samples of known cancer types. In some cases, the reference model further comprises methylation profiles of cg markers from Tables 1 and 2 generated from normal samples. In some cases, the reference model comprises methylation profiles of cg markers from Tables 1 and 2 generated from tissue samples.

In some instances, the reference model comprises methylation profiles of cg markers from Tables 7 and 8 generated from samples of known cancer types. In some cases, the reference model further comprises methylation profiles of cg markers from Tables 7 and 8 generated from normal samples. In some cases, the reference model comprises methylation profiles of cg markers from Tables 7 and 8 generated from tissue samples.

In some instances, the reference model comprises methylation profiles of cg markers from Table 13 generated from samples of known cancer types. In some cases, the reference model further comprises methylation profiles of cg markers from Table 13 generated from normal samples. In some cases, the reference model comprises methylation profiles of cg markers from Table 13 generated from tissue samples.

In some cases, the reference model is developed using an algorithm selected from one or more of the following: a principal component analysis, a logistic regression analysis, a nearest neighbor analysis, a support vector machine, and a neural network model.

In some embodiments, the analyzing described above comprises quantitatively detecting the methylation status of the amplified product. In some cases, the detection comprises a real-time quantitative probe-based PCR or a digital probe-based PCR. In some cases, the detection comprises a real-time quantitative probe-based PCR. In other cases, the detection comprises a digital probe-based PCR, optionally, a digital droplet PCR.

In some embodiments, the treatment comprises a chemotherapeutic agent or an agent for a targeted therapy. Exemplary chemotherapeutic agents include, but are not limited to, cisplatin, doxorubicin, fluoropyrimidine, gemcitabine, irinotecan, mitoxantrone, oxaliplatin, thalidomide, or a combination thereof. In some cases, the chemotherapeutic agent comprises cisplatin, doxorubicin, fluoropyrimidine, gemcitabine, irinotecan, mitoxantrone, oxaliplatin, thalidomide, or a combination thereof. In some instances, the treatment comprises an agent for a targeted therapy. In additional instances, the treatment comprises surgery.

In some instances, the biological sample is a blood sample, an urine sample, a saliva sample, a sweat sample, or a tear sample. In some cases, the biological sample is a blood sample or an urine sample. In some cases, the biological sample is a tissue biopsy sample. In some cases, the biological sample is a cell-free DNA sample. In some cases, the biological sample comprises circulating tumor cells.

In some embodiments, also disclosed herein is a method of detecting the methylation status of a set of cg markers. In some embodiments, the method comprises (a) processing a biological sample obtained from a subject with a deaminating agent to generate treated DNA comprising deaminated nucleotides; (b) contacting the treated DNA with at least one probe that hybridizes under high stringency condition to a target sequence of a cg marker from Table 1, Table 2, Table 7, Table 8, Table 13, Table 14, or Table 20; and (c) quantitatively detecting the methylation status of the cg marker, wherein said detection comprises a real-time quantitative probe-based PCR or a digital probe-based PCR.

In some embodiments, the method of detecting the methylation status of a set of cg markers comprises (a) processing a biological sample obtained from a subject with a deaminating agent to generate treated DNA comprising deaminated nucleotides; (b) contacting the treated DNA with at least one probe that hybridizes under high stringency condition to a target sequence of a cg marker from Table 1 or Table 2; and (c) quantitatively detecting the methylation status of the cg marker, wherein said detection comprises a real-time quantitative probe-based PCR or a digital probe-based PCR.

In some cases, the detection comprises a real-time quantitative probe-based PCR or a digital probe-based PCR. In some cases, the detection comprises a real-time quantitative probe-based PCR. In other cases, the detection comprises a digital probe-based PCR, optionally, a digital droplet PCR.