DNA METHYLATION SIGNATURES OF CANCER IN HOST PERIPHERAL BLOOD MONONUCLEAR CELLS AND T CELLS

Information

  • Patent Application
  • 20190345559
  • Publication Number
    20190345559
  • Date Filed
    June 23, 2016
    8 years ago
  • Date Published
    November 14, 2019
    5 years ago
Abstract
Disclosed is a DNA methylation signature in Peripheral Blood Mononuclear cells (PBMC) for predicting hepatocellular carcinoma (HCC) stages and chronic hepatitis, which is CG IDs. This invention also disclosed kits and uses for the DNA methylation signature.
Description
FIELD OF THE INVENTION

The invention relates to DNA methylation signatures in human DNA, particularly in the field of molecular diagnostics.


BACKGROUND OF THE INVENTION

Hepatocellular Carcinoma (HCC) is the fifth most common cancer world-wide (1). It is particularly prevalent in Asia, and its occurrence is highest in areas where hepatitis B is prevalent, indicating a possible causal relationship (2). Follow up of high-risk populations such as chronic hepatitis patients and early diagnosis of transitions from chronic hepatitis to HCC would improve cure rates. The survival rate of hepatocellular carcinoma is currently extremely low because it is almost always diagnosed at the late stages. Liver cancer could be effectively treated with cure rates of >80% if diagnosed early1. Advances in imaging have improved noninvasive detection of HCC (3, 4). However, current diagnostic methods, which include imaging and immunoassays with single proteins such as alpha-fetoprotein often fail to diagnose HCC early (2). These challenges are not limited to HCC but common to other cancers as well. Molecular diagnosis of cancer is focused on tumors and biomaterial originating in tumor including tumor DNA in plasma (5, 6), circulating tumor cells (7) and the tumor-host microenvironment (8, 9). The prevailing and widely accepted hypothesis is that molecular changes that drive cancer initiation and progression originate primarily in the tumor itself and that relevant changes in the host occur primarily in the tumor microenvironment. The identity of immune cells in the tumor microenvironment has attracted therefore significant attention (10, 11).


DNA methylation, a covalent modification of DNA, which is a primary mechanism of epigenetic regulation of genome function is ubiquitously altered in tumors (12-15) including HCC (16). DNA methylation profiles of tumors distinguish different stages of tumor progression and are potentially robust tools for tumor classification, prognosis and prediction of response to chemotherapy (17). The major drawback for using tumor DNA methylation in early diagnosis is that it requires invasive procedures and anatomical visualization of the suspected tumor. Circulating tumor cells are a noninvasive source of tumor DNA and are used for measuring DNA methylation in tumor suppressor genes (18). Hypomethylation of HCC DNA is detectable in patients' blood (19) and genome wide bisulfite sequencing was recently applied to detect hypomethylated DNA in plasma from HCC patients (20). However, this source is limited, particularly at early stages of cancer and the DNA methylation profiles are confounded by host DNA methylation profiles.


The idea that host immuno-surveillance plays an important role in tumorigenesis by eliminating tumor cells and suppressing tumor growth has been proposed by Paul Ehrlich (21, 22) more than a century ago and has fallen out of favor since. However, accumulating data from both animal and human clinical studies suggest that the host immune system plays an important role in tumorigenesis through “immuno-editing” which involves three stages: elimination, equilibrium and escape (23-25). Presence of tumor infiltrating cytotoxic CD8+ T cells associated with better prognosis in several clinical studies of human regressive melanoma (26-31), esophageal (32), ovarian (33, 34), and colorectal cancer (35-37). The immune system is believed to be responsible for the phenomenon of cancer dormancy when circulating cancer cells are detectable in the absence of clinical symptoms (15, 38). Interestingly, recent DNA methylation and transcriptome analysis of tumors revealed tumor stage specific immune signatures of infiltrating lymphocytes (39, 40). However, these signatures represent targeted immune cells in the tumor microenvironment and utilization of such signatures for early diagnosis requires invasive procedures. The tumor-infiltrating immune cells represent only a minor fraction of peripheral blood cells (41-44). Global DNA methylation changes were previously reported in leukocytes and EWAS studies revealed differences in DNA methylation in leukocytes from bladder, head and neck and ovarian cancer and these differences were independent of differences in white blood cell distribution (45). These studies were mainly aimed at identifying underlying DNA methylation changes in cancer genes that might serve as surrogate markers for changes in DNA methylation in the tumor. However, the question of whether the peripheral host immune system exhibits a distinct DNA methylation response to the cancer state that correlates with cancer progression has not been addressed.


SUMMARY OF THE INVENTION

Inventors of this invention find that cancer progression is associated with distinct DNA methylation profiles in the host peripheral immune cells. The present inventions also show that these DNA methylation markers differentiate between cancer and the underlying chronic inflammatory liver disease.


The present inventions illustrate these DNA methylation profiles in a discovery set of 69 people from the Beijing area of China (10 controls and 10 patients for each of the following groups Hepatitis B, C, stages 1-3, and 9 patients for stage 4) of HCC staged using the EASL-EORTC Clinical Practice Guidelines for HCC (Table 1). The present invention used a whole genome approach (Illumina 450k arrays) to delineate DNA methylation profiles without preconceived bias on the type of genes that might be involved. This invention demonstrates for the first time specific DNA methylation profiles of Hepatitis B and C that are distinct from HCC as well as DNA methylation profiles for each of the different stages of HCC in peripheral blood mononuclear cells. These profiles do not show a significant overlap with the DNA methylation profiles of HCC tumors that have been previously described (16), suggesting that they reflect changes in peripheral blood mononuclear cells genomic functions and are not surrogates of changes in tumor DNA methylation. Thus, this invention reveals the DNA methylation changes in the host immune system in cancer. This invention also reveals a DNA methylation signature in host T cells in people suffering from cancer. The present invention also shows that there is a significant overlap between DNA methylation profiles delineated in PBMCs and T cells. The present invention validates 4 genes that were differentially methylated in T cells from HCC patients in the discovery cohort by pyrosequencing of T cells DNA in a separate cohort of patients (n=79).


The present invention demonstrates the utility of this invention in predicting cancer and stage of cancer of unknown samples using statistical models based on these DNA methylation signatures. This invention has important implications for understanding of the mechanisms of the disease and its treatment and provides noninvasive diagnostics of cancer in peripheral blood mononuclear cells DNA. This invention could be used by any person skilled in the art to derive DNA methylation signatures in the immune system of any cancer using any method for genome wide methylation mapping that are available to those skilled in the art such as for example genome wide bisulfite sequencing, capture sequencing, methylated DNA Immunoprecipitation (MeDIP) sequencing and any other method of genome wide methylation mapping that becomes available.


Preferred embodiments of the present invention are as follows.


In the first aspect, the present invention provides DNA methylation signature of cancer in peripheral blood mononuclear cells (PBMC) for predicting cancer, said DNA methylation signature is derived using genome wide DNA methylation mapping methods, such as Illumina 450K or 850K arrays, genome wide bisulfite sequencing, methylated DNA Immunoprecipitation (MeDIP) sequencing or hybridization with oligonucleotide arrays.


In one embodiment, the DNA methylation signature is CG IDs derived from PBMC DNA listed below for predicting hepatocellular carcinoma (HCC) stages and chronic hepatitis using either PBMC or T cells DNA methylation levels of said CG IDs.



















cg05375333
cg24304617
cg08649216
cg15775914
cg06098530
cg04536922


cg23679141
cg26009832
cg06908855
cg21585138
cg15514380
cg20838429


cg01546046
cg27090007
cg11412036
cg00744866
cg19988492
cg21542922


cg10036013
cg24958366
cg23824801
cg08306955
cg00361155
cg11356004


cg12829666
cg17479131
cg27408285
cg15009198
cg05423018
cg19140262


cg15011899
cg27644327
cg01810593
cg18878210
cg13710613
cg05033369


cg02001279
cg11031737
cg19795616
cg02717454
cg07072643
cg09048334


cg15188939
cg09800500
cg27284331
cg22344162
cg04018625
cg04385818


cg23311108
cg02313495
cg08575688
cg26923863
cg01238991
cg01214050


cg09789584
cg16324306
cg05486191
cg15447825
cg17741339
cg14361741


cg22301128
cg02914652
cg04171808
cg04771084
cg18132851
cg16292016


cg11737318
cg11057824
cg14276584
cg23981150
cg02556954
cg14783904


cg07118376
cg26407558
cg03496780
cg24383056
cg01359822
cg26250154


cg13978347
cg09451574
cg14375111
cg24232444
cg22747380
cg02758552


cg23544996
cg21156970
cg08944236
cg22281935
cg00211609
cg21811450


cg16306870
cg01732538
cg02142483
cg22110158
cg11911769
cg03432151


cg03731740
cg10312296
cg23102014
cg04398282
cg15755348
cg08455089


cg02749789
cg17704839
cg25683268
cg08946713
cg25195795
cg17766305


cg08123444
cg24742520
cg20460227
cg24056269
cg06151145
cg06349546


cg15747825
cg14983135
cg17163729
cg15118835
cg00568910
cg23017594


cg23829949
cg21164050
cg01417062
cg14189441
cg15146122
cg12813441


cg16712679
cg06879746
cg13146484
cg16111924
cg13615971
cg01411912


cg12820627
cg27057509
cg18417954
cg27089675
cg06194421
cg15374754


cg17534034
cg23857976
cg13913085
cg07128102
cg01966878
cg00093544


cg05591270
cg05228338
cg12705693
cg18556587
cg16565409
cg14711743


cg13219008
cg24783785
cg21579239
cg02863594
cg03044573
cg00483304


cg15607708
cg27457290
cg10274682
cg08577341
cg10469659
cg24376286


cg22475353
cg14199837
cg19389852
cg12306086
cg16240816
cg27638509


cg27296330
cg25104397
cg01839860
cg21700582
cg21487856
cg11300809


cg24449629
cg20592700
cg20222519
cg14774438
cg23486701
cg09244071


cg12177922
cg27010159
cg02272851
cg15123819
cg24640156
cg00014638


cg23004466
cg14898127
cg14734614
cg00759807
cg05086021
cg00697672


cg01696603
cg11783497
cg27120934
cg07929642
cg03899643
cg01116137


cg03639671
cg08861115
cg10078703
cg08134863
cg11556164
cg20250700


cg10203922
cg15966610
cg05099186
cg20228731
cg25135755
cg15867698


cg13749822
cg13299325
cg11767757
cg23493018
cg08113187
cg11151251


cg12263794
cg22547775
cg09545443
cg04071270
cg27588356
cg05577016


cg23157190
cg22945413
cg20427318
cg20750319
cg01611777
cg01933228


cg21406217
cg15046123
cg01698579
cg12050434
cg12299554
cg11006453


cg08247053
cg26405097
cg12691488
cg00458932
cg14356440
cg03555836


cg26576206
cg03483626
cg08568561
cg25708982
cg18482303
cg02482718


cg07212747
cg14531436
cg13943141
cg12592365
cg15323084
cg24065504


cg22872033
cg20587236
cg13619522
cg19780570
cg22876402
cg09340198


cg27186013
cg24284882
cg05502766
cg20187173
cg17092349
cg22143698


cg19851487
cg17226602
cg06445016
cg07772781
cg02782634
cg07065759


cg03481488
cg22707529
cg10895875
cg01828328
cg09987993
cg21751540


cg12598524
cg19945957
cg08634082
cg05725404
cg26401541
cg20956548


cg10761639
cg05460226
cg20944521
cg14426660
cg00248242
cg18731803


cg00350932
cg25364972
cg03252499
cg04998202
cg09514545
cg09639931


cg14914552
cg00754989
cg14762436
cg07381872
cg16476382
cg16810031


cg07504763
cg01994308
cg19266387
cg14193653
cg00189276
cg10861953


cg25279586
cg23837109
cg17934470
cg22675447
cg08858441
cg12628061


cg12019814
cg10892950
cg00758915
cg09479286
cg20874210
cg06874640


cg05941376
cg02976588
cg27143049
cg00426720
cg00321614
cg15006843


cg23044884
cg24576298
cg23880736
cg05999692
cg08226047
cg25522867


cg15891076
cg12344600
cg04090347
cg10784548
cg02265379
cg01124132


cg07145988
cg27544294
cg22515654
cg12201380
cg19925215
cg10536529


cg09635768
cg00448395
cg03062944
cg05961707
cg10995381
cg16517298


cg01124132
cg10536529
cg16517298
cg18882449
cg03909800
cg18882449


cg03909800









In one embodiment, the DNA methylation signature is CG IDs derived from T cells listed below for predicting HCC stages and chronic hepatitis using PBMC or T cells DNA methylation levels of said CG IDs.



















cg00014638
cg02015053
cg03568507
cg06098530
cg08313420
cg10918327


cg00052964
cg02086310
cg03692651
cg06168204
cg08479516
cg10923662


cg00167275
cg02132714
cg03764364
cg06279274
cg08566455
cg11065621


cg00168785
cg02142483
cg03853208
cg06445016
cg08641990
cg11080540


cg00257775
cg02152108
cg03894796
cg06477663
cg08644463
cg11157127


cg00399683
cg02193146
cg03909800
cg06488150
cg08826152
cg11231949


cg00404641
cg02314201
cg03911306
cg06568880
cg08946713
cg11262262


cg00431894
cg02322400
cg03942932
cg06652329
cg09122035
cg11556164


cg00434461
cg02490460
cg03976645
cg06816239
cg09259081
cg11692124


cg00452133
cg02536838
cg04083575
cg06822816
cg09324669
cg11706775


cg00500229
cg02556954
cg04116354
cg06850005
cg09555124
cg11718162


cg00674365
cg02710015
cg04192168
cg06895913
cg09639931
cg11909467


cg00772991
cg02717454
cg04398282
cg07019386
cg09681977
cg11955727


cg00804338
cg02750262
cg04536922
cg07052063
cg09696535
cg11958644


cg00815832
cg02849693
cg04656070
cg07065759
cg09750084
cg12019814


cg00898013
cg02863594
cg04771084
cg07145988
cg10036013
cg12099423


cg01044293
cg02914652
cg04864807
cg07249730
cg10061361
cg12161228


cg01116137
cg02939781
cg04998202
cg07266910
cg10091662
cg12299554


cg01124132
cg02976588
cg05084827
cg07381872
cg10167378
cg12315391


cg01254303
cg02991085
cg05107535
cg07385778
cg10184328
cg12427303


cg01305421
cg03035849
cg05132077
cg07721852
cg10185424
cg12549858


cg01359822
cg03151810
cg05157625
cg07772781
cg10196532
cg12583076


cg01366985
cg03204322
cg05217983
cg07834396
cg10274682
cg12649038


cg01405107
cg03215181
cg05304366
cg07850527
cg10341310
cg12691488


cg01413790
cg03400131
cg05348875
cg07912766
cg10530883
cg12727605


cg01557792
cg03441844
cg05429448
cg08038033
cg10549831
cg12777448


cg01832672
cg03461110
cg05460226
cg08113187
cg10555744
cg12789173


cg01921773
cg03541331
cg05512157
cg08123444
cg10584024
cg12856392


cg01927745
cg03544320
cg05554346
cg08280368
cg10890302
cg12868738


cg01992590
cg03546163
cg05759347
cg08306955
cg10909506
cg12880685


cg12906381
cg15009198
cg17335387
cg19795616
cg22404498
cg24919348


cg12963656
cg15011899
cg17372657
cg19841369
cg22589728
cg25100962


cg12970155
cg15046123
cg17597631
cg19930116
cg22656550
cg25104397


cg13260278
cg15109018
cg17718703
cg19988492
cg22668906
cg25174412


cg13286116
cg15145341
cg17741339
cg20197130
cg22675447
cg25188006


cg13308137
cg15302376
cg17765025
cg20222519
cg22747380
cg25310233


cg13401703
cg15331834
cg17766305
cg20478129
cg22945413
cg25353287


cg13404054
cg15514380
cg17775490
cg20585841
cg23299919
cg25459280


cg13405775
cg15514896
cg17786894
cg20587236
cg23486701
cg25461186


cg13435137
cg15598244
cg17837517
cg20606062
cg23771949
cg25502144


cg13466988
cg15695738
cg17988310
cg20625523
cg23824902
cg25673720


cg13679714
cg15704219
cg18031596
cg20769177
cg23829949
cg25779483


cg13896699
cg15720112
cg18051353
cg20781967
cg23880736
cg25784220


cg13904970
cg15747825
cg18128914
cg20995304
cg23944804
cg25891647


cg13912027
cg15756407
cg18132851
cg21092324
cg24056269
cg25964728


cg13939291
cg15867698
cg18182216
cg21222426
cg24065504
cg26015683


cg14140403
cg16111924
cg18214661
cg21226442
cg24070198
cg26250154


cg14242995
cg16218221
cg18273840
cg21358380
cg24142603
cg26325335


cg14276584
cg16259904
cg18297196
cg21384492
cg24169486
cg26402555


cg14326196
cg16292016
cg18370682
cg21386573
cg24232444
cg26405097


cg14362178
cg16306870
cg18417954
cg21487856
cg24383056
cg26407558


cg14376836
cg16496269
cg18766900
cg21816330
cg24405716
cg26465602


cg14419424
cg16512390
cg18804667
cg21833076
cg24453118
cg26475911


cg14734614
cg16763089
cg18808261
cg21918548
cg24536818
cg26594335


cg14762436
cg16810031
cg19095568
cg22088248
cg24616553
cg26803268


cg14774438
cg16894855
cg19140262
cg22143698
cg24631428
cg26827373


cg14858267
cg16924102
cg19193595
cg22256433
cg24680439
cg26856443


cg14898127
cg17144149
cg19266387
cg22301128
cg24716416
cg26876834


cg14914552
cg17173975
cg19760965
cg22303909
cg24729928
cg26963367


cg15000827
cg17221813
cg19768229
cg22374742
cg24742520
cg27010159


cg27098685
cg27113419
cg27186013
cg27207470
cg27247736
cg27300829


cg27406664
cg27408285
cg27544294
cg27576694









In one embodiment, the DNA methylation signature is CG IDs listed below for predicting different stages of HCC using DNA methylation measurements of said CG IDs in T cells or PBMC obtained by using statistical models such as penalized regression or clustering analysis.


Target CG IDs for separating HCC stage 1 from controls: cg14983135, cg10203922, cg05941376, cg14762436, cg12019814, cg14426660, cg18882449, cg02914652;


Target CG IDs for separating HCC stage 2 from controls: cg05941376, cg15188939, cg12344600, cg03496780, cg12019814;


Target CG IDs for separating HCC stage 3 from controls: cg05941376, cg02782634, cg27284331, cg12019814, cg23981150;


Target CG IDs for separating HCC stage 4 from controls: cg02782634, cg05941376, cg10203922, cg12019814, cg14914552, cg21164050, cg23981150;


Target CG IDs for separating HCC stage 1 from hepatitis B: cg05941376, cg10203922, cg11767757, cg04398282, cg11151251, cg24742520, cg14711743;


Target CG IDs for separating HCC stage 1 from stage 2-4: cg03252499, cg03481488, cg04398282, cg10203922, cg11783497, cg13710613, cg14762436, cg23486701;


Target CG IDs for separating HCC stage 2 from stage 3-4: cg02914652, cg03252499, cg11783497, cg11911769, cg12019814, cg14711743, cg15607708, cg20956548, cg22876402, cg24958366;


Target CG IDs for separating HCC stage 1-3 from stage 4: cg02782634, cg11151251, cg24958366, cg06874640, cg27284331, cg16476382, cg14711743.


In one embodiment, the DNA methylation signature is CG IDs listed below for predicting stages of HCC using DNA methylation measurements of said CG IDs in T cells or PBMC obtained by using statistical models such as penalized regression or clustering analysis,


















cg14983135
cg10203922
cg05941376
cg14762436
cg12019814


cg03496780
cg02782634
cg27284331
cg23981150
cg14914552


cg13710613
cg23486701
cg11911769
cg14711743
cg15607708


cg14426660
cg18882449
cg02914652
cg15188939
cg12344600


cg21164050
cg03252499
cg03481488
cg04398282
cg11783497


cg20956548
cg22876402
cg24958366
cg11151251
cg06874640


cg16476382









In the second aspect, the present invention provides a kit for predicting cancer, comprising means and reagents for detecting DNA methylation measurements of the DNA methylation signature.


In one embodiment, the present invention provides a kit for predicting hepatocellular carcinoma (HCC) stages and chronic hepatitis, comprising means and reagents for detecting DNA methylation measurements of the CG IDs of table 3 in embodiment.


In one embodiment, the present invention provides a kit for predicting HCC stages and chronic hepatitis, comprising means and reagents for detecting DNA methylation measurements of the CG IDs of table 6 in embodiment.


In one embodiment, the present invention provides a kit for predicting different stages of HCC, comprising means and reagents for detecting DNA methylation measurements of the CG IDs of table 4 in embodiment.


In one embodiment, the present invention provides a kit for predicting stages of HCC, comprising means and reagents for detecting DNA methylation measurements of the CG IDs of table 5 in embodiment.


In the third aspect, the present invention provides gene pathways that are epigenetically regulated in cancer in peripheral immune system.


In the fourth aspect, the present invention provides use of CG IDs disclosed in the present invention. In one embodiment, present invention provides use of DNA pyrosequencing methylation assays for predicting HCC by using CG IDs listed above, for example using the below disclosed primers for AHNAK (outside forward; GGATGTGTCGAGTAGTAGGGT, outside reverse CCTATCATCTCCACACTAACGCT, nested forward TGTTAGGGGTGATTTTTAGAGG, nested reverse ATTAACCCCATTTCCATCCTAACTATCTT, and sequencing primer TTTTAGAGGAGTTTTTTTTTTTTA);


SLFN2L (outside forward GTGATYTTGGTYAYTGTAAYYT, Outside reverse TCTCATCTTTCCATARACATTTATTTAR, forward nested AGGGTTTYAYTATATTAGYYAGGTTGG, reverse nested ATRCAAACCATRCARCCCTTTTRC, sequencing primer YYYAAAATAYTGAGATTATAGGTGT);


AKAP7 (outside forward TAGGAGAAAGGGTTTATTGTGGT, outside reverse ACACACCCTACCTTTTTCACTCCA, nested forward GGTATTGATTTATGGTTAGGGATTTATAG, nested reverse AAACAAAAAAAACTCCACCTCCAATCC, sequencing primer GGGATTTATAGTTTTGTGAGA); and


STAP1 (outside forward AGTYATGTYTTYTGYAAATAAAAATGGAYAYY, outside reverse, TTRCTTTTTAACCACCAACACTACC nested forward YYGTTTYTTTYATYTTYTGGTGATGTTAA, nested reverse ARARRRCAATCTCTRRRTAATCCACATRTR, sequencing primer GGTGATGTTAATYTTYTGTTTA).


In one embodiment, present invention provides use of Receiver operating characteristics (ROC) assays for predicting HCC by using CG IDs listed above, for example STAP1 (cg04398282). In one embodiment, present invention provides use of hierarchical Clustering analysis for predicting HCC by using CG IDs listed above.


In the fifth aspect, the present invention provides method for identifying DNA methylation signature for predicting disease, comprising the step of performing statistical analysis on DNA methylation measurements obtained from samples.


In one embodiment, the method comprises the step of performing statistical analysis on DNA methylation measurements obtained from samples, said DNA methylation measurements are obtained by performing Illumina Beadchip 450K or 850K assay of DNA extracted from sample. In one embodiment, said DNA methylation measurements are obtained by performing DNA pyrosequencing, mass spectrometry based (Epityper™) or PCR based methylation assays of DNA extracted from sample.


In one embodiment, the method comprises the step of performing statistical analysis on DNA methylation measurements obtained from samples; said statistical analysis includes Pearson correlation.


In one embodiment, said statistical analysis includes Receiver operating characteristics (ROC) assays.


In one embodiment, said statistical analysis includes hierarchical clustering analysis assays.


Definitions

As used herein, the term “CG” refers to a di-nucleotide sequence in DNA containing cytosine and guanosine bases. These di-nucleotide sequences could become methylated in human and other animal DNA. The CG ID reveals its position in the human genome as defined by the Illlumina 450K manifest ((The annotation of the CGs listed herein is publicly available at https://bioconductor.org/packages/release/data/annotation/html/IlluminaHumanMethylation450k.db.html and installed as an R package IlluminaHumanMethylation450k.db as described in Triche T and Jr. IlluminaHumanMethylation450k.db: Illumina Human Methylation 450k annotation data. R package version 2.0.9.).


As used herein, the term “penalized regression” refers to a statistical method aimed at identifying the smallest number of predictors required to predict an outcome out of a larger list of biomarkers as implemented for example in the R statistical package “penalized” as described in Goeman, J. J., L1 penalized estimation in the Cox proportional hazards model. Biometrical Journal 52(1), 70-84.


As used herein, the term “clustering” refers to the grouping of a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters).


As used herein, the term “Hierarchical clustering” refers to a statistical method that builds a hierarchy of “clusters” based on how similar (close) or dissimilar (distant) are the clusters from each other as described for example in Kaufman, L.; Rousseeuw, P. J. (1990). Finding Groups in Data: An Introduction to Cluster Analysis (1 ed). New York: John Wiley. ISBN 0-471-87876-6.


As used herein, the term “gene pathways” refers to a group of genes that encode proteins that are known to interact with each other in physiological pathways or processes. These pathways are characterized using bio-computational methods such as Ingenuity Pathway Analysis: http://www.ingenuity.com/products/ipa.


As used herein, the term “Receiver operating characteristics (ROC) assay” refers to a statistical method that creates a graphical plot that illustrates the performance of a predictor. The true positive rate of prediction is plotted against the false positive rate at various threshold settings for the predictor (i.e. different % of methylation) as described for example in Hanley, James A.; McNeil, Barbara J. (1982). “The Meaning and Use of the Area under a Receiver Operating Characteristic (ROC) Curve”. Radiology 143 (1): 29-36.


As used herein, the term “Multivariate linear regression” refers to a statistical method that estimates the relationship between multiple “independent variables” or “predictors” such as percentage of methylation, age, sex etc. and an “outcome” or a “dependent variable” such as cancer or stage of cancer. This method determines the statistical significance of each “predictor” (independent variable) in predicting the “outcome” (dependent variable) when several “independent variables” are included in the model.





BRIEF DESCRIPTIONS OF THE DRAWINGS


FIG. 1. Genome wide distribution of cancer specific DNA methylation signatures in peripheral blood mononuclear cells.



FIG. 1A. A genome wide view (IGV genome browser) of the escalating differences in DNA methylation from healthy controls (Ref.), chronic hepatitis B (HepB) and C (HepC), and progressive stages of HCC (CAN1, CAN2, CAN3, CAN4);



FIG. 1B. The top box plot represents beta values of DNA methylation of sites that lose methylation as HCC progresses. The bottom box plot represents beta values of DNA methylation of sites that gain DNA methylation during progression of HCC.



FIG. 2. DNA methylation signature of HCC progression in 69 individuals which are in the state of normal, chronic hepatitis and stages of HCC. Each column represents a subject, each row represents a CG site, level of methylation is indicated by gray level. Black represents most methylated, white represents least methylated and grey represents intermediate methylated.



FIG. 3.



FIG. 3A. Overlap in number of CG sites that are differentially methylated between stages of HCC (CAN1, CAN2, CAN3, CAN4);



FIG. 3B. Number of CGs that become either hypo or hypermethylated during HCC progression (CAN1, CAN2, CAN3, CAN4).



FIG. 4. Prediction of 49 chronic hepatitis and HCC patients using the DNA methylation signature derived for stage 1 HCC (20 patients). Black represents most methylated, white represents least methylated and grey represents intermediate methylated.



FIG. 5. Prediction of 49 chronic hepatitis and HCC patients using the DNA methylation signature derived for stage 2 HCC (20 patients). Black represents most methylated, white represents least methylated and grey represents intermediate methylated.



FIG. 6. Prediction of 49 chronic hepatitis and HCC patients using the DNA methylation signature derived for stage 3 HCC (20 patients). Black represents most methylated, white represents least methylated and grey represents intermediate methylated.



FIG. 7. Prediction of 49 chronic hepatitis and HCC patients using the DNA methylation signature derived for stage 4 HCC (20 patients). Black represents most methylated, white represents least methylated and grey represents intermediate methylated.



FIG. 8. Prediction of 69 controls, chronic hepatitis and HCC patients using the 350 CG DNA methylation signature (Table 3). Black represents most methylated, white represents least methylated and grey represents intermediate methylated.



FIG. 9. Prediction of 69 controls, chronic hepatitis and HCC patients using a 31 CG DNA methylation signature (Table 5). Black represents most methylated, white represents least methylated and grey represents intermediate methylated.



FIG. 10.



FIG. 10A. Prediction (0 to 1 probability) differentiating stage HCC 2-4 from stage 1 using measurements of DNA methylation of following predictive CGs described in this invention, Target CG IDs: cg03252499, cg03481488, cg04398282, cg10203922, cg11783497, cg13710613, cg14762436, cg23486701;



FIG. 10B. Prediction (0 to 1 probability) differentiating stage HCC 3-4 from stage 1 and 2 using measurements of DNA methylation of following predictive CGs described in this invention, Target CG IDs: cg02914652, cg03252499, cg11783497, cg11911769, cg12019814, cg14711743, cg15607708, cg20956548, cg22876402, cg24958366;



FIG. 10C. Prediction (0 to 1 probability) differentiating stage HCC 4 from stage 1 to 3 using measurements of DNA methylation in predictive CGs described in this invention, Target CG IDs: cg02782634, cg11151251, cg24958366, cg06874640, cg27284331, cg16476382, cg14711743.



FIG. 11. Differences in DNA methylation profiles between T cells from healthy controls (n=10; TCTRL-1 to TCTRL-10) and HCC stages (n=10; TCAN1, TCAN2, TCAN3, TCAN4).



FIG. 12. Prediction of HCC using measurements of DNA methylation in PBMC DNA of the 370 CGs derived from T cells (Table 6).



FIG. 13.



FIG. 13A. Prediction of HCC using measurements of DNA methylation in T cell DNA of 350 CGs derived from PBMC DNA (Table 3).



FIG. 13B. Overlap between differentially methylated CGs in T cell DNA from different stages of HCC (TCAN1-4) and in DNA from PBMC from different stages of HCC (PBMCCAN1, PBMCCAN2, PBMCCAN4).



FIG. 13C. Prediction of HCC using measurements of DNA methylation in T cell DNA of 31 CGs derived from PBMC DNA (Table 5).



FIG. 14. Validation by pyrosequencing of differences in DNA methylation in 4 genes between all control samples and early stages of HCC in T cell DNA from a replication set.



FIG. 15. Receiver Operating Characteristic (ROC) measuring specificity (fraction of true positives) (Y axis) and sensitivity (absence of false positives) (X axis) of STAP1 methylation as a biomarker for discriminating HCC from healthy controls using T cells DNA (Illumina 450K data) (FIG. 15A) or HCC from all controls (healthy and chronic hepatitis) in PBMC (FIG. 15B).



FIG. 16. Receiver Operating Characteristic (ROC) measuring specificity (Y axis) and sensitivity (X axis) of STAP1 methylation (measured using pyrosequencing) in T cells as a biomarker for discriminating HCC from healthy controls (FIG. 16A) and all controls (FIG. 16B).





EMBODIMENTS OF THE INVENTION
Embodiment 1. DNA Methylation Signatures in Peripheral Blood Mononuclear Cells (PBMC) that Correlate with HCC Cancer Stages

Patient Samples


HCC staging was diagnosed according to EASL-EORTC Clinical Practice Guidelines: Management of hepatocellular carcinoma. The patients were divided into four groups, including Stage 0 (1), stage A (2), stage B (3) and stage C+D (4). For simplicity, the present invention refers to stages 1-4 in the figures and embodiments. Chronic hepatitis B diagnosing was confirmed using AASLD practice guideline for chronic Hepatitis B, and chronic hepatitis C diagnosing was according to AASLD recommendations for testing, managing and treating Hepatitis C. A strict exclusion criterion was any other known inflammatory disease (bacterial or viral infection with the exception of hepatitis B or C, diabetes, asthma, autoimmune disease, active thyroid disease) which could alter T cells and monocytes characteristics. Clinical characteristics of patients are provided in Table 1 and 2. The participants in the study provided consent according to the regulations of the Capital Medical School. The study received ethical approval from The Capital Medical School in Beijing and McGill University (IRB Study Number A02-M34-13B).









TABLE 1







Clinical data of training cohort.
















ID

text missing or illegible when filed


text missing or illegible when filed


text missing or illegible when filed


text missing or illegible when filed


text missing or illegible when filed


text missing or illegible when filed


text missing or illegible when filed


text missing or illegible when filed


text missing or illegible when filed




















1_9
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-O
No
15 y
TACE
5.8
<500



1_6
M
45
HCC-BCtext missing or illegible when filed C-O
No
No
No
2.25
<500


1_5
M
55
HCC-BCtext missing or illegible when filed C-O
20 y
No
No

text missing or illegible when filed

4.80E+04 


1_10
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-O
No
30 y
TACE
81.08
<500


1_8
M
44
HCC-BCtext missing or illegible when filed C-O
25 y
No
No
50.12

text missing or illegible when filed E+04



1_2
M
50
HCC-BCtext missing or illegible when filed C-O
15 y
seldom
No

text missing or illegible when filed

<500


1_1
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-O
No
No
No
4.72
2.46E+05 
<1000


1_7
M
58
HCC-BCtext missing or illegible when filed C-O
No
No

text missing or illegible when filed


text missing or illegible when filed

5.41E+02 


1_3
M
47
HCC-BCtext missing or illegible when filed C-O
20 y
20 y
No
3.07
3.52E+05 


1_4

text missing or illegible when filed


text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-O
No
seldom
No
13.4
<500

text missing or illegible when filed



2_8
F
50
HCC-BCtext missing or illegible when filed C-A
No
No
TACE + ADV − TKS text missing or illegible when filed

text missing or illegible when filed

<500 U/ml


2_3
M
55
HCC-BCtext missing or illegible when filed C-A
quit
No
TACE + RFA

text missing or illegible when filed

<500


2_4
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-A
quit
30 y
TACE

text missing or illegible when filed

5.42E+04 


2_1
M
48
HCC-BCtext missing or illegible when filed C-A
quit
seldom
No
0.82
<500


2_2
M
34
HCC-BCtext missing or illegible when filed C-A
No
seldom
No
3178

text missing or illegible when filed E+04



2_10
M
76
HCC-BCtext missing or illegible when filed C-A
No
No
TACE + RFA

text missing or illegible when filed


<1000


2_5
M
73
HCC-BCtext missing or illegible when filed C-A
No
No
No

text missing or illegible when filed

<500


2_6
M
41
HCC-BCtext missing or illegible when filed C-A
seldom
seldom

text missing or illegible when filed  + RFA

2.31
8.59E+02 


2_7

text missing or illegible when filed

53
HCC-BCtext missing or illegible when filed C-A
No
seldom
RFA
117.4

text missing or illegible when filed E+08



2_9
M
44
HCC-BCtext missing or illegible when filed C-A
25 y
No

text missing or illegible when filed

32.76
<500


3_8
M
52
HCC-BCtext missing or illegible when filed C-B
No
No
TACE + RFA

text missing or illegible when filed

<500


3_10
M
58
HCC-BCtext missing or illegible when filed C-B
No
No
TACE + RFA
86.72

text missing or illegible when filed E+05



3_3
M
60
HCC-BCtext missing or illegible when filed C-B
40 y
No
TACE

text missing or illegible when filed

4.61E+08 


3_text missing or illegible when filed
M
53
HCC-BCtext missing or illegible when filed C-B
30 y
30 y
No
3481
7.47E+05 


3_1
M
53
HCC-BCtext missing or illegible when filed C-B
30 y
20 y
TACE
254.3

text missing or illegible when filed E+03



3_7
M
48
HCC-BCtext missing or illegible when filed C-B
25 y
25 y
No

text missing or illegible when filed

<500


3_4
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-B
quit
40 y
TACE
28.84

text missing or illegible when filed E+04



3_5
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-B
quit
30 y
TACE

text missing or illegible when filed

<500


3_6
F
59
HCC-BCtext missing or illegible when filed C-B
No
No

text missing or illegible when filed

3.25
<500


3_2
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-B
No
30 y
TACE
31474

text missing or illegible when filed E+04



4_3
M
48
HCC-BCtext missing or illegible when filed C-C + D
No
 5 y

text missing or illegible when filed

1087
<500


4_text missing or illegible when filed
M
48
HCC-BCtext missing or illegible when filed C-C + D
No
No
TACE + RFA
1304
<500


4_2
M
58
HCC-BCtext missing or illegible when filed C-C + D
quit
30 y
No
67.44


4_5

text missing or illegible when filed

47
HCC-BCtext missing or illegible when filed C-C + D
No
No

text missing or illegible when filed  + text missing or illegible when filed  + RFA


text missing or illegible when filed

<500


4_6
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-C + D
20 y
seldom
No
97.91

text missing or illegible when filed E+05



4_8
M
76
HCC-BCtext missing or illegible when filed C-C + D
50 y
seldom
TACE + RFA

text missing or illegible when filed

<500


4_1

text missing or illegible when filed

28
HCC-BCtext missing or illegible when filed C-C + D
No
No

text missing or illegible when filed


text missing or illegible when filed

<500


4_4
M
59
HCC-BCtext missing or illegible when filed C-C + D
No
No
RFA
32.51

text missing or illegible when filed E+02



4_text missing or illegible when filed
M
31
HCC-BCtext missing or illegible when filed C-C + D
No
No

text missing or illegible when filed  + RFA

2.3

text missing or illegible when filed E+03



C1
M
47
hepetitis C
No
No

2.65


C6
M
54
hepetitis C
No
No

1.66


C4
M
31
hepetitis C
10 y
No

2.58


C2

text missing or illegible when filed

43
hepetitis C
No
seldom

2.78


C6
M
57
hepetitis C
No
No


text missing or illegible when filed



C7
M
32
hepetitis C
10 y
No


text missing or illegible when filed



C10

text missing or illegible when filed

26
hepetitis C
No
No


C8

text missing or illegible when filed

41
hepetitis C
10y
No


text missing or illegible when filed



C9
M
28
hepetitis C
No
seldom

2.09


Ctext missing or illegible when filed
M

text missing or illegible when filed

hepetitis C
No
No

3.56


B3
M
83
hepetitis B
30 y
No

28

text missing or illegible when filed E+05



B4
M
19
hepetitis B
No
No


1.85E+07 


B2
M
36
hepetitis B
No
10 y

3686
4.85E+05 


B7
M
43
hepetitis B
30 y
No

4842
2.02E+08 


B5
M
42
hepetitis B
20 y
seldom


text missing or illegible when filed

6.01E+04 


B1
M
40
hepetitis B
10 y
25 y


text missing or illegible when filed


text missing or illegible when filed E+04



B8

text missing or illegible when filed

31
hepetitis B
No
No


text missing or illegible when filed


text missing or illegible when filed E+04



B9
M
37
hepetitis B
No
No

48.34

text missing or illegible when filed E+04



B6
M
38
hepetitis B
10
14 y

3.78

text missing or illegible when filed E+03



B10

text missing or illegible when filed

30
hepetitis B
No
No



text missing or illegible when filed E+02



H1
M
30
healthy
10 y
seldom


H2

text missing or illegible when filed


text missing or illegible when filed

healthy
No
No


H3
M
40
healthy
10 y
seldom


H4
F
42
healthy
No
No


Htext missing or illegible when filed

text missing or illegible when filed

53
healthy
No
No


Htext missing or illegible when filed

text missing or illegible when filed

25
healthy
No
No


H7
F

text missing or illegible when filed

healthy
No
No


H8

text missing or illegible when filed

28
healthy
No
No


H9

text missing or illegible when filed

36
healthy
No
No


H10
M
29
healthy
No
No





DNA was prepared from PBMC cells for all patients. T cells were isolated from all healthy controls and from HCC patients (patient IDs; 1-1, 1-3, 1-6, 2-2, 2-3, 2-4, 3-6, 4-2, 4-3).



text missing or illegible when filed indicates data missing or illegible when filed














TABLE 2







Clinical data of test (replication) cohort
















ID

text missing or illegible when filed


text missing or illegible when filed


text missing or illegible when filed


text missing or illegible when filed


text missing or illegible when filed


text missing or illegible when filed


text missing or illegible when filed


text missing or illegible when filed


text missing or illegible when filed




















I-11
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-O
No
No
No

<500



I-14
M
30
HCC-BCtext missing or illegible when filed C-O

text missing or illegible when filed

No
No

text missing or illegible when filed

<500


I-18
M
65
HCC-BCtext missing or illegible when filed C-O
30 y
No
No

text missing or illegible when filed

<500


I-19
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-O
No
No
No

text missing or illegible when filed

<500


I-22
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-O

text missing or illegible when filed


text missing or illegible when filed

No
3.13
<500


I-23
M
62
HCC-BCtext missing or illegible when filed C-O
No
No
No
2358

text missing or illegible when filed



I-24
M
53
HCC-BCtext missing or illegible when filed C-O
20 y
No
No


I-30
M
58
HCC-BCtext missing or illegible when filed C-O
No
No
No

text missing or illegible when filed

<500


I-text missing or illegible when filed
M
57
HCC-BCtext missing or illegible when filed C-A
No
No
No
1210
<500


I-text missing or illegible when filed
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-A
No
40 y
No

text missing or illegible when filed


text missing or illegible when filed



I-26
M
72
HCC-BCtext missing or illegible when filed C-A
30 y
No
No

text missing or illegible when filed

<500


I-text missing or illegible when filed
M
41
HCC-BCtext missing or illegible when filed C-A
No
No
No

text missing or illegible when filed

<500


I-text missing or illegible when filed
M
43
HCC-BCtext missing or illegible when filed C-A
No
No
No

text missing or illegible when filed

<500


I-15
M
71
HCC-BCtext missing or illegible when filed C-A
quit
No
No
39.11
<500


I-16
F

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-A
No
No
No
4578
<500


I-20
M
58
HCC-BCtext missing or illegible when filed C-A
No
No
No
3.01
<500


I-25
F
68
HCC-BCtext missing or illegible when filed C-A
No
No
No
0.8
<500


II-11
M
47
HCC-BCtext missing or illegible when filed C-A
20 y
10 y
No

text missing or illegible when filed

<500


II-12
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-A
20 y
17 y
No
5.9
<500


II-15
M
62
HCC-BCtext missing or illegible when filed C-A
No
seldom
No

text missing or illegible when filed

<500


I-21
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-B
20 y
30 y
No
852.3

text missing or illegible when filed



II-13
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-B
20 y
30 y
No

text missing or illegible when filed


text missing or illegible when filed



II-14
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-B
40 y
40 y
No
442.3

text missing or illegible when filed



II-16
M
52
HCC-BCtext missing or illegible when filed C-B

text missing or illegible when filed

20 y
No
37.08
<500


II-17
M
47
HCC-BCtext missing or illegible when filed C-B
30 y
20 y
No
2.54


II-18
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-B
40 y

text missing or illegible when filed

No

text missing or illegible when filed


text missing or illegible when filed



II-19
M
49
HCC-BCtext missing or illegible when filed C-B

text missing or illegible when filed

No
No

text missing or illegible when filed


text missing or illegible when filed



II-20
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-B
No
No
No
171.4

text missing or illegible when filed



III-16
M
34
HCC-BCtext missing or illegible when filed C-B
40 y
No
No

text missing or illegible when filed


text missing or illegible when filed



III-17
M
34
HCC-BCtext missing or illegible when filed C-B

text missing or illegible when filed


text missing or illegible when filed

No
41524

text missing or illegible when filed



III-18
M
45
HCC-BCtext missing or illegible when filed C-B
No

text missing or illegible when filed

No
796.6
<500


I-28
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-C
No
Mo
No

text missing or illegible when filed

<500


I-29
M
47
HCC-BCtext missing or illegible when filed C-C
No
No
No

text missing or illegible when filed


text missing or illegible when filed



III-13
M
50
HCC-BCtext missing or illegible when filed C-C

text missing or illegible when filed

10 y
No

text missing or illegible when filed


text missing or illegible when filed



III-14
M
53
HCC-BCtext missing or illegible when filed C-C

text missing or illegible when filed

20 y
No

text missing or illegible when filed

<500


III-15
F

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-C
No
No
No
3.61

text missing or illegible when filed



III-19
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-C
40 y
seldom
No

text missing or illegible when filed

<500


IV-13
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-C
quit

text missing or illegible when filed

No

text missing or illegible when filed

<500


IV-15
M
20
HCC-BCtext missing or illegible when filed C-C
10 y
No
No
121000

text missing or illegible when filed



IV-16
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-C
20 y
20 y
No
4282

text missing or illegible when filed



IV-17
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-C
No
No
No
343.6

text missing or illegible when filed



IV-18
M
42
HCC-BCtext missing or illegible when filed C-C
No
No
No
4.95

text missing or illegible when filed



IV-19
M
50
HCC-BCtext missing or illegible when filed C-C
20 y
17 y
No
1383

text missing or illegible when filed



IV-20
M
50
HCC-BCtext missing or illegible when filed C-C
No

text missing or illegible when filed  years

No
4040
<500


III-11
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-D
40 y
40 y
No
496.4
<500


III-12
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-D

text missing or illegible when filed


text missing or illegible when filed

No
23.47

text missing or illegible when filed



III-20
F
72
HCC-BCtext missing or illegible when filed C-D
quit
No
No

text missing or illegible when filed


text missing or illegible when filed



IV-11
M

text missing or illegible when filed

HCC-BCtext missing or illegible when filed C-D
quit
30 y
No

text missing or illegible when filed


text missing or illegible when filed



IV-12
F
62
HCC-BCtext missing or illegible when filed C-D
No
No
No
10.56

text missing or illegible when filed



IV-14
M
42
HCC-BCtext missing or illegible when filed C-D
20 y
No
No
743.0

text missing or illegible when filed



B11
M
54
hepetitis B
No
No

181.8

text missing or illegible when filed



B12
F

text missing or illegible when filed

hepetitis B
No
No


text missing or illegible when filed

<500


B13
M

text missing or illegible when filed

hepetitis B

text missing or illegible when filed

No


text missing or illegible when filed


text missing or illegible when filed



B14
M

text missing or illegible when filed

hepetitis B
No
No

3
<500


B15
M

text missing or illegible when filed

hepetitis B

text missing or illegible when filed

No


text missing or illegible when filed

<500


B16
M
63
hepetitis B
No
No

20.73

text missing or illegible when filed



B17
M

text missing or illegible when filed

hepetitis B
40 y
No

4.67
<500


B18
F

text missing or illegible when filed

hepetitis B
No
No


text missing or illegible when filed


text missing or illegible when filed



B19
M

text missing or illegible when filed

hepetitis B
No
No


text missing or illegible when filed


text missing or illegible when filed



B20
F

text missing or illegible when filed

hepetitis B

text missing or illegible when filed

No

4.28
<500


C11
M
19
hepetitis C
No
No

1.72

2.01E+06


C12
F

text missing or illegible when filed

hepetitis C
No
No

8.67

1.25E+06


C13
M
32
hepetitis C
No
No

3.13
<500

text missing or illegible when filed



C14
M
60
hepetitis C

text missing or illegible when filed

No


text missing or illegible when filed


3.87E+06


C15
M

text missing or illegible when filed

hepetitis C
30 y
20 y

4.25


C16
F

text missing or illegible when filed

hepetitis C
No
No

4.25

2.22E+5 


C17
F
48
hepetitis C
No
No

1.82


text missing or illegible when filed



C18
F
62
hepetitis C
No
No


text missing or illegible when filed



text missing or illegible when filed



C19
M
69
hepetitis C
No
quit

3.08


text missing or illegible when filed



C20
F

text missing or illegible when filed

hepetitis C
No
No

3.4

6.40E+04


H11
M
31
healthy


H12
M
37
healthy


H13
M
25
healthy


H14
M
44
healthy


H15
M
38
healthy


H16
F
42
healthy


H17
F
34
healthy


H18
F
23
healthy


H19
M
39
healthy


H20
F
32
healthy





AFP-alpha feto protein;


HBV-Hepatitis B virus;


HCV-hepatitis C virus;


TACE- transcatheter arterial chemoembolization;


RFA-Radiofrequency ablation



text missing or illegible when filed indicates data missing or illegible when filed







Illumina Beadchip 450K Analysis


Blood was drawn from patients into EDTA coated tubes and peripheral blood mononuclear cells were isolated using standard protocols by centrifugation on Ficoll-Hypaque density gradient and mononuclear cells were collected on top of the Ficoll-Hypaque layer because they have a lower density using routine lab procedures, mononuclear cells were separated from platelets by washing (46). DNA was extracted from the cells using commercial human DNA extraction kits (Qiagen), DNA was bisulfite converted and subjected to Illumina HumanMethyaltion450k BeadChip hybridization and scanning using standard protocols recommended by the manufacturer. Samples were randomized with respect to slide and position on arrays and all samples were hybridized and scanned concurrently to mitigate batch effects as recommended by McGill Genome Quebec innovation center according to Illumina Infinum HD technology user guide. Illumina arrays hybridizations and scanning were performed by the McGill Genome Quebec Innovation center according to the manufacturer guidelines. Illumina arrays were analyzed using the ChAMP Bioconductor package in R (47). IDAT files were used as input in the champ.load function using minfi quality control and normalization options. Raw data were filtered for probes with a detection value of P>0.01 in at least one sample. Probes on the X or Y chromosome are filtered out to mitigate sex effects and probes with SNPs as identified in (48), as well as probes that align to multiple locations as identified in (48). Batch effects were analyzed on the non-normalized data using the function champ.svd. Five out of the first 6 principal components were associated with group and batch (slides). Intra-array normalization to adjust the data for bias introduced by the Infinium type 2 probe design was performed using beta-mixture quantile normalization (BMIQ) with function champ.norm (norm=“BMIQ”) (47). Batch effects are corrected after BMIQ normalization using champ.runcombat function.


Cell count analysis for peripheral blood mononuclear cells distribution in samples of this invention was performed according to the Houseman algorithm (49) using the function estimateCellCounts and FlowSorted.Blood.450k data as reference. The Beta values of the batch corrected normalized data are used for downstream statistical analyses.


To compute linear correlation between HCC stages and quantitative distribution of DNA methylation at the 450K CG sites, Pearson correlation between the normalized DNA methylation values and stages of HCC (with stage codes of 0 for control 1 and 2 for hepatitis B and C respectively and 3-6 for the 4 stages of HCC) is performed using the pearson con function in R and correcting for multiple testing using the method “fdr” of Benjamini Hochberg (adjusted P value (Q) of <0.05) as well as the conservative Bonferroni correction (Q<1×10−7). A similar approach could be used utilizing new generations of Illumina arrays such as Illumina 850K arrays.


Correlation Between Quantitative Distribution of Site-Specific DNA Methylation Levels and Progression of HCC


The analysis reveals a broad signature of DNA methylation that correlates with progression of HCC (160,904 sites). The analysis of this invention focus on 3924 sites with the most robust changes (r>0.8;r<−0.8; delta beta >0.2/, delta beta>−0.2, p<10−7). A genome wide view of the intensifying changes in DNA methylation of these sites during HCC progression relative to chronic hepatitis B and C and control is shown in FIG. 1A. A box plot of the DNA methylation levels of sites that either increase or decrease methylation during HCC confirms the progression of changes in DNA methylation with progression of HCC with an increase in the extent of hypomethylation with progression of HCC (FIG. 1B). Clustering using One minus Pearson correlation reveals that these sites cluster all individual HCC patients away from control and Hepatitis B and C individuals with the exception of patient CAN1-5 who is clustered on the boundary between HepC and HCC, showing strong consistency across individual members of the different groups (FIG. 2).


Utility of DNA Methylation Signature of HCC in Peripheral Blood Mononuclear Cells for Differentiating Cancer Samples from Controls


These DNA methylation signatures have therefore the utility of classifying the stage of HCC in patient sample. The heat map in FIG. 2 reveals the intensification of the changes in DNA methylation differences with progression of HCC. Importantly, the combination of this invention's analyses show that DNA methylation signatures differentiate individual HCC patients at the earliest stage from Hepatitis B and C which is a critical challenge in early diagnosis of HCC. Further, this invention's analysis shows that changes in DNA methylation in PBMC from HCC patients could be distinguished from changes induced by viral triggered chronic inflammation. Based on the description of this invention any person skilled in the art could derive similar DNA methylation signatures for other cancers.


Embodiment 2. Unique and Overlapping Differentially Methylated Sites Associate with Different HCC Stages and Differentiate HCC from Hepatitis B and C

Inventors of the present invention delineated differentially methylated CGs between healthy controls and each of the HCC stages independently using the Bioconductor package Limma (50) as implemented in ChAMP. The number of differentially methylated CG sites (p<1×10−7) between each stage of HCC and healthy controls increases with advance in stages; 14375 for stage 1, 22018 stage 2, 30709, stage 3 and 54580 for stage 4. Significance of overlap between two groups was determined using hypergeometric Fisher exact test in R. There is a significant overlap between the stages of cancer (FIG. 3A) suggesting common markers are affected in all HCC stages (p<1.9e−297).


The fraction of sites that are hypomethylated relative to hypermethylated sites in HCC increases as well from 26% in stage 1 to 57% in stage 4 (Figure. 3B). This increase in number of hypomethylated sites with progression of HCC was observed as well in the results of the Pearson correlation analysis (FIG. 1, 2). For each HCC stage, a set of highly robust CG methylation markers are derived by using the threshold of p<1×10−7 (genome wide significance after Bonferroni correction) and delta beta of +/−0.3 for HCC stage 1 and p<10−10 delta beta of +/−0.3 for the stages 2-4 (a more stringent threshold for later stages is used to reduce the number of sites used for analysis) which were used for further analysis (74 for stage 1, 14 for stage 2, 58 for stage 3, and 298 for stage 4). By combining the lists of markers derived independently for each stage and removing redundant CG sites between stages, a combined non-redundant list of 350 CGs (Table 3) is derived.









TABLE 3





List of top significant 350CG IDs derived from PBMC DNA that are differentially


methylated between stages of HCC and healthy controls.




















cg05375333
cg24304617
cg08649216
cg15775914
cg06098530
cg04536922


cg23679141
cg26009832
cg06908855
cg21585138
cg15514380
cg20838429


cg01546046
cg27090007
cg11412036
cg00744866
cg19988492
cg21542922


cg10036013
cg24958366
cg23824801
cg08306955
cg00361155
cg11356004


cg12829666
cg17479131
cg27408285
cg15009198
cg05423018
cg19140262


cg15011899
cg27644327
cg01810593
cg18878210
cg13710613
cg05033369


cg02001279
cg11031737
cg19795616
cg02717454
cg07072643
cg09048334


cg15188939
cg09800500
cg27284331
cg22344162
cg04018625
cg04385818


cg23311108
cg02313495
cg08575688
cg26923863
cg01238991
cg01214050


cg09789584
cg16324306
cg05486191
cg15447825
cg17741339
cg14361741


cg22301128
cg02914652
cg04171808
cg04771084
cg18132851
cg16292016


cg11737318
cg11057824
cg14276584
cg23981150
cg02556954
cg14783904


cg07118376
cg26407558
cg03496780
cg24383056
cg01359822
cg26250154


cg13978347
cg09451574
cg14375111
cg24232444
cg22747380
cg02758552


cg23544996
cg21156970
cg08944236
cg22281935
cg00211609
cg21811450


cg16306870
cg01732538
cg02142483
cg22110158
cg11911769
cg03432151


cg03731740
cg10312296
cg23102014
cg04398282
cg15755348
cg08455089


cg02749789
cg17704839
cg25683268
cg08946713
cg25195795
cg17766305


cg08123444
cg24742520
cg20460227
cg24056269
cg06151145
cg06349546


cg15747825
cg14983135
cg17163729
cg15118835
cg00568910
cg23017594


cg23829949
cg21164050
cg01417062
cg14189441
cg15146122
cg12813441


cg16712679
cg06879746
cg13146484
cg16111924
cg13615971
cg01411912


cg12820627
cg27057509
cg18417954
cg27089675
cg06194421
cg15374754


cg17534034
cg23857976
cg13913085
cg07128102
cg01966878
cg00093544


cg05591270
cg05228338
cg12705693
cg18556587
cg16565409
cg14711743


cg13219008
cg24783785
cg21579239
cg02863594
cg03044573
cg00483304


cg15607708
cg27457290
cg10274682
cg08577341
cg10469659
cg24376286


cg22475353
cg14199837
cg19389852
cg12306086
cg16240816
cg27638509


cg27296330
cg25104397
cg01839860
cg21700582
cg21487856
cg11300809


cg24449629
cg20592700
cg20222519
cg14774438
cg23486701
cg09244071


cg12177922
cg27010159
cg02272851
cg15123819
cg24640156
cg00014638


cg23004466
cg14898127
cg14734614
cg00759807
cg05086021
cg00697672


cg01696603
cg11783497
cg27120934
cg07929642
cg03899643
cg01116137


cg03639671
cg08861115
cg10078703
cg08134863
cg11556164
cg20250700


cg10203922
cg15966610
cg05099186
cg20228731
cg25135755
cg15867698


cg13749822
cg13299325
cg11767757
cg23493018
cg08113187
cg11151251


cg12263794
cg22547775
cg09545443
cg04071270
cg27588356
cg05577016


cg23157190
cg22945413
cg20427318
cg20750319
cg01611777
cg01933228


cg21406217
cg15046123
cg01698579
cg12050434
cg12299554
cg11006453


cg08247053
cg26405097
cg12691488
cg00458932
cg14356440
cg03555836


cg26576206
cg03483626
cg08568561
cg25708982
cg18482303
cg02482718


cg07212747
cg14531436
cg13943141
cg12592365
cg15323084
cg24065504


cg22872033
cg20587236
cg13619522
cg19780570
cg22876402
cg09340198


cg27186013
cg24284882
cg05502766
cg20187173
cg17092349
cg22143698


cg19851487
cg17226602
cg06445016
cg07772781
cg02782634
cg07065759


cg03481488
cg22707529
cg10895875
cg01828328
cg09987993
cg21751540


cg12598524
cg19945957
cg08634082
cg05725404
cg26401541
cg20956548


cg10761639
cg05460226
cg20944521
cg14426660
cg00248242
cg18731803


cg00350932
cg25364972
cg03252499
cg04998202
cg09514545
cg09639931


cg14914552
cg00754989
cg14762436
cg07381872
cg16476382
cg16810031


cg07504763
cg01994308
cg19266387
cg14193653
cg00189276
cg10861953


cg25279586
cg23837109
cg17934470
cg22675447
cg08858441
cg12628061


cg12019814
cg10892950
cg00758915
cg09479286
cg20874210
cg06874640


cg05941376
cg02976588
cg27143049
cg00426720
cg00321614
cg15006843


cg23044884
cg24576298
cg23880736
cg05999692
cg08226047
cg25522867


cg15891076
cg12344600
cg04090347
cg10784548
cg02265379
cg01124132


cg07145988
cg27544294
cg22515654
cg12201380
cg19925215
cg10536529


cg09635768
cg00448395
cg03062944
cg05961707
cg10995381
cg16517298


cg01124132
cg10536529
cg16517298
cg18882449
cg03909800
cg18882449


cg03909800









HCC patients in the study and in clinical setting are a heterogeneous group with respect to alcohol, smoking (52-55), sex (56) and age (57) and each of these factors are known to affect DNA methylation. In addition, peripheral mononuclear cells are a heterogeneous mixture of cells and alterations in cell distribution between individuals might affect DNA methylation as well. This invention first determined the cell count distribution for each case using the Houseman algorithm (49). Two-way ANOVA followed by pairwise comparisons and correction for multiple testing found no significant difference in cell count between the groups. Multifactorial ANOVA with group, sex and age as cofactors was performed for CGs that were short listed for association with HCC using loop_anova lmFit function with Bonferoni adjustment for multiple testing. Multivariate linear regression was performed on the shortlisted CG sites that were found to associate with HCC to test whether these associations will survive if cell counts, sex, age, and alcohol abuse are used as covariates in the linear regression model using the lmFit function in R. Comparison of differentially methylated (relative to control) gene lists in different groups was performed using Venny (http://bioinfogp.cnb.csic.es/tools/venny/). Hierarchical clustering was performed using One minus Pearson correlation and heatmaps were generated in the Broad institute GeneE application (http://www.broadinstitute.org/cancer/software/GENE-E/).


Then, a multivariate linear regression on the normalized beta values of the 350 CG sites is performed that differentiate HCC from all other groups using group (HCC versus non HCC), sex, alcohol, smoking, age, and cell-count as covariates. All CG sites remained highly significant for the group covariate even after including the other covariates in the model. Following Bonferroni corrections for 350 measurements, 342 CG sites remained highly significant for group (HCC versus non HCC). A multifactorial ANOVA analysis is performed on the beta values of the 350 sites as dependent variables and group (HCC versus non-HCC), sex and age as independent variables to determine whether there are possible interactions between either sex and group, age and group and between sex+age and group on DNA methylation.


While group remained significant for all 350 CGs no significant interactions with sex or age were found after Bonferroni corrections. In summary, these data show robust DNA methylation differences in PBMC DNA between HCC and other non-HCC patients including Hepatitis B and Hepatitis C.


Embodiment 3. Utility of Cancer Stage Specific DNA Methylation Markers to Predict Unknown Samples from Patients Using One Minus Pearson Cluster Analysis, Detect Early Stages of HCC Cancer and Differentiate them from Chronic Hepatitis

The differentially methylated sites for each of the HCC stages were derived by comparing 10 healthy control and 10 stage specific HCCs. Other stages and the Hepatitis B and C samples were not “trained” (“trained” is used by the model to derive the differentially methylated sites) for these differentially methylated CGs and served as “cross-validation” sets of “unknown” samples to address the following questions: First, would the markers derived for one stage of cancer cluster correctly HCC samples that were not “trained” by these markers? Second, would DNA methylation markers that were “trained” to differentiate HCC from healthy controls also differentiate HCC from Hepatitis B and hepatitis C. Differentiating HCC from chronic hepatitis is a critical challenge for early diagnosis of HCC since a notable fraction of HCC patient progress from chronic hepatitis to HCC.


Hierarchical clustering is performed by one minus Pearson correlation for all HCC and hepatitis samples using for each individual analysis a set of CG methylation markers that were “discovered” by testing only one stage of HCC and controls. All other stages were “naïve” to these markers and served as “cross-validation”. Cross validation refers to a statistical strategy whereby a small subset of samples in the study is used to “discover” a list of markers (predictors) that differentiate two groups from each other (i.e. “cancer” and “control”). These “discovered” markers are then tested as predictors in other “new” samples in the study. As demonstrated in FIGS. 4 to 7, each of the independently-derived set of markers for specific stages of HCC were “cross-validated”; they correctly predicted HCC in a group of samples that included “new” HCC and non-HCC cases (FIG. 4 uses stage 1 markers, FIG. 5 uses stage 2 markers, FIG. 6 uses stage 3 markers and FIG. 7 uses stage 4 markers). Remarkably, the CG markers that were discovered by just comparing only one stage of HCC to healthy controls correctly predicted HCC in a different set of samples that included HCC and chronic hepatitis cases. This provides further evidence for a different DNA methylation profile for chronic hepatitis and cancer that could be utilized for predicting whether a patient has still chronic hepatitis or whether he/she has transitioned into HCC. Interestingly, the same markers predicted correctly Hepatitis B and C cases as well (FIG. 4-7).


The overlap between independently derived CG markers that differentiate each of the HCC stages (FIG. 3A) is significant for all possible overlaps between the stages using Fisher hypergeometric test (p<1.921718e297). The highly significant overlap between the markers derived for each stage independently using only 10 cases and controls strongly validates the robustness of these markers and illustrates the utility of these differentially methylated CGs as peripheral markers of HCC that could be used for early detection.


Although there is a large overlap between CGs that are differentially methylated at the different stages of cancer, the overlap is partial. The present invention demonstrates here that one could utilize the 350 CG list (described above) (Table 3) to differentiate HCC stages from each other. Hierarchical clustering by one minus Pearson correlation of all samples using these 350 CGs correctly clustered the HCC cases by stage while hepatitis B and C cases were clustered with healthy controls. Although there is a large overlap between sites that are differentially methylated from healthy controls at different stages of HCC, the intensity of differential methylation is enhanced with progression of HCC. Thus, the level of methylation of these 350 CG sites could be also used to differentiate stages of HCC. A kit, comprising means and reagents for detecting DNA methylation measurements of the CG IDs of table 3, could be used for predicting hepatocellular carcinoma (HCC) stages and chronic hepatitis. Note that the DNA methylation markers list was derived by comparing only healthy controls and single stages of HCC, nevertheless this list could correctly predict other “new” hepatitis B and C cases as non-HCC (FIG. 8).


The disclosure of this invention reveals differentially methylated CGs in PBMC from HCC patients that can be used to distinguish particular stages of HCC from controls and from chronic hepatitis patients.


Embodiment 4. Stage Specific CG Methylation Markers that Differentiate Early from Late Stages of HCC Using Penalized Regression

Data suggest that PBMC DNA methylation markers differentiate stages of HCC. The present invention then defined a list of the minimal number of CG sites that are required to differentiate stages of HCC from each other. “Penalized regression” of the 350 CG sites is performed between stage samples using the R package “penalized” for fitting penalized regression models (51). The penalized R package uses likelihood cross-validation and predictions are made on each left-out subject. The fitted model identified 8 CGs that predict stage 1 versus control, 5CGs that predict stage 2 versus control, 5 CGs that differentiate stage 3 versus control, 7 CGs that differentiate Stage 4 versus control and 7 CGs that are sufficient to differentiate stage 1 from hepatitis B (Table 4). 8 CGs are selected that differentiate between stage 1 and later stages 2-4, 10CGs that differentiate stage 1 and 2 from later stages 3-4 and 7 CGs that differentiate stage 4 from all earlier stages (stages 1-3) (Table 4). DNA methylation measurements in PBMC of the combined list of 31 CG stage-separators (after removing duplicates, table 5) accurately predicted all HCC cases and their stages using One minus Pearson clustering (FIG. 9). A kit, comprising means and reagents for detecting DNA methylation measurements of the CG IDs of table 4 or 5, could be used for predicting hepatocellular carcinoma (HCC) stages.









TABLE 4





CG markers differentiating different stages of HCC from control and hepatitis B and


C using penalized regression models.
















Target CG IDs for
cg14983135, cg10203922, cg05941376, cg14762436, cg12019814,


separating HCC stage 1
cg14426660, cg18882449, cg02914652


from controls:


Target CG IDs for
cg05941376, cg15188939, cg12344600, cg03496780, cg12019814


separating HCC stage 2


from controls:


Target CG IDs for
cg05941376, cg02782634, cg27284331, cg12019814, cg23981150


separating HCC stage 3


from controls:


Target CG IDs for
cg02782634, cg05941376, cg10203922, cg12019814, cg14914552,


separating HCC stage 4
cg21164050, cg23981150


from controls:


Target CG IDs for
cg05941376, cg10203922, cg11767757, cg04398282, cg11151251,


separating HCC stage 1
cg24742520, cg14711743


from hepatitis B:


Target CG IDs for
cg03252499, cg03481488, cg04398282, cg10203922, cg11783497,


separating HCC stage 1
cg13710613, cg14762436, cg23486701


from stage 2-4:


Target CG IDs for
cg02914652, cg03252499, cg11783497, cg11911769, cg12019814,


separating HCC stage 2
cg14711743, cg15607708, cg20956548, cg22876402, cg24958366


from stage 3-4:


Target CG IDs for
cg02782634, cg11151251, cg24958366, cg06874640, cg27284331,


separating HCC stage 1-3
cg16476382, cg14711743


from stage 4:
















TABLE 5





Combined list of 31 CGs differentiating different stages of HCC from


control and hepatitis B and C using penalized regression models.


(after of removing the duplicated CGs)



















cg14983135
cg10203922
cg05941376
cg14762436
cg12019814


cg03496780
cg02782634
cg27284331
cg23981150
cg14914552


cg13710613
cg23486701
cg11911769
cg14711743
cg15607708


cg14426660
cg18882449
cg02914652
cg15188939
cg12344600


cg21164050
cg03252499
cg03481488
cg04398282
cg11783497


cg20956548
cg22876402
cg24958366
cg11151251
cg06874640


cg16476382









Embodiment 5. Utility of the CG Penalized Regression Model to Predict Unknown Samples as Different Stage Cancer with 100% Specificity and Sensitivity

The penalized models derived for differentiating the specific stages using CGs listed in Table 4 were then used on other “naïve” (new samples that were not used for the discovery of the markers) HCC cases and hepatitis B and C controls to predict likelihood of each case being at different stages of HCC. The results of these analyses are shown in FIG. 10. The penalized models predicted all the stages samples with 100% sensitivity and 100% specificity.


Embodiment 6. DNA Methylation Markers that Differentiate Between HCC and Healthy Controls Using DNA Extracted from T Cells

Multivariate analysis suggests that the differences in PBMC DNA methylation between HCC and other groups (control and chronic hepatitis) remain even when differences in cell count are taken into account. Further, to determine whether differences in DNA methylation between cancer and control would disappear once the complexity of cell composition is reduced by isolation of a specific cell type (although heterogeneity in T cell subtypes remains), the differences in DNA methylation profiles between T cells isolated from 10 of the 39 HCC patients included in the study (samples from each of the HCC stages, indicated in the legend to table 1) and all healthy controls (n=10) were analyzed to determine whether differences in DNA methylation between cancer and control would disappear once the complexity of cell composition is partly reduced by isolation of a specific cell type.


T cells were isolated using antiCD3 immuno-magnetic beads (Dynabed Life technologies), Linear (mixed effects) regression using the ChAMP package on normalized DNA methylation values between HCC and healthy controls revealed 24863 differentially methylated sites at a threshold of p<1×10−7. 370 robust differentially methylated CGs are shortlisted at a threshold of p<1×10−7 and delta beta >0.3, <−0.3 (Table 6) and hierarchical clustering of the healthy control and HCC T cell DNA by One minus Pearson correlation was performed (FIG. 11). These 370 CGs correctly cluster all samples into two groups: HCC and controls. A kit, comprising means and reagents for detecting DNA methylation measurements of the CG IDs of table 3, could be used for predicting hepatocellular carcinoma (HCC) stages and chronic hepatitis.









TABLE 6





List of top significant 370 CG IDs derived from T cells that differentiate HCC from


healthy control in cell DNA.




















cg00014638
cg02015053
cg03568507
cg06098530
cg08313420
cg10918327


cg00052964
cg02086310
cg03692651
cg06168204
cg08479516
cg10923662


cg00167275
cg02132714
cg03764364
cg06279274
cg08566455
cg11065621


cg00168785
cg02142483
cg03853208
cg06445016
cg08641990
cg11080540


cg00257775
cg02152108
cg03894796
cg06477663
cg08644463
cg11157127


cg00399683
cg02193146
cg03909800
cg06488150
cg08826152
cg11231949


cg00404641
cg02314201
cg03911306
cg06568880
cg08946713
cg11262262


cg00431894
cg02322400
cg03942932
cg06652329
cg09122035
cg11556164


cg00434461
cg02490460
cg03976645
cg06816239
cg09259081
cg11692124


cg00452133
cg02536838
cg04083575
cg06822816
cg09324669
cg11706775


cg00500229
cg02556954
cg04116354
cg06850005
cg09555124
cg11718162


cg00674365
cg02710015
cg04192168
cg06895913
cg09639931
cg11909467


cg00772991
cg02717454
cg04398282
cg07019386
cg09681977
cg11955727


cg00804338
cg02750262
cg04536922
cg07052063
cg09696535
cg11958644


cg00815832
cg02849693
cg04656070
cg07065759
cg09750084
cg12019814


cg00898013
cg02863594
cg04771084
cg07145988
cg10036013
cg12099423


cg01044293
cg02914652
cg04864807
cg07249730
cg10061361
cg12161228


cg01116137
cg02939781
cg04998202
cg07266910
cg10091662
cg12299554


cg01124132
cg02976588
cg05084827
cg07381872
cg10167378
cg12315391


cg01254303
cg02991085
cg05107535
cg07385778
cg10184328
cg12427303


cg01305421
cg03035849
cg05132077
cg07721852
cg10185424
cg12549858


cg01359822
cg03151810
cg05157625
cg07772781
cg10196532
cg12583076


cg01366985
cg03204322
cg05217983
cg07834396
cg10274682
cg12649038


cg01405107
cg03215181
cg05304366
cg07850527
cg10341310
cg12691488


cg01413790
cg03400131
cg05348875
cg07912766
cg10530883
cg12727605


cg01557792
cg03441844
cg05429448
cg08038033
cg10549831
cg12777448


cg01832672
cg03461110
cg05460226
cg08113187
cg10555744
cg12789173


cg01921773
cg03541331
cg05512157
cg08123444
cg10584024
cg12856392


cg01927745
cg03544320
cg05554346
cg08280368
cg10890302
cg12868738


cg01992590
cg03546163
cg05759347
cg08306955
cg10909506
cg12880685


cg12906381
cg15009198
cg17335387
cg19795616
cg22404498
cg24919348


cg12963656
cg15011899
cg17372657
cg19841369
cg22589728
cg25100962


cg12970155
cg15046123
cg17597631
cg19930116
cg22656550
cg25104397


cg13260278
cg15109018
cg17718703
cg19988492
cg22668906
cg25174412


cg13286116
cg15145341
cg17741339
cg20197130
cg22675447
cg25188006


cg13308137
cg15302376
cg17765025
cg20222519
cg22747380
cg25310233


cg13401703
cg15331834
cg17766305
cg20478129
cg22945413
cg25353287


cg13404054
cg15514380
cg17775490
cg20585841
cg23299919
cg25459280


cg13405775
cg15514896
cg17786894
cg20587236
cg23486701
cg25461186


cg13435137
cg15598244
cg17837517
cg20606062
cg23771949
cg25502144


cg13466988
cg15695738
cg17988310
cg20625523
cg23824902
cg25673720


cg13679714
cg15704219
cg18031596
cg20769177
cg23829949
cg25779483


cg13896699
cg15720112
cg18051353
cg20781967
cg23880736
cg25784220


cg13904970
cg15747825
cg18128914
cg20995304
cg23944804
cg25891647


cg13912027
cg15756407
cg18132851
cg21092324
cg24056269
cg25964728


cg13939291
cg15867698
cg18182216
cg21222426
cg24065504
cg26015683


cg14140403
cg16111924
cg18214661
cg21226442
cg24070198
cg26250154


cg14242995
cg16218221
cg18273840
cg21358380
cg24142603
cg26325335


cg14276584
cg16259904
cg18297196
cg21384492
cg24169486
cg26402555


cg14326196
cg16292016
cg18370682
cg21386573
cg24232444
cg26405097


cg14362178
cg16306870
cg18417954
cg21487856
cg24383056
cg26407558


cg14376836
cg16496269
cg18766900
cg21816330
cg24405716
cg26465602


cg14419424
cg16512390
cg18804667
cg21833076
cg24453118
cg26475911


cg14734614
cg16763089
cg18808261
cg21918548
cg24536818
cg26594335


cg14762436
cg16810031
cg19095568
cg22088248
cg24616553
cg26803268


cg14774438
cg16894855
cg19140262
cg22143698
cg24631428
cg26827373


cg14858267
cg16924102
cg19193595
cg22256433
cg24680439
cg26856443


cg14898127
cg17144149
cg19266387
cg22301128
cg24716416
cg26876834


cg14914552
cg17173975
cg19760965
cg22303909
cg24729928
cg26963367


cg15000827
cg17221813
cg19768229
cg22374742
cg24742520
cg27010159


cg27098685
cg27113419
cg27186013
cg27207470
cg27247736
cg27300829


cg27406664
cg27408285
cg27544294
cg27576694









Embodiment 7. Utility of DNA Methylation Marker Discovered in T Cells to Predict “Untrained” HCC and Chronic Hepatitis Patients

These 370 CG sites that differentiate T cells from HCC and healthy controls (Table 6) could be used to cluster “untrained” different chronic hepatitis and healthy control PBMC samples (n=69). The clustering analysis presented in FIG. 12 shows that the 370 CG sites that are differentially methylated in T cells DNA cluster individual HCC, hepatitis and healthy control DNA from PBMC with 100% accuracy. Thus, the differentially methylated CGs discovered using T cell DNA were “cross validated” on different patients (29 different patients with HCC, and 20 with chronic hepatitis) using DNA methylation measurements in PBMC.


Embodiment 8. Utility of 350 CG Sites (Table 3) and 31CG Sites (Table 5) Derived from Analysis of PBMC DNA in Predicting HCC Cancer Using T Cell DNA

The 350 CGs that were derived by analysis of PBMC DNA clustered the T cell healthy controls and HCC samples correctly (FIG. 13A). There is a highly significant overlap between the significant CGs (Fisher, p<1×10−7) that differentiate healthy controls from HCC using T cell DNA and CGs that differentiate the different HCC stages and controls using PBMC DNA (FIG. 13B).


The present invention also shows that the shortlisted 31 CGs derived by penalized regression from PBMC DNA methylation measures (Table 5) also cluster and stage accurately T cell DNA methylation measurements from HCC patients and controls using One minus Pearson correlations (FIG. 13C). These data demonstrate that the differences in DNA methylation between HCC and other samples remains even when the complexity of cell types is reduced by isolation of particular cell types and provides further “cross-validation” for the association of these CGs with HCC and their predictive value.


Embodiment 9. Differentially Methylated Genes in PBMC in HCC are Enriched in Immune Related Canonical Pathways

Progression of HCC has a broad footprint in the methylome (the genome-wide DNA methylation profile) (FIG. 1). To gain insight into the functional footprint of the differentially methylated genes in PBMC and T cells from HCC patients, the gene lists generated from the differential methylation analyses were subjected to a gene set enrichment analysis using Ingenuity Pathway Analysis (IPA). We first subjected genes associated with CGs to gene set enrichment analysis, said CGs show linear correlation with stages of HCC in the Pearson correlation analysis (FIG. 1) (r>0.8; r<−0.8; delta beta>0.2, delta beta<−0.2). Notably the top upstream regulators of genes associated with these CGs are TGFbeta (p<1.09×10−17), TNF (p<7.32×10−15), dexamethasone (p<7.74×10−12) and estradiol (p<4×10−12) which are major immune inflammation and stress regulators of the immune system. Top diseases identified were cancer (p value 1×10-5 to 2×10−51) and hepatic disease (p<1.24×10−5 to 1.11×10−25). A strong signal was noted for Liver hyperplasia (p<6.19×10−1 to 1.11×10−25) and hepatocellular carcinoma (p<5.2×10−1 to 3.76×10−25). An inspection of the genes that are differentially methylated reveals a large representation of immune regulatory molecules such as IL2, IL4, IL5, IL16, IL7, Il10, IL18, Il24, Il1B and interleukin receptors such as IL12RB2, IL1B, IL1R1, IL1R2, IL2RA, IL4R, IL5RA; chemokines such as CCL1, CCL7, CCL18, CCL24, as well as chemokine receptors such CCR6, CCR7 and CCR9; cellular receptors such as CD2, CD6, CD14, CD38, CD44, CD80 and CD83; TGFbeta3 and TGFbetaI, NFKB, STAT1, STAT3 and TNFa.


A comparative IPA analysis between PBMC and T cells differentially methylated genes revealed NFKB, TNF, VEGF and IL4 and NFAT as common upstream regulators. Overall, the DNA methylation alterations in HCC PBMC and T cell show a strong signature in immune modulation functions. Differentially methylated promoters between HCC and noncancerous liver tissue were previously delineated (16, 58). The present invention determined whether there was an overlap between the promoters that are differentially methylated in HCC in the cancer biopsies (1983 promoters) and peripheral blood mononuclear cells (545 promoters) and found an overlap of 44 promoters which was not statistically significant as determined by Fisher hypergeometric test (p=0.76). These data show that the changes in DNA methylation seen in peripheral blood mononuclear cells reflect changes in the immune system in HCC and that these differentially methylated CGs are most probably not a footprint of circulating DNA from tumors or “surrogates” of DNA methylation changes occurring in the tumor. The utility of these pathways is by providing new targets for cancer therapeutics in the peripheral immune system.


Embodiment 10. Predicting HCC and Cancer by Pyrosequencing of Differentially Methylated CGs

Pyrosequencing was performed using the PyroMark Q24 machine and results were analyzed with PyroMark® Q24 Software (Qiagen). All data were expressed as mean±standard error of the mean (SEM). The statistical analysis was undertaken using R. Primers used for the analysis are listed in Table 7.









TABLE 7







Pyrosequencing assays for HCC predictors; AHNAK, SLFN2L, AKAP7, STAP1.









Gene
Primers
sequence(5′ -3′)





AHNAK
out Forward
GGATGTGTCGAGTAGTAGGGT



out Reverse
CCTATCATCTCCACACTAACGCT



nest Forward
TGTTAGGGGTGATTTTTAGAGG



nest R(biotin)
ATTAACCCCATTTCCATCCTAACTATCTT



sequencing primer
TTTTAGAGGAGTTTTTTTTTTTTA





SLFN12L
out Forward
GTGATYTTGGTYAYTGTAAYYT



out Reverse
TCTCATCTTTCCATARACATTTATTTAR



nest Forward
AGGGTTTYAYTATATTAGYYAGGTTGG



nest Reverse (biotin)
ATRCAAACCATRCARCCCTTTTRC



sequencing primer
YYYAAAATAYTGAGATTATAGGTGT





AKAP7
out Forward
TAGGAGAAAGGGTYTTATTGTGGT



out Reverse
ACACACCCTACCTTTTTCACTCCA



nest Forward
GGTATTGATTTATGGTTAGGGATTTATAG



nest Reverse(biotin)
AAACAAAAAAAACTCCACCTCCAATCC



sequencing primer
GGGATTTATAGTTTTGTGAGA





STAP1
out Forward
AGTYATGTYTTYTGYAAATAAAAATGGAYAYY



out Reverse
TTRCTTTTTAACCACCAACACTACC



nest Forward
YYGTTTYTTTYATYTTYTGGTGATGTTAA



nest Reverse(biotin)
ARARRRCAATCTCTRRRTAATCCACATRTR



sequencing primer
GGTGATGTTAATYTTYTGTTTA









For the replication set, this invention uses T cells DNA to reduce cell composition issues. The replication set included 79 people, 10 healthy controls and 10 individuals from each of the hepatitis B and C and 3 cancer stages and 19 stage 1 samples (Table 2). Following genes are examined that were found to be significantly differentially methylated in T cells in comparison with HCC in the discovery set: STAP1 (cg04398282) (also included in table 6), AKAP7 (cg12700074), SLFNL2 (cg00974761), and included 1 additional hypomethylated gene in HCC: Neuroblast differentiation-associated protein (AHNAK) (cg14171514). Linear regression between all controls (healthy and hepatitis B and C) and HCC stage 1,2 (0+A) revealed significant association with HCC stage 1,2 for all 4 CGs after correction for multiple testing (STAP1 p=4.04×10−7; AKAP7 p=0.046; SLFNL2 p=0.012; AHNAK p=0.003436). Linear regression between all controls and all stages of HCC revealed significant association for STAP1 (p=6.6×10−6) and AHNAK with HCC (p=0.026) after correction for multiple testing.


ANOVA analysis revealed a significant difference in methylation between the control group (healthy controls and hepatitis B and C) and the group of early HCC (stages 0+A; 1,2) in all 4 CGs that were validated. A group comparison between all controls and all HCC revealed a significant difference in methylation for STAP1 (p=1.7×10−6), AKAP7 (p=0.042), AHNAK (p=0.0062) but the difference for SLFNL2 was trendy but not significant (p=0.071). ANOVA revealed significant effect for diagnosis (F=10.017; p=7.49×10−6) on STAP1 methylation.


Pairwise analysis after correction for multiple testing on the 5 different diagnosis subgroups of controls (healthy controls, chronic hepatitis B and chronic hepatitis C) and early HCC (stages 1 and 2 or 0 and A) revealed significant differences between stage 1 (BCLC 0) HCC and either healthy controls (p=0.00037), chronic hepatitis B (p=0.00849) or hepatitis C (p=0.00698) and between stage 2 (BCLC A) and either healthy controls (p=0.00018), hepatitis B (p=0.00670) or hepatitis C (p=0.00534). While there was also an effect of diagnosis on SLFN2L methylation (F=3.9376; p=0.00810) AHNAK (F=3.0219; p=0.02809) and AKAP7 (F=3.4; p=0.01633), pairwise comparisons between the different diagnosis subgroups were not significant.


These data illustrates that these 4 CG sites could be used to predict early stages of HCC and differentiate them from controls (FIG. 14).


Embodiment 11. Utility of the Discovered List of Differentially Methylated CGs to Predict HCC by Receiver Operating Characteristic (ROC) Analysis; the Example of STAP1

A measure of the diagnostic value of a biomarker is the Receiver Operating Characteristic (ROC) which measures “sensitivity” (fraction of true discoveries) as a function of “specificity” (fraction of false discoveries). The ROC test determines a threshold value (ie. percentage of methylation at a particular CG) that provides the most accurate prediction (the highest fraction of “true discoveries” and the least number of “false discoveries”) (59) (FIG. 15). The DNA methylation level of each sample is compared to a threshold DNA methylation value and is then classified as either control or HCC. The present invention first determines ROC characteristics for the normalized Illumina 450K beta values for T cells from healthy controls and HCC (FIG. 15A). The STAP1 gene cg04398282 behaves as a perfect biomarker. With a threshold DNA methylation beta value of 0.757 (any sample that has higher value is classified as HCC and lower value than 0.757 as control) the accuracy for calling HCC samples was 100%, the AUC is 1 and both sensitivity and specificity are 100%. The STAP1 biomarker was discovered by comparing T cells DNA methylation from HCC and healthy controls. We therefore could cross-validate the biomarker properties of STAP1 cg04398282 by examining the ROC characteristics using normalized beta values from the PBMC DNA samples which included hepatitis B and hepatitis C patients as well as 29 additional HCC patients that were not included in the T cells DNA methylation analysis (FIG. 15B). The accuracy of predicting all HCC samples (all stages) using PBMC DNA was 96% using a threshold beta value of 0.6729 and the AUC was 0.9741379 (sensitivity 0.975 and specificity 0.973). The ROC characteristics are examined using pyrosequencing values of STAP1 in the replication set of T cell DNA (FIG. 16). The CG methylation values of this STAP1 as quantified by pyrosequencing site were overall lower than Illumina 450K values. At threshold of DNA methylation of 40.2% for STAP1 cg04398282, the accuracy of calling HCC from all other controls (healthy and hepatitis B and C) is 82.2%. The area under the curve (AUC) for discrimination between HCC and all controls is: 0.8 (85% sensitivity and 73% specificity) (FIG. 16A). At threshold of 50.12% methylation of STAP1 cg04398282 the accuracy of calling HCC stage 1 from all controls is 83.6% and the AUC is 0.89 (84% sensitivity and 83% specificity). The accuracy of differentiating HCC stage 1 from healthy controls (FIG. 16A) is 93% at a threshold methylation level of 47.2 and the AUC is 0.94 (94% sensitivity and 94% specificity) (FIG. 16B). In summary, STAP1 illustrates that DNA methylation biomarkers in HCC peripheral blood mononuclear cells could be used for discriminating Stage 1 from chronic hepatitis and healthy controls which is a critical hurdle in early diagnosis of liver cancer. STAP1 was identified using T cell DNA and was validated in the replication set (FIG. 14).


The methods used here to measure DNA methylation provide only an example and do not exclude measurements of DNA methylation by other acceptable methods. It should be noted that any person skilled in the art could measure DNA methylation of STAP1 and other differentially methylated sites using a number of accepted and available methods that are well documented in the public domain including for example, Illumina 850K arrays, mass spectrometry based methods such as Epityper (Seqenom), PCR amplification using methylation specific primers (MS-PCR), high resolution melting (HRM), DNA methylation sensitive restriction enzymes and bisulfite sequencing.


Applications of this Invention


The applications of this invention are in the field of molecular diagnostics of HCC and cancer in general. Any person skilled in the art could use this invention to derive similar biomarkers for other cancers. Moreover, the genes and the pathways derived from the genes can guide new drugs that focus on the peripheral immune system using the targets listed in embodiment 9. The focus in DNA methylation studies in cancer to date has been on the tumor, tumor microenvironment (8, 9) and circulating tumor DNA (5, 6) and major advances were made in this respect. However, the question remains of whether there are DNA methylation changes in host systems that could instruct us on the system wide mechanisms of the disease and/or serve as noninvasive predictors of cancer. HCC is a very interesting example since it frequently progresses from preexisting chronic hepatitis and liver cirrhosis (2) and could provide a tractable clinical paradigm for addressing this question. This invention reveals that the qualities of the host immune system might define the clinical emergence and trajectory of cancer.


Importantly, the present invention shows a sharp boundary between stage 1 of HCC and chronic hepatitis B and C that could be used to diagnose early transition from chronic hepatitis to HCC as illustrated in the embodiments of this invention. The present invention also reveals how this invention could be used to separate stages of cancer from each other. All assays will require a set of known samples with methylation values for the CG IDs disclosed in this invention to train the models using hierarchical clustering, ROC or penalized regression and unknown samples will then be analyzed using these models as illustrated in the embodiments of this invention.


The fact that the present invention is mentioning different dependent claims does not mean that one cannot use a combination of these claims for predicting cancer. The examples disclosed here for measuring and statistically analyzing and predicting cancer, stages of cancer and chronic hepatitis should not be considered limiting. Various other modifications will be apparent to those skilled in the art to measure DNA methylation in cancer patients such as Illumina 850K arrays, capture array sequencing, next generation sequencing, methylation specific PCR, epityper, restriction enzyme based analyses and other methods found in the public domain. Similarly, there are numerous statistical methods in the public domain in addition to those listed here to use this invention for prediction of cancer in patient samples.


REFERENCES



  • 1. El-Serag H B. Hepatocellular carcinoma. N Engl J Med. 2011; 365:1118-27.

  • 2. Flores A, Marrero J A. Emerging trends in hepatocellular carcinoma: focus on diagnosis and therapeutics. Clinical Medicine Insights Oncology. 2014; 8:71-6.

  • 3. Tan C H, Low S C, Thng C H. APASL and AASLD Consensus Guidelines on Imaging Diagnosis of Hepatocellular Carcinoma: A Review. International journal of hepatology. 2011; 2011:519783.

  • 4. Valente S, Liu Y, Schnekenburger M, Zwergel C, Cosconati S, Gros C, et al. Selective non-nucleoside inhibitors of human DNA methyltransferases active in cancer including in cancer stem cells. J Med Chem. 2014; 57:701-13.

  • 5. Jiao L, Zhu J, Hassan M M, Evans D B, Abbruzzese J L, Li D. K-ras mutation and p16 and preproenkephalin promoter hypermethylation in plasma DNA of pancreatic cancer patients: in relation to cigarette smoking. Pancreas. 2007; 34:55-62.

  • 6. Park J W, Baek I H, Kim Y T. Preliminary study analyzing the methylated genes in the plasma of patients with pancreatic cancer. Scand J Surg. 2012; 101:38-44.

  • 7. Dirix L, Van Dam P, Vermeulen P. Genomics and circulating tumor cells: promising tools for choosing and monitoring adjuvant therapy in patients with early breast cancer? Curr Opin Oncol. 2005; 17:551-8.

  • 8. Finak G, Laferriere J, Hallett M, Park M. [The tumor microenvironment: a new tool to predict breast cancer outcome]. Med Sci (Paris). 2009; 25:439-41.

  • 9. Finak G, Sadekova S, Pepin F, Hallett M, Meterissian S, Halwani F, et al. Gene expression signatures of morphologically normal breast tissue identify basal-like tumors. Breast Cancer Res. 2006; 8:R58.

  • 10. Sehouli J, Loddenkemper C, Cornu T, Schwachula T, Hoffmuller U, Grutzkau A, et al. Epigenetic quantification of tumor-infiltrating T-lymphocytes. Epigenetics. 2011; 6:236-46.

  • 11. Jeschke J, Collignon E, Fuks F. DNA methylome profiling beyond promoters: taking an epigenetic snapshot of the breast tumor microenvironment. FEBS J. 2014.

  • 12. Baylin S B, Esteller M, Rountree M R, Bachman K E, Schuebel K, Herman J G. Aberrant patterns of DNA methylation, chromatin formation and gene expression in cancer. Hum Mol Genet. 2001; 10:687-92.

  • 13. Issa J P, Vertino P M, Wu J, Sazawal S, Celano P, Nelkin B D, et al. Increased cytosine DNA-methyltransferase activity during colon cancer progression. J Natl Cancer Inst. 1993; 85:1235-40.

  • 14. Ehrlich M. DNA methylation in cancer: too much, but also too little. Oncogene. 2002; 21:5400-13.

  • 15. Aguirre-Ghiso J A. Models, mechanisms and clinical evidence for cancer dormancy. Nat Rev Cancer. 2007; 7:834-46.

  • 16. Stefanska B, Huang J, Bhattacharyya B, Suderman M, Hallett M, Han Z G, et al. Definition of the landscape of promoter DNA hypomethylation in liver cancer. Cancer Res. 2011; 71:5891-903.

  • 17. Stefansson O A, Moran S, Gomez A, Sayols S, Arribas-Jorba C, Sandoval J, et al. A DNA methylation-based definition of biologically distinct breast cancer subtypes. Mol Oncol. 2014.

  • 18. Radpour R, Barekati Z, Kohler C, Lv Q, Burki N, Diesch C, et al. Hypermethylation of tumor suppressor genes involved in critical regulatory pathways for developing a blood-based test in breast cancer. PLoS One. 2011; 6:e16080.

  • 19. Ramzy, I I, Omran D A, Hamad O, Shaker O, Abboud A. Evaluation of serum LINE-1 hypomethylation as a prognostic marker for hepatocellular carcinoma. Arab journal of gastroenterology: the official publication of the Pan-Arab Association of Gastroenterology. 2011; 12:139-42.

  • 20. Chan K C, Jiang P, Chan C W, Sun K, Wong J, Hui E P, et al. Noninvasive detection of cancer-associated genome-wide hypomethylation and copy number aberrations by plasma DNA bisulfite sequencing. Proc Natl Acad Sci USA. 2013; 110:18761-8.

  • 21. Blair G E, Cook G P. Cancer and the immune system: an overview. Oncogene. 2008; 27:5868.

  • 22. Ehrlich P. Ueber den jetzigen Stand der Karzinomforschung. Ned Tijdschr Geneeskd. 1909; 5:273-90.

  • 23. Vesely M D, Kershaw M H, Schreiber R D, Smyth M J. Natural innate and adaptive immunity to cancer. Annual review of immunology. 2011; 29:235-71.

  • 24. Dunn G P, Bruce A T, Ikeda H, Old L J, Schreiber R D. Cancer immunoediting: from immunosurveillance to tumor escape. Nature immunology. 2002; 3:991-8.

  • 25. Swann J B, Smyth M J. Immune surveillance of tumors. The Journal of clinical investigation. 2007; 117:1137-46.

  • 26. Mackensen A, Ferradini L, Carcelain G, Triebel F, Faure F, Viel S, et al. Evidence for in situ amplification of cytotoxic T-lymphocytes with antitumor activity in a human regressive melanoma. Cancer research. 1993; 53:3569-73.

  • 27. Ferradini L, Mackensen A, Genevee C, Bosq J, Duvillard P, Avril M F, et al. Analysis of T cell receptor variability in tumor-infiltrating lymphocytes from a human regressive melanoma. Evidence for in situ T cell clonal expansion. The Journal of clinical investigation. 1993; 91:1183-90.

  • 28. Zorn E, Hercend T. A natural cytotoxic T cell response in a spontaneously regressing human melanoma targets a neoantigen resulting from a somatic point mutation. European journal of immunology. 1999; 29:592-601.

  • 29. Zorn E, Hercend T. A MAGE-6-encoded peptide is recognized by expanded lymphocytes infiltrating a spontaneously regressing human primary melanoma lesion. European journal of immunology. 1999; 29:602-7.

  • 30. Carcelain G, Rouas-Freiss N, Zorn E, Chung-Scott V, Viel S, Faure F, et al. In situ T-cell responses in a primary regressive melanoma and subsequent metastases: a comparative analysis. International journal of cancer Journal international du cancer. 1997; 72:241-7.

  • 31. Knuth A, Danowski B, Oettgen H F, Old L J. T-cell-mediated cytotoxicity against autologous malignant melanoma: analysis with interleukin 2-dependent T-cell cultures. Proceedings of the National Academy of Sciences of the United States of America. 1984; 81:3511-5.

  • 32. Schumacher K, Haensch W, Roefzaad C, Schlag P M. Prognostic significance of activated CD8(+) T cell infiltrations within esophageal carcinomas. Cancer research. 2001; 61:3932-6.

  • 33. Conejo-Garcia J R, Benencia F, Courreges M C, Gimotty P A, Khang E, Buckanovich R J, et al. Ovarian carcinoma expresses the NKG2D ligand Letal and promotes the survival and expansion of CD28− antitumor T cells. Cancer research. 2004; 64:2175-82.

  • 34. Sato E, Olson S H, Ahn J, Bundy B, Nishikawa H, Qian F, et al. Intraepithelial CD8+ tumor-infiltrating lymphocytes and a high CD8+/regulatory T cell ratio are associated with favorable prognosis in ovarian cancer. Proceedings of the National Academy of Sciences of the United States of America. 2005; 102:18538-43.

  • 35. Naito Y, Saito K, Shiiba K, Ohuchi A, Saigenji K, Nagura H, et al. CD8+ T cells infiltrated within cancer cell nests as a prognostic factor in human colorectal cancer. Cancer research. 1998; 58:3491-4.

  • 36. Galon J, Costes A, Sanchez-Cabo F, Kirilovsky A, Mlecnik B, Lagorce-Pages C, et al. Type, density, and location of immune cells within human colorectal tumors predict clinical outcome. Science. 2006; 313:1960-4.

  • 37. Pages F, Berger A, Camus M, Sanchez-Cabo F, Costes A, Molidor R, et al. Effector memory T cells, early metastasis, and survival in colorectal cancer. The New England journal of medicine. 2005; 353:2654-66.

  • 38. Teng M W, Vesely M D, Duret H, McLaughlin N, Towne J E, Schreiber R D, et al. Opposing roles for IL-23 and IL-12 in maintaining occult cancer in an equilibrium state. Cancer Res. 2012; 72:3987-96.

  • 39. Finak G, Bertos N, Pepin F, Sadekova S, Souleimanova M, Zhao H, et al. Stromal gene expression predicts clinical outcome in breast cancer. Nat Med. 2008; 14:518-27.

  • 40. Kristensen V N, Vaske C J, Ursini-Siegel J, Van Loo P, Nordgard S H, Sachidanandam R, et al. Integrated molecular profiles of invasive breast tumors and ductal carcinoma in situ (DCIS) reveal differential vascular and interleukin signaling. Proc Natl Acad Sci USA. 2011.

  • 41. Teschendorff A E, Menon U, Gentry-Maharaj A, Ramus S J, Gayther S A, Apostolidou S, et al. An epigenetic signature in peripheral blood predicts active ovarian cancer. PLoS One. 2009; 4:e8274.

  • 42. Widschwendter M, Apostolidou S, Raum E, Rothenbacher D, Fiegl H, Menon U, et al. Epigenotyping in peripheral blood cell DNA and breast cancer risk: a proof of principle study. PLoS One. 2008; 3:e2656.

  • 43. Xu Z, Bolick S C, DeRoo L A, Weinberg C R, Sandler D P, Taylor J A. Epigenome-wide association study of breast cancer using prospectively collected sister study samples. J Natl Cancer Inst. 2013; 105:694-700.

  • 44. Koestler D C, Marsit C J, Christensen B C, Accomando W, Langevin S M, Houseman E A, et al. Peripheral blood immune cell methylation profiles are associated with nonhematopoietic cancers. Cancer Epidemiol Biomarkers Prey. 2012; 21:1293-302.

  • 45. Langevin S M, Houseman E A, Accomando W P, Koestler D C, Christensen B C, Nelson H H, et al. Leukocyte-adjusted epigenome-wide association studies of blood from solid tumor patients. Epigenetics. 2014; 9:884-95.

  • 46. Kanof M E, Smith P D, Zola H. PREPARATION O F HUMAN MONONUCLEAR CELL POPULATIONS AND SUBPOPULATIONS. Current Protocols in Immunology.

  • 47. Morris T J, Butcher L M, Feber A, Teschendorff A E, Chakravarthy A R, Wojdacz T K, et al. ChAMP: 450k Chip Analysis Methylation Pipeline. Bioinformatics. 2014; 30:428-30.

  • 48. Marzouka N A, Nordlund J, Backlin C L, Lonnerholm G, Syvanen A C, Carlsson Almlof J. CopyNumber450kCancer: baseline correction for accurate copy number calling from the 450k methylation array. Bioinformatics. 2015.

  • 49. Houseman E A, Accomando W P, Koestler D C, Christensen B C, Marsit C J, Nelson H H, et al. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics. 2012; 13:86.

  • 50. Smyth G K, Michaud J, Scott H S. Use of within-array replicate spots for assessing differential expression in microarray experiments. Bioinformatics. 2005; 21:2067-75.

  • 51. Goeman J J. L1 penalized estimation in the Cox proportional hazards model. Biometrical journal Biometrische Zeitschrift. 2010; 52:70-84.

  • 52. Wan E S, Qiu W, Carey V J, Morrow J, Bacherman H, Foreman M G, et al. Smoking Associated Site Specific Differential Methylation in Buccal Mucosa in the COPDGene Study. Am J Respir Cell Mol Biol. 2014.

  • 53. Allione A, Marcon F, Fiorito G, Guarrera S, Siniscalchi E, Zijno A, et al. Novel Epigenetic Changes Unveiled by Monozygotic Twins Discordant for Smoking Habits. PLoS One. 2015; 10:e0128265.

  • 54. Cheng L, Liu J, Li B, Liu S, Li X, Tu H. Cigarette smoke-induced hypermethylation of the GCLC gene is associated with chronic obstructive pulmonary disease. Chest. 2015.

  • 55. Li H, Hedmer M, Wojdacz T, Hossain M B, Lindh C H, Tinnerberg H, et al. Oxidative stress, telomere shortening, and DNA methylation in relation to low-to-moderate occupational exposure to welding fumes. Environ Mol Mutagen. 2015.

  • 56. Liu J, Morgan M, Hutchison K, Calhoun V D. A study of the influence of sex on genome wide methylation. PLoS One.5:e10028.

  • 57. Horvath S. DNA methylation age of human tissues and cell types. Genome Biol. 2013; 14:R115.

  • 58. Stefanska B, Huang J, Bhattacharyya B, Suderman M, Hallett M, Han Z G, et al. Definition of the landscape of promoter DNA hypomethylation in liver cancer. Cancer Res. 2011.

  • 59. Mandrekar J N. Receiver operating characteristic curve in diagnostic test assessment. J Thorac Oncol. 2010; 5:1315-6.

  • 60. Di Bisceglie A M. Hepatitis B and hepatocellular carcinoma. Hepatology. 2009; 49:S56-60.

  • 61. Hayashi P H, Di Bisceglie A M. The progression of hepatitis B- and C-infections to chronic liver disease and hepatocellular carcinoma: epidemiology and pathogenesis. Med Clin North Am. 2005; 89:371-89.


Claims
  • 1. A DNA methylation signature of cancer in peripheral blood mononuclear cells (PBMC) for predicting cancer, said DNA methylation signature is derived using genome wide DNA methylation mapping methods selected from the group consisting of IIlumina 450K or 850K arrays, genome wide bisulfite sequencing, or methylated DNA Immunoprecipitation (MeDIP) sequencing or hybridization with oligonucleotide arrays.
  • 2. The DNA methylation signature according to claim 1, wherein said DNA methylation signature is CG IDs derived from PBMC DNA for predicting hepatocellular carcinoma (HCC) stages and chronic hepatitis using either PBMC or T cells DNA methylation levels of said CG IDs, and wherein said CG IDs are selected from the group consisting of:
  • 3. The DNA methylation signature according to claim 1, wherein said DNA methylation signature is CG IDs derived from T cells for predicting HCC stages and chronic hepatitis using PBMC or T cells DNA methylation levels of said CG IDs, and wherein said CG IDs are selected from the group consisting of:
  • 4. The DNA methylation signature according to claim 1, wherein said DNA methylation signature is CG IDs for predicting different stages of HCC using DNA methylation measurements of said CG IDs in T cells or PBMC obtained by using statistical models comprising penalized regression or clustering analysis, and wherein said CG IDs are selected from the group consisting of: Target CG IDs for separating HCC stage 1 from controls: cg14983135, cg10203922, cg05941376, cg14762436, cg12019814, cg14426660, cg18882449, cg02914652;Target CG IDs for separating HCC stage 2 from controls: cg05941376, cg15188939, cg12344600, cg03496780, cg12019814;Target CG IDs for separating HCC stage 3 from controls: cg05941376, cg02782634, cg27284331, cg12019814, cg23981150;Target CG IDs for separating HCC stage 4 from controls: cg02782634, cg05941376, cg10203922, cg12019814, cg14914552, cg21164050, cg23981150;Target CG IDs for separating HCC stage 1 from hepatitis B: cg05941376, cg10203922, cg11767757, cg04398282, cg11151251, cg24742520, cg14711743;Target CG IDs for separating HCC stage 1 from stage 2-4: cg03252499, cg03481488, cg04398282, cg10203922, cg11783497, cg13710613, cg14762436, cg23486701;Target CG IDs for separating HCC stage 2 from stage 3-4: cg02914652, cg03252499, cg11783497, cg11911769, cg12019814, cg14711743, cg15607708, cg20956548, cg22876402, cg24958366; andTarget CG IDs for separating HCC stage 1-3 from stage 4: cg02782634, cg11151251, cg24958366, cg06874640, cg27284331, cg16476382, cg14711743.
  • 5. The DNA methylation signature according to claim 1, wherein said DNA methylation signature is CG IDs for predicting stages of HCC using DNA methylation measurements of said CG IDs in T cells or PBMC obtained by using statistical models comprising penalized regression or clustering analysis, and wherein said CG IDs are selected from the group consisting of:
  • 6. A kit for predicting cancer, comprising means and reagents for detecting DNA methylation measurements of the DNA methylation signature according to claim 1.
  • 7. A kit for predicting hepatocellular carcinoma (HCC) stages and chronic hepatitis, comprising means and reagents for detecting DNA methylation measurements of the DNA methylation signature according to claim 2.
  • 8. A kit for predicting HCC stages and chronic hepatitis, comprising means and reagents for detecting DNA methylation measurements of the DNA methylation signature according to claim 3.
  • 9. A kit for predicting different stages of HCC, comprising means and reagents for detecting DNA methylation measurements of the DNA methylation signature according to claim 4.
  • 10. A kit for predicting stages of HCC, comprising means and reagents for detecting DNA methylation measurements of the DNA methylation signature according to claim 5.
  • 11. Gene pathways that are epigenetically regulated in cancer in peripheral immune system.
  • 12. A method for predicting HCC using at least one DNA methylation signature of claim 1 in DNA pyrosequencing methylation assays.
  • 13. A method for predicting HCC using a DNA methylation signature of claim 2 in Receiver operating characteristics (ROC) assays, wherein said DNA methylation signature is STAP1 (cg04398282).
  • 14. A method for predicting HCC using CG IDs of claim 2 in hierarchical Clustering analysis.
  • 15. A method for identifying DNA methylation signature for predicting disease, comprising the step of performing statistical analysis on DNA methylation measurements obtained from samples.
  • 16. The method according to claim 15, said DNA methylation measurements are obtained by performing Illumina Beadchip 450K or 850K assay of DNA extracted from sample.
  • 17. The method according to claim 15, said DNA methylation measurements are obtained by performing DNA pyrosequencing, mass spectrometry based (Epityper™) or PCR based methylation assays of DNA extracted from sample.
  • 18. The method according to claim 15, wherein said statistical analysis comprises Pearson correlation.
  • 19. The method according to claim 15, wherein said statistical analysis comprises Receiver operating characteristics (ROC) assays.
  • 20. The method according to claim 15, wherein said statistical analysis comprises hierarchical clustering analysis assays.
  • 21. A method for predicting HCC using at least one DNA methylation signature of claim 2 in DNA pyrosequencing methylation assays.
  • 22. A method for predicting HCC using at least one DNA methylation signature of claim 3 in DNA pyrosequencing methylation assays.
  • 23. A method for predicting HCC using at least one DNA methylation signature of claim 4 in DNA pyrosequencing methylation assays.
  • 24. A method for predicting HCC using at least one DNA methylation signature of claim 5 in DNA pyrosequencing methylation assays.
  • 25. A method for predicting HCC using at least one DNA methylation signature of claim 2 in hierarchical Clustering analysis.
  • 26. A method for predicting HCC using at least one DNA methylation signature of claim 3 in hierarchical Clustering analysis.
  • 27. A method for predicting HCC using at least one DNA methylation signature of claim 4 in hierarchical Clustering analysis.
  • 28. A method for predicting HCC using at least one DNA methylation signature of claim 5 in hierarchical Clustering analysis.
PCT Information
Filing Document Filing Date Country Kind
PCT/CN2016/086845 6/23/2016 WO 00