The innovation relates to the identification and use of biomarkers for use in clinic for predicting cancer risk.
Colorectal cancer (CRC) is the third most common cancer diagnosed in both men and women and is the second leading cause of cancer-related death in the United States. The early detection of CRC significantly improves the prognosis of patients and is a key factor to reduce the mortality rate from CRC. It can be easily cured by surgical procedures if the cancer is diagnosed early, specifically before metastasis is established. The 5-year relative survival rate for early-stage CRC is 90%; for advanced stage IV CRC, the survival rate drops to about 5. However, only about 4 out of 10 CRC patients are diagnosed at the early stage, partially due to poor patient acceptance and/or sensitivity of available screening modalities. In comparison to colonoscopy, a blood-based test is non-invasive, convenient, and cost-effective with high acceptance by individuals, leading to greater screening compliance in the general population and a reduction in the incidence and mortality rates of this disease.
There is evidence that the risk of CRC can be modified by diet, lifestyle and environmental factors, which suggests epigenetic mechanisms are associated with CRC initiation and progression. Epigenetic mechanisms are heritable chemical modifications of DNA and chromatin involving alterations in DNA methylation, histone modifications and small noncoding microRNAs (miRNAs), which induce chromatin structural changes, thereby affecting gene activity. DNA methylation represents a more stable source of biological information than RNA or the expression of most proteins. It is the most common modification in the mammalian genome and occurs when a methyl group is added onto the C5 position of cytosine, thereby modifying gene function and affecting gene expression. Most DNA methylation occurs at cytosine residues that precede guanine residues, or CpG dinucleotides, which tend to cluster in DNA domains known as CpG islands.
The relationship between methylation and gene expression is complex. In general, DNA methylation of gene promoters is associated with transcriptional silencing, whereas methylation in gene bodies is associated with increased gene expression. Strong correlations between gene expression and CpG islands and island shores were demonstrated. Global hypomethylation is thought to influence CRC development by inducing chromosomal instability.
DNA methylation patterns in peripheral blood can be informative noninvasive biomarkers of cancer risk and prognosis with a high sensitivity and specificity. DNA methylation pattern alterations in the blood cells may reflect the microenvironment components which support cancer initiation and methylome changes in blood may also reflect changes that occur in colon cells during CRC progression.
In previous studies, a variety of epigenetic biomarkers have been evaluated in colorectal cancer for early detection and prognosis prediction, however, most of the studies focused on a single gene. For example, SEPT9 showed abnormal hypomethylation at its promoters and was considered to be a biomarker for CRC cancer detection. However, the sensitivity of SEPT9 is 48.2% for CRC stages I-IV, but much lower (11.2%) for the precancerous condition, advanced adenoma.
In recent years, genome-wide methylation profiling can help us understand the molecular mechanisms involved in CRC initiation and progression. There is a need for a less-invasive and accurate test for detecting CRC, especially in the early stage of the disease.
The following presents a simplified summary of the innovation in order to provide a basic understanding of some aspects of the innovation. This summary is not an extensive overview of the innovation. It is not intended to identify key/critical elements of the innovation or to delineate the scope of the innovation. Its sole purpose is to present some concepts of the innovation in a simplified form as a prelude to the more detailed description that is presented later.
Most previous studies on blood-based DNA methylation biomarkers have relied on testing a limited number of pre-selected genes and on the use of non-quantitative detection methods, such as gel-based methylation-specific PCR.
A method according to the innovation can include genome-wide methylation profiling to investigate DNA methylation alterations in peripheral blood on colorectal cancer (CRC) initiation and progression. In particular, the method employs accurate non-invasive biomarkers to facilitate the early diagnosis of CRC. In one embodiment, this may be addressed in a comprehensive fashion by identifying DNA methylation alterations during CRC progression and development in blood and tissue specimens, and integrates with gene transcriptional changes.
In one embodiment, a bisulfite sequencing method can be performed to identify differential methylated regions (DMRs) in peripheral blood samples for a CRC patient versus a control group. The bisulfite sequencing analysis can result in the identification of a plurality of alterations in the methylome landscape in peripheral blood of patients with CRC.
The results are that differentially methylated regions (DMRs) associated with the gene body regions of Ras-related genes are hypermethylated. The activation of Ras signaling is involved in CRC initiation and development. Because methylation in gene bodies is generally associated with increased gene expression, it can reasonably be deduced that DNA methylation alterations in peripheral blood contribute to the activation of Ras signaling and colorectal tumorigenesis. In contrast, most DMRs associated with genebody regions of Rac-related genes were hypomethylated, suggesting DNA methylation alterations may inhibit Rac signaling, which is an important regulator of Arp2/3-dependent actin polymerization and phagocytosis of invading pathogens. DNA methylation alterations can be in the endocytosis pathway. One important innate immune defense mechanism is the ingestion of extracellular macromolecules through endocytosis or phagocytosis of whole bacteria in order to remove the inflammatory stimuli. DNA methylation alterations in CRC greatly compromise the ability of intestinal epithelial cells to respond to invading pathogens.
A genome-wide methylation analysis was conducted in CRC tumors (N=10) compared to adjacent normal tissues (N=10) to reveal functional genes with significant aberrant DNA methylation during carcinogenesis. The age and gender of tissue donors were comparable to blood donors.
Integrated analysis between the transcriptome and methylation profile in CRC were performed to reveal the DNA methylation changes in both CRC peripheral blood and tumors, and the underlying regulatory mechanisms of the impact of DNA methylation alteration on gene expression during CRC development. Genes with overlapping DMRs in blood and tumors and altered gene expression were selected as potential candidate biomarkers. Correlation analysis on hypermethylated or hypomethylated overlapping DMRs in both tumor tissue and blood using integrated reduced representation bisulfite sequencing (RRBS) and RNA-Seq analysis was then performed.
According to an aspect, the innovation provides a method for identifying cancer-related DNA methylation alterations in peripheral blood. In one embodiment, genome-wide DNA methylation analysis may be performed to identify cancer-related DNA methylation alterations. In one embodiment, the DNA methylation alterations may be used to identify CRC. In one embodiment, the DNA methylation alteration may be used to diagnose the stage (e.g., early vs. late stage) of the CRC. In one embodiment, DNA methylation alterations in the Ras/Rac signaling pathway may be used t as blood-based diagnostic markers.
In one embodiment, the integrated analysis method according to the innovation allows for the efficient mapping of tumor-specific DNA methylation alterations in whole blood and tumor with an accompanying gene expression change, and screened a list of 96 genes associated with aberrant DMRs (shown in Table 3). For example, some of the genes including (MAPK9, RXRA, NR4A1, VAV2, ARHGEF4 and CELSR3) exhibit significant accuracy for the detection of CRC (e.g., AUC of CRC vs. control=1(>92% CI: 1, 1) for all DMRs). Some of genes (MAPK9, LRP1, ARAP1, COL4A2 and ARPC1B) can discriminate patients from early-stage to late-stage cancer in the peripheral blood DNA (e.g., AUC of late stage vs. early stage=0.884 (95% CI: 0.675, 1), 0.859 (95% CI: 0.612, 1), 0.848 (95% CI: 0.628, 1), 0.807 (95% CI: 0.571, 0.987) and 0.798 (95% CI: 0.55, 0.995), respectively).
To the accomplishment of the foregoing and related ends, certain illustrative aspects of the innovation are described herein in connection with the following description and the annexed drawings. These aspects are indicative, however, of but a few of the various ways in which the principles of the innovation can be employed and the subject innovation is intended to include all such aspects and their equivalents. Other advantages and novel features of the innovation will become apparent from the following detailed description of the innovation when considered in conjunction with the drawings.
The innovation is now described with reference to the drawings, wherein like reference numerals are used to refer to like elements throughout. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the subject innovation. It may be evident, however, that the innovation can be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to facilitate describing the innovation.
Table 1 is a glossary of abbreviations used in disclosure herein.
According to an aspect, the innovation provides a method to identify biomarkers for blood-based early detection of CRC using genome-wide methylation sequencing data to detect DNA methylation alterations in peripheral blood of CRC patients or suspected CRC patients (see
Functional analyses revealed that DNA methylation alterations in peripheral blood contribute to the activation of Ras signaling which is involved in CRC initiation and development (
Most DMRs associated with genebody regions of Rac-related genes can be hypomethylated. DNA methylation alterations can inhibit Rac signaling, which is an important regulator of Arp2/3-dependent actin polymerization and phagocytosis of invading pathogens. DNA methylation alterations can be in the endocytosis pathway (
The method can reveal key functional genes with significant aberrant DNA methylation during carcinogenesis using genome-wide methylation analysis on CRC tumors (N=10) and comparing to adjacent normal tissues (N=10). In a test embodiment, ages and genders of tissue donors were comparable to blood donors. 65,680 DMRs were identified between CRC tumor and normal tissues and compared to 6,025 DMRs between CRC peripheral blood and healthy controls. To evaluate whether DNA methylation alteration in peripheral blood is associated with CRC tumor, we overlapped these DMRs identified separately from CRC tumor and peripheral blood (
The method employs RNA-Seq data (N=10) from the same CRC tissue donors to determine transcriptome-wide changes in CRC, compared to adjacent normal tissues. In a test embodiment, the method was used to perform an integrated analysis between the transcriptome and methylation profile in CRC to reveal the DNA methylation changes and the underlying regulatory mechanisms of the impact of DNA methylation alteration on gene expression during CRC development. The method determines genes having overlapping DMRs in blood and tumors and selects altered gene expression as candidate biomarkers. For example, APC-stimulated guanine nucleotide-exchange factor (ARHGEF4) is a binding partner of adenomatous polyposis coli (APC), which is an important tumor suppressor gene of great importance in the development of CRC. The method determines that ARHGEF4 gene was hypermethylated in the promoter region of both in CRC peripheral blood and tumors (
The method evaluates the accuracy of genes with DMRs for early detection of CRC. In one example embodiment, the method calculated receiver operating characteristic (ROC) curves and areas under the ROC curve (AUCs) of the selected DMRs. According to an aspect, genes associated with overlapping DMRs in blood and tumors and have altered gene expression can be selected by the method as potential candidate biomarkers as shown in Table 3.
To evaluate the accuracy of genes with DMRs for early detection of CRC, the method calculates receiver operating characteristic (ROC) curves and areas under the ROC curve (AUCs) of the selected DMRs. For example, genes including MAPK9, RXRA, NR4A1, VAV2, ARHGEF4 and CELSR3 exhibited significant accuracy for the detection of CRC [AUC of CRC vs. control=1(>92% CI: 1, 1) for all DMRs] (
In an exemplary embodiment, the method analyzes mechanisms and cell regulatory effects of DNA methylation alterations in CRC were investigated.
Study Population.
Whole blood samples (n=20) from CRC patients were obtained from the Cooperative Human Tissue Network (CHTN) (Table 2). The DNA methylation data of whole blood samples from healthy controls (n=10) was obtained from a publicly available database (NCBI GEO; accession number GSE85928). Paired CRC tumor and adjacent normal tissue from 10 age and gender-matched CRC patients were also used.
DNA Extraction, RRBS Library Preparation and Sequencing.
An RRBS library was prepared. Genomic DNA was digested overnight with Msp1 (New England Biolabs, USA) followed by end-repair and ligation of sequencing adaptors. A DNA library was prepared using NEXTflex Bisulfite-Seq Kit (Bioo Scientific) following a standard procedure. Bisulfite conversion of non-methylated cytosines was performed using the EZ DNA Methylation-Gold kit (Zymo Research Corp.) following the manufacturer's instructions. All PCR reactions for RRBS were purified using AMPure XP (Beckman Coulter, Brea, USA), and analyzed on a bioanalyzer. Sequencing was performed on the Illumina HiSeq2500 for a paired-end 2×50 bp run, with 150 million reads from each direction. Data quality checking was done on the Illumina SAV. De-multiplexing was performed with the Illumina Bcl2fastq2 v2.17 program.
Bioinformatics and Statistical Analysis.
The quality of the raw reads was examined with FastQC. The adapter trimming and filtering of the high quality reads was carried out with Cutadapt v1.8.3 and Trim Galore v0.4.0 with the—RRBS option. Quality processed reads were mapped to human genome (hg19) using Bismark assisted by Bowtie2. Before DMC and DMR analyses, methylation calls were filtered by discarding bases with coverage below 5× and bases with more than 99.9th percentile coverage in each sample. CpG sites on sex chromosomes and mitochondrion were excluded from the analyses. Individual DMCs were identified between CRC and control groups using logistic regression with the R package methylKit. Read coverage was normalized between samples. A minimum of three individuals per group were required for a CpG site to be analyzed. The CpGs with at least 10% methylation difference and a q-value <0.05 were considered to be differentially methylated. DMRs were determined using the R package eDMR with default parameters. To be considered significant, a DMR needed to contain at least one DMC, three CpG sites, and an absolute mean methylation difference greater than 5%. The DMRs identified using UCSC Refseq gene models with promoter regions defined as being 2 kb upstream from the transcription start site (TSS) were annotated. CpG islands were defined based on UCSC annotation (http://genome.ucsc.edu/). Functional Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses of involved genes were performed using DAVID bioinformatics resources (version 6.8; http://david.abcc.ncifcrf.gov/). The p-value was calculated using the modified Fishers exact test and the GO categories and KEGG pathways were identified as significantly enriched when p value was <0.05. Additional parameters were set to the default values. The Mann-Whitney U test was used for comparisons between the groups of subjects. A p value <0.05 was defined as statistically significant. The ROC curves were constructed and the areas under the ROC curves (AUCs) were calculated to evaluate the accuracy of DMRs for predicting CRC. The bootstrap method with 500 bootstrap samples was used to obtain the 95% confidence interval (CI) of the AUC. A 95% CI of AUCs not including 0.5 indicates a significant result.
Distribution of DMRs Identified in CRC.
Through differential methylation analysis, 6961 DMRs between CRC vs. control were identified. These DMRs to gene regions (
Functional Analysis of Genes Associated with DMRs.
These DMRs were further selected using non-parametric methods. The average methylation level across DMRs was calculated for each subject. The Mann-Whitney U test was then performed to determine significant differences between CRC and control groups, and differences between early (stage I and II) and late stage (stage III and IV) CRC patients. 1,852 DMRs located in the promoter or gene body regions with significant differences in mean methylation levels between the CRC and control groups were located.
Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and Gene Ontology (GO) biological process (GO-BP) analyses of the 1,852 genes associated with selected DMRs were used to understand how DNA methylation contributes to CRC development. The top 10 KEGG pathways and top 10 GO-BPs are summarized in
The top 10 significantly enriched GO biological processes include: axonogenesis, cell morphogenesis as involved in neuron differentiation, axon development, regulation of small GTPase-mediated signal transduction, neuron projection guidance, regulation of Ras protein signal transduction, regulation of neuron projection development, axon guidance, positive regulation of cell morphogenesis involved in differentiation and plasma membrane organization.
DNA Methylation Alterations of RAS Signaling Pathways.
It was observed that most of the DMRs that were associated with the genebody regions of Ras-related genes were hypermethylated. For example, in the Ras signaling pathway (hsa04014), fibroblast growth factor receptor 3 (FGFR3) and rap guanine nucleotide exchange factor 2 (RAPGEF2), upstream of RAS, were associated with hypermethylated DMRs in their coding sequence (CDS) and intron region, respectively. Ral guanine nucleotide dissociation stimulator (RALGDS) and mitogen-activated protein kinase kinase 2 (MAP2K2), downstream of RAS, were associated with hypermethylated DMRs in their intron regions. Ras GTPase-activating protein 3 (RASA3), a negative regulator of the Ras signaling pathway, was associated with a hypomethylated DMR in the intron region.
Most DMRs which are associated with the genebody regions of Rac-related genes were hypomethylated. Rac family small GTPase 1 (Rac1) is one of the key signaling components to control actin cytoskeleton organization and to suppress endometrial cancer metastasis. The Rac1-actin-related protein 2/3 (Arp2/3) pathway plays a critical role in phagocytosis of invading pathogens through cytoskeletal rearrangements. Guanine nucleotide exchange factor (VAV2), upstream of the Rac1, was associated with a hypomethylated DMR in the CDS region. Although not on the top of the DMR list, Rac1 is associated with a hypomethylated DMR (p of CRC vs. control=4.38×10−5) in the CDS region. The downstream genes, LIM-kinase1 (LIMK1), BAI1 associated protein 2 (BAIAP2) and Arp2/3 Complex Subunit 1B (ARPC1B) were associated with a hypomethylated DMR in their CDS, intron and CDS region, respectively.
It was also observed that DNA methylation was altered in the endocytosis pathway (hsa04144). Rabaptin, Rab GTPase binding effector protein 1 (RABEP1), was associated with a hypermethylated DMR in the intron region. Cytohesin 2 (CYTH2) was associated with a hypermethylated DMR in the CDS region. Pleckstrin and sec7 domain containing 3 (PSD3) was associated with a hypomethylated DMR in the intron region. ArfGAP with RhoGAP domain, ankyrin repeat and PH domain 1 (ARAP1) was associated with a hypomethylated DMR in the CDS region.
DNA Methylation Alterations in Other Cancer-related Pathways.
In the PI3K-Akt signaling pathway (hsa04151), AKT serine/threonine kinase 1 (AKT1) was associated with a hypomethylated DMR in the CDS region. Regulatory associated protein of mTOR complex 1 (RPTOR) was associated with a hypermethylated DMR in the intron region. Retinoid X receptor alpha (RXRA) and its heterodimerization partner, nuclear receptor subfamily 4 group A member 1 (NR4A1), exhibits pro-oncogenic activity and enhances either survival and/or cell proliferation. NR4A1 was associated with a hypomethylated DMR in the 5′ UTR region and RXRA was associated with a hypermethlated DMR in the intron region.
Notch signaling is overexpressed or constitutively activated in many cancers including CRC. Overexpression of C-terminal binding protein 1 (CTBP1) contributes to colon adenoma initiation and has been reported to be associated with poor prognosis, consistently, in the Notch signaling pathway (hsa04330), it was observed that Notch homolog 1 (NOTCH1) and CTBP1 were associated with hypermethylated DMRs in their CDS regions. In the MAPK signaling pathway (hsa04010), Mitogen-activated protein kinase 9 (MAPK9) is a member of the MAP kinase family and blocks the ubiquitination of tumor suppressor p53. A hypomethylated DMR in the intron region was observed. Abnormal choline metabolism may be a metabolic hallmark associated with oncogenesis and tumour progression. There were 6 genes involved in the choline metabolism in cancer (hsa05231) including AKT1, MAPK9, MAP2K2, RALGDS, RAC1 and platelet derived growth factor (PDGFB). There were 3 genes involved in the calcium signaling pathway (hsa04020) including calcium voltage-gated channel subunit α1B (CACNA1B), guanine nucleotide-binding protein G subunit alpha (GNAS) and guanine nucleotide-binding protein subunit alpha-11 (GNA11). Moreover, RXRA (described above) is also an important genetic pathway in the calcium/vitamin D pathway.
In addition, some selected DMRs are associated with genes that are related to CRC development. Plexin D1 (PLXND1) mediates anti-angiogenic signaling and recent findings suggest it is upregulated in CRC. A hypermethylated DMR in its CDS region was observed. Upregulation of N-Myc downstream regulated 1 (NDRG1) has been associated with poor prognosis in CRC, and the data suggests that it is associated with a hypermethylated DMR in the intron region. LDL receptor-related protein 1 (LRP1) mediates the clearance of many extracellular enzymes involved in the spread of cancer cells: metalloproteinases and serine proteinases. Decrease of LRP1 activity or loss of LRP1 expression correlates with increased aggressiveness of cancer cells in certain types of cancer. A hypomethylated DMR in the CDS region was observed. Collagen type IV a 2 chain (COL4A2) encodes one of the six subunits of type IV collagen, the major structural component of basement membranes. It also functions as an inhibitor of angiogenesis and tumor growth. Consistent with previous findings, a hypomethylated DMR in the CDS region of COL4A2 was observed.
In order to reveal specific functional genes with significant aberrant DNA methylation during carcinogenesis, genome-wide methylation analysis was conducted in CRC tumors (N=10) compared to adjacent normal tissues (N=10). The age and gender of tissue donors were comparable to blood donors. 65,680 DMRs were identified between CRC tumor and normal tissues and compared to 6,025 DMRs between CRC peripheral blood and healthy controls. To evaluate whether DNA methylation alteration in peripheral blood is associated with CRC tumor, these DMRs identified separately from CRC tumor and peripheral blood were overlapped (
RNA-Seq data (N=10) from the same CRC tissue donors was then used to determine the transcriptome-wide changes in CRC, compared to adjacent normal tissues. Integrated analysis between the transcriptome and methylation profile in CRC were performed to reveal the DNA methylation changes and the underlying regulatory mechanisms of the impact of DNA methylation alteration on gene expression during CRC development. Genes with overlapping DMRs in blood and tumors and have altered gene expression were selected as potential candidate biomarkers. This integrated analysis methods allowed us to efficiently map tumor-specific DNA methylation alterations in whole blood and tumor with an accompanying gene expression change. The average methylation levels in tumor tissue and in blood across overlapping DMRs was calculated for each subject. Spearman correlation coefficient can be used to determine the correlation of methylation levels of overlapping DMRs between tumor and blood DNA and hierarchical clustering can be used to identify significantly correlated DMRs. Significantly correlated DMRs are therefore a potential measure of the degree of association between the DNA methylation alterations in blood and tumor tissue.
Using integrated RRBS and RNA-Seq, 96 genes associated with overlapping DMRs were screened as listed in Table 3. For example, APC-stimulated guanine nucleotide-exchange factor (ARHGEF4) is a binding partner of adenomatous polyposis coli (APC), which is a tumor suppressor gene of importance in the development of CRC. The method found that the ARHGEF4 gene was hypermethylated in t-he promoter region of both in CRC peripheral blood and tumors (
Accuracy of DMRs for Detecting CRC and Discriminating Early-stage Patients.
To evaluate the accuracy of DMRs for early detection of CRC, receiver operating characteristic (ROC) curves and areas under the ROC curve (AUCs) of the selected DMRs were calculated. Table 3 shows the ROC results showing the ability of these DMRs to detect CRC and to differentiate early-stage cancer. Overall 96 genes associated with overlapping DMRs were are identified by the method. Either DMRs with mean methylation levels were increased or decreased progressively over control, or can be used to differentiate CRC early stage from late stage. For example, DMRs associated with MAPK9, RXRA, NR4A1, VAV2 and CELSR3 have a high ability to detect CRC [AUC of CRC vs. control=1(95% CI: 1, 1) for all DMRs]. DMRs associated with MAPK9, LRP1, ARAP1, COL4A2 and ARPC1B have a high ability to differentiate early-stage cancer [AUC of late stage vs. early stage=0.884 (95% CI: 0.675, 1), 0.859 (95% CI: 0.612, 1), 0.848 (95% CI: 0.628, 1), 0.807 (95% CI: 0.571, 0.987) and 0.798 (95% CI: 0.55, 0.995), respectively].
0.847 (95% CI: 0.67, 0.96)
0.887 (95% CI: 0.74, 0.99)
What has been described above includes examples of the innovation. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the subject innovation, but one of ordinary skill in the art may recognize that many further combinations and permutations of the innovation are possible. Accordingly, the innovation is intended to embrace all such alterations, modifications and variations that fall within the spirit and scope of the appended claims. Furthermore, to the extent that the term “includes” is used in either the detailed description or the claims, such term is intended to be inclusive in a manner similar to the term “comprising” as “comprising” is interpreted when employed as a transitional word in a claim.
This application claims the benefit of U.S. Provisional Patent Application Ser. No. 62/678,655 entitled “BLOOD-BASED BIOMARKERS FOR THE DETECTION OF COLORECTAL CANCER” filed on May 31, 2018, the entirety of which is incorporated by reference herein.
Number | Date | Country | |
---|---|---|---|
62678655 | May 2018 | US |