The present invention relates to methods of detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer, and associated methods of selecting a treatment or ascertaining whether a treatment is effective. The present invention also relates to a method for determining a solid cancer circulating free DNA (cfDNA) methylome signature for use in the detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of the solid cancer in a sample obtained from a subject.
Prostate cancer is the most common cancer among men in many parts of the world. Prostate cancer is the second leading cause of cancer death in men in the United States.
Currently, the most frequently used methods for detecting prostate cancer are a digital rectal examination and a blood test to determine levels of prostate-specific antigen (PSA) produced by the prostate gland. However, these diagnostic tools can lack the sensitivity required to detect very early prostate lesions or to detect progression. Biopsies are invasive, and can lead to false-negatives and repeat biopsies, as they do not sample the entire prostate. As the cancer progresses, metastasis can occur and currently metastatic prostate cancer is generally diagnosed using further PSA testing together with MRI/PSMA imaging. PSMA imaging involves the use of a radiolabelled monoclonal antibody for prostate-specific membrane antigen. Detection of the radiolabelled antibody enables the clinician to identify if cancerous cells have spread in the body. These methods of detection and diagnosis have various disadvantages: they are expensive to use; PSA has come under much scrutiny recently for unreliable results and over diagnosis; and imaging modalities are only able to detect a secondary tumour once it has reached a certain size.
Plasma tumour DNA tests have shown clinical utility for cancer detection, risk stratification and response assessment. Molecular analysis of circulating cell-free DNA (cfDNA) and cell-free RNA (cfRNA) has been found to be a useful approach in some circumstances. It is particularly convenient as samples can be obtained without any invasive procedure being necessary. A common approach is to detect or measure the abundance of genomic alterations that are used to distinguish tumour from normal DNA. However, this approach can be limited by the low prevalence of recurrent genomic changes, the relatively small number that are tumour specific and the low abundance in circulation of these aberrations that can overlap with other non-tumour aberrations, for example those resulting from clonal haematopoiesis. Overall these factors limit the sensitivity of genomic tests for screening for prostate cancer.
Methylation changes are tissue- or cancer-specific. Detection of methylation changes thus provides a promising approach for the diagnosis and assessment of cancers, including prostate cancer. In WO2014/043763 and WO2017/212428, there are described methods for the assessment of diseases, in particular cancers by the analysis of methylation patterns in cell-free DNA.
Regarding prostate cancer in particular, for example Kirby et al. (BMC Cancer (2017), 17:273) reported that DNA methylation patterns are altered in prostate cancer tissue in comparison to benign-adjacent tissue. They noted patterns of DNA methylation that can distinguish prostate cancers with good specificity and sensitivity in multiple patient tissue cohorts. The authors also identified transcription factors binding in these differentially methylated regions that may play a role in prostate cancer development. The methods developed by Kirby and by others require a very large amount of DNA to be sequenced and analysed in order for a reliable assessment to be made.
Metastatic castration-resistant prostate cancer (mCRPC) patients with a range of genomic aberrations, including androgen receptor (AR) copy number gain or TP53 mutations, detected in plasma prior to androgen receptor (AR) targeting with abiraterone or enzalutamide have a shorter duration of treatment benefit and overall survival. mCRPC exhibits a variable clinical course and biomarkers to stratify patients are urgently required to optimize management. As tumour biopsies from metastatic sites can be difficult to obtain and repeated sampling of multiple metastases is usually not feasible a minimally-invasive liquid biopsy-based analysis method would be helpful for clinicians. There thus remains a need for improved methods of detection and screening in this field.
The present invention provides a method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises circulating free DNA (cfDNA), the method comprising
To concurrently study the plasma genome and methylome and overcome the inherent challenges of methylation analysis resulting from the high variance in methylation data, the inventors selected plasma samples from a focused cohort of mCRPC patients with genomic information. The inventors surprisingly found that methylation data obtained by analysis of metastatic cancer patients' cell-free DNA in the plasma samples could very accurately estimate tumour fraction and can be used, for example, to improve liquid biopsy patient stratification.
The present invention provides a method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises circulating free DNA (cfDNA), the method comprising:
characterizing the methylome sequence of a plurality of cfDNA molecules in the sample, wherein the methylome sequence of a cfDNA molecule is the DNA sequence and the methylation profile of the molecule;
determining the average methylation ratio at 10 or more genomic regions, each genomic region being selected from the group consisting of:
a 100 to 200 bp region comprising or having a genomic location defined in Table 8, and
a 2 to 99 bp region within a genomic location defined in Table 8 and comprising at least one CpG locus,
and wherein each of the genomic regions is covered by at least one sequence read of at least one characterized methylome sequence;
calculating a methylation score using the average methylation ratio for each of the genomic regions;
analyzing the methylation score to determine whether the sample comprises cfDNA derived from a prostate cancer subtype.
The inventors surprisingly found that methylation data extracted from metastatic cancer patient plasma DNA could identify clinically-relevant subtypes, and in particular a sub-group of cancers characterized by a more aggressive clinical course and enriched for AR copy number gain, and thus can be used, for example, to improve liquid biopsy patient stratification.
The present invention also provides an in-vitro diagnostic kit for use in the detection, screening, monitoring, staging, classification and prognostication of prostate cancer, comprising one or more reagents for detecting the presence or absence of at least 25 DNA molecules having a DNA sequence corresponding to all or part of a genomic location defined in Tables 1 to 4 and/or Table 8.
The present invention further provides a computer product comprising a non-transitory computer readable medium storing a plurality of instructions that when executed control a computer system to perform the method of the present invention for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject.
The present invention further provides a computer-implemented method for detection, screening, monitoring, staging, classification and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises cfDNA.
The present invention further provides a computer-implemented method for classifying a prostate cancer patient into one or more of a plurality of treatment categories, the method comprising determining the level of prostate cancer fraction of cfDNA in a sample obtained from a subject, wherein the sample comprises cfDNA.
The present invention further provides a method for treating prostate cancer comprising treating the subject using a therapeutic agent for the treatment of prostate cancer, surgery, and/or radiotherapy.
The present invention further provides a method of determining one or more suitable therapeutic agents for the treatment of prostate cancer for a subject having prostate cancer.
The present invention further provides a method of determining a suitable treatment regimen for a subject having prostate cancer.
The present invention further provides a method for determining a solid cancer circulating free DNA (cfDNA) methylome signature for use in detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, prognostication and/or treatment of the solid cancer, the method comprising:
(i) characterizing the methylome sequence of a plurality of cfDNA molecules in a first sample comprising cfDNA from a subject known to have the solid cancer, wherein the methylome sequence of a cfDNA molecule is the DNA sequence and the methylation profile of the molecule;
(ii) determining the respective number of characterised cfDNA molecules corresponding to a CpG locus or a genomic region of 2 to 10,000 bp (preferably 2 to 200 bp) in the first sample by aligning the methylome sequences;
(iii) determining the methylation ratio of each CpG locus and/or average methylation ratio of each genomic region of 2 to 10,000 bp (preferably 2 to 200 bp) in the first sample;
repeating steps (i) to (iii) for one or more further samples comprising cfDNA each from subjects known to have the solid cancer;
performing a variance analysis of all or a selection of the methylation ratios of the CpG loci and/or all or a selection of average methylation ratios of the genomic regions of the samples;
selecting a group of CpG loci and/or genomic regions associated with a feature of the samples; and
selecting CpG loci and/or genomic regions in the group to provide the cfDNA methylome signature.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly recognized by one of ordinary skill in the art to which this invention belongs.
As used herein “DNA methylation” refers to the addition of a methyl group to a DNA nucleotide. DNA methylation most commonly occurs on the 5′ carbon of cytosine residues (i.e. 5-methylcytosines) of a CpG dinucleotide (referred to herein as a “CpG locus”). DNA methylation may also occur in cytosines in other contexts, for example CHG and CHH, where H is adenine, cytosine or thymine. Cytosine methylation may also be in the form of 5-hydroxymethylcytosine. Non-cytosine methylation, such as N6-methyladenine, may also occur.
As used herein, the term “CpG locus” refers to a region of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5′ to 3′ direction. A CpG site can become methylated in human and other animal DNA.
As used herein, a “methylome” is the set of nucleic acid methylation modifications in a subject's genome in a particular cell, tissue or cancer. The methylome may correspond to all of the genome, a substantial part of the genome, or relatively small portion(s) of the genome.
As used herein the term “plasma methylome” is the methylome determined from the plasma or serum of a subject (e.g., a human). The plasma methylome is an example of a “cell-free DNA methylome” since plasma includes cfDNA. The plasma methylome is an example of a mixed methylome because the plasma may comprise cfDNA from a variety of sources, for example, cfDNA from different tissues, non-cancerous and cancerous tissues.
As used herein the term “methylation profile” is the information related to DNA methylation for a DNA molecule. Information related to DNA methylation can include, but not limited to, a methylation index of a CpG locus, a methylation density of CpG sites in a DNA molecule, a distribution of CpG sites over a contiguous region, a pattern or level of methylation for each individual CpG site within a region that contains more than one CpG site, and non-CpG methylation.
As used herein the term “methylome sequence” is the DNA sequence and the methylation profile of the whole or a portion of a DNA molecule, for example a cfDNA molecule. For example, the methylome sequence may be the methylome sequence of the whole or a portion of a cfDNA molecule. The methylome sequence may correspond to all of the genome, a substantial part of the genome, or portion(s) of the genome.
As used herein the term “circulating free DNA” (cfDNA) means the DNA fragments that have been released into the blood plasma and are found freely circulating the blood stream, as well as in the urine. cfDNA is generally double-stranded DNA consisting of small fragments (70 to 200 bp).
As used herein the term “sequence read” refers to a sequence of the base pairs inferred from the whole or a portion of single molecule of DNA, for example the whole or a portion of a single molecule of cfDNA. A single read may be of 20 to 500 base pairs, or even up to 1500 base pairs. The sequence of a specific single molecule of DNA may be read once or read multiple times and each sequence is taken to be representative of a single molecule of DNA.
As used herein the term “tumour fraction cfDNA” is cfDNA derived from DNA of a cancer cell. As used herein the term “prostate cancer fraction cfDNA” is cfDNA derived from DNA of a prostate cancer cell.
As used herein, the term “genomic region” refers to a region of a genome, e.g. the genome of a subject, for example a human. A genomic region may also be referred to as a “segment”. It may be referred to using the genomic location of the region, for example using the coordinates of the start position and end position of the location in a specific chromosome. For a human subject a genomic region is suitably described by a genomic location, and in particular a genomic location with reference to a reference genome (for example, a digital nucleic acid sequence database, assembled a representative example of a species' set of genes).
As used herein, the term “genomic location” refers to the location of a region of a genome, e.g. the genome of a subject, for example a human. It may be referred to using the coordinates of the start position and end position of the location in a specific chromosome. For a human subject a genomic location is suitably described by reference to a reference genome (for example, a digital nucleic acid sequence database, assembled from a representative example of a species' set of genes). For example, for a human subject, with reference to the human reference genome GRCh37 (also referred to as Human Genome 19 (hg19)) or human reference genome GRCh38 (also referred to as Human Genome 38 (hg38)). For the present inventions, preferably the reference genome is human reference genome GRCh37 (also known as hg19). As such, a genomic location for a human may be described using the coordinates of the start position and end position of the location in a specific chromosome with reference to the Genome Reference Consortium Human Build 37 (GRCh37) (also referred to as Human Genome 19 (hg19)). Suitably, a genomic location according to the present invention is a location that covers 2 to 200 bp of DNA. A genomic location according to the present invention preferably includes at least one CpG locus, and suitably includes at least two CpG loci, for example 2, 3, 4, 5, 6, 7 or 8 CpG loci, and preferably 2, 3, 4, 5 or 6 CpG loci.
As used herein the term “plurality” is at least 2, for example at least 10, at least 100, at least 1000, at least 10,000, at least 100,000, at least 106, at least 107, at least 108 or at least 109 or more.
As used herein the term “a level of prostate cancer fraction” is the level of cfDNA derived from prostate cancer cells in a cfDNA sample compared to the cfDNA that is not derived from prostate cancer cells. cfDNA that is not derived from prostate cancer cells in a cfDNA sample may be derived from blood cells, for example white blood cells (leukocyte), and other non-prostate tissues.
As used herein the term “blood cell fraction cfDNA” is cfDNA derived from DNA of a blood cell, for example a white blood cell (leukocyte).
As used herein, a “subject” refers to an animal, including mammals such as humans. Preferably, the subject is a human subject. As used herein, an “individual” can be a subject. As used herein, a “patient” refers to a human subject. In one embodiment, the subject is known or suspected to have a cancer (for example prostate cancer), and/or is known or suspected to have a risk of developing cancer (for example prostate cancer), or is known to have cancer and is known or suspected to have metastatic cancer (for example prostate cancer) or to have a risk of developing metastatic cancer (for example prostate cancer). In some embodiments, the subject is a subject who has been identified as being at risk of developing a cancer, in particular at risk of developing a prostate cancer.
As used herein, a “healthy subject” refers to a subject that has not been diagnosed with a type of cancer (for example prostate cancer), and preferably has not been diagnosed with any type of cancer. Thus, for example, a method of relating to prostate cancer, a “healthy subject” has no prostate cancer, and preferably no other type of cancer. Preferably, a healthy subject has not been diagnosed with a type of cancer (for example prostate cancer), and is not suspected of having a type of cancer, and suitably has not been diagnosed with any type of cancer (for example prostate cancer), and is not suspected of having any type of cancer.
The term “sample” as used herein means a biological sample derived from a patient to be screened in a method of the invention. The biological sample may be any suitable sample known in the art in which cfDNA can be detected and/or isolated. Included are individual cells and cell populations obtained from bodily tissues or fluids. Examples of suitable body fluids that may be used as samples according to the present invention are plasma, blood, and urine.
As used herein the term “methylation ratio” refers to the proportion of cytosine residues (C) that are methylated at all sequence reads covering a CpG locus (“G” is a guanine residue) within a population or pool of DNA, such as a sample of cfDNA obtained from the plasma of a subject. When the methylation profile is measured using bisulfite conversion the un-methylated CpG loci are converted to UpG (“U” is a uracil residue), while methylated CpG sites remain the same. The uracil residues are read as thymine residues during the DNA sequencing step following bisulfite conversion. The methylation ratio may be calculated using formula (X), which take the cytosine (C) and thymidine (T) counts from multiple sequence reads of a specific CpG locus:
For example, a CpG locus having a methylation ratio of 0.5 is methylated in 50% of the sequencing reads covering the specific CpG locus and unmethylated in 50% of the reads covering the specific CpG. A CpG locus having a methylation ratio of 0.75 is methylated in 75% of the sequencing reads covering the specific CpG locus and unmethylated in 25% of the reads covering the specific CpG. A CpG locus having a methylation ratio of 1.0 is methylated in 100% of the sequencing reads covering the specific CpG locus and unmethylated in 0% of the reads covering the specific CpG.
The methylation ratio of a specific CpG locus describes the degree of methylation of that specific CpG locus in the population or pool of DNA (for example the degree of methylation of that specific CpG locus in a sample of cfDNA obtained from the plasma of a subject).
Tools such as BSMAP (PMID: 19635165), Bismark (PMID: 21493656), gemBS (PMID: 30137223), Arioc (PMID: 29554207), BS-Seeker2 (PMID: 24206606), MethylCoder (PMID: 21724594) or BatMeth2 (PMID: 30669962) can be used to determine methylation ratios. These programs can also align the sequencing reads from bisulfite sequencing before determining methylation ratios.
As used herein the term “reference methylation ratio” is the methylation ratio of a CpG locus in a reference sample or reference methylome, for example the methylation ratio of a CpG locus in one of the following:
As regards using a cfDNA sample from a different subject having prostate cancer, wherein the level of prostate cancer fraction in the cfDNA sample from the different subject is known, the level of prostate cancer fraction in a cfDNA sample from a different subject can be determined by, for example, using methods that estimate tumour fraction using genomic markers. Due to the low sensitivity of such methods, generally the lowest percentage level of tumour fraction in a cfDNA sample that can be detected are around 5 to 10% tumour fraction.
As used herein the term “average methylation ratio” is the average of the methylation ratios of all the CpGs within a given genomic region. The average methylation ratio can be calculated by determining the sum of the methylation ratios of all CpGs within a given genomic region and dividing the sum by the number of CpGs within the given genomic region. The average methylation ratio may also be referred to as the mean methylation ratio. If a genomic region has only 1 CpG locus, the average methylation is the same as the methylation ratio for the single CpG locus in the genomic region. Programs such as methylKit R package v1.6.2 (Akalin, A. et al. Genome Biol 13, R87 (2012)) can be used to calculate average methylation ratio.
The average methylation ratio of a specific genomic region describes the degree of methylation of that specific genomic region in the population or pool of DNA (for example the degree of methylation of that specific genomic region in a sample of cfDNA obtained from the plasma of a subject).
The term “hypermethylated region” as used herein refers to a genomic region of cfDNA that is indicative of cancer when there is an increase in the average methylation ratio in the region (i.e. hypermethylation) compared to the average methylation ratio of the same genomic region in one or more of the following:
The term “hypomethylated region” as used herein refers to a genomic region of cfDNA that is indicative of cancer when there is a decrease in the average methylation ratio in the region (i.e. hypomethylation) compared to the average methylation ratio of the same genomic region in one or more of the following:
The term “methylation score” as used herein is a value that is indicative of the methylation state of a sub-population or fraction of DNA in a sample. For example a “methylation score” may be indicative of the methylation state of the genomic regions in a sample that have the average methylation ratio determined. The methylation score may be, for example:
In certain embodiments, preferably the methylation score is, for example:
The term “reference methylation score” as used herein is a methylation score for a reference sample or a reference methylome. The reference sample or reference methylome may selected from the group consisting of:
The reference methylation score is preferably calculated (for example calculated using the average methylation ratio) for the same genomic regions as the genomic regions for a methylation score to which the reference methylation score is to be compared with.
For example, if a methylation score is the median of the average methylation ratios for all genomic regions that have had the average methylation ratios determined, then preferably the reference methylation score is the median of the average methylation ratios for the same genomic regions in a reference sample or reference methylome. If a methylation score is the mean of the average methylation ratios for all genomic regions that have had the average methylation ratios determined, then preferably the reference methylation score is the mean of the average methylation ratios for the same genomic regions in a reference sample or reference methylome
If a methylation score is the median of the average methylation ratios for a first group of genomic regions (resulting in a first methylation score) and/or the median of the average methylation for a second group of genomic regions (resulting in a second methylation score), then preferably the reference methylation score is the median of the average methylation ratios for the same first group of genomic regions (resulting in a first reference methylation score) and/or the median of the average methylation ratios for the same second group of genomic regions (resulting in a second reference methylation score) in a reference sample or reference methylome.
If a methylation score is the mean of the average methylation ratios for a first group of genomic regions (resulting in a first methylation score) and/or the mean of the average methylation for a second group of genomic regions (resulting in a second methylation score), then preferably the reference methylation score is the mean of the average methylation ratios for the same first group of genomic regions (resulting in a first reference methylation score) and/or the mean of the average methylation ratios for the same second group of genomic regions (resulting in a second reference methylation score) in a reference sample or reference methylome.
If a methylation score is the methylation ratio score for each genomic region that have the average methylation ratio determined, wherein a methylation ratio score is determined by comparing the average methylation ratio at each genomic region to a reference methylation ratio for each genomic region, the reference methylation score is preferably the reference methylation ratio score for each of the same genomic regions in a reference sample.
As used herein an “abnormal level of PSA” is a level of PSA in the blood indicative of a risk of a patient having prostate cancer. For example an abnormal level of PSA in the blood may be a level of at least 4.0 ng/mL. An “abnormal level of PSA” may additionally be an increase in the level of PSA in the blood compared to the level at initial diagnosis or the level at the previous time PSA was tested in the subject (for example an increase of 0.1 ng/mL or more, 0.2 ng/mL or more, 0.5 ng/mL or more, 1.0 ng/mL or more compared to the level at initial diagnosis or the level at the previous time PSA was tested in the subject).
The term “oligonucleotide(s)” are nucleic acids that are usually between 5 and 100 contiguous bases, for example between 5-10, 5-20, 10-20, 10-50, 15-50, 15-100, 20-50, or 20-100 contiguous bases. An oligonucleotide may be capable of hybridising to a target of interest, e.g., a sequence that is at least 10 nucleotides in length. An oligonucleotide for hybridising to a target may comprise at least 10, at least 15 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides or at least 60 nucleotides. An oligonucleotide can be used as a primer, a probe, included in a microarray, or used in polynucleotide-based identification methods. An oligonucleotide may be capable of hybridising to a DNA genomic region of the invention, for example a DNA genomic region as defined in Tables 1 to 4, or DNA genomic region comprising a DNA genomic region as defined in Tables 1 to 4, or a 2 to 99 bp DNA genomic region within a DNA genomic region defined in Tables 1 to 4 and comprising at least one CpG locus.
The term “comprising” as used in this specification and claims means “consisting at least in part of” or “consisting of”, that is to say when interpreting statements in this specification and claims which include the term, the features, prefaced by that term in each statement, all need to be present but other features can also be present. Related terms such as “comprise” and “comprised” are to be interpreted in a similar manner.
As used herein a “subtype of a cancer” (for example a “subtype of prostate cancer”) is a subset of a type of cancer based on characteristics of the cancer cells, and in particular molecular and genetic characteristics of the cells. Different cancer subtypes can have different disease progression and can respond or not respond to different treatments. The subtype of a cancer is, for example, used to assist in planning treatment and determine prognosis of the patient having that cancer subtype.
As used herein a “solid cancer cfDNA methylome signature” is a set of CpG loci and/or genomic regions that have a certain state of methylation in cfDNA derived from solid cancer cells. The pattern or fingerprint of methylation at the set of CpG loci and/or genomic regions is indicative of the solid cancer, and can provide information relating to the solid cancer, for example the level of solid cancer fraction in the cfDNA sample, a subtype of cancer (for example a genomic subtype), the aggression of the cancer, the prognosis of the cancer, and/or the tumour response to a treatment. A CpG locus or genomic region of a solid cancer cfDNA methylome signature may be tissue specific (for example, a certain state of methylation present in a particular tissue type, i.e. the tissue from which the cancer is derived) and/or cancer specific (for example, a certain state of methylation present in a particular cancer type). A CpG locus or genomic region of a solid cancer cfDNA methylome signature may have increased methylation compared to, for example, the methylation of the same locus or genomic region in a white blood cell and/or non-tumour cell and/or a different tissue to the cancer tissue, and especially compared to the methylation of the same locus or genomic region in a white blood cell. A CpG locus or genomic region of a solid cancer cfDNA methylome signature may have decreased methylation compared to, for example, the methylation of the same locus or genomic region in a white blood cell and/or non-tumour cell and/or a different tissue to the cancer tissue, and especially compared to the methylation of the same locus or genomic region in a white blood cell.
Tumour DNA circulates in the plasma of cancer patients admixed with DNA from non-cancerous cells. The genomic landscape of plasma DNA has been characterized in prostate cancer, for example, metastatic castration-resistant prostate cancer (mCRPC) but the plasma methylome has not been extensively explored. The identification of circulating methylation biomarkers can be challenging due to the heterogeneities of methylation. The traditional way to identify methylation markers started with the comparison between cancer tissue and normal tissue methylation patterns, and cancer-specific methylation loci are chosen and later validated in plasma samples. The present inventors have used an innovative approach and workflow to characterize the plasma methylome in mCRPC and identify a unique set of methylation markers due to the innovative experimental design which uses an unbiased approach to investigate the methylation profile of tumour derived cfDNA. The inventors' approach starts from profiling plasma pan-methylome. They then applied unbiased dimensional reduction algorithms, such as principal component analysis (PCA), and selected the regions most highly correlated with genomically-determined tumour fraction or the subtype of interest. The methylation markers identified by this approach markers can be used as cancer-specific methylation signatures in methods of the invention for high sensitivity and accurate tracking of tumour DNA in subjects with, for example, suspected or confirmed untreated or treated prostate cancer and/or for subtyping prostate cancer patients.
Furthermore, due to the large number of regions that the inventors have found to highly correlate with prostate and prostate cancer specific DNA methylation patterns and that show the greatest variance when compared to non-cancer plasma DNA in age-matched men, the inventors have been able to develop methods that are applicable to, for example, low-pass whole genome bisulfite sequencing data, and thus will be cost-effective and clinically scalable methods for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer.
Additionally, due to the methylation markers of the signatures of the present invention being based on variance compared to non-cancer plasma DNA in age-matched men, the signatures can be used in the methods described here to provide increased sensitivity and specificity for determining the level of prostate cancer fraction in a cfDNA sample, and in particular to detect significantly lower levels than is possible using genomic screening of cfDNA. Also, as methylation markers are not affected by clonal hematopoiesis in older populations (i.e. the formation of a genetically distinct subpopulation of blood cells), which can introduce false positives in genomic alternation-based tests, the methods of the present invention are applicable to subjects of all ages. Furthermore, as the methods of the invention determine methylation levels at multiple different methylation markers of the signatures of the present invention, the methods are not biased by inter-patient differences and genomic changes that could occur in normal cells and that could introduce a false positive result in the case of genomic testing.
Surprisingly, and due to the innovative workflow of the present invention, the methylation signatures of the present invention include methylation markers that are specific to either normal prostate tissue or prostate cancer tissue. The approach can be adapted and applied to other tumour types to identify circulating tumour-specific methylation signatures that can be used to accurately detect a tumour at earlier stages and quantitate tumour fraction. Also surprisingly, the signature found by the inventors did not include genes whose methylation status has been previously reported as diagnostic of prostate cancer such as, GSTP1, APC, RASSF1 and HOXD3 (Massie, C. E, et al, J Steroid Biochem Mol Biol 166, 1-15 (2017)). Although not wishing to be bound by theory, the present inventors postulate that this finding could be explained by highly variable methylation levels at the genomic regions of the signature in non-cancer plasma DNA compared to cancer plasma DNA. The inventors therefore understand that, in view of the signatures being found by the innovative workflow of the present invention, only the most stably methylated regions in non-cancer plasma DNA are identified as discriminators between non-cancer plasma DNA and cancer plasma DNA and are included in the signatures.
The present invention finds particular utility in risk stratification of men diagnosed with localised prostate cancer. Men with prostate cancer DNA detected in plasma using methods of the present invention can be staged, classified, and/or offered additional treatment with the aim of maximising cure whilst minimising over-treatment of men who do not require it. Furthermore, the methods of the invention can be used to identify patients with poorer prognosis so that a more intensive primary treatment can be selected, i.e. patients with a high tumour fraction level in the plasma, or who have an aggressive subtype of cancer. The methods can also be used for monitoring whether a treatment for prostate cancer is working or not, and for selecting further treatment, if necessary. Also, the half-life of Plasma DNA is approximately 1 hour so changes can be seen within days when a cancer is responding/not responding. Thus testing, after start of treatment (for example days or weeks after start of treatment) could be used to identify men for whom treatment is ineffective and to guide a change to a more effective alternative, potentially improving outcomes and minimising unnecessary side-effects.
Currently PSA testing is used to determine bio-chemical progression, and whole-body MRI scanning/PSMA testing for detecting metastases. PSA testing has come under much scrutiny for its reliability and overdiagnosis. Imaging modalities cannot detect metastatic disease as early as a ctDNA test. Imaging can only detect lesions >0.5-1 cm, i.e. 1 million cells or more. On the other hand, it is possible detect DNA from a few 100 tumour cells in circulation. The methods of the present invention can therefore complement or replace imaging for more accurate detecting, screening, monitoring, staging, classification and prognostication of prostate cancer, and in particular metastatic prostate cancer.
Furthermore, the methods and approaches employed by the present inventors to find the signatures described herein can be used in methods to find further signatures useful for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of other solid cancers in a sample obtained from a subject, wherein the sample comprises circulating free DNA (cfDNA).
DNA cytosine methylation, also called DNA methylation or CpG methylation, plays an important part in multiple biological processes by interacting with specific methyl-CpG binding proteins or specific methyl-CpG binding domains (MBDs), a key messenger to other transcriptional regulators which result in histone modification, chromatin re-arrangement, and differential gene expressions (Ballestar, E. & Esteller, M. Biochem Cell Biol 83, 374-384 (2005); Nakayama, T. et al. Lab Invest 80, 1789-1796 (2000)). Some DNA methylation is believed to remain constant in tumour clones, and have the unique inheritance, while some methylation consequences may be later events and result in more malignant form of cancer (Beltran, H. et al. Nat Med 22, 298-305 (2016)). Therefore, it has been hypothesized that DNA methylation signatures could be an important indicator for both early carcinogenesis and advanced tumour progression.
Methods of the Invention to Determine the Level of Prostate Cancer Fraction in a cfDNA Sample
The present invention provides a method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises circulating free DNA (cfDNA), the method comprising:
Tables 1 to 4 are provided below. The genomic locations have been separated into 4 tables based on whether a region including, having, or within the genomic location is a hypermethylated region (i.e. indicative of cancer when there in an increase in methylation level for the region) or a hypomethylated region (i.e. indicative of cancer when there is a decrease in methylation level for the region) when used in the method, and a region including, having, or within the genomic location is indicative of a methylation pattern specific to prostate tissue or is indicative of a methylation pattern specific to prostate cancer. The genomic locations of Tables 1 (and Table 1b) to 4 are locations with reference to hg19.
In Tables 1 to 4, where the gene indicated is “n/a” this means that the genomic location defined in the table is a non-coding region of DNA or not within the location of a known gene. In certain embodiments, the set of genomic locations listed in Table 1 does not include the genomic locations listed in Table 1b below:
The method is for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer. The prostate cancer may be any type of prostate cancer. Suitably, it may be acinar adenocarcinoma prostate cancer, ductal adenocarcinoma prostate cancer, transitional cell cancer of the prostate, squamous cell cancer of the prostate, or small cell prostate cancer. For example, it may be acinar adenocarcinoma prostate cancer or ductal adenocarcinoma prostate cancer. Alternatively, or additionally, the prostate cancer may be castration sensitive prostate cancer or castration resistant prostate cancer. Alternatively, or additionally, the prostate cancer may be metastatic prostate cancer, or it may be non-metastatic prostate cancer. In certain embodiments, it may be metastatic prostate cancer. In certain embodiments, the prostate cancer may be metastatic castration resistant prostate cancer or non-metastatic castration resistant prostate cancer. For example, it may be metastatic castration resistant prostate cancer.
The method is especially suitable for the detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of metastatic prostate cancer.
The method is also especially suitable for the detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of castration resistant prostate cancer prostate cancer.
The sample is a sample that comprises cfDNA. The sample may suitably be a blood sample, a plasma sample, or a urine sample. Preferably, the sample is a blood sample or a plasma sample. More preferably, the sample is a plasma sample.
The method may further comprise isolating the cfDNA from the sample. cfDNA can be isolated from the sample using a variety of techniques known in the art. For example, DNA (e.g., cfDNA) can be isolated by a column-based approach and/or a bead-based approach. In some embodiments, DNA (e.g., cfDNA) is isolated by means of a column-based approach, for example using a commercially available kit such as QIAamp circulating nucleic acid kit (Qiagen qiagen.com/ch/products/discovery-and-translational-research/dna-rna-purification/dna-purification/cell-free-dna/qiaamp-circulating-nucleic-acid-kit/#orderinginformation). In some embodiments, DNA (e.g., cfDNA) is isolated by means of a bead-based approach, for example an automated cf-DNA extraction system using a commercially available kit such as Maxwell RSC ccfDNA Plasma Kit (Promega (https://www.promega.co.uk/resources/protocols/technical-manuals/101/maxwell-rsc-ccfdna-plasma-kit-protocol/)).
The isolated cfDNA may be amplified before analysis. Thus the method may further comprise amplification of the isolated cfDNA. Amplification techniques are known to those of ordinary skill in the art and include, but are not limited to, cloning, polymerase chain reaction (PCR), polymerase chain reaction of specific alleles (PASA), polymerase chain ligation, nested polymerase chain reaction, and so forth.
The method comprises characterizing the methylome sequence of a plurality of cfDNA molecules in the sample, wherein the methylome sequence of a cfDNA molecule is the DNA sequence and the methylation profile of the molecule. The methylome sequence of a cfDNA molecule may be characterised by using methylation aware sequencing, by genome sequencing followed by methylation profiling, or by targeted approaches that capture specific DNA sequences (for example using DNA probes). Examples of methylation aware sequencing include bisulfite sequencing, bisulfite-free methylation-aware sequencing, methylation arrays (for example methylation microarrays), enzymatic methylation sequencing, methylation-sensitive restriction enzyme digestion, methylation-specific PCR, methylation aware PCR based assays, methylation-dependent DNA precipitation, and methylated DNA binding proteins/peptides. In certain embodiments, the methylome sequence of a plurality of cfDNA molecules in the sample is characterised using bisulfite sequencing, methylation microarrays, enzymatic methylation sequencing, bisulfite-free methylation-aware sequencing, or methylation aware PCR based assays.
Examples of targeted approaches that capture specific DNA sequences (for example using DNA probes) include cell-free methylated DNA immunoprecipitation and high-throughput sequencing (cfMeDIP-seq), methylation-dependent DNA precipitation, and methylated DNA binding proteins/peptides.
Bisulfite sequencing may comprise massive parallel sequencing with bisulfite conversion, for example treating the DNA molecule with sodium bisulfite and performing sequencing of the treated DNA molecule. Methylation assay sequencing may comprise treating the DNA molecule with sodium bisulfite, whole genome amplification, and hybridisation to a methylation-specific probe or a non-methylation probe, for example attached to a bead or chip.
Enzymatic methylation sequencing may comprise enzymatic treatment of the DNA molecule to convert methylated cytosine sites, followed by sequencing of the treated DNA. For example enzymatic methylation sequencing may comprise enzymatic treatment of the DNA molecule to convert methylated cytosine sites into a form protected from deamination, followed by deamination to convert unprotected cytosine to uracils, and sequencing of the treated DNA. An example of an enzymatic methylation sequencing kit includes NEBNext® Enzymatic Methyl-seq Kit (https://www.neb.com/products/e7120-nebnext-enzymatic-methyl-seq-kit#).
Examples of methylation aware PCR based assays include digital droplet PCR and qPCR (quantitative PCR).
An example of bisulfite-free methylation-aware sequencing is Oxford Nanopore seqeuencing (Oxford Nanopore Technologies, https://nanoporetech.com/))
In certain embodiments, the methylome sequence of a plurality of cfDNA molecules in the sample is characterised using whole genome bisulfite sequencing, for example low pass whole genome bisulfite sequencing. In another embodiment, the methylome sequence of a plurality of cfDNA molecules in the sample is characterised using reduced representation bisulfite treatments. In certain embodiments, the methylome sequence of a plurality of cfDNA molecules in the sample is characterised using methylation arrays, for example methylation microarrays, such as an Illumina Methylation Assay.
A variety of genome sequencing procedures are known in the art and may be used to practice the methods disclosed herein. For example, Sanger sequencing, Polony sequencing, 454 pyrosequencing, Combinatorial probe anchor synthesis, SOLiD sequencing, Ion Torrent semiconductor sequencing, DNA nanoball sequencing, Heliscope single molecule sequencing, Single molecule real time (SMRT) sequencing, Nanopore DNA sequencing, Microfluidic Sanger sequencing and Illumina dye sequencing.
A plurality of cfDNA molecules may be, for example, at least 100, at least 1000, at least 10,000, at least 50,000, at least 100,000, at least 500,000, at least 1,000,000 (106), at least 5,000,000 (5×106), at least 10,000,000 (107), at least 100,000,000 (108), or at least 1,000,000,000 (109). Preferably, a plurality of cfDNA molecules may be, for example, at least 10,000, at least 50,000, at least 100,000, at least 500,000, at least 1,000,000 (106), at least 5,000,000 (5×106), at least 10,000,000 (107), at least 100,000,000 (108), or at least 1,000,000,000 (109). More preferably, a plurality of cfDNA molecules may be, for example, at least 100,000, at least 500,000, at least 1,000,000 (106), at least 5,000,000 (5×106), at least 10,000,000 (107), at least 100,000,000 (108), or at least 1,000,000,000 (109).
The method may further comprise aligning the methylome sequences with a reference genome for the subject, for example by aligning the methylome sequences with hg38, hg19, hg18, hg17 or hg16. The alignment can, for example, be carried out using a variety of techniques known in the art. For example, a DNA sequence alignment tool, (e.g., BSMAP (PMID: 19635165), Bismark (PMID: 21493656), gemBS (PMID: 30137223), Arioc (PMID: 29554207), BS-Seeker2 (PMID: 24206606), MethylCoder (PMID: 21724594) or BatMeth2 (PMID: 30669962)) can be used to align the reads to the reference genome (for example hg38, hg19, hg18, hg17 or hg16).
The genomic location assigned to each methylome sequence in the alignment is based on the reference genome adopted. The genomic locations listed in Tables 1, 1b, 2 to 9 disclosed herein correspond to reference genome hg19. The corresponding locations in a different reference genome can be found using public available tools known in the art. An example of these tools is LiftOver (http://genome.ucsc.edu/).
In certain embodiments, the method comprises removing duplications of reads of the same DNA molecule (i.e. duplications of reads of the same cfDNA molecule). In this step, sequence reads having exactly the same sequence and start and end base pairs (i.e. the same unclipped alignment start and unclipped alignment end of the sequence) are removed, as they are likely to be duplicate sequence reads of the same sequence (i.e. duplicate of reads of the same cfDNA molecule). For example, PCR duplications can be removed as part of the aligning step, such as using Picard tools v2.1.0 (http://broadinstitute.github.io/picard).
The method comprises determining the average methylation ratio at 10 or more of the genomic regions for which the average methylation ratio has been determined, each genomic region being selected from the group consisting of:
In certain embodiments, each genomic region for which the average methylation ratio has been determined is covered by at least one sequence read of at least two characterized methylome sequences, for example at least one sequence read of at least 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, 1000, 10,000 characterized methylome sequences. Preferably each genomic region is covered by at least one sequence read of at least two characterized methylome sequences, for example at least one sequence read of at least 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, or 1000 characterized methylome sequences. In certain preferred embodiments, each genomic region is covered by at least one sequence read of at least 10 characterized methylome sequences, for example at least one sequence read of at least 10, at least 15, at least 20, at least 25, at least 50, at least 100, or at least 1000 characterized methylome sequences.
In certain embodiments, each genomic region for which the average methylation ratio has been determined is covered by at least 2 sequence reads, for example at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 50, 100, 200, 300, 400, 500, 1000, or 10,000 sequence reads. Preferably, each genomic region is covered by at least 5 sequence reads, for example at least 6, 7, 8, 9, 10, 12, 15, 20, 25, 50, 100, 200, 300, 400, 500, 1000, or 10,000 sequence reads. More preferably, each genomic region is covered by at least 10 sequence reads, for example at least 12, 15, 20, 25, 50, 100, 200, 300, 400, 500, 1000, or 10,000 sequence reads.
In embodiments wherein each genomic region for which the average methylation ratio has been determined is covered by at least 2 sequence reads (for example at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 50, 100, 200, 300, 400, 500, 1000, or 10,000 sequence reads) preferably each sequence read or the majority of the sequence reads (for example at least 50%, 60%, 70%, 80% or 90% of the sequence reads) are from different characterized methylome sequences. More preferably, each sequence read or at least 60%, 70%, 80% or 90% of the sequence reads are from different characterized methylome sequences.
In certain embodiments the method comprises determining the average methylation ratio at 12 or more genomic regions, for example 15 or more genomic regions, 20 or more genomic regions, 25 or more genomic regions, 30 or more genomic regions, 50 or more genomic regions, 75 or more genomic regions, 100 or more genomic regions, 125 or more genomic regions, 150 or more genomic regions, 200 or more genomic regions, 300 or more genomic regions, 400 or more genomic regions, 500 or more genomic regions, 600 or more genomic regions, 700 or more genomic regions, 800 or more genomic regions, 900 or more genomic regions, or 1000 genomic regions. Each genomic region may be selected from the group consisting of:
The genomic regions are preferably each different from each other. In certain preferred embodiments, the method comprises determining the average methylation ratio at 100 or more genomic regions, 125 or more genomic regions, 150 or more genomic regions, 200 or more genomic regions, 300 or more genomic regions, 400 or more genomic regions, 500 or more genomic regions, 600 or more genomic regions, 700 or more genomic regions, 800 or more genomic regions, 900 or more genomic regions, or 1000 genomic regions. Each genomic region may be selected from the group consisting of:
In certain preferred embodiments, the method comprises determining the average methylation ratio at 500 or more genomic regions, 600 or more genomic regions, 700 or more genomic regions, 800 or more genomic regions, 900 or more genomic regions, or 1000 genomic regions. Each genomic region may be selected from the group consisting of:
In certain preferred embodiments, the method comprises determining the average methylation ratio at 800 or more genomic regions, 900 or more genomic regions, or 1000 genomic regions. Each genomic region may be selected from the group consisting of:
In one embodiment, each genomic region is selected from the group consisting of:
In such embodiments, preferably the method comprises determining the average methylation ratio at 12 or more genomic regions, for example 15 or more genomic regions, 20 or more genomic regions, 25 or more genomic regions, 30 or more genomic regions, 50 or more genomic regions, 75 or more genomic regions, 100 or more genomic regions, 125 or more genomic regions, 150 or more genomic regions, 200 or more genomic regions, 300 or more genomic regions, or 400 or more genomic regions. For example, the method comprises determining the average methylation ratio at 100 or more genomic regions.
In certain embodiments, each genomic region is selected from the group consisting of:
In such embodiments, preferably the method comprises determining the average methylation ratio at 12 or more genomic regions, for example 15 or more genomic regions, 20 or more genomic regions, 25 or more genomic regions, 30 or more genomic regions, 50 or more genomic regions, 75 or more genomic regions, 100 or more genomic regions, 125 or more genomic regions, 150 or more genomic regions, 200 or more genomic regions, 300 or more genomic regions, or 400 or more genomic regions. For example, the method comprises determining the average methylation ratio at 100 or more genomic regions.
In certain embodiments, each genomic region is selected from the group consisting of:
In such embodiments, preferably the method comprises determining the average methylation ratio at 12 or more genomic regions, for example 15 or more genomic regions, 20 or more genomic regions, 25 or more genomic regions, 30 or more genomic regions, 50 or more genomic regions, 75 or more genomic regions, 100 or more genomic regions, 125 or more genomic regions, 150 or more genomic regions, 200 or more genomic regions, 300 or more genomic regions, or 400 or more genomic regions. For example, the method comprises determining the average methylation ratio at 100 or more genomic regions.
In certain preferred embodiments, each genomic region is selected from the group consisting of:
In such preferred embodiments, preferably the method comprises determining the average methylation ratio at 12 or more genomic regions, for example 15 or more genomic regions, 20 or more genomic regions, 25 or more genomic regions, 30 or more genomic regions, 50 or more genomic regions, 75 or more genomic regions, 100 or more genomic regions, 125 or more genomic regions, 150 or more genomic regions, 200 or more genomic regions, 300 or more genomic regions, or 400 or more genomic regions. For example, the method comprises determining the average methylation ratio at 100 or more genomic regions.
In certain embodiments, each genomic region is selected from the group consisting of:
In such embodiments, preferably the method comprises determining the average methylation ratio at 12 or more genomic regions, for example 15 or more genomic regions, 20 or more genomic regions, 25 or more genomic regions, 30 or more genomic regions, 50 or more genomic regions, 75 or more genomic regions, 100 or more genomic regions, 125 or more genomic regions, 150 or more genomic regions, 200 or more genomic regions, 300 or more genomic regions, or 400 or more genomic regions. For example, the method comprises determining the average methylation ratio at 100 or more genomic regions.
In one preferred embodiment, each genomic region is selected from the group consisting of:
In such embodiments, preferably the method comprises determining the average methylation ratio at 10 or more genomic regions, 12 or more genomic regions, for example 15 or more genomic regions, 20 or more genomic regions, 25 or more genomic regions, 30 or more genomic regions, 50 or more genomic regions, 75 or more genomic regions, 100 or more genomic regions, 125 or more genomic regions, 150 or more genomic regions, 200 or more genomic regions, or 250 genomic regions. For example, the method comprises determining the average methylation ratio at 10 or more genomic regions, 50 or more genomic regions or 100 or more genomic regions.
In another preferred embodiments, each genomic region is selected from the group consisting of:
In such embodiments, preferably the method comprises determining the average methylation ratio at 10 or more genomic regions, at 12 or more genomic regions, for example at 15 or more genomic regions, 20 or more genomic regions, 25 or more genomic regions, 30 or more genomic regions, 50 or more genomic regions, 75 or more genomic regions, 100 or more genomic regions, 125 or more genomic regions, or 150 genomic regions. For example, the method comprises determining the average methylation ratio at 10 or more genomic regions, 50 or more genomic regions or 100 or more genomic regions.
In another preferred embodiment, each genomic region is selected from the group consisting of:
a 100 to 200 bp region comprising or having a genomic location defined in Table 7, and a 2 to 99 bp region within a genomic location defined in Table 7 and comprising at least one CpG locus. More suitably, each genomic region is selected from the group consisting of: a 100 to 150 bp region comprising or having a genomic location defined in Table 7, and 10 to 99 bp region within a genomic location defined in Table 7 and comprising at least one CpG locus. More suitably, each genomic region is selected from the group consisting of: a 100 to 120 bp region comprising or having a genomic location defined in Table 7, and 50 to 99 bp region within a genomic location defined in Table 7 and comprising at least one CpG locus. More suitably, each genomic region is selected from the group consisting of: a 100 to 120 bp region comprising or having a genomic location defined in Table 7, and 80 to 99 bp region within a genomic location defined in Table 7 and comprising at least one CpG locus. For example, each genomic region is selected from a 100 bp region having a genomic location defined in Table 7.
In such embodiments, preferably the method comprises determining the average methylation ratio at 10 or more genomic regions, at 12 or more genomic regions, for example 15 or more genomic regions, 20 or more genomic regions, 25 or more genomic regions, 30 or more genomic regions, 50 or more genomic regions, 75 or more genomic regions, or 100 genomic regions. For example, the method comprises determining the average methylation ratio at 10 or more genomic regions, 50 or more genomic regions or 100 genomic regions.
In certain preferred embodiments, at least 25% of the genomic regions comprise, have or are within a genomic location defined in Tables 1 and/or 2. For example, at least 25% of the genomic regions comprise or have a genomic location defined in Tables 1 and/or 2.
In certain preferred embodiments, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or all of the genomic regions comprise, have or are within a genomic location defined in Tables 1 and/or 2. For example, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or all of the genomic regions comprise or have a genomic location defined in Tables 1 and/or 2.
In certain embodiments, at least 25% of the genomic regions comprise, have or are within a genomic location defined in Tables 3 and/or 4. For example, at least 25% of the genomic regions comprise or have a genomic location defined in Tables 3 and/or 4.
In certain embodiments, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or all of the genomic regions comprise, have or are within a genomic location defined in Tables 3 and/or 4. For example, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or all of the genomic regions comprise or have a genomic location defined in Tables 3 and/or 4.
In certain embodiments, at least 25% of the genomic regions comprise, have or are within a genomic location defined in Tables 1 and/or 3. For example, at least 25% of the genomic regions comprise or have a genomic location defined in Tables 1 and/or 3.
In certain embodiments, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or all of the genomic regions comprise, have or are within a genomic location defined in Tables 1 and/or 3. For example, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or all of the genomic regions comprise or have a genomic location defined in Tables 1 and/or 3.
In certain embodiments, at least 25% of the genomic regions comprise, have or are within a genomic location defined in Tables 2 and/or 4. For example, at least 25% of the genomic regions comprise or have a genomic location defined in Tables 2 and/or 4.
In certain embodiments, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or all of the genomic regions comprise, have or are within a genomic location defined in Tables 3 and/or 4. For example, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or all of the genomic regions comprise or have a genomic location defined in Tables 3 and/or 4.
In certain preferred embodiments, determining the average methylation ratio for a genomic region comprises calculating the sum of the methylation ratios of all CpGs within the genomic region and dividing the sum by the number of CpGs within the genomic region. In such embodiments, the average methylation ratio may also be referred to as the mean methylation ratio. For the avoidance of doubt, if a genomic region has only one CpG locus, the average methylation ratio for the genomic region is the same as the methylation ratio for the single CpG locus in the genomic region.
The method of the present invention comprises calculating a methylation score using the average methylation ratio for each genomic region for which the average methylation ratio has been determined.
In certain embodiments, calculating a methylation score using the average methylation ratio for each genomic region comprises:
In one preferred embodiment, calculating a methylation score using the average methylation ratio for each genomic region comprises:
In one preferred embodiment, calculating a methylation score using the average methylation ratio for each genomic region comprises:
In one preferred embodiment, calculating a methylation score using the average methylation ratio for each genomic region comprises
In very preferred embodiments wherein calculating a methylation score using the average methylation ratio for each genomic region comprises determining the median (or the mean) of the average methylation ratios for a first group of genomic regions to obtain a first methylation score and/or determining the median (or the mean) of the average methylation ratios for second group of genomic regions to obtain a second methylation score, the first group of genomic regions are all of the hypermethylated genomic regions (i.e. all hypermethylated genomic regions for which an average methylation ratio has been determined in the method, i.e. selected from those comprising, having or within a genomic location defined in Table 1 or 2), and the second group of genomic regions are all of the hypomethylated genomic regions (i.e. all hypomethylated genomic regions for which an average methylation ratio has been determined in the method, i.e. selected from those comprising, having or within a genomic location defined in Table 3 or 4, or Table 6).
In another embodiment, the first group of genomic regions are all of the genomic regions (for which the average methylation ratio has been determined) having a methylation pattern specific to prostate tissue (i.e. selected from those comprising, having or within a genomic location defined in Table 1 or 3), and the second group of genomic regions are all of the genomic regions (for which the average methylation ratio has been determined) having a methylation pattern specific to prostate cancer (i.e. selected from those comprising, having or within a genomic location defined in Table 2 or 4).
In one preferred embodiment, calculating a methylation score using the average methylation ratio for each genomic region comprises
In one preferred embodiment, calculating a methylation score using the average methylation ratio for each genomic region comprises
In one preferred embodiment, calculating a methylation score using the average methylation ratio for each genomic region comprises
In one embodiment, calculating a methylation score using the average methylation ratio for each genomic region comprises comparing the average methylation ratio at each genomic region to a reference methylation ratio for each genomic region to determine a methylation ratio score for each genomic region. In such embodiments, preferably the reference methylation ratio is the average methylation ratio for the same genomic region in or covered by:
In one preferred embodiment, the reference methylation ratio is the average methylation ratio for the same genomic region in or covered by
The method of the present invention comprises analyzing the methylation ratio scores to determine the level of prostate cancer fraction in the cfDNA sample. For example, no level (for example no detectable level) of prostate cancer fraction in the cfDNA sample may be determined. Alternatively, a level of cancer fraction in the cfDNA sample may be determined. The minimum percentage level of prostate cancer fraction in the cfDNA sample that may be determined may be 0.01% of cancer fraction in the cfDNA sample. In certain embodiments, the minimum percentage level of prostate cancer fraction in the cfDNA sample that may be determined may be 0.02%, 0.03%, 0.04%, 0.06%, 0.07%, 0.08%, 0.05%, 0.09%, 0.1%, 0.2%, 0.3%, 0.4%, 0.5%, 0.6%, 0.7%, 0.8%, 0.9%, or 1% of cancer fraction in the cfDNA sample. For example, the minimum percentage level of prostate cancer fraction in the cfDNA sample that may be determined may be 0.01%, 0.05%, 0.1% or 0.5% of cancer fraction in the cfDNA sample. Preferably, the minimum percentage level of prostate cancer fraction in the cfDNA is 0.01%.
The method comprises analyzing the methylation score to determine the level of prostate cancer fraction in the cfDNA sample.
Preferably, analyzing the methylation score to determine the level of prostate cancer fraction in the cfDNA sample comprises comparing the methylation score to one or more reference methylation scores. For example, the method may comprise comparing the methylation score to one reference methylation scores. In certain embodiments, the method comprises comparing the methylation score to two or more reference methylation scores, for example 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 20, 30, 50, 100, 200, 300, 400, 500 or 1000 reference methylation scores. In certain embodiments, the method comprises comparing the methylation score to 5 or more reference methylation scores, for example 10 or more, 15 or more, 20 or more, 30, or more 50, or more 100, or more 200, or more 300, or more 400, or more 500 or 1000 or more reference methylation scores.
In embodiments wherein the method comprises comparing the methylation score to two or more reference methylation scores, the reference methylation scores may come from different types of reference samples and/or reference methylomes (for example a cfDNA sample from a healthy subject and a cancer cell line sample) and/or the same type of reference samples or reference methylomes but from different sources (for example, two or more cfDNA samples each from a different a healthy subject).
A reference methylation score is a methylation score calculated for the same genomic regions (for example, calculated using the average methylation ratio for the same genomic regions) in a reference sample or reference methylome. A reference sample or reference methylome may be selected from the group consisting of:
a cfDNA sample from a healthy subject, for example a healthy age-matched subject;
a tissue sample from a healthy subject, for example a prostate tissue sample from a healthy subject;
a cancer biopsy sample from a cancer patient, for example a prostate cancer biopsy sample from a prostate cancer patient;
a cancer cell line sample, for example a prostate cancer cell line sample from a prostate cancer cell line;
a sample of white blood cells from a subject, for example the subject or a healthy subject;
a cfDNA sample from a different subject having prostate cancer, wherein the level of prostate cancer fraction in the cfDNA sample from the different subject is known (preferably multiple cfDNA samples (for example at least 2, 3, 4, 5, 10, 20, 40, 50 or 100 samples) each from a different subject having prostate cancer, wherein the level of prostate cancer fraction in each cfDNA sample from the different subjects is known, and preferably wherein each cfDNA sample has a different level of prostate cancer fraction);
a characterized methylome sequence of a white blood cell;
a characterized methylome sequence of a prostate cancer cell line;
a characterized methylome sequence of a cancerous prostate cell; and/or
a characterized methylome sequence of a non-cancerous prostate cell.
A reference sample or reference methylome may be one that can be used to represent a sample having 0% tumour fraction, for example a reference sample or reference methylome selected from one or more of the following
a cfDNA sample from a healthy subject, for example a healthy age-matched subject;
a sample of white blood cells from a subject, for example the subject or a healthy subject; and/or
a characterized methylome sequence of a white blood cell.
A reference sample or reference methylome may be one that can be used to represent a sample having 100% tumour fraction, for example a reference sample or reference methylome selected from one or more of the following
a cancer biopsy sample from a cancer patient, for example a prostate cancer biopsy sample from a prostate cancer patient;
a cancer cell line sample, for example a prostate cancer cell line sample from a prostate cancer cell line;
a characterized methylome sequence of a prostate cancer cell line; and/or
a characterized methylome sequence of a cancerous prostate cell.
A reference sample or reference methylome may be one that can be used to represent a sample having 10 to 90% tumour fraction, for example one or more cfDNA samples from different subjects having prostate cancer, wherein the level of prostate cancer fraction in each cfDNA sample from the different subjects is/are known. A level of prostate cancer fraction in each cfDNA sample can be determined by looking at genomic markers.
Preferably, analyzing the methylation score to determine the level of prostate cancer fraction in the cfDNA sample comprises comparing the methylation score to one or more reference methylation scores that can be used to represent a sample having 100% tumour fraction, and can be used to represent a sample having 0% tumour fraction, and optionally can be used to represent a sample having 10-90% tumour fraction. For example, analyzing the methylation score to determine the level of prostate cancer fraction in the cfDNA sample comprises:
comparing the methylation score to one or more reference methylation scores for a reference sample or reference methylome selected from the group consisting of:
Preferably, the reference methylation score for a reference sample or reference methylome that a methylation ratio score is compared to is calculated in the same way as the methylation score for the sample obtained from the subject (i.e. the sample that the method of the invention is being carried out in respect of). For example, if the methylation ratio for the selected genomic regions of the sample obtained from the subject is calculated by determining the median (or the mean) of the average methylation ratios for a first group of genomic regions to obtain a first methylation score and/or determining the median (or the mean) of the average methylation ratios for second group of genomic regions to obtain a second methylation score, the reference methylation score for a reference sample or reference methylome is calculated by determining the median (or the mean) of the average methylation ratios for the same first group of genomic regions to obtain a first reference methylation score and/or determining the median (or the mean) of the average methylation ratios for the same second group of genomic regions to obtain a second reference methylation score.
Or, for example, if the methylation ratio for the selected genomic regions of the sample obtained from the subject is calculated by determining the median (or the mean) of the average methylation ratios for all genomic regions, the reference methylation score for a reference sample or reference methylome is calculated by determining the median (or the mean) of the average methylation ratios for the same genomic regions.
In embodiments wherein the method comprises comparing the average methylation ratio at each genomic region to a reference methylation ratio for each genomic region to determine a methylation ratio score for each genomic region, analyzing the methylation ratio scores to determine the level of prostate cancer fraction in the cfDNA sample may comprise determining how many methylation ratio scores are indicative of prostate cancer fraction in the cfDNA sample.
In certain embodiments, analyzing the methylation score to determine the level of prostate cancer fraction in the cfDNA sample comprises using a mathematical model, such as a linear regression model or another linear model (for example, a general linear model, a heteroscedastic model, a generalised linear model, or a hierarchical linear model).
In certain embodiments, analyzing the methylation score to determine the level of prostate cancer fraction in the cfDNA sample comprises using a mathematical model that compares the methylation score for the sample to reference methylation scores that can be used to represent a sample having 100% tumour fraction, and can be used to represent a sample having 0% tumour fraction, and optionally can be used to represent a sample having 10-90% tumour fraction. For example, the method comprises using mathematical model that compares the methylation score for the sample to reference methylation scores for a cfDNA sample from a healthy subject, for example a healthy age-matched subject (0% tumour fraction) and/or a characterized methylome sequence of a white blood cell (0% tumour fraction) and/or a sample of white blood cells from a subject, for example the subject or a healthy subject, (0% tumour fraction) and/or a characterized methylome sequence of a prostate cancer cell line (100% tumour fraction) and/or a prostate cancer biopsy sample from a prostate cancer patient (100% tumour fraction) and/or one or more cfDNA samples (for example at least 2, 3, 4, 5, 10, 20, 40, 50, 100, 200, 300 or 500 samples) each from a different subject having prostate cancer, wherein the level of prostate cancer fraction in each cfDNA sample from the different subjects is known, and preferably wherein each cfDNA sample has a different level of prostate cancer fraction (10-90% tumour fraction).
In one embodiment, the method comprises using mathematical model that compares the methylation score for the sample to reference methylation scores for a cfDNA sample from a healthy subject, for example a healthy age-matched subject (0% tumour fraction) and/or a characterized methylome sequence of a prostate cancer cell line (100% tumour fraction) and/or a prostate cancer biopsy sample from a prostate cancer patient (100% tumour fraction) and/or one or more cfDNA samples (for example at least 2, 3, 4, 5, 10, 20, 40, 50, 100, 200, 300 or 500 samples) each from a different subject having prostate cancer, wherein the level of prostate cancer fraction in each cfDNA sample from the different subjects is known, and preferably wherein each cfDNA sample has a different level of prostate cancer fraction (10-90% tumour fraction).
The method may further comprise measuring the level of prostate-specific antigen (PSA) in a sample of blood from the subject. It may also comprise determining if the subject has an abnormal level of PSA in the blood (for example a level of PSA in the blood of at least 4.0 ng/mL). An abnormal level of PSA in the blood may be, for example, a level of PSA in the blood of at least 4.0 ng/mL). A normal level of PSA in the blood may, for example, be a level of PSA in the blood of 4.0 ng/mL or less.
In one preferred embodiment, the method is for screening, monitoring, and/or prognostication of prostate cancer, wherein prostate cancer with a poor prognosis is predicted when a level of prostate cancer is determined, for example a detectable level of prostate cancer, for example a percentage level of prostate cancer fraction of at least 0.01%. For example, a prostate cancer with a poor prognosis is predicted when at least 0.01% prostate cancer fraction is determined, or for example, at least 0.02%, at least 0.03%, at least 0.04%, at least 0.05%, at least 0.1%, at least 0.5% or at least 1% prostate cancer fraction is determined.
In some instances, a “poor” prognosis refers to a low likelihood that a subject will likely respond favorably to a drug or set of drugs, is in complete or partial remission, or there is a decrease and/or a stop in the progression of prostate cancer. In some instances, a “poor” prognosis refers to a survival of a subject that is expected to be from less than 5 years to less than 1 month. In some instances, a “poor” prognosis refers to a survival of a subject in which the survival of the subject upon treatment is expected to be from less than 5 years to less than 1 month.
In one preferred embodiment, the method is for detection of prostate cancer, wherein prostate cancer is detected when a level of prostate cancer is determined, for example a detectable level of prostate cancer, for example a percentage level of prostate cancer fraction of at least 0.01%, or for example, at least 0.02%, at least 0.03%, at least 0.04%, at least 0.05%, at least 0.1%, at least 0.5% or at least 1% prostate cancer fraction.
In one preferred embodiment, the method is for screening, monitoring, and/or prognostication of prostate cancer, wherein prostate cancer with a poor prognosis is predicted when a level of prostate cancer is determined, for example a detectable level of prostate cancer, for example a percentage level of prostate cancer fraction of at least 0.01%, for example at least 0.01% prostate cancer fraction, or for example, at least 0.02%, at least 0.03%, at least 0.04%, at least 0.05%, at least 0.1%, at least 0.5% or at least 1% prostate cancer fraction.
In one preferred embodiment, the method is for detecting, screening and/or prognostication of metastatic prostate cancer, wherein metastatic prostate cancer is predicted when a level of prostate cancer is determined, for example a detectable level of prostate cancer, for example a percentage level of prostate cancer fraction of at least 0.01%, or for example, at least 0.02%, at least 0.03%, at least 0.04%, at least 0.05%, at least 0.1%, at least 0.5% or at least 1% prostate cancer fraction.
In one preferred embodiment, the method is for selecting treatment of prostate cancer or ascertaining whether treatment is working in prostate cancer, wherein a new treatment is selected when a level of prostate cancer is determined, for example a detectable level of prostate cancer, for example a percentage level of prostate cancer fraction of at least 0.01%, or for example, at least 0.02%, at least 0.03%, at least 0.04%, at least 0.05%, at least 0.1%, at least 0.5% or at least 1% prostate cancer fraction.
In one preferred embodiment, the method is for ascertaining whether treatment of prostate cancer is working, wherein it is determined that the treatment is not working when a level of prostate cancer is determined, for example a detectable level of prostate cancer, for example a percentage level of prostate cancer fraction of at least 0.01%, or for example, at least 0.02%, at least 0.03%, at least 0.04%, at least 0.05%, at least 0.1%, at least 0.5% or at least 1% prostate cancer fraction.
The method may further comprise repeating the method on second sample obtained from the subject after the subject has undergone a treatment for prostate cancer, wherein the second sample comprises cfDNA, and comparing the level of prostate cancer fraction in each sample. Preferably, the second sample is of the same type as the first sample, for example if the first sample is a plasma sample then the second sample is a plasma sample. The invention may further comprise repeating the method on a third, and optionally a 4th, 5th, 6th 7th, 8th, 9th and/or 10th, sample obtained from the subject after the subject has undergone a treatment for prostate cancer, wherein the third, and optionally the 4th, 5th, 6th, 7th, 8th, 9th and/or 10th, sample comprises circulating free DNA (cfDNA), and comparing the level of prostate cancer fraction in each sample. Preferably, all samples are of the same type as the first sample, for example if the first sample is a plasma sample the all other samples are plasma samples.
In one preferred embodiment, the method is for monitoring of prostate cancer, wherein the method comprises repeating the method on a second sample obtained from the subject after the subject has undergone a treatment for prostate cancer, wherein the second sample comprises cfDNA, and comparing the level of prostate cancer fraction in each sample.
In one preferred embodiment, the method is for selecting treatment of prostate cancer, comprising repeating the method on a second sample obtained from the subject after the subject has undergone a treatment for prostate cancer, wherein the second sample comprises cfDNA, and comparing the level of prostate cancer fraction in each sample, wherein a new treatment is selected if the level of prostate cancer is increased in the second sample, for example an increase of at least 0.01%.
In one preferred embodiment, the method is for ascertaining whether treatment of prostate cancer is working, comprising repeating the method on a second sample obtained from the subject after the subject has undergone a treatment for prostate cancer, wherein the second sample comprises cfDNA, wherein it is determined that the treatment is not working if the level of prostate cancer is increased in the second sample, for example an increase of at least 0.01%.
In one preferred embodiment, the method is for prognostication of prostate cancer, comprising repeating the method on a second sample obtained from the subject after the subject has undergone a treatment for prostate cancer, wherein the second sample comprises cfDNA, wherein it is determined that the prognosis is poor if the level of prostate cancer is increased in the second sample, for example an increase of at least 0.01%. In one preferred embodiment, the method is for prognostication of prostate cancer, comprising repeating the method on a second sample obtained from the subject after the subject has undergone a treatment for prostate cancer, wherein the second sample comprises cfDNA, wherein it is determined that the prognosis is good if the level of prostate cancer is decreased in the second sample, for example a decrease of at least 0.01%. In some instances, a “good” prognosis refers to the likelihood that a subject will likely respond favorably to a drug or set of drugs, leading to a complete or partial remission, or a decrease and/or a stop in the progression of prostate cancer. In some instances, a “good” prognosis refers to the survival of a subject of from at least 1 month to at least 90 years. In some instances, a “good” prognosis refers to the survival of a subject in which the survival of the subject upon treatment is from at least 1 month to at least 90 years.
In certain preferred embodiments, the method of present invention comprises the additional step of obtaining a biological sample from a subject.
The methods of the invention can be used with the kits, methods of treatment, therapeutic agents for the treatment of prostate cancer, methods of determining one or more suitable therapeutic agents for the treatment of prostate cancer, methods for determining a treatment regimen, computerized (or computer implemented) methods, computer-assisted methods, computer products and/or computer implemented software described herein. Embodiments and preferred embodiments for the methods of the invention are equally applicable to the kits, methods of treatment, therapeutic agents for the treatment of prostate cancer, methods of determining one or more suitable therapeutic agents for the treatment of prostate cancer, methods for determining a treatment regimen, computerized (or computer implemented) methods, computer-assisted methods, computer products and/or computer implemented software described herein.
Methods of the Invention to Determine Whether a Sample Comprises cfDNA Derived from a Prostate Cancer Subtype
The present invention also provides a method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises cfDNA, the method comprising:
Tables 8 is provided below. The genomic locations of Table 8 are locations with reference to hg19.
The prostate cancer subtype is one that has an aggressive clinical course and/or androgen receptor (AR) copy number gain, for example an androgen-insensitive prostate cancer subtype. The prostate cancer subtype may be a subtype (i.e. one having an aggressive clinical course and/or androgen receptor (AR) copy number gain) of acinar adenocarcinoma prostate cancer, ductal adenocarcinoma prostate cancer, transitional cell cancer of the prostate, squamous cell cancer of the prostate, or small cell prostate cancer. For example, it may be a subtype (i.e. one having an aggressive clinical course and/or androgen receptor (AR) copy number gain) of acinar adenocarcinoma prostate cancer or ductal adenocarcinoma prostate cancer. Alternatively, or additionally, the prostate cancer may be castration sensitive prostate cancer or castration resistant prostate cancer. Alternatively, or additionally, the prostate cancer may be metastatic prostate cancer, or it may be non-metastatic prostate cancer. In certain embodiments, it may be metastatic prostate cancer. In certain embodiments, the prostate cancer may be metastatic castration resistant prostate cancer or non-metastatic castration resistant prostate cancer. For example, it may be metastatic castration resistant prostate cancer.
The method is especially suitable for the detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of metastatic prostate cancer and/or castration resistant prostate cancer, and particularly prostate cancers subtypes that have an aggressive clinical course and androgen receptor (AR) copy number gain, for example an androgen-insensitive prostate cancer subtype.
The sample is a sample that comprises cfDNA. The sample may suitably be a blood sample, a plasma sample, or a urine sample. Preferably, the sample is a blood sample or a plasma sample. More preferably, the sample is a plasma sample.
The method may further comprise isolating the cfDNA from the sample. cfDNA can be isolated from the sample using a variety of techniques known in the art. For example, DNA (e.g., cfDNA) can be isolated by a column-based approach and/or a bead-based approach. In some embodiments, DNA (e.g., cfDNA) is isolated by means of a column-based approach, for example using a commercially available kit such as QIAamp circulating nucleic acid kit (Qiagen qiagen.com/ch/products/discovery-and-translational-research/dna-rna-purification/dna-purification/cell-free-dna/qiaamp-circulating-nucleic-acid-kit/#orderinginformation). In some embodiments, DNA (e.g., cfDNA) is isolated by means of a bead-based approach, for example an automated cf-DNA extraction system using a commercially available kit such as Maxwell RSC ccfDNA Plasma Kit (Promega (https://www.promega.co.uk/resources/protocols/technical-manuals/101/maxwell-rsc-ccfdna-plasma-kit-protocol/)).
The isolated cfDNA may be amplified before analysis. Thus the method may further comprise amplification of the isolated cfDNA. Amplification techniques are known to those of ordinary skill in the art and include, but are not limited to, cloning, polymerase chain reaction (PCR), polymerase chain reaction of specific alleles (PASA), polymerase chain ligation, nested polymerase chain reaction, and so forth.
The method comprises characterizing the methylome sequence of a plurality of cfDNA molecules in the sample, wherein the methylome sequence of a cfDNA molecule is the DNA sequence and the methylation profile of the molecule. The methylome sequence of a cfDNA molecule may be characterised by using methylation aware sequencing, by genome sequencing followed by methylation profiling, or by targeted approaches that capture specific DNA sequences (for example using DNA probes). Examples of methylation aware sequencing include bisulfite sequencing, bisulfite-free methylation-aware sequencing, methylation arrays (for example methylation microarrays), enzymatic methylation sequencing, methylation-sensitive restriction enzyme digestion, methylation-specific PCR, methylation aware PCR based assays, methylation-dependent DNA precipitation, methylated DNA binding proteins/peptides, single molecule sequences without sodium bisulfite treatment. In certain embodiments, the methylome sequence of a plurality of cfDNA molecules in the sample is characterised using bisulfite sequencing, methylation microarrays, enzymatic methylation sequencing, bisulfite-free methylation-aware sequencing, or methylation aware PCR based assays.
Examples of targeted approaches that capture specific DNA sequences (for example using DNA probes) include cell-free methylated DNA immunoprecipitation and high-throughput sequencing (cfMeDIP-seq), methylation-dependent DNA precipitation, and methylated DNA binding proteins/peptides.
Bisulfite sequencing may comprise massive parallel sequencing with bisulfite conversion, for example treating the DNA molecule with sodium bisulfite and performing sequencing of the treated DNA molecule. Methylation assay sequencing may comprise treating the DNA molecule with sodium bisulfite, whole genome amplification, and hybridisation to a methylation-specific probe or a non-methylation probe, for example attached to a bead or chip.
Enzymatic methylation sequencing may comprise enzymatic treatment of the DNA molecule to convert methylated cytosine sites, followed by sequencing of the treated DNA. For example enzymatic methylation sequencing may comprise enzymatic treatment of the DNA molecule to convert methylated cytosine sites into a form protected from deamination, followed by deamination to convert unprotected cytosine to uracils, and sequencing of the treated DNA. An example of an enzymatic methylation sequencing kit includes NEBNext® Enzymatic Methyl-seq Kit (https://www.neb.com/products/e7120-nebnext-enzymatic-methyl-seq-kit#).
Examples of methylation aware PCR based assays include digital droplet PCR and qPCR (quantitative PCR).
An example of bisulfite-free methylation-aware sequencing is Oxford Nanopore seqeuencing (Oxford Nanopore Technologies, https://nanoporetech.com/))
In certain embodiments, the methylome sequence of a plurality of cfDNA molecules in the sample is characterised using whole genome bisulfite sequencing, for example low pass whole genome bisulfite sequencing. In another embodiment, the methylome sequence of a plurality of cfDNA molecules in the sample is characterised using reduced representation bisulfite treatments. In certain embodiments, the methylome sequence of a plurality of cfDNA molecules in the sample is characterised using methylation arrays, for example methylation microarrays, such as a Illumina Methylation Assay.
A variety of genome sequencing procedures are known in the art and may be used to practice the methods disclosed herein. For example, Sanger sequencing, Polony sequencing, 454 pyrosequencing, Combinatorial probe anchor synthesis, SOLiD sequencing, Ion Torrent semiconductor sequencing, DNA nanoball sequencing, Heliscope single molecule sequencing, Single molecule real time (SMRT) sequencing, Nanopore DNA sequencing, Microfluidic Sanger sequencing and Illumina dye sequencing.
A plurality of cfDNA molecules may be, for example, at least 100, at least 1000, at least 10,000, at least 50,000, at least 100,000, at least 500,000, at least 1,000,000 (106), at least 5,000,000 (5×106), at least 10,000,000 (107), at least 100,000,000 (108), or at least 1,000,000,000 (109). Preferably, a plurality of cfDNA molecules may be, for example, at least 10,000, at least 50,000, at least 100,000, at least 500,000, at least 1,000,000 (106), at least 5,000,000 (5×106), at least 10,000,000 (107), at least 100,000,000 (108), or at least 1,000,000,000 (109). More preferably, a plurality of cfDNA molecules may be, for example, at least 100,000, at least 500,000, at least 1,000,000 (106), at least 5,000,000 (5×106), at least 10,000,000 (107), at least 100,000,000 (108), or at least 1,000,000,000 (109).
The method may further comprise aligning the methylome sequences with a reference genome for the subject, for example by aligning the methylome sequences with hg38, hg19, hg18, hg17 or hg16. The alignment can, for example, be carried out using a variety of techniques known in the art. For example, a DNA sequence alignment tool, (e.g., BSMAP (PMID: 19635165), Bismark (PMID: 21493656), gemBS (PMID: 30137223), Arioc (PMID: 29554207), BS-Seeker2 (PMID: 24206606), MethylCoder (PMID: 21724594) or BatMeth2 (PMID: 30669962)) can be used to align the reads to the reference genome (for example hg38, hg19, hg18, hg17 or hg16).
The genomic location assigned to each methylome sequence in the alignment is based on the reference genome adopted. The genomic locations listed in Tables 1, 1b, 2 to 9 disclosed herein correspond to reference genome hg19. The corresponding locations in a different reference genome can be found using public available tools known in the art. An example of these tools is LiftOver (http://genome.ucsc.edu/).
In certain embodiments, the method comprises removing duplications of reads of the same DNA molecule (i.e. duplications of reads of the same cfDNA molecule). In this step, sequence reads having exactly the same sequence and start and end base pairs (for example the same unclipped alignment start and unclipped alignment end of the sequence) are removed, as they are likely to be duplicate sequence reads of the same sequence (i.e. duplicate of reads of the same cfDNA molecule). For example, PCR duplications can be removed as part of the aligning step, such as using Picard tools v2.1.0 (http://broadinstitute.github.io/picard).
The method comprises determining the average methylation ratio at 10 or more of the genomic regions for which the average methylation ratio has been determined, each genomic region being selected from the group consisting of:
In one preferred embodiment, the method comprises determining the average methylation ratio at 10 or more of the genomic regions for which the average methylation ratio has been determined, each genomic region being selected from the group consisting of:
a 100 to 200 bp region comprising or having a genomic location defined in Table 9, and
a 2 to 99 bp region within a genomic location defined in Table 9 and comprising at least one CpG locus,
and wherein each of the genomic regions is covered by at least one sequence read of at least one characterized methylome sequence.
In certain embodiments, each genomic region for which the average methylation ratio has been determined is covered by at least one sequence read of at least two characterized methylome sequences, for example at least one sequence read of at least 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, 1000, 10,000 characterized methylome sequences. Preferably each genomic region is covered by at least one sequence read of at least two characterized methylome sequences, for example at least one sequence read of at least 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, or 1000 characterized methylome sequences. In certain preferred embodiments, each genomic region is covered by at least one sequence read of at least 10 characterized methylome sequences, for example at least one sequence read of at least 10, at least 15, at least 20, at least 25, at least 50, at least 100, or at least 1000 characterized methylome sequences.
In certain embodiments, each genomic region for which the average methylation ratio has been determined is covered by at least 2 sequence reads, for example at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 50, 100, 200, 300, 400, 500, 1000, or 10,000 sequence reads. Preferably, each genomic region is covered by at least 5 sequence reads, for example at least 6, 7, 8, 9, 10, 12, 15, 20, 25, 50, 100, 200, 300, 400, 500, 1000, or 10,000 sequence reads. More preferably, each genomic region is covered by at least 10 sequence reads, for example at least 12, 15, 20, 25, 50, 100, 200, 300, 400, 500, 1000, or 10,000 sequence reads.
In embodiments wherein each genomic region for which the average methylation ratio has been determined is covered by at least 2 sequence reads (for example at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 50, 100, 200, 300, 400, 500, 1000, or 10,000 sequence reads) preferably each sequence read or the majority of the sequence reads (for example at least 50%, 60%, 70%, 80% or 90% of the sequence reads) are from different characterized methylome sequences. More preferably, each sequence read or at least 60%, 70%, 80% or 90% of the sequence reads are from different characterized methylome sequences.
In certain embodiments the method comprises determining the average methylation ratio at 12 or more genomic regions, for example 15 or more genomic regions, 20 or more genomic regions, 25 or more genomic regions, 30 or more genomic regions, 50 or more genomic regions, 75 or more genomic regions, 100 or more genomic regions, 125 or more genomic regions, 150 or more genomic regions, 200 or more genomic regions, 300 or more genomic regions, 400 or more genomic regions, or 500 or more genomic regions. Each genomic region may be selected from the group consisting of:
The genomic regions are preferably each different from each other. In certain preferred embodiments, the method comprises determining the average methylation ratio at 100 or more genomic regions, 125 or more genomic regions, 150 or more genomic regions, 200 or more genomic regions, 300 or more genomic regions, 400 or more genomic regions, or 500 or more genomic regions. Each genomic region may be selected from the group consisting of:
In such embodiments, preferably the method comprises determining the average methylation ratio at 12 or more genomic regions, for example 15 or more genomic regions, 20 or more genomic regions, 25 or more genomic regions, 30 or more genomic regions, 50 or more genomic regions, 75 or more genomic regions, 100 or more genomic regions, 125 or more genomic regions, 150 or more genomic regions, 200 or more genomic regions, 300 or more genomic regions, 400 or more genomic regions, or 500 or more genomic regions. For example, the method comprises determining the average methylation ratio at 100 or more genomic regions.
In certain embodiments the method comprises determining the average methylation ratio at 12 or more genomic regions, for example 15 or more genomic regions, 20 or more genomic regions, 25 or more genomic regions, 30 or more genomic regions, 50 or more genomic regions, 75 or more genomic regions, 100 or more genomic regions, 125 or more genomic regions, or 150 genomic regions. Each genomic region may be selected from the group consisting of:
The genomic regions are preferably each different from each other.
In certain embodiments, each genomic region is selected from the group consisting of:
More suitably, each genomic region is selected from the group consisting of: a 100 to 150 bp region comprising or having a genomic location defined in Table 8, and 10 to 99 bp region within a genomic location defined in Table 8 and comprising at least one CpG locus. More suitably, each genomic region is selected from the group consisting of: a 100 to 120 bp region comprising or having a genomic location defined in Table 8, and 50 to 99 bp region within a genomic location defined in Table 8 and comprising at least one CpG locus. More suitably, each genomic region is selected from the group consisting of: a 100 to 120 bp region comprising or having a genomic location defined in Table 8, and 80 to 99 bp region within a genomic location defined in Table 8 and comprising at least one CpG locus. For example, each genomic region is selected from a 100 bp region having a genomic location defined in Table 8.
In such embodiments, preferably the method comprises determining the average methylation ratio at 12 or more genomic regions, for example 15 or more genomic regions, 20 or more genomic regions, 25 or more genomic regions, 30 or more genomic regions, 50 or more genomic regions, 75 or more genomic regions, 100 or more genomic regions, 125 or more genomic regions, 150 or more genomic regions, 200 or more genomic regions, 300 or more genomic regions, or 400 or more genomic regions. For example, the method comprises determining the average methylation ratio at 100 or more genomic regions.
In certain embodiments, each genomic region is selected from the group consisting of:
More suitably, each genomic region is selected from the group consisting of: a 100 to 150 bp region comprising or having a genomic location defined in Table 9, and 10 to 99 bp region within a genomic location defined in Table 9 and comprising at least one CpG locus. More suitably, each genomic region is selected from the group consisting of: a 100 to 120 bp region comprising or having a genomic location defined in Table 9, and 50 to 99 bp region within a genomic location defined in Table 9 and comprising at least one CpG locus. More suitably, each genomic region is selected from the group consisting of: a 100 to 120 bp region comprising or having a genomic location defined in Table 9, and 80 to 99 bp region within a genomic location defined in Table 9 and comprising at least one CpG locus. For example, each genomic region is selected from a 100 bp region having a genomic location defined in Table 9.
In such embodiments, preferably the method comprises determining the average methylation ratio at 12 or more genomic regions, for example 15 or more genomic regions, 20 or more genomic regions, 25 or more genomic regions, 30 or more genomic regions, 50 or more genomic regions, 75 or more genomic regions, 100 or more genomic regions, 125 or more genomic regions, or 150 genomic regions. For example, the method comprises determining the average methylation ratio at 100 or more genomic regions.
In certain preferred embodiments, determining the average methylation ratio for a genomic region comprises calculating the sum of the methylation ratios of all CpGs within the genomic region and dividing the sum by the number of CpGs within the genomic region. In such embodiments, the average methylation ratio may also be referred to as the mean methylation ratio. For the avoidance of doubt, if a genomic region has only one CpG locus, the average methylation ratio for the genomic region is the same as the methylation ratio for the single CpG locus in the genomic region.
The method of the present invention comprises calculating a methylation score using the average methylation ratio for each genomic region for which the average methylation ratio has been determined.
In certain embodiments, calculating a methylation score using the average methylation ratio for each genomic region comprises:
In one preferred embodiment, calculating a methylation score using the average methylation ratio for each genomic region comprises:
In one preferred embodiment, calculating a methylation score using the average methylation ratio for each genomic region comprises:
In one preferred embodiment, calculating a methylation score using the average methylation ratio for each genomic region comprises
In embodiments wherein calculating a methylation score using the average methylation ratio for each genomic region comprises determining the median (or the mean) of the average methylation ratios for a first group of genomic regions to obtain a first methylation score and/or determining the median (or the mean) of the average methylation ratios for a second group of genomic regions to obtain a second methylation score, the first group of genomic regions are all of the hypermethylated genomic regions (i.e. all hypermethylated genomic regions for which an average methylation ratio has been determined in the method, i.e. selected from those comprising, having or within a genomic location defined in Table 8), and the second group of genomic regions are all of the hypomethylated genomic regions (i.e. all hypomethylated genomic regions for which an average methylation ratio has been determined in the method, i.e. selected from those comprising, having or within a genomic location defined in Table 8).
In one preferred embodiment, calculating a methylation score using the average methylation ratio for each genomic region comprises
In one embodiment, calculating a methylation score using the average methylation ratio for each genomic region comprises
In one especially preferred embodiment, calculating a methylation score using the average methylation ratio for each genomic region comprises
In one embodiment, calculating a methylation score using the average methylation ratio for each genomic region comprises comparing the average methylation ratio at each genomic region to a reference methylation ratio for each genomic region to determine a methylation ratio score for each genomic region. In such embodiments, preferably the reference methylation ratio is the average methylation ratio for the same genomic region in or covered by:
In one preferred embodiment, the reference methylation ratio is the average methylation ratio for the same genomic region in or covered by
In one preferred embodiment, the reference methylation ratio is the average methylation ratio for the same genomic region in or covered by
The method of the present invention comprises analyzing the methylation ratio scores to determine whether the sample comprises cfDNA derived from a prostate cancer subtype and/or determine the level of cfDNA in the sample that is derived from a prostate cancer subtype. For example, no level (for example no detectable level) of cfDNA derived from a prostate cancer subtype in the cfDNA sample may be determined. Alternatively, a level of cfDNA derived from a prostate cancer subtype in the cfDNA sample may be determined. The minimum percentage level of cfDNA derived from a prostate cancer subtype in the cfDNA sample that may be determined may be 0.01% of cfDNA derived from a prostate cancer subtype in the cfDNA sample. In certain embodiments, the minimum percentage level of cfDNA derived from a prostate cancer subtype in the cfDNA sample that may be determined may be 0.02%, 0.03%, 0.04%, 0.06%, 0.07%, 0.08%, 0.05%, 0.09%, 0.1%, 0.2%, 0.3%, 0.4%, 0.5%, 0.6%, 0.7%, 0.8%, 0.9%, 1%, 2%, 3% 4%, 5%, 10%, 15%, 20%, 30%, 40%, 50% of cfDNA derived from a prostate cancer subtype in the cfDNA sample. For example, the minimum percentage level of cfDNA derived from a prostate cancer subtype in the cfDNA sample that may be determined may be 0.01%, 0.05%, 0.1% or 0.5%. Preferably, the minimum percentage level of cfDNA derived from a prostate cancer subtype in the cfDNA is 0.01%.
The method comprises analyzing the methylation score to determine the level of cfDNA derived from a prostate cancer subtype in the cfDNA sample.
If level of cfDNA derived from a prostate cancer subtype in the cfDNA sample is determined, the subject can be classed as having the subtype. As such, analyzing the methylation score to determine whether there is a level of cfDNA derived from a prostate cancer subtype in the cfDNA sample may also be referred as analyzing the methylation score to determine whether a subject has a prostate cancer subtype.
Preferably, analyzing the methylation score to determine the level of cfDNA derived from a prostate cancer subtype in the cfDNA sample comprises comparing the methylation score to one or more reference methylation scores. For example, the method may comprise comparing the methylation score to one reference methylation scores. In certain embodiments, the method comprises comparing the methylation score to two or more reference methylation scores, for example 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 20, 30, 50, 100, 200, 300, 400, 500 or 1000 reference methylation scores. In certain embodiments, the method comprises comparing the methylation score to 5 or more reference methylation scores, for example 10 or more, 15 or more, 20 or more, 30, or more 50, or more 100, or more 200, or more 300, or more 400, or more 500 or 1000 or more reference methylation scores.
In embodiments wherein the method comprises comparing the methylation score to two or more reference methylation scores, the reference methylation scores may come from different types of reference samples and/or reference methylomes (for example a cfDNA sample from a healthy subject and a cancer cell line sample) and/or the same type of reference samples or reference methylomes but from different sources (for example, two or more cfDNA samples each from a different healthy subject).
A reference methylation score is a methylation score calculated for the same genomic regions (for example, calculated using the average methylation ratio for the same genomic regions) in a reference sample or reference methylome. A reference sample or reference methylome may be selected from the group consisting of:
A reference sample or reference methylome may be one that can be used to represent a sample having no cfDNA derived from the prostate cancer subtype (for example an undetectable level of cfDNA in the prostate cancer subtype in the cfDNA sample), for example a reference sample or reference methylome selected from one or more of the following
A reference sample or reference methylome may be one that can be used to represent a sample having 100% cfDNA derived from the prostate cancer subtype, for example a reference sample or reference methylome selected from one or more of the following
A reference sample or reference methylome may be one that can be used to represent a sample having 10 to 90% cfDNA derived from a prostate cancer subtype, for example one or more cfDNA samples from different subjects having prostate cancer known to have the prostate cancer subtype, wherein the level of cfDNA derived from the prostate cancer subtype in each cfDNA sample from the different subjects is/are known. A level of cfDNA derived from the prostate cancer subtype in each cfDNA sample can be determined by looking at genomic markers.
Preferably, analyzing the methylation score to determine the level of cfDNA derived from the prostate cancer subtype in the cfDNA sample comprises comparing the methylation score to one or more reference methylation scores that can be used to represent a sample having 100% cfDNA derived from the prostate cancer subtype, and can be used to represent a sample having 0% cfDNA derived from the prostate cancer subtype, and optionally can be used to represent a sample having 10-90% cfDNA derived from the prostate cancer subtype. For example, analyzing the methylation score to determine the level of cfDNA derived from a prostate cancer subtype in the cfDNA sample comprises:
comparing the methylation score to one or more reference methylation scores for a reference sample or reference methylome selected from the group consisting of:
Preferably, the reference methylation score for a reference sample or reference methylome that a methylation ratio score methylation ratio score is compared to is calculated in the same way as the methylation score for the sample obtained from the subject (i.e. the sample that the method of the invention is being carried out in respect of). For example, if the methylation ratio for the selected genomic regions of the sample obtained from the subject is calculated by determining the median (or the mean) of the average methylation ratios for a first group of genomic regions to obtain a first methylation score and/or determining the median (or the mean) of the average methylation ratios for second group of genomic regions to obtain a second methylation score, the reference methylation score for a reference sample or reference methylome is calculated by determining the median (or the mean) of the average methylation ratios for the same first group of genomic regions to obtain a first reference methylation score and/or determining the median (or the mean) of the average methylation ratios for the same second group of genomic regions to obtain a second reference methylation score.
Or, for example, if the methylation ratio for the selected genomic regions of the sample obtained from the subject is calculated by determining the median (or the mean) of the average methylation ratios for all genomic regions, the reference methylation score for a reference sample or reference methylome is calculated by determining the median (or the mean) of the average methylation ratios for the same genomic regions.
In embodiments wherein the method comprises comparing the average methylation ratio at each genomic region to a reference methylation ratio for each genomic region to determine a methylation ratio score for each genomic region, analyzing the methylation ratio scores to determine the level of cfDNA derived from the prostate cancer subtype in the cfDNA sample may comprise determining how many methylation ratio scores are indicative of the prostate cancer subtype.
In certain embodiments, analyzing the methylation score to determine the level of cfDNA derived from the prostate cancer subtype in the cfDNA sample comprises using a mathematical model, such as a linear regression model or another linear model (for example, a general linear model, a heteroscedastic model, a generalised linear model, or a hierarchical linear model).
In certain embodiments, analyzing the methylation score to determine the level of level of cfDNA derived from the prostate cancer subtype in the cfDNA sample comprises using a mathematical model that compares the methylation score for the sample to reference methylation scores that can be used to represent a sample having 100% cfDNA derived from the prostate cancer subtype in the cfDNA, and can be used to represent a sample having 0% cfDNA derived from the prostate cancer subtype in the cfDNA, and optionally can be used to represent a sample having 10-90% cfDNA derived from the prostate cancer subtype in the cfDNA. For example, the method comprises using mathematical model that compares the methylation score for the sample to reference methylation scores for a cfDNA sample from a healthy subject, for example a healthy age-matched subject (0% cfDNA derived from the prostate cancer subtype in the cfDNA) and/or a characterized methylome sequence of a white blood cell (0% cfDNA derived from the prostate cancer subtype in the cfDNA) and/or a sample of white blood cells from a subject, for example the subject or a healthy subject, (0% cfDNA derived from the prostate cancer subtype in the cfDNA sample) and/or a characterized methylome sequence of a prostate cancer cell line (100% cfDNA derived from the prostate cancer subtype in the cfDNA sample) and/or a prostate cancer biopsy sample from a prostate cancer patient (100% cfDNA derived from the prostate cancer subtype in the cfDNA sample) and/or one or more cfDNA samples (for example at least 2, 3, 4, 5, 10, 20, 40, 50, 100, 200, 300 or 500 samples) each from a different subject having prostate cancer, wherein the level of cfDNA derived from the prostate cancer subtype in each cfDNA sample from the different subjects is known, and preferably wherein each cfDNA sample has a different level of cfDNA derived from the prostate cancer subtype (10-90% cfDNA derived from the prostate cancer subtype in the cfDNA sample).
In one embodiment, the method comprises using mathematical model that compares the methylation score for the sample to reference methylation scores for a cfDNA sample from a healthy subject, for example a healthy age-matched subject (0% cfDNA derived from the prostate cancer subtype in the cfDNA sample) and/or a characterized methylome sequence of a prostate cancer cell line (100% cfDNA derived from the prostate cancer subtype in the cfDNA sample) and/or a prostate cancer biopsy sample from a prostate cancer patient (100% cfDNA derived from the prostate cancer subtype in the cfDNA sample) and/or one or more cfDNA samples (for example at least 2, 3, 4, 5, 10, 20, 40, 50, 100, 200, 300 or 500 samples) each from a different subject having prostate cancer, wherein the level of cfDNA derived from the prostate cancer subtype in the cfDNA in each cfDNA sample from the different subjects is known, and preferably wherein each cfDNA sample has a different level of cfDNA derived from the prostate cancer subtype in the cfDNA sample (10-90% cfDNA derived from the prostate cancer subtype in the cfDNA sample).
The method may further comprise measuring the level of prostate-specific antigen (PSA) in a sample of blood from the subject. It may also comprise determining if the subject has an abnormal level of PSA in the blood (for example a level of PSA in the blood of at least 4.0 ng/mL). An abnormal level of PSA in the blood may be, for example, a level of PSA in the blood of at least 4.0 ng/mL). A normal level of PSA in the blood may, for example, be a level of PSA in the blood of 4.0 ng/mL or less.
In one preferred embodiment, the method is for screening, monitoring, and/or prognostication of prostate cancer, wherein prostate cancer with a poor prognosis is predicted when a level of cfDNA derived from the prostate cancer subtype in the cfDNA sample is determined, for example a detectable level of cfDNA derived from the prostate cancer subtype in the sample, for example a percentage level of cfDNA derived from the prostate cancer subtype in the sample of at least 0.01%. For example, a prostate cancer with a poor prognosis is predicted when at least 0.01% cfDNA derived from the prostate cancer subtype in the sample is determined, or for example, at least 0.02%, at least 0.03%, at least 0.04%, at least 0.05%, at least 0.1%, at least 0.5% or at least 1% cfDNA derived from the prostate cancer subtype in the sample is determined.
In some instances, a “poor” prognosis refers to a low likelihood that a subject will likely respond favorably to a drug or set of drugs, is in complete or partial remission, or there is a decrease and/or a stop in the progression of prostate cancer. In some instances, a “poor” prognosis refers to a survival of a subject that is expected to be from less than 5 years to less than 1 month (for example less than 3 years to less than 1 month, or less than 3 years to less than 6 months). In some instances, a “poor” prognosis refers to a survival of a subject in which the survival of the subject upon treatment is expected to be from less than 5 years to less than 1 month.
In one preferred embodiment, the method is for detection of prostate cancer, wherein the prostate cancer subtype is detected when a level of cfDNA derived from a prostate cancer subtype in the cfDNA sample is determined, for example a detectable level of cfDNA derived from the prostate cancer subtype in the cfDNA sample, for example a percentage level of cfDNA derived from the prostate cancer subtype in the cfDNA sample of at least 0.01%, or for example, at least 0.02%, at least 0.03%, at least 0.04%, at least 0.05%, at least 0.1%, at least 0.5% or at least 1% cfDNA derived from the prostate cancer subtype in the cfDNA sample.
In one preferred embodiment, the method is for screening, monitoring, and/or prognostication of prostate cancer, wherein prostate cancer with a poor prognosis is predicted when a level of cfDNA derived from the prostate cancer subtype in the cfDNA sample is determined, for example a detectable level of prostate cancer, for example a percentage level of cfDNA derived from the prostate cancer subtype in the cfDNA sample of at least 0.01%, for example at least 0.01% cfDNA derived from the prostate cancer subtype in the cfDNA sample, or for example, at least 0.02%, at least 0.03%, at least 0.04%, at least 0.05%, at least 0.1%, at least 0.5% or at least 1% cfDNA derived from the prostate cancer subtype in the cfDNA sample.
In one preferred embodiment, the method is for detecting, screening and/or prognostication of metastatic prostate cancer, wherein metastatic prostate cancer is predicted when a level of cfDNA derived from the prostate cancer subtype in the cfDNA sample is determined, for example a detectable level of cfDNA derived from the prostate cancer subtype in the cfDNA sample, for example a percentage level of cfDNA derived from the prostate cancer subtype in the cfDNA sample of at least 0.01%, or for example, at least 0.02%, at least 0.03%, at least 0.04%, at least 0.05%, at least 0.1%, at least 0.5% or at least 1% cfDNA derived from the prostate cancer subtype in the cfDNA sample.
In one preferred embodiment, the method is for selecting treatment of prostate cancer or ascertaining whether treatment is working in prostate cancer, wherein a new treatment is selected when a level of cfDNA derived from the prostate cancer subtype in the cfDNA sample is determined, for example a detectable level of cfDNA derived from the prostate cancer subtype in the cfDNA sample, for example a percentage level of cfDNA derived from the prostate cancer subtype in the cfDNA sample of at least 0.01%, or for example, at least 0.02%, at least 0.03%, at least 0.04%, at least 0.05%, at least 0.1%, at least 0.5% or at least 1% cfDNA derived from the prostate cancer subtype in the cfDNA sample.
In one preferred embodiment, the method is for ascertaining whether treatment of prostate cancer is working, wherein it is determined that the treatment is not working when a level of prostate cancer is determined, for example a detectable level of cfDNA derived from the prostate cancer subtype in the cfDNA sample, for example a percentage level of cfDNA derived from the prostate cancer subtype in the cfDNA sample of at least 0.01%, or for example, at least 0.02%, at least 0.03%, at least 0.04%, at least 0.05%, at least 0.1%, at least 0.5% or at least 1% cfDNA derived from the prostate cancer subtype in the cfDNA sample.
The method may further comprising repeating the method on second sample obtained from the subject after the subject has undergone a treatment for prostate cancer, wherein the second sample comprises cfDNA, and comparing the level of cfDNA derived from the prostate cancer subtype in each sample. Preferably, the second sample is of the same type as the first sample, for example if the first sample is a plasma sample then the second sample is a plasma sample. The invention may further comprise repeating the method on a third, and optionally a 4th, 5th, 6th, 7th, 8th, 9th and/or 10th, sample obtained from the subject after the subject has undergone a treatment for prostate cancer, wherein the third, and optionally the 4th, 5th, 6th, 7th, 8th, 9th and/or 10th, sample comprises cfDNA, and comparing the level of cfDNA derived from the prostate cancer subtype in each sample. Preferably, all samples are of the same type as the first sample, for example if the first sample is a plasma sample the all other samples are plasma samples.
In one preferred embodiment, the method is for monitoring of prostate cancer, wherein the method comprises repeating the method on a second sample obtained from the subject after the subject has undergone a treatment for prostate cancer, wherein the second sample comprises cfDNA, and comparing the level of cfDNA derived from the prostate cancer subtype in each cfDNA sample.
In one preferred embodiment, the method is for selecting treatment of prostate cancer, comprising repeating the method on a second sample obtained from the subject after the subject has undergone a treatment for prostate cancer, wherein the second sample comprises cfDNA, and comparing the level of cfDNA derived from the prostate cancer subtype in each cfDNA sample, wherein a new treatment is selected if the level of prostate cancer is increased in the second sample, for example an increase of at least 0.01%.
In one preferred embodiment, the method is for ascertaining whether treatment of prostate cancer is working, comprising repeating the method on a second sample obtained from the subject after the subject has undergone a treatment for prostate cancer, wherein the second sample comprises cfDNA, wherein it is determined that the treatment is not working if the level of cfDNA derived from the prostate cancer subtype is increased in the second sample, for example an increase of at least 0.01%.
In one preferred embodiment, the method is for prognostication of prostate cancer, comprising repeating the method on second sample obtained from the subject after the subject has undergone a treatment for prostate cancer, wherein the second sample comprises cfDNA, wherein it is determined that the prognosis is poor if the level of cfDNA derived from the prostate cancer subtype is increased in the second sample, for example an increase of at least 0.01%. In one preferred embodiment, the method is for prognostication of prostate cancer, comprising repeating the method on a second sample obtained from the subject after the subject has undergone a treatment for prostate cancer, wherein the second sample comprises cfDNA, wherein it is determined that the prognosis is good if the level of cfDNA derived from the prostate cancer subtype is decreased in the second sample, for example a decrease of at least 0.01%. In some instances, a “good” prognosis refers to the likelihood that a subject will likely respond favorably to a drug or set of drugs, leading to a complete or partial remission, or a decrease and/or a stop in the progression of prostate cancer. In some instances, a “good” prognosis refers to the survival of a subject of from at least 1 month to at least 90 years. In some instances, a “good” prognosis refers to the survival of a subject in which the survival of the subject upon treatment is from at least 1 month to at least 90 years.
In certain preferred embodiments, the method of present invention comprises the additional step of obtaining a biological sample from a subject.
The methods can be used with the kits, methods of treatment, therapeutic agents for the treatment of prostate cancer, methods of determining one or more suitable therapeutic agents for the treatment of prostate cancer, methods for determining a treatment regimen, computerized (or computer implemented) methods, computer-assisted methods, computer products and/or computer implemented software described herein. Embodiments and preferred embodiments for the methods are equally applicable to the kits, methods of treatment, therapeutic agents for the treatment of prostate cancer, methods of determining one or more suitable therapeutic agents for the treatment of prostate cancer, methods for determining a treatment regimen, computerized (or computer implemented) methods, computer-assisted methods, computer products and/or computer implemented software described herein.
A further aspect, the invention provides an in-vitro diagnostic kit for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises cfDNA. Preferably, the kits of the invention comprise one or more reagents for detecting the presence or absence of at least 10 DNA molecules having a DNA sequence corresponding to all or part of a genomic location comprising at least one CpG locus defined in Tables 1 to 4.
In certain embodiments, the kit comprises DNA sampling reagents and, preferably, methylome analysis reagents, such as bisulfate reagents. In certain embodiments, the kit comprises DNA amplification agents, for example primers for amplification of specific DNA molecules, for example for amplification of at least 10 DNA molecules having a DNA sequence corresponding to all or part of a genomic location comprising at least one CpG locus defined in Tables 1 to 4.
In one preferred embodiment, the kit comprises instructions for use. In certain embodiments, the kit comprises instructions for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample using the kit. For example the kit comprises instructions for use which define how to determine the level of prostate cancer fraction in a sample comprising cfDNA from a subject, for example by following a method of the invention defined herein.
In one preferred embodiment, the kit comprises a computer product or a computer-executable software for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample using the kit. In certain embodiments, the computer product comprises a non-transitory computer readable medium storing a plurality of instructions that when executed control a computer system to perform a method of the invention. In certain embodiments, the computer-executable software comprises software for performing a method of the invention.
In certain embodiments the kit comprises of one or more containers and may also include sampling equipment, for example, bottles, bags (such as intravenous fluid bags), vials, syringes, and test tubes. Other components may include needles, diluents, wash reagents and buffers. Usefully, the kit may include at least one container comprising a pharmaceutically-acceptable buffer, such as phosphate-buffered saline, Ringer's solution and dextrose solution.
If a reagent is for detecting the presence or absence of a DNA molecule having a DNA sequence corresponding to all of a genomic location defined in Tables 1 to 4, the reagent is able to detect the presence of a DNA sequence having or comprising a genomic location defined in Tables 1 to 4. For example, the reagent is able to detect the presence of the a DNA sequence having a genomic location defined in Tables 1 to 4 or comprising a genomic location defined in Tables 1 to 4 and having a sequence length of 101 to 200 bp, for example having a sequence length of 101 to 180, a sequence length of 101 to 150, a sequence length of 101 to 140, a sequence length of 101 to 130, a sequence length of 101 to 120, or a sequence length of 101 to 110 bp.
If a reagent is for detecting the presence or absence of a DNA molecule having a DNA sequence corresponding to a part of a genomic location defined in Tables 1 to 4, the reagent is able to detect the presence of a DNA sequence comprising at least a 10 bp continuous sequence within a genomic location defined in Tables 1 to 4 and comprising at least one CpG locus. Preferably, if a reagent is for detecting the presence or absence of a DNA molecule having a DNA sequence corresponding to a part of a genomic location defined in Tables 1 to 4, the reagent is able to detect the presence of a DNA sequence comprising at least a 15 bp continuous sequence within a genomic location defined in Tables 1 to 4 and comprising at least one CpG locus, for example at least a 20, 25, 30, 35, 40, 45, 50, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95 or 99 bp continuous sequence within a genomic location defined in Tables 1 to 4 and comprising at least one CpG locus. In certain preferred embodiments, if a reagent is for detecting the presence or absence of a DNA molecule having a DNA sequence corresponding to a part of a genomic location defined in Tables 1 to 4, the reagent is able to detect the presence of a DNA sequence comprising (or consisting of) a 20, 25, 30, 35, 40, 45, 50, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95 or 99 bp continuous sequence within a genomic location defined in Tables 1 to 4 and comprising at least one CpG locus.
In certain embodiments, the kit comprises one or more reagents for detecting the presence or absence of at least 15 DNA molecules. For example, the kit comprises one or more reagents for detecting the presence or absence of 15, 20, 30, 40, 50, 75, 100, 150, 200, 250, 300, 400, 500, 600, 700, 800, 900 or 1000 DNA molecules.
In certain embodiments, the kit comprises one or more reagents for detecting the presence or absence of at least 50 DNA molecules (for example, the kit comprises one or more reagents for detecting the presence or absence of 50, 75, 100, 150, 200, 250, 300, 400, 500, 600, 700, 800, 900 or 1000 DNA molecules), at least 75 DNA molecules (for example, the kit comprises one or more reagents for detecting the presence or absence of 75, 100, 150, 200, 250, 300, 400, 500, 600, 700, 800, 900 or 1000 DNA molecules), at least 100 DNA molecules (for example, the kit comprises one or more reagents for detecting the presence or absence of 100, 150, 200, 250, 300, 400, 500, 600, 700, 800, 900 or 1000 DNA molecules), at least 150 DNA molecules (for example, the kit comprises one or more reagents for detecting the presence or absence of 150, 200, 250, 300, 400, 500, 600, 700, 800, 900 or 1000 DNA molecules), at least 250 DNA molecules (for example, the kit comprises one or more reagents for detecting the presence or absence of 250, 300, 400, 500, 600, 700, 800, 900 or 1000 DNA molecules), at least 500 DNA molecules (for example, the kit comprises one or more reagents for detecting the presence or absence of 500, 600, 700, 800, 900 or 1000 DNA molecules), at least 700 DNA molecules or at least 900 DNA molecules (for example, the kit comprises one or more reagents for detecting the presence or absence of 900 or 1000 DNA molecules).
In certain preferred embodiments, the genomic location is a location defined in Tables 1 and 2. In certain embodiments, the genomic location is a location defined in Tables 3 and 4. In certain embodiments, the genomic location is a location defined in Tables 1 and 3. In certain embodiments, the genomic location is a location defined in Tables 2 and 4.
In certain preferred embodiments, the genomic location is a location defined in Table 5. In certain preferred embodiments, the genomic location is a location defined in Table 6. In certain preferred embodiments, the genomic location is a location defined in Table 7.
In certain embodiments, the kit comprises oligonucleotides for specifically hybridizing to at least a section of the at least 10 DNA molecules having a DNA sequence corresponding to all or part of a genomic location comprising at least one CpG locus defined in Tables 1 to 4. An oligonucleotide for specifically hybridizing to at least a section of a DNA molecules may be for hybridizing to at least a 10 bp section, at least a 12 bp section, at least a 14 bp section, at least a 15 bp section, at least a 18 bp section, at least a 20 bp section of a DNA molecule, at least a 25 bp section of a DNA molecule, at least a 30 bp section of a DNA molecule or at least a 40 bp section of a DNA molecule. In certain embodiments, an oligonucleotide for specifically hybridizing to at least a section of a DNA molecule may be for hybridizing to a 10 bp section, 12 bp section, 14 bp section, 15 bp section, 18 bp section, 20 bp section, 25 bp section or 30 bp section.
An oligonucleotide for specifically hybridizing to at least a section of a DNA molecule may have a sequence of at least 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90 or 95 bp. An oligonucleotide for specifically hybridizing to at least a section of a DNA molecule may comprise not more than 100, 90, 80, or 70 bp. An oligonucleotide for specifically hybridizing to at least a section of a DNA molecule may have a sequence of 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90 or 95 bp. Preferably, an oligonucleotide for specifically hybridizing to at least a section of a DNA molecule may have a sequence of 15, 18, 19, 20, 21, 22, 23, 24, 25, 30, 40, 50, 60 or 70 bp. In certain embodiments, an oligonucleotide for specifically hybridizing to at least a section of a DNA molecule may have a sequence of 20 to 90 bp, for example 30 to 80 bp, 50 to 80 bp. In certain embodiments, an oligonucleotide for specifically hybridizing to at least a section of a DNA molecule may have a sequence of 55 to 95 bp. In certain embodiments, an oligonucleotide for specifically hybridizing to at least a section of a DNA molecule may have a sequence of 60 to 80 bp, for example a sequence of 70 bp.
In certain embodiments, the kit comprises oligonucleotides for specifically hybridizing to at least a section of at least 15, 20, 25, 30, 35, 40, 45, 50, 75, 100, 200, 250, 300, 400, 500, 600, 700, 800, or 900 DNA molecules corresponding to a genomic region having or comprising a genomic location defined in Tables 1 to 4. In certain embodiments, the kit comprises oligonucleotides for specifically hybridizing to at least a section of 15, 20, 25, 30, 35, 40, 45, 50, 75, 100, 200, 250, 300, 400, 500, 600, 700, 800, 900, or 1000 DNA molecules corresponding to a genomic region having a genomic location defined in Tables 1 to 4.
In the kits of the invention comprising oligonucleotides, preferably at least one of the oligonucleotides for specifically hybridizing to at least a section of the DNA molecules is an amplification primer. Even more preferably, each oligonucleotide for specifically hybridizing to at least a section of the DNA molecules is an amplification primer.
As the methods of the invention of the present invention are for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer, a method of the invention may be used in a method of treatment of a subject having prostate cancer and/or used with a therapeutic agent for use in the treatment of a subject having prostate cancer.
A therapeutic agent for the treatment of prostate cancer for use in the methods of treatment and uses of the present invention, as well as in the methods, kits, and other aspects of the present invention, is selected from the group consisting of a hormonal agent, a targeted agent, a biologic agent, an immunotherapy agent, a chemotherapy agent and a radionuclide agent.
A hormonal agent for the treatment of prostate cancer is selected from the group consisting of LHRH agonists (for example leuprolide, goserelin, triptorelin, or histrelin), LHRH antagonists (for example degarelix), androgen blockers (for example abiraterone or ketoconazole), anti-androgens (for example flutamide, bicalutamide, nilutamide, enzalutamide, apalutamide or darolutamide), estrogens and steroids (for example prednisone or dexamethasone).
A targeted agent for the treatment of prostate cancer is selected from the group consisting of poly(ADP-ribose) polymerase (PARP) inhibitors (for example olaparib, rucaparib, niraparib or talazoparib), epidermal growth factor receptor (EGFR) inhibitors (for example gefitinib, erlotinib, afatinib, brigatinib, icotinib, cetuximab, osimertinib, adavosertib, or lapatinib), and tyrosine kinase inhibitors (for example imatinib, gefitinib, erlotinib, or sunitinib).
A biologic agent for the treatment of prostate cancer is selected from the group consisting of monoclonal antibodies (for example pertuzumab, trastuzumab or solitomab), hormones (for example a hormonal agent selected from LHRH agonists (for example leuprolide, goserelin, triptorelin, or histrelin), LHRH antagonists (for example degarelix), androgen blockers (for example abiraterone or ketoconazole), anti-androgens (for example flutamide, bicalutamide, nilutamide, enzalutamide, apalutamide or darolutamide), and estrogens), interferons (for example interferons-α, -β, -γ), and interleukin-based products (for example interleukin-2).
An immunotherapy agent for the treatment of prostate cancer is selected from the group consisting of cancer vaccines (for example sipuleucel-T), T-cell therapies, monoclonal antibody therapies, immune checkpoint therapies (for example a PD-1 inhibitor (e.g. pembrolizumab, nivolumab, cemiplimab, or spartalizumab), PD-L1 inhibitors (e.g. atezolizumab, avelumab or durvalumab), or a CTLA-4 (e.g. ipilimumab)), and non-specific immunotherapies (for example interferons or inerleukins).
A chemotherapy agent for the treatment of prostate cancer is selected from the group consisting selected from docetaxel, cabazitaxel, and c-Met inhibitors (for example cabozantinib).
A radionuclide agent for the treatment of prostate cancer is selected from Radium223 and PSMA-labelled radionuclide (for example 225Ac-Labeled PSMA-617 or 177Lu-Labeled PSMA-617).
A therapeutic agent for the treatment of prostate cancer may be administered in amounts indicated in the Physicians' Desk Reference (PDR) or as otherwise determined by one of ordinary skill in the art.
In certain preferred embodiments, a therapeutic agent for the treatment of prostate cancer for use in the methods of treatment and uses of the present invention, as well as in the methods, kits, and other aspects of the present invention, is a hormonal agent and optionally a chemotherapy agent and/or optionally a further hormonal agent and/or optionally a targeted agent and/or optionally a radionuclide agent and/or an immunotherapy agent. For example, a hormonal agent selected from a LHRH agonist (for example leuprolide, goserelin, triptorelin, or histrelin) and a LHRH antagonist (for example degarelix), and optionally docetaxel and/or optionally a PARP inhibitor (for example olaparib, rucaparib, niraparib or talazoparib). Or, for example, a LHRH agonist (for example leuprolide, goserelin, triptorelin, or histrelin) or a LHRH antagonist (for example degarelix), and optionally a chemotherapy agent (for example docetaxel, cabazitaxel, carboplatin) and/or optionally a further hormonal treatment (for example enzalutamide, abiraterone, darolutamide) and/or optionally a radionuclide agent (Radium223 or PSMA-labelled radionuclide) and/or optionally a PARP inhibitor (for example olaparib, rucaparib, niraparib or talazoparib) and/or an immunotherapy agent (for example nivolumab, pembroluzimab, ipilumimab, durvalumab).
In embodiments wherein the prostate cancer is castration sensitive prostate cancer, preferably the therapeutic agent is a LHRH agonist (for example leuprolide, goserelin, triptorelin, or histrelin) or a LHRH antagonist (for example degarelix) and optionally a chemotherapy agent (for example docetaxel, cabazitaxel, carboplatin) and/or optionally a further hormonal treatment (for example enzalutamide, abiraterone, darolutamide) and/or optionally a radionuclide agent (Radium223 or PSMA-labelled radionuclide (for example 225Ac-Labeled PSMA-617 or 177Lu-Labeled PSMA-617)) and/or optionally a PARP inhibitor (for example olaparib, rucaparib, niraparib or talazoparib) and/or immunotherapy (for example nivolumab, pembroluzimab, ipilumimab, durvalumab).
In embodiments wherein the prostate cancer is castration resistant prostate cancer, preferably the therapeutic agent for the treatment of prostate cancer is a LHRH agonist (for example leuprolide, goserelin, triptorelin, or histrelin) or a LHRH antagonist (for example degarelix), and optionally a chemotherapy agent (for example docetaxel, cabazitaxel, carboplatin) and/or optionally a further hormonal treatment (for example enzalutamide, abiraterone, darolutamide) and/or optionally a radionuclide agent (Radium223 or a PSMA-labelled radionuclide (for example 225Ac-Labeled PSMA-617 or 177Lu-Labeled PSMA-617)) and/or optionally a PARP inhibitor (for example olaparib, rucaparib, niraparib or talazoparib) and/or immunotherapy agent (for example nivolumab, pembroluzimab, ipilumimab, durvalumab).
A non-therapeutic treatment for the treatment of prostate cancer is selected from surgery and radiotherapy. A surgical treatment of prostate cancer is selected from the group consisting of radical prostatectomy, a trans-urethral resection of the prostate, and an orchidectomy. A radiotherapy treatment of prostate cancer is selected from external beam localized radiotherapy of the prostate, external beam radiotherapy of metastatic sites.
In certain embodiments, methods of treatment of the present invention comprise treating the subject using a therapeutic agent for the treatment of prostate cancer, surgery, and/or radiotherapy. In certain embodiments, methods of treatment of the present invention comprise administering to the subject an effective amount of a therapeutic agent for the treatment of prostate cancer, and/or radiotherapy, and/or performing surgery. In certain embodiments, methods of treatment of the present invention comprise starting, ceasing or altering treatment with a therapeutic agent, or initiating a non-therapeutic treatment (e.g., surgery or radiation).
The present invention provides a method for treating prostate cancer in a subject comprising a method defined herein (for example, a method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises cfDNA as defined herein) and further comprising treating the subject using a therapeutic agent for the treatment of prostate cancer, surgery, and/or radiotherapy.
The present invention also provides a method for treating prostate cancer in a subject comprising a method defined herein (for example, a method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises cfDNA as defined herein) and further comprising administering to the subject an effective amount of a therapeutic agent for the treatment of prostate cancer, and/or radiotherapy, and/or performing surgery.
A method of treatment of the present invention is performed before and/or after a method of the invention defined herein (for example, a method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises cfDNA as defined herein).
Preferably, a method for treating prostate cancer of the present invention comprises administering to the subject an effective amount of a therapeutic agent for the treatment of prostate cancer surgery, and/or radiotherapy after a method of the invention defined herein, for example after the subject has been determined to have a level of prostate cancer fraction, or determined to have cfDNA derived from a prostate cancer subtype, based on a method as described herein. In another preferred embodiment, a method for treating prostate cancer of the present invention comprises administering to the subject an effective amount of a therapeutic agent for the treatment of prostate cancer after a method of the invention defined herein, for example after the subject has been determined to have a level of prostate cancer fraction, or determined to have cfDNA derived from a prostate cancer subtype, based on a method as described herein.
In one embodiment, a method for treating prostate cancer of the present invention comprises administering a therapeutic agent for the treatment of prostate cancer to the subject for at least 1 week, 2 weeks, 3 weeks, 4 weeks, 1 month, 2 months, 3 months, 6 months, 9 months, 12 months, 24 months or 36 months. A therapeutic agent for the treatment of prostate cancer may be administered, for example, daily, every second day, twice per week, weekly or monthly.
In one embodiment, a method for treating prostate cancer of the present invention comprises treating a subject using a therapeutic agent for the treatment of prostate cancer for at least 1 week, 2 weeks, 3 weeks, 4 weeks, 1 month, 2 months, 3 months, 6 months, 9 months, 12 months, 24 months or 36 months.
A therapeutic agent for the treatment of prostate cancer may be administered in amounts and at frequencies indicated in the Physicians' Desk Reference (PDR) or as otherwise determined by one of ordinary skill in the art.
In one preferred embodiment, a method for treating prostate cancer of the present invention comprises performing the method of the invention (for example, a method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises cfDNA as defined herein) before treating the subject, and subsequently repeating the method of the invention, for example at least 1 week, at least 2 weeks, at least 3 weeks, at least 4 weeks, at least 1 month, at least 2 months, at least 3 months, at least 6 months, at least 9 months, at least 12 months, at least 24 months or at least 36 months after starting or finishing the treatment, for example after administering to the subject an effective amount of a therapeutic agent for the treatment of prostate cancer, and/or radiotherapy, and/or performing surgery.
In another preferred embodiment, a method for treating prostate cancer of the present invention comprises performing the method (for example, a method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises cfDNA as defined herein) before treating the subject, and subsequently repeating the method, for example at least 1 week, at least 2 weeks, at least 3 weeks, at least 4 weeks, at least 1 month, at least 2 months, at least 3 months, at least 6 months, at least 9 months, at least 12 months, at least 24 months or at least 36 months after performing the first method of the invention.
In embodiments comprising repeating the method (for example, repeating the method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises cfDNA as defined herein), the method may be repeated once, or it may be repeated multiple times, for examples 2, 3, 4, 5, 6 or more times.
In embodiments comprising repeating the method (for example, repeating the method for detecting, screening, monitoring, staging, classification, selecting treatment, ascertaining whether treatment is working, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises cfDNA as defined herein), after the subsequent method(s) is performed, the method may further comprise continuing to treat the subject with the therapeutic agent for the treatment of prostate cancer if the level of prostate cancer tumour fraction is the same or substantially the same in the initial and subsequent method(s) or lower in the subsequent method(s) than in the initial method.
In embodiments comprising repeating the method (for example, repeating the method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises cfDNA as defined herein), after the subsequent method(s) is performed, the method may further comprise:
ceasing or altering (e.g. changing the dose or frequency of the dosing) treatment with the therapeutic agent for the treatment of prostate cancer; and/or
initiating treatment with a second or further therapeutic agent for the treatment of prostate cancer; and/or
initiating a non-therapeutic agent treatment (e.g., surgery or radiation),
if the level of prostate cancer tumour fraction is substantially the same in the initial and subsequent method or higher in the subsequent method than in the initial method; or
if the sample comprises cfDNA derived from a prostate cancer subtype and/or the sample comprises a level of cfDNA derived from a prostate cancer subtype that is substantially the same in the initial and subsequent method or higher in the subsequent method than in the initial method.
The invention further provides a method of treating a subject in need of treatment with a therapeutic agent for the treatment of prostate cancer, comprising
i) performing a method of the invention (for example, a method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises cfDNA as defined herein) to determine the level of prostate cancer tumour fraction in the subject;
ii) administering a therapeutic agent for the treatment of prostate cancer if the subject has a level of prostate cancer tumour fraction or if the sample comprises cfDNA derived from a prostate cancer subtype and/or if the sample comprises a level of cfDNA derived from a prostate cancer subtype (for example 0.01% or more, more preferably 0.02% or more, more preferably 0.03% or more, more preferably 0.04% or more, for example 0.05% or more, 0.1% or more, 0.5% or more, or 1% or more cfDNA derived from a prostate cancer subtype.
In certain embodiments, the method of treating a subject comprises administering a therapeutic agent for the treatment of prostate cancer if the subject has a detectable level of prostate cancer tumour DNA, for example 0.01% or more, 0.02% or more, 0.03% or more, 0.04% or more, 0.05% or more, 0.1% or more, 0.5% or more, or 1% or more, prostate cancer fraction.
In certain embodiments the method further comprises administering a second therapeutic agent for the treatment of prostate cancer if the subject has a level of prostate cancer fraction (for example a detectable level of prostate cancer fraction, for example 0.01% or more, 0.02% or more, 0.03% or more, 0.04% or more, 0.05% or more, 0.1% or more, 0.5% or more, or 1% or more, prostate cancer fraction). In one preferred embodiment, the method further comprises administering a second therapeutic agent for the treatment of prostate cancer if the subject has a level of prostate cancer fraction 0.01% or more, more preferably 0.02% or more, more preferably 0.03% or more, more preferably 0.04% or more, for example 0.05% or more, 0.1% or more, 0.5% or more, or 1% or more, prostate cancer fraction.
In certain embodiments, the method of treating a subject comprises
(iii) at least 1 week, at least 2 weeks, at least 3 weeks, at least 4 weeks, at least 1 month, at least 2 months, at least 3 months, at least 6 months, at least 9 months, at least 12 months, at least 24 months, or at least 36 months, after the administration of the therapeutic agent, a further sample comprising cfDNA is obtained from the subject, and the method of the invention (for example, a method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises cfDNA as defined herein) is performed to determine the level of prostate cancer fraction in the further sample.
The invention also provides a therapeutic agent for the treatment of prostate cancer, for use in the treatment of prostate cancer, wherein
i) a method of the invention (for example, a method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises cfDNA as defined herein) is performed to determine the level of prostate cancer prostate cancer fraction in a subject;
ii) the therapeutic agent is administered if the subject has a level of prostate cancer.
In certain embodiments, the therapeutic agent for use in the treatment of prostate cancer is one for use in a treatment that comprises administering a therapeutic agent for the treatment of prostate cancer if the subject has a detectable level of prostate cancer tumour DNA, for example 0.01% or more, 0.02% or more, 0.03% or more, 0.04% or more, 0.05% or more, 0.1% or more, 0.5% or more, or 1% or more, prostate cancer fraction.
In certain embodiments, the therapeutic agent for use in the treatment of prostate cancer is one for use in a treatment that comprises administering a second therapeutic agent for the treatment of prostate cancer if the subject has a level of prostate cancer fraction (for example a detectable level of prostate cancer fraction, for example 0.01% or more, 0.02% or more, 0.03% or more, 0.04% or more, 0.05% or more, 0.1% or more, 0.5% or more, or 1% or more, prostate cancer fraction). In one preferred embodiment, the therapeutic agent for use in the treatment of prostate cancer is one for use in a treatment that comprises administering a second therapeutic agent for the treatment of prostate cancer if the subject has a level of prostate cancer fraction 0.01% or more, more preferably 0.02% or more, more preferably 0.03% or more, more preferably 0.04% or more, for example 0.05% or more, 0.1% or more, 0.5% or more, or 1% or more, prostate cancer fraction.
In certain embodiments, the therapeutic agent for use in the treatment of prostate cancer is one for use in a treatment in which
(iii) at least 1 week, at least 2 weeks, at least 3 weeks, at least 4 weeks, at least 1 month, at least 2 months, at least 3 months, at least 6 months, at least 9 months, at least 12 months, at least 24 months, or at least 36 months, after the administration of the therapeutic agent, a further sample comprising cfDNA is obtained from the subject, and the method of the invention (for example, a method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises cfDNA as defined herein) is performed to determine the level of prostate cancer fraction in the further sample.
The present invention also provides a method of determining one or more suitable therapeutic agents for the treatment of prostate cancer in a subject having prostate cancer comprising
In certain embodiments, no level of prostate cancer tumour is no detectable level of prostate cancer. In certain embodiments, a level of prostate cancer tumour is a detectable level of prostate cancer, for example 0.01% or more, 0.02% or more, 0.03% or more, 0.04% or more, 0.05% or more, 0.1% or more, 0.5% or more, or 1% or more, prostate cancer fraction. In certain embodiments, a level of prostate cancer tumour is a detectable level of prostate cancer. In certain embodiments, a level of prostate cancer fraction 0.01% or more, more preferably 0.02% or more, more preferably 0.03% or more, more preferably 0.04% or more, for example 0.05% or more, 0.1% or more, 0.5% or more, or 1% or more, prostate cancer fraction.
In certain embodiments, the method of determining one or more suitable therapeutic agents for the treatment of prostate cancer for a subject having prostate cancer comprises
In certain embodiments, the method of determining one or more suitable therapeutic agents for the treatment of prostate cancer for a subject having prostate cancer comprises
The present invention also provides a method of determining a suitable treatment regimen for a subject having prostate cancer comprising:
In certain embodiments, no level of prostate cancer tumour is no detectable level of prostate cancer. In certain embodiments, a percentage level of prostate cancer tumour is a detectable level of prostate cancer, for example 0.01% or more, 0.02% or more, 0.03% or more, 0.04% or more, 0.05% or more, 0.1% or more, 0.5% or more, or 1% or more, prostate cancer fraction. In certain embodiments, a level of prostate cancer tumour is a detectable level of prostate cancer. In certain embodiments, a percentage level of prostate cancer fraction 0.01% or more, more preferably 0.02% or more, more preferably 0.03% or more, more preferably 0.04% or more, for example 0.05% or more, 0.1% or more, 0.5% or more, or 1% or more, prostate cancer fraction.
In certain embodiments, a standard treatment is a treatment with a therapeutic agent for the treatment of prostate cancer, and a non-standard treatment is a treatment with two or more therapeutic agents for the treatment of prostate cancer.
In certain embodiments, a standard treatment is a treatment with a hormonal agent for the treatment of prostate cancer, and a non-standard treatment is a treatment with a hormonal agent for the treatment of prostate cancer, and a chemotherapeutic agent for the treatment of prostate cancer and/or a immunotherapy treatment of prostate cancer and/or a targeted treatment of prostate cancer and/or a biologic agent treatment of prostate cancer and/or a radionuclide agent treatment.
In certain embodiments, the method of determining a suitable treatment regimen for a subject having prostate cancer comprising
In certain embodiments, the method of determining a suitable treatment regimen for a subject having prostate cancer comprising
In certain embodiments, the method of determining a suitable treatment regimen for a subject having prostate cancer comprising
In certain embodiments, the method of determining a suitable treatment regimen for a subject having prostate cancer comprising
In certain embodiments, the method of determining a suitable treatment regimen for a subject having prostate cancer comprising
In certain embodiments, the method of determining a suitable treatment regimen for a subject having prostate cancer comprising
In certain embodiments, the method of determining a suitable treatment regimen for a subject having prostate cancer comprising
performing a method of invention;
determining the treatment regimen by reference to the level of prostate cancer fraction, whereby a standard treatment is suitable for a subject having a percentage level of prostate cancer fraction of less than 1%, and a non-standard treatment is suitable for a subject with a percentage level of prostate cancer fraction of 1% or more.
The present invention also provides a method of performing a method of invention (for example, a method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises cfDNA as defined herein, to determine whether the sample comprises cfDNA derived from a prostate cancer subtype);
determining the one or more suitable therapeutic agents for the treatment of prostate cancer by reference to whether the sample comprises cfDNA derived from a prostate cancer subtype and/or the level of cfDNA in the sample that is derived from a prostate cancer subtype, whereby one therapeutic agent is suitable for a subject with a sample having no cfDNA derived from a prostate cancer subtype (for example an undetectable level of cfDNA derived from a prostate cancer subtype) or a level of cfDNA derived from a prostate cancer subtype of less than 0.01%, and two or more therapeutic agents are suitable for a subject with a level of cfDNA derived from a prostate cancer subtype (for example a percentage level of cfDNA derived from a prostate cancer subtype of at least 0.01%);
or whereby a therapeutic agent selected from a first list of therapeutic agents is suitable for a subject with a sample having no cfDNA derived from a prostate cancer subtype (for example an undetectable level of cfDNA derived from a prostate cancer subtype) or a level of cfDNA derived from a prostate cancer subtype of less than 0.01%, and a therapeutic agent from a second list of therapeutic agents, or two or more therapeutic agents from the first list, is suitable for a subject with a level of cfDNA derived from a prostate cancer subtype (for example a percentage level of cfDNA derived from a prostate cancer subtype of at least 0.01%).
In certain embodiments, no cfDNA derived from a prostate cancer subtype is no detectable cfDNA derived from a prostate cancer subtype. In certain embodiments, a percentage level of cfDNA derived from a prostate cancer subtype is a detectable level of cfDNA derived from a prostate cancer subtype, for example 0.01% or more, 0.02% or more, 0.03% or more, 0.04% or more, 0.05% or more, 0.1% or more, 0.5% or more, or 1% or more, prostate cancer fraction. In certain embodiments, a level of cfDNA derived from a prostate cancer subtype is a detectable level of prostate cancer. In certain embodiments, a percentage level of cfDNA derived from a prostate cancer subtype is 0.01% or more, more preferably 0.02% or more, more preferably 0.03% or more, more preferably 0.04% or more, for example 0.05% or more, 0.1% or more, 0.5% or more, or 1% or more, prostate cancer fraction.
The present invention also provides a method of determining a suitable treatment regimen for a subject having prostate cancer comprising:
performing a method of invention (for example, a method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises cfDNA as defined herein, to determine whether the sample comprises cfDNA derived from a prostate cancer subtype);
determining the treatment regimen by reference whether the sample comprises cfDNA derived from a prostate cancer subtype and/or the level of cfDNA in the sample that is derived from a prostate cancer subtype, whereby a standard treatment is suitable for a subject with a sample having no cfDNA derived from a prostate cancer subtype (for example an undetectable level of cfDNA derived from a prostate cancer subtype) or a percentage level of cfDNA derived from a prostate cancer subtype of less than 0.01%, and a non-standard treatment is suitable for a subject with a level cfDNA derived from a prostate cancer subtype (for example a detectable level of prostate cancer fraction) or a percentage level of prostate cancer fraction of at least 0.01%.
In certain embodiments, the method of determining a suitable treatment regimen for a subject having prostate cancer comprising
performing a method of invention;
determining the treatment regimen by reference whether the sample comprises cfDNA derived from a prostate cancer subtype and/or the level of cfDNA in the sample that is derived from a prostate cancer subtype, whereby a standard treatment is suitable for a subject with a sample having no cfDNA derived from a prostate cancer subtype (for example an undetectable level of cfDNA derived from a prostate cancer subtype) or a percentage level of cfDNA derived from a prostate cancer subtype of less than 0.1%, and a non-standard treatment is suitable for a subject with a level cfDNA derived from a prostate cancer subtype (for example a detectable level of prostate cancer fraction) or a percentage level of prostate cancer fraction of at least 0.1%.
In certain embodiments, the method of determining a suitable treatment regimen for a subject having prostate cancer comprising
performing a method of invention;
determining the treatment regimen by reference whether the sample comprises cfDNA derived from a prostate cancer subtype and/or the level of cfDNA in the sample that is derived from a prostate cancer subtype, whereby a standard treatment is suitable for a subject with a sample having no cfDNA derived from a prostate cancer subtype (for example an undetectable level of cfDNA derived from a prostate cancer subtype) or a percentage level of cfDNA derived from a prostate cancer subtype of less than 1%, and a non-standard treatment is suitable for a subject with a level cfDNA derived from a prostate cancer subtype (for example a detectable level of prostate cancer fraction) or a percentage level of prostate cancer fraction of at least 1%.
In certain embodiments, a standard treatment is a treatment with a therapeutic agent for the treatment of prostate cancer, and a non-standard treatment is a treatment with two or more therapeutic agents for the treatment of prostate cancer.
In certain embodiments, a standard treatment is a treatment with a hormonal agent for the treatment of prostate cancer, and a non-standard treatment is a treatment with a hormonal agent for the treatment of prostate cancer, and a chemotherapeutic agent for the treatment of prostate cancer and/or a immunotherapy treatment of prostate cancer and/or a targeted treatment of prostate cancer and/or a biologic agent treatment and/or a radionuclide agent treatment of prostate cancer.
The invention also provides a computerized (or computer implemented) method and/or computer-assisted method and/or a computer product and/or a computer implemented software for performing or implementing the method defined herein, for example the method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject described herein, the methods of treatment and therapeutic agents for use described herein, the methods of determining one or more suitable therapeutic agents for the treatment of prostate cancer described herein, the methods for determining a treatment regimen described herein, and the methods of determining a solid cancer cfDNA methylome signature. A kit of the invention may comprise a computerized method and/or computer-assisted method and/or a computer product and/or a computer implemented software of the present invention.
A computerized method and/or computer-assisted method and/or a computer product and/or a computer implemented software for performing or implementing a method defined herein comprises performing one or more steps of the relevant method, or in certain embodiments, comprises performing the relevant method. A computerized (or computer implemented) method and/or computer-assisted method and/or a computer implemented software can control a computer product to execute, perform or implement one or more steps of the relevant method, or in certain embodiments, comprises performing the relevant method.
In certain embodiments, the present invention provides a computer product. A computer product of the present invention has the means for performing or implementing one or more method described herein.
In some embodiments, a computer product of the present invention comprises at least one memory containing at least one computer program or software adapted to control the operation of the computer system to perform or implement a method that includes receiving and characterizing DNA methylation data e.g., receiving and characterizing methylome sequences of a plurality of cfDNA molecules and determining the average methylation ratio at 10 or more genomic regions, and at least one processor for executing the computer program or software.
In some embodiments, a computer product of the present invention comprises a non-transitory computer readable medium storing a plurality of instructions that, when executed, control a computer system to perform one or more steps of a method described herein or comprises performing or implementing a method described herein.
In certain embodiments, a computer product is a product having a computer, where the computer comprises a computer-readable medium embodying software to operate the computer. In some cases, the computer system includes one or more general or special purpose processors and associated memory, including volatile and non-volatile memory devices. In some cases, the computer product memory stores software or computer programs for controlling the operation of the computer system to make a special purpose system according to the invention or to implement a system to perform the methods according to the invention. In some cases, the computer system includes a single or multi-core central processing unit (CPU), an ARM processor or similar computer processor for processing the data. In some cases, the CPU or microprocessor is any conventional general purpose single- or multi-chip microprocessor, a RISC or MISS processor, a Power PC processor, or an ALPHA processor. In some cases, the microprocessor is any conventional or special purpose microprocessor such as a digital signal processor or a graphics processor. The microprocessor typically has conventional address lines, conventional data lines, and one or more conventional control lines. The software or computer program may be executed on dedicated system or on a general purpose computer having, for example, a Windows, Unix, Linux or other operating system. In some instances, the system includes non-volatile memory, such as disk memory and solid state memory for storing computer programs, software and data and volatile memory, such as high speed ram for executing programs and software.
In certain embodiments, a computer product is a storage device used for storing data accessible by a computer, as well as any other means for providing access to data by a computer. Examples of a storage device-type computer-readable medium include: a magnetic hard disk; an optical disk, such as a CD-ROM and a DVD; a magnetic tape; a memory chip. Examples of a computer-readable physical storage media include any physical computer-readable storage medium, e.g., solid state memory (such as flash memory), magnetic and optical computer-readable storage media and devices, and memory that uses other persistent storage technologies. In certain embodiments, a computer product is computer readable media selected from the group consisting of RAM (random access memory), ROM (read only memory), EPROM (erasable programmable read only memory), EEPROM (electrically erasable programmable read only memory), flash memory or other memory technology, CD-ROM (compact disc read only memory), DVDs (digital versatile disks) or other optical storage media, magnetic cassettes, magnetic tape, and magnetic disk storage.
In one preferred embodiment, the present invention provides a computerized (or computer implemented) method and/or computer-assisted method and/or a computer product and/or computerized (or computer implemented) software for detection, screening, monitoring, staging, classification and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises cfDNA, the method comprising:
receiving a data set in a computer comprising a processor and a computer readable medium, wherein the data set comprises the methylome sequence of a plurality of cfDNA molecules in the sample; and wherein the computer readable medium comprises instructions that, when executed by the processors, causes the computer to perform or implement a method of the invention.
For example, in one embodiment it causes the computer to perform or implement a method comprising the following steps:
characterize the methylome sequence of a plurality of cfDNA molecules in the sample, wherein the methylome sequence of a cfDNA molecule is the DNA sequence and the methylation profile of the molecule;
determine the average methylation ratio at 10 or more genomic regions, each genomic region being selected from the group consisting of:
a 100 to 200 bp region comprising or having a genomic location defined in Tables 1 to 4, and
a 2 to 99 bp region within a genomic location defined in Tables 1 to 4 and comprising at least one CpG locus,
and wherein each genomic region is covered by at least one sequence read of at least one characterized methylome sequence;
calculate a methylation score using the average methylation ratio for each genomic region;
analyse the methylation score to determine the level of prostate cancer fraction in the cfDNA sample.
For example, in one embodiment it causes the computer to perform or implement a method comprising the following steps:
characterize the methylome sequence of a plurality of cfDNA molecules in the sample, wherein the methylome sequence of a cfDNA molecule is the DNA sequence and the methylation profile of the molecule;
determine the average methylation ratio at 10 or more genomic regions, each genomic region being selected from the group consisting of:
a 100 to 200 bp region comprising or having a genomic location defined in Table 8, and
a 2 to 99 bp region within a genomic location defined in Table 8 and comprising at least one CpG locus,
and wherein each of the genomic regions is covered by at least one sequence read of at least one characterized methylome sequence;
calculate a methylation score using the average methylation ratio for each of the genomic regions;
analyze the methylation score to determine whether the sample comprises cfDNA derived from a prostate cancer subtype.
In one preferred embodiment, the present invention provides a computerized (or computer implemented) method and/or computer-assisted method and/or a computer product method and/or computerized (or computer implemented) software for classifying a prostate cancer patient into one or more of a plurality of treatment categories, the method comprising determining the level of prostate cancer fraction in a sample obtained from a subject, wherein the sample comprises cfDNA, the method comprising:
receiving a data set in a computer comprising a processor and a computer readable medium, wherein the data set comprises the methylome sequence of a plurality of cfDNA molecules in a sample obtained from a subject, wherein the sample comprises cfDNA;
and wherein the computer readable medium comprises instructions that, when executed by the processors, causes the computer to perform or implement a method of the invention.
For example, in one embodiment it causes the computer to perform or implement a method comprising the following steps:
characterize the methylome sequence of a plurality of cfDNA molecules in the sample, wherein the methylome sequence of a cfDNA molecule is the DNA sequence and the methylation profile of the molecule;
determine the average methylation ratio at 10 or more genomic regions, each genomic region being selected from the group consisting of:
a 100 to 200 bp region comprising or having a genomic location defined in Tables 1 to 4, and
a 2 to 99 bp region within a genomic location defined in Tables 1 to 4 and comprising at least one CpG locus,
and wherein each genomic region is covered by at least one sequence read of at least one characterized methylome sequence;
calculate a methylation score using the average methylation ratio for each genomic region;
analyse the methylation score to determine the level of prostate cancer fraction in the cfDNA sample.
For example, in one embodiment it causes the computer to perform or implement a method comprising the following steps:
characterize the methylome sequence of a plurality of cfDNA molecules in the sample, wherein the methylome sequence of a cfDNA molecule is the DNA sequence and the methylation profile of the molecule;
determine the average methylation ratio at 10 or more genomic regions, each genomic region being selected from the group consisting of:
a 100 to 200 bp region comprising or having a genomic location defined in Table 8, and
a 2 to 99 bp region within a genomic location defined in Table 8 and comprising at least one CpG locus,
and wherein each of the genomic regions is covered by at least one sequence read of at least one characterized methylome sequence;
calculate a methylation score using the average methylation ratio for each of the genomic regions;
analyze the methylation score to determine whether the sample comprises cfDNA derived from a prostate cancer subtype.
In another preferred embodiment, the present invention provides a computerized (or computer implemented) method and/or computer-assisted method and/or a computer product method and/or computerized (or computer implemented) software for classifying a prostate cancer patient into one or more of a plurality of treatment categories, the method comprising determining the subtype of prostate cancer a sample obtained from a subject, wherein the sample comprises cfDNA, the method comprising:
receiving a data set in a computer comprising a processor and a computer readable medium, wherein the data set comprises the methylome sequence of a plurality of cfDNA molecules in a sample obtained from a subject, wherein the sample comprises cfDNA;
and wherein the computer readable medium comprises instructions that, when executed by the processors, causes the computer to perform or implement a method of the invention.
For example, in one embodiment it causes the computer to perform or implement a method comprising the following steps:
characterize the methylome sequence of a plurality of cfDNA molecules in the sample, wherein the methylome sequence of a cfDNA molecule is the DNA sequence and the methylation profile of the molecule;
determine the average methylation ratio at 10 or more genomic regions, each genomic region being selected from the group consisting of:
a 100 to 200 bp region comprising or having a genomic location defined in Tables 1 to 4, and
a 2 to 99 bp region within a genomic location defined in Tables 1 to 4 and comprising at least one CpG locus,
and wherein each genomic region is covered by at least one sequence read of at least one characterized methylome sequence;
calculate a methylation score using the average methylation ratio for each genomic region;
analyse the methylation score to determine the level of prostate cancer fraction in the cfDNA sample.
For example, in one embodiment it causes the computer to perform or implement a method comprising the following steps:
characterize the methylome sequence of a plurality of cfDNA molecules in the sample, wherein the methylome sequence of a cfDNA molecule is the DNA sequence and the methylation profile of the molecule;
determine the average methylation ratio at 10 or more genomic regions, each genomic region being selected from the group consisting of:
a 100 to 200 bp region comprising or having a genomic location defined in Table 8, and
a 2 to 99 bp region within a genomic location defined in Table 8 and comprising at least one CpG locus,
and wherein each of the genomic regions is covered by at least one sequence read of at least one characterized methylome sequence;
calculate a methylation score using the average methylation ratio for each of the genomic regions;
analyze the methylation score to determine whether the sample comprises cfDNA derived from a prostate cancer subtype.
In one embodiment, the plurality of treatment categories are selected from a hormonal agent, a targeted agent, a biologic agent, an immunotherapy agent, and a chemotherapy agent.
In one embodiment, the plurality of treatment categories are selected from a treatment with a single agent (for example a hormonal agent, a targeted agent, a biologic agent, an immunotherapy agent, and a chemotherapy agent); and treatment with a combination of agents (for example, a combination of two or more agents selected from the group consisting of a hormonal agent, a targeted agent, a biologic agent, an immunotherapy agent, and a chemotherapy agent).
In one preferred embodiment, the plurality of treatment categories are selected from a treatment with a single agent (for example a hormonal agent, a targeted agent, a biologic agent, an immunotherapy agent, and a chemotherapy agent); and treatment with a combination of two, three, four of five agents (for example, a combination of two, three, four of five agents selected from the group consisting of a hormonal agent, a targeted agent, a biologic agent, an immunotherapy agent, and a chemotherapy agent).
For example, the plurality of treatment categories are selected from a hormonal agent; and a hormonal agent and a chemotherapeutic agent and/or a further hormonal agent.
In one preferred embodiment, the plurality of treatment categories are selected from a standard treatment (for example a treatment with a hormonal agent); and a non-standard treatment (for example a hormonal agent for the treatment of prostate cancer, and a chemotherapeutic agent for the treatment of prostate cancer and/or a immunotherapy treatment of prostate cancer and/or a targeted treatment of prostate cancer and/or a biologic agent treatment of prostate cancer).
In certain embodiments, a computerized (or computer implemented) method and/or computer-assisted method and/or a computer product and/or a computer implemented software described herein further comprises treating the subject for prostate cancer using a therapeutic agent for the treatment of prostate cancer;
or ceasing or altering treatment with a therapeutic agent for the treatment of prostate cancer; or initiating a non-therapeutic agent treatment for prostate cancer (for example initiation of treatment by surgery or radiation).
In another preferred embodiment, the present invention provides a computerized (or computer implemented) method and/or computer-assisted method and/or a computer product and/or computerized (or computer implemented) software for determining a solid cancer cfDNA methylome signature for use in the detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of the solid cancer, the method comprising:
receiving a data set in a computer comprising a processor and a computer readable medium, wherein the data set comprises the methylome sequence of a plurality of cfDNA molecules in a sample from a subject known to have the solid cancer;
and wherein the computer readable medium comprises instructions that, when executed by the processors, causes the computer to to perform or implement a method comprising the following steps:
(i) characterize the methylome sequence of a plurality of cfDNA molecules in a first sample comprising cfDNA from a subject known to have the solid cancer, wherein the methylome sequence of a cfDNA molecule is the DNA sequence and the methylation profile of the molecule;
(ii) determine the respective number of characterised cfDNA molecules corresponding to a CpG locus or a genomic region of 2 to 10,000 bp (preferably 2 to 200 bp) in the first sample by aligning the methylome sequences;
(iii) determine the methylation ratio of each CpG locus and/or average methylation ratio of each genomic region of 2 to 10,000 bp (preferably 2 to 200 bp) in the first sample;
repeating steps (i) to (iii) for one or more further samples comprising cfDNA each from subjects known to have the solid cancer;
perform a variance analysis of all or a selection of the methylation ratios of the CpG loci and/or all or a selection of average methylation ratios of the genomic regions of the samples;
select a group of CpG loci and/or genomic regions associated with a feature of the samples; and
select CpG loci and/or genomic regions in the group to provide the cfDNA methylome signature.
Prostate Cancer cfDNA Methylome Signatures
The invention also provides a cfDNA methylome signature comprising a set of genomic locations defining 10 or more genomic regions.
In one embodiment, the set of genomic locations defining 10 or more genomic regions are genomic locations selected from the group consisting of:
In such embodiments, preferably the signature comprises a set of genomic locations defining 12 or more genomic regions, for example a set of genomic locations defining 15 or more genomic regions, a set of genomic locations defining 20 or more genomic regions, a set of genomic locations defining 25 or more genomic regions, a set of genomic locations defining 30 or more genomic regions, a set of genomic locations defining 50 or more genomic regions, a set of genomic locations defining 75 or more genomic regions, a set of genomic locations defining 100 or more genomic regions, a set of genomic locations defining 125 or more genomic regions, a set of genomic locations defining 150 or more genomic regions, a set of genomic locations defining 200 or more genomic regions, a set of genomic locations defining 300 or more genomic regions, a set of genomic locations defining 400 or more genomic regions, a set of genomic locations defining 500 or more genomic regions, a set of genomic locations defining 600 or more genomic regions, a set of genomic locations defining 700 or more genomic regions, a set of genomic locations defining 800 or more genomic regions, a set of genomic locations defining 900 or more genomic regions, or a set of genomic locations defining 1000 genomic regions.
The signature is for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, prognostication and/or treatment of prostate cancer. The methylation state (for example the average methylation ratio) of the genomic regions defined by the set of genomic locations of the signature may be used to indicate one or more of the following: the presence of prostate cancer cfDNA in the cfDNA sample, the level of prostate cancer fraction in the cfDNA sample, a subtype of prostate cancer (for example a genomic subtype or molecular subtype, such as castration resistant prostate cancer), if the prostate cancer is metastatic, the aggression of the prostate cancer, the prognosis of the prostate cancer, the tumour response to a treatment, the relapse of the prostate cancer, and/or the residual disease following curative treatment. The methylation state of the genomic regions defined by the set of genomic locations of the signature may be used to indicate the presence of prostate cancer cfDNA in the cfDNA sample and/or the level of prostate cancer fraction in the cfDNA sample.
In one embodiment, the set of genomic locations defining 10 or more genomic regions are genomic locations selected from the group consisting of:
a 100 to 200 bp (for example a 100 to 150 bp or 100 to 120 bp) genomic location comprising or having a genomic location defined in Tables 1 and 3, and
a 2 to 99 bp (for example a 20 to 99 bp or 50 to 99 bp) genomic location within a genomic location defined in Tables 1 and 3 and comprising at least one CpG locus. For example, the set of genomic locations defining 10 or more genomic regions are genomic locations selected from the the 100 bp genomic locations defined in Tables 1 and 3.
In such embodiments, preferably the signature comprises a set of genomic locations defining 12 or more genomic regions, for example a set of genomic locations defining 15 or more genomic regions, a set of genomic locations defining 20 or more genomic regions, a set of genomic locations defining 25 or more genomic regions, a set of genomic locations defining 30 or more genomic regions, a set of genomic locations defining 50 or more genomic regions, a set of genomic locations defining 75 or more genomic regions, a set of genomic locations defining 100 or more genomic regions, a set of genomic locations defining 125 or more genomic regions, a set of genomic locations defining 150 or more genomic regions, a set of genomic locations defining 200 or more genomic regions, a set of genomic locations defining 300 or more genomic regions, a set of genomic locations defining 400 or more genomic regions. In one embodiment, the set of genomic locations defining 10 or more genomic regions are genomic locations selected from the group consisting of:
a 100 to 200 bp (for example a 100 to 150 bp or 100 to 120 bp) genomic location comprising or having a genomic location defined in Tables 2 and 4, and
a 2 to 99 bp (for example a 20 to 99 bp or 50 to 99 bp) genomic location within a genomic location defined in Tables 2 and 4 and comprising at least one CpG locus. For example, the set of genomic locations defining 10 or more genomic regions are genomic locations selected from the 100 bp genomic locations defined in Tables 2 and 4.
In such embodiments, preferably the signature comprises a set of genomic locations defining 12 or more genomic regions, for example a set of genomic locations defining 15 or more genomic regions, a set of genomic locations defining 20 or more genomic regions, a set of genomic locations defining 25 or more genomic regions, a set of genomic locations defining 30 or more genomic regions, a set of genomic locations defining 50 or more genomic regions, a set of genomic locations defining 75 or more genomic regions, a set of genomic locations defining 100 or more genomic regions, a set of genomic locations defining 125 or more genomic regions, a set of genomic locations defining 150 or more genomic regions, a set of genomic locations defining 200 or more genomic regions, a set of genomic locations defining 300 or more genomic regions, a set of genomic locations defining 400 or more genomic regions.
In one embodiment, the set of genomic locations defining 10 or more genomic regions are genomic locations selected from the group consisting of:
a 100 to 200 bp (for example a 100 to 150 bp or 100 to 120 bp) genomic location comprising or having a genomic location defined in Tables 1 and 2, and
a 2 to 99 bp (for example a 20 to 99 bp or 50 to 99 bp) genomic location within a genomic location defined in Tables 1 and 2 and comprising at least one CpG locus. For example, the set of genomic locations defining 10 or more genomic regions are genomic locations selected from the 100 bp genomic locations defined in Tables 1 and 2.
In such embodiments, preferably the signature comprises a set of genomic locations defining 12 or more genomic regions, for example a set of genomic locations defining 15 or more genomic regions, a set of genomic locations defining 20 or more genomic regions, a set of genomic locations defining 25 or more genomic regions, a set of genomic locations defining 30 or more genomic regions, a set of genomic locations defining 50 or more genomic regions, a set of genomic locations defining 75 or more genomic regions, a set of genomic locations defining 100 or more genomic regions, a set of genomic locations defining 125 or more genomic regions, a set of genomic locations defining 150 or more genomic regions, a set of genomic locations defining 200 or more genomic regions, a set of genomic locations defining 300 or more genomic regions, a set of genomic locations defining 400 or more genomic regions, or a set of genomic locations defining 500 or more genomic regions.
In one embodiment, the set of genomic locations defining 10 or more genomic regions are genomic locations selected from the group consisting of:
a 100 to 200 bp (for example a 100 to 150 bp or 100 to 120 bp) genomic location comprising or having a genomic location defined in Tables 3 and 4, and
a 2 to 99 bp (for example a 20 to 99 bp or 50 to 99 bp) genomic location within a genomic location defined in Tables 3 and 4 and comprising at least one CpG locus. For example, the set of genomic locations defining 10 or more genomic regions are genomic locations selected from the 100 bp genomic locations defined in Tables 3 and 4.
In such embodiments, preferably the signature comprises a set of genomic locations defining 12 or more genomic regions, for example a set of genomic locations defining 15 or more genomic regions, a set of genomic locations defining 20 or more genomic regions, a set of genomic locations defining 25 or more genomic regions, a set of genomic locations defining 30 or more genomic regions, a set of genomic locations defining 50 or more genomic regions, a set of genomic locations defining 75 or more genomic regions, a set of genomic locations defining 100 or more genomic regions, a set of genomic locations defining 125 or more genomic regions, a set of genomic locations defining 150 or more genomic regions, a set of genomic locations defining 200 or more genomic regions, a set of genomic locations defining 300 or more genomic regions, a set of genomic locations defining 400 or more genomic regions.
In one embodiment, the set of genomic locations defining 10 or more genomic regions are genomic locations selected from the group consisting of:
a 100 to 200 bp (for example a 100 to 150 bp or 100 to 120 bp) genomic location comprising or having a genomic location defined in Table 5, and
a 2 to 99 bp (for example a 20 to 99 bp or 50 to 99 bp) genomic location within a genomic location defined in Table 5 and comprising at least one CpG locus. For example, the set of genomic locations defining 10 or more genomic regions are genomic locations selected from the 100 bp genomic locations defined in Table 5.
In such embodiments, preferably the signature comprises a set of genomic locations defining 12 or more genomic regions, for example a set of genomic locations defining 15 or more genomic regions, a set of genomic locations defining 20 or more genomic regions, a set of genomic locations defining 25 or more genomic regions, a set of genomic locations defining 30 or more genomic regions, a set of genomic locations defining 50 or more genomic regions, a set of genomic locations defining 75 or more genomic regions, a set of genomic locations defining 100 or more genomic regions, a set of genomic locations defining 125 or more genomic regions, a set of genomic locations defining 150 or more genomic regions.
In one embodiment, the set of genomic locations defining 10 or more genomic regions are genomic locations selected from the group consisting of:
a 100 to 200 bp (for example a 100 to 150 bp or 100 to 120 bp) genomic location comprising or having a genomic location defined in Table 6, and
a 2 to 99 bp (for example a 20 to 99 bp or 50 to 99 bp) genomic location within a genomic location defined in Table 6 and comprising at least one CpG locus. For example, the set of genomic locations defining 10 or more genomic regions are genomic locations selected from the 100 bp genomic locations defined in Table 6.
In such embodiments, preferably the signature comprises a set of genomic locations defining 12 or more genomic regions, for example a set of genomic locations defining 15 or more genomic regions, a set of genomic locations defining 20 or more genomic regions, a set of genomic locations defining 25 or more genomic regions, a set of genomic locations defining 30 or more genomic regions, a set of genomic locations defining 50 or more genomic regions, a set of genomic locations defining 75 or more genomic regions, a set of genomic locations defining 100 or more genomic regions, a set of genomic locations defining 125 or more genomic regions, a set of genomic locations defining 150 or more genomic regions.
In one embodiment, the set of genomic locations defining 10 or more genomic regions are genomic locations selected from the group consisting of:
a 100 to 200 bp (for example a 100 to 150 bp or 100 to 120 bp) genomic location comprising or having a genomic location defined in Table 7, and
a 2 to 99 bp (for example a 20 to 99 bp or 50 to 99 bp) genomic location within a genomic location defined in Table 7 and comprising at least one CpG locus. For example, the set of genomic locations defining 10 or more genomic regions are genomic locations selected from the 100 bp genomic locations defined in Table 7.
In such embodiments, preferably the signature comprises a set of genomic locations defining 12 or more genomic regions, for example a set of genomic locations defining 15 or more genomic regions, a set of genomic locations defining 20 or more genomic regions, a set of genomic locations defining 25 or more genomic regions, a set of genomic locations defining 30 or more genomic regions, a set of genomic locations defining 50 or more genomic regions, a set of genomic locations defining 75 or more genomic regions, a set of genomic locations defining 100 genomic regions.
The invention also provides a cfDNA methylome signature comprising a set of genomic locations defining 10 or more genomic regions.
In such embodiments, preferably the signature comprises a set of genomic locations defining 12 or more genomic regions, for example a set of genomic locations defining 15 or more genomic regions, a set of genomic locations defining 20 or more genomic regions, a set of genomic locations defining 25 or more genomic regions, a set of genomic locations defining 30 or more genomic regions, a set of genomic locations defining 50 or more genomic regions, a set of genomic locations defining 75 or more genomic regions, a set of genomic locations defining 100 or more genomic regions, a set of genomic locations defining 125 or more genomic regions, a set of genomic locations defining 150 or more genomic regions, a set of genomic locations defining 200 or more genomic regions, a set of genomic locations defining 300 or more genomic regions, a set of genomic locations defining 400 or more genomic regions, a set of genomic locations defining 500 or more genomic regions, a set of genomic locations defining 600 or more genomic regions, a set of genomic locations defining 700 or more genomic regions, a set of genomic locations defining 800 or more genomic regions, a set of genomic locations defining 900 or more genomic regions, or a set of genomic locations defining 1000 genomic regions.
The signature is for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, prognostication and/or treatment of prostate cancer.
The methylation state (for example the average methylation ratio) of the genomic regions defined by the set of genomic locations of the signature may be used to indicate one or more of the following: the presence of prostate cancer cfDNA in the cfDNA sample, a subtype of prostate cancer (for example a genomic subtype or molecular subtype, such as one that has an aggressive clinical course and/or a AR copy number gain), if the prostate cancer is metastatic, the aggression of the prostate cancer, the prognosis of the prostate cancer, the tumour response to a treatment, the relapse of the prostate cancer, and/or the residual disease following curative treatment. Preferably, the methylation state of the genomic regions defined by the set of genomic locations of the signature may be used to indicate the presence of prostate cancer cfDNA in the cfDNA sample and/or a subtype of prostate cancer, such as one that has an aggressive clinical course and/or a AR copy number gain.
In one embodiment, the set of genomic locations defining 10 or more genomic regions are genomic locations selected from the group consisting of:
a 100 to 200 bp (for example a 100 to 150 bp or 100 to 120 bp) genomic location comprising or having a genomic location defined in Table 9, and
a 2 to 99 bp (for example a 20 to 99 bp or 50 to 99 bp) genomic location within a genomic location defined in Table 9 and comprising at least one CpG locus. For example, the set of genomic locations defining 10 or more genomic regions are genomic locations selected from the 100 bp genomic locations defined in Table 9.
In such embodiments, preferably the signature comprises a set of genomic locations defining 12 or more genomic regions, for example a set of genomic locations defining 15 or more genomic regions, a set of genomic locations defining 20 or more genomic regions, a set of genomic locations defining 25 or more genomic regions, a set of genomic locations defining 30 or more genomic regions, a set of genomic locations defining 50 or more genomic regions, a set of genomic locations defining 75 or more genomic regions, a set of genomic locations defining 100 or more genomic regions, a set of genomic locations defining 125 or more genomic regions, a set of genomic locations defining 150 genomic regions.
Methods for Determining a Solid Cancer cfDNA Methylome Signature
The present invention also provides methods for determining a solid cancer cfDNA methylome signature. Suitably, such signatures are used, for example, in detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, prognostication and/or treatment of a solid cancer. They can also suitably be used with the methods and kits for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, prognostication and/or treatment of a solid cancer and in methods for treatment of solid cancer.
In one embodiment, the invention provides a method for determining a solid cancer cfDNA methylome signature for use in detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, prognostication and/or treatment of the solid cancer, the method comprising:
In certain embodiments the solid cancer is prostate cancer (for example acinar adenocarcinoma prostate cancer, ductal adenocarcinoma prostate cancer, transitional cell cancer of the prostate, squamous cell cancer of the prostate, or small cell prostate cancer, and particularly acinar adenocarcinoma prostate cancer or ductal adenocarcinoma prostate cancer). In certain embodiments, the solid cancer is a metastatic cancer. In certain embodiments, the solid cancer is a relapsed and/or refractory solid cancer. In certain embodiments, the solid cancer is a subtype of a solid cancer, for example a subtype of a prostate cancer, for example a prostate cancer with specific molecular characteristics and/or genetic characteristics of the cancer cells.
The first sample is a sample that comprises cfDNA. The sample may suitably be a blood sample, a plasma sample, or a urine sample. In certain embodiments, the sample is a blood sample or a plasma sample. In certain embodiments, the sample is a urine sample.
Each further sample is a sample that comprises cfDNA. Each further sample may suitably be a blood sample, a plasma sample, or a urine sample. In certain embodiments, one or more further sample(s) is/are blood sample(s) or plasma sample(s). In certain embodiments, one or more further sample(s) is/are urine sample(s). In certain embodiments, all of the further samples are of the same type, for example each further sample is a blood sample; or each further sample is a plasma sample; or each further sample is a urine sample. In certain embodiments, each further sample is a blood sample; or each further sample is a plasma sample.
In one preferred embodiment, the first sample and each further sample are all samples of the same type. For example, the first sample and each further sample are all blood samples; or the first sample and each further sample are all plasma samples; or the first sample and each further sample are all urine samples. In certain embodiments, the first sample and each further sample are all blood samples; or the first sample and each further sample are all plasma samples.
In one embodiment, the first sample comprising cfDNA is from a subject known to have or suspected of having metastatic solid cancer. For example, the sample comprising cfDNA is from a subject known to have metastatic solid cancer.
In one embodiment, the one or more further samples comprising cfDNA are each from subjects known to have or suspected of having metastatic solid cancer. For example, the one or more further samples comprising cfDNA are each from subjects known to have metastatic solid cancer.
In one embodiment, the first and each further sample comprising cfDNA are each from subjects known to have or suspected of having metastatic solid cancer. For example, the first and each further sample comprising cfDNA are each from subjects known to have metastatic solid cancer.
In one embodiment, the first sample and one or more of the further samples are from the same subject, for example the same subject but at different time points, for example before treatment, during a treatment, after a treatment, before progression, after progression, after relapse, and/or after change of the disease to metastatic cancer.
In one embodiment, the first sample and each of the further samples are from the same subject, for example the same subject but at different time points, for example before treatment, during a treatment, after a treatment, before progression, after progression, after relapse, and/or after change of the disease to metastatic cancer.
In one embodiment, the first sample and one or more of the further samples are from different subjects. The different subjects may all have the same type of the solid cancer, or may all have a different type of the solid cancer, or some may have the same and some may have a different type of the solid cancer. A type of solid cancer may be metastatic, and a different type may be non-metastatic cancer. Another type of solid cancer may be a solid cancer that responds to a certain treatment (e.g. a hormonal agent), and a solid cancer that does not respond to that treatment (e.g. a hormonal agent). For prostate cancer, different types of that solid cancer include acinar adenocarcinoma prostate cancer, ductal adenocarcinoma prostate cancer, transitional cell cancer of the prostate, squamous cell cancer of the prostate, and small cell prostate cancer. For prostate cancer, different types of that solid cancer also include castration sensitive prostate cancer and castration resistant prostate cancer.
In one embodiment, the first sample and one or more of the further samples are from different subjects. The different subjects may all have the same subtype of the solid cancer, or may all have a different subtype of the solid cancer, or some may have the same and some may have a different subtype of the solid cancer. A subtype of solid cancer may be subtype based on characteristics of the cancer cells, and in particular molecular and genetic characteristics of the cells. An example of prostate cancer subtypes include androgen sensitive prostate cancer, androgen insensitive prostate cancer, AR copy number gain, and prostate cancer with an aggressive clinical course.
In one embodiment, the first sample and one or more of the further samples have different levels of cancer fraction of cfDNA. In one embodiment, the first sample and one or more of the further samples have similar levels of cancer fraction of cfDNA. The level of cancer fraction in a cfDNA sample can be determined by, for example, using methods that estimate tumour fraction using genomic markers.
Each subject is preferably the same species, for example each subject (i.e. the first subject and each of the one or more further subjects) are human.
In certain embodiments, the method comprises the additional step of obtaining a biological sample from the first subject and/or obtaining a biological sample from one or more further subjects, for example from each of the one or more further subjects.
The method for determining a solid cancer cfDNA methylome signature may further comprise isolating the cfDNA from the first sample, and isolating the cfDNA from the one or more further samples. Methods for isolating the cfDNA from the sample described elsewhere herein may be used in the method for determining a solid cancer cfDNA methylome signature.
The method comprises characterizing the methylome sequence of a plurality of cfDNA molecules in a first sample, wherein the methylome sequence of a cfDNA molecule is the DNA sequence and the methylation profile of the molecule. The method also comprises characterizing the methylome sequence of a plurality of cfDNA molecules in each of one or more further samples, wherein the methylome sequence of a cfDNA molecule is the DNA sequence and the methylation profile of the molecule. Methods for characterizing the methylome sequence of a plurality of cfDNA molecules described elsewhere herein may be used in the method for determining a solid cancer cfDNA methylome signature.
A plurality of cfDNA molecules may be, for example, at least 100, at least 1000, at least 10,000, at least 50,000, at least 100,000, at least 500,000, at least 1,000,000 (106), at least 5,000,000 (5×106), at least 10,000,000 (107), at least 100,000,000 (108), or at least 1,000,000,000 (109). Preferably, a plurality of cfDNA molecules may be, for example, at least 10,000, at least 50,000, at least 100,000, at least 500,000, at least 1,000,000 (106), at least 5,000,000 (5×106), at least 10,000,000 (107), at least 100,000,000 (108), or at least 1,000,000,000 (109). More preferably, a plurality of cfDNA molecules may be, for example, at least 100,000, at least 500,000, at least 1,000,000 (106), at least 5,000,000 (5×106), at least 10,000,000 (107), at least 100,000,000 (108), or at least 1,000,000,000 (109). The plurality of cfDNA molecules that are characterised for the first sample and for each of the one or more further samples may be the same or may be different.
The method comprises determining the respective number of characterised cfDNA molecules corresponding to a CpG locus or a genomic region of 2 to 10,000 bp (preferably 2 to 200 bp) by aligning the methylome sequences in the first sample. The method also comprises determining the respective number of characterised cfDNA molecules corresponding to a CpG locus or a genomic region of 2 to 10,000 bp (preferably 2 to 200 bp) by aligning the methylome sequences in each of of the one or more further samples. Aligning the methylome sequences can, for example, be carried out using a variety of techniques known in the art. For example, a DNA sequence alignment tool, (e.g., BSMAP (PMID: 19635165), Bismark (PMID: 21493656), gemBS (PMID: 30137223), Arioc (PMID: 29554207), BS-Seeker2 (PMID: 24206606), MethylCoder (PMID: 21724594) or BatMeth2 (PMID: 30669962)) can be used to align the reads. The reads may be aligned to reference genome (for example hg38, hg19, hg18, hg17 or hg16).
In certain embodiments, the method comprises removing duplications of reads of the same DNA molecule (i.e. duplications of reads of the same cfDNA molecule). In this step, sequence reads having exactly the same sequence and start and end base pairs (i.e. the same unclipped alignment start and unclipped alignment end of the sequence) are removed, as they are likely to be duplicate sequence reads of the same sequence (i.e. duplicate of reads of the same cfDNA molecule). For example, PCR duplications can be removed as part of the aligning step, such as using Picard tools v2.1.0 (http://broadinstitute.github.io/picard).
Preferably, determining the respective number of characterised cfDNA molecules corresponding to a CpG locus or a genomic region of 2 to 10,000 bp (preferably 2 to 200 bp) in the first sample comprises aligning the methylome sequences with a reference genome for the subject, for example for a human subject by aligning the methylome sequences with hg38, hg19, hg18, hg17 or hg16.
Preferably, determining the respective number of characterised cfDNA molecules corresponding to a CpG locus or a genomic region of 2 to 10,000 bp (preferably 2 to 200 bp) in the one or more further samples comprises aligning the methylome sequences for each of the one or more further samples with a reference genome for the subject, for example for a human subject by aligning the methylome sequences with hg38, hg19, hg18, hg17 or hg16.
Preferably, the methylome sequences in the first sample and the methylome sequences in each of the one or more further samples are aligned to the same reference genome, for example the methylome sequences in the first sample and the methylome sequences in each of the one or more further samples are aligned to hg38; or the methylome sequences in the first sample and the methylome sequences in each of the one or more further samples are aligned to hg19; or the methylome sequences in the first sample and the methylome sequences in each of the one or more further samples are aligned to hg18; or the methylome sequences in the first sample and the methylome sequences in each of the one or more further samples are aligned to hg17; or the methylome sequences in the first sample and the methylome sequences in each of the one or more further samples are aligned to hg16.
In certain preferred embodiments, the cfDNA molecules in the first sample and the one or more further samples may correspond to a CpG locus or a genomic region of 2 to 5000 bp. More preferably, cfDNA molecules correspond to a CpG locus or a genomic region of 2 to 5000 bp, 2 to 4000 bp, 2 to 3000 bp, 2 to 2000 bp, 2 to 1000 bp, 2 to 800 bp, 2 to 600 bp, 2 to 500 bp, 2 to 400 bp, 2 to 300 bp, or 2 to 200 bp. In one very preferred embodiment, the cfDNA molecules correspond to a CpG locus or a genomic region of 2 to 200 bp for example 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190 or 200 bp. In another preferred embodiment, the cfDNA molecules correspond to a CpG locus or a genomic region of 10 to 150 bp, 20 to 150 bp, 50 to 150 bp, 50 to 120 bp, 80 to 120 bp, 90 to 110 bp. In one preferred embodiment, the cfDNA molecules correspond to a genomic region of 100 bp.
The method comprises determining the methylation ratio of each CpG locus or the average methylation ratio of each genomic region of 2 to 10,000 bp (preferably 2 to 200 bp) in the first sample, and determining the methylation ratio of each CpG locus or the average methylation ratio of each genomic region of (preferably 2 to 200 bp) in each of the one or more further samples.
The average methylation ratio is the average of the methylation ratios of all the CpG loci within a given genomic region, and can be calculated by determining the sum of the methylation ratios of all CpG within a given genomic region and dividing the sum by the number of CpG within the given genomic region. If a genomic region has only 1 CpG locus, the average methylation is the same as the methylation ratio for the single CpG locus in the genomic region.
The method comprises repeating steps (i) to (iii) for one or more further samples comprising cfDNA each from subjects known to have the solid cancer. As such, the method comprises:
characterizing the methylome sequence of a plurality of cfDNA molecules in each of one or more further samples comprising cfDNA each from a subject known to have the solid cancer, wherein the methylome sequence of a cfDNA molecule is the DNA sequence and the methylation profile of the molecule;
determining the respective number of characterised cfDNA molecules corresponding to a CpG locus or a genomic region of 2 to 10,000 bp (preferably 2 to 200 bp) in each of one or more further samples by aligning the methylome sequences;
determining the methylation ratio of each CpG locus or the average methylation ratio of each genomic region of 2 to 10,000 bp (preferably 2 to 200 bp) in each of one or more further samples.
Thus, for the first sample and for each of the one or more further samples, the methylation ratio of each CpG locus or the average methylation ratio of each genomic region of 2 to 10,000 bp (preferably 2 to 200 bp) in the characterised cfDNA molecules are determined.
In certain embodiments, there is one further sample. In certain embodiments there is more than one further sample.
Preferably there are 2 or more further samples, 3 or more further samples, 4 or more further samples, 5 or more further samples, 6 or more further samples, 7 or more further samples, 8 or more further samples, 9 or more further samples, 10 or more further samples, 12 or more further samples, 15 or more further samples, 20 or more further samples, 25 or more further samples, 30 or more further samples, 40 or more further samples, 50 or more further samples, 60 or more further samples, 70 or more further samples, 80 or more further samples, 90 or more further samples, 100 or more further samples, 200 or more further samples, 300 or more further samples, 400 or more further samples, 500 or more further samples or 1000 or more further samples.
In one preferred embodiment there are 5 or more further samples, 10 or more further samples, 15 or more further samples, 20 or more further samples, 25 or more further samples, 50 or more further samples, 100 or more further samples, 200 or more further samples, 300 or more further samples, 400 or more further samples, 500 or more further samples or 1000 or more further samples.
In one preferred embodiment there are 10 or more further samples, 15 or more further samples, 20 or more further samples, 25 or more further samples, 50 or more further samples, 100 or more further samples, 200 or more further samples, 300 or more further samples, 400 or more further samples, 500 or more further samples or 1000 or more further samples.
The method comprises performing a variance analysis of all or a selection of the methylation ratios of the CpG loci and/or all or a selection of average methylation ratios of the genomic regions of the samples. A variance analysis results in groupings of CpG locus and/or genomic regions associated with features of the samples.
A cfDNA sample from a subject having a solid cancer is a heterogenous mixture of cfDNA from a primary source (for example, for a blood or plasma sample the primary source of cfDNA molecules are cfDNA from white blood cells, or in a urine sample the primary source of cfDNA molecules is a mixture of cfDNA from white blood cells, immune cell and urinary tract lining cells) and cfDNA from cancer cells. cfDNA in different samples (i.e. samples from different subjects and/or from the same subject at different time points) have differences in methylation levels. The inventors have surprisingly found that very useful methylome signatures can be found by performing a variance analysis of methylation ratios of CpG loci and/or average methylation ratios of genomic regions in multiple cfDNA samples from cancer patients. As not all DNA ends up as cfDNA, in view of the method of the invention determining variance in cfDNA samples, the signatures found using the method include CpG loci and/or genomic regions that are found in cfDNA samples. Additionally, the signatures found using this method can include both cancer-specific and tissue specific methylation. Thus, signatures found using the method of the invention will be especially useful and accurate when used in methods for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, prognostication and/or treatment of a solid cancer in a cfDNA sample, and especially in a sample of the same type as was used to find the signature.
A selection of the methylation ratios and/or a selection of average methylation ratios may be, for example at least 95%, at least 90%, at least 80%, at least 70%, at least 60%, at least 50%, at least 40%, at least 30%, at least 20%, at least 10%, or at least 5% methylation ratios and/or average methylation ratios. A selection of the methylation ratios and/or a selection of average methylation ratios of the genomic regions of the samples may be, for example less than 95%, less than 90%, less than 80%, less than 70%, less than 60%, less than 50%, less than 40%, less than 30%, less than 20%, less than 10%, or less than 5% methylation ratios and/or average methylation ratios.
A selection of the methylation ratios and/or a selection of average methylation ratios of the genomic regions of the samples may be a selection of the methylation ratios of the CpG loci and/or a selection of average methylation ratios of the genomic regions for one or more chromosomes. For example, selection of the methylation ratios and/or a selection of average methylation ratios of the genomic regions for one or more of chromosome 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, X and/or Y.
A selection of the methylation ratios and/or a selection of average methylation ratios of the genomic regions of the samples may be a selection of the methylation ratios of the CpG loci and/or a selection of average methylation ratios of the genomic regions wherein all samples have at least 1 characterised cfDNA molecule covering of each of the CpG loci and/or genomic regions. For example, wherein each sample has at least 10 (for example at least 15, 20, 25, 50, 100 or 1000) characterised cfDNA molecules covering each of the CpG loci and/or genomic regions.
In one preferred embodiment, the variance analysis performed is dimensionality reduction. For example, the variance analysis performed is a principal component analysis, a logistic regression analysis, a nearest neighbour analysis, a support vector machine, a neural network model, a NMF (non-negative matrix factorisation), an ICA (independent component analysis), FA (factor analysis), surrogate variable analysis (SVA), and independent surrogate variable analysis (ISVA).
In one preferred embodiment, the variance analysis performed is a principal component analysis.
In embodiments wherein the variance analysis performed is a principal component analysis, the CpG locus and/or genomic regions associated with features of the samples are the groupings of the different principal components, such as principal component 1, principal component 2, principal component 3, principal component 4, principal component 5, principal component 6, principal component 7, principal component 8 or a higher principal component.
The variance analysis performed will group CpG loci and/or genomic regions associated with different feature of the samples.
The variance analysis (for example the dimensionality reduction) is optionally followed by feature selection methods. An optional feature selection method can be implemented using R, python languages or equivalent statistical application or software.
The method comprises selecting a group of CpG loci and/or genomic regions associated with a feature of the samples, i.e. selecting a group from all of the groups that the variance analysis results in. For example, in embodiments wherein the variance analysis performed is a principal component analysis, the selecting a group of CpG loci and/or genomic regions associated with a feature of the samples comprises selecting one of principal component 1, principal component 2, principal component 3, principal component 4, principal component 5, principal component 6, principal component 7, principal component 8 or a higher principal component.
A feature of the samples may be any feature of the samples, which are each from subjects known to have the solid cancer and which all comprise cfDNA. Examples of a feature of the samples that a group of CpG loci and/or genomic regions may be associated with include, but are not limited to, level of solid cancer fraction in the cfDNA, a type of solid cancer, a subtype of solid cancer, a prognosis, aggression of the solid cancer, and susceptibility of the solid cancer to a treatment.
In certain embodiments, the group selected is a group of CpG loci and/or genomic regions associated with a level of solid cancer fraction in the cfDNA.
In certain embodiments, the group selected is a group of CpG loci and/or genomic regions associated with a type of solid cancer, for example associated with metastatic cancer; associated with non-metastatic cancer; associated with a type of solid cancer that responds to a certain treatment (e.g. a hormonal agent); or associated with a solid cancer that does not respond to a certain treatment (e.g. a hormonal agent). For a solid cancer that is a prostate cancer, in certain embodiments the group selected is a group of CpG loci and/or genomic regions associated with a type of solid cancer, for example associated with castration resistant prostate cancer; associated with castration sensitive prostate cancer; associated with acinar adenocarcinoma prostate cancer; associated with ductal adenocarcinoma prostate cancer; associated with transitional cell cancer of the prostate; associated with squamous cell cancer of the prostate; or associated with small cell prostate cancer.
In certain embodiments, the group selected is a group of CpG loci and/or genomic regions associated with a subtype of solid cancer, for example associated with molecular characteristics of the cancer cells; and/or associated with genetic characteristics of the cancer cells. For a solid cancer that is a prostate cancer, in certain embodiments the group selected is a group of CpG loci and/or genomic regions associated with a subtype of the solid cancer, for example associated with AR copy number gain; and/or associated with an aggressive clinical course.
In certain embodiments, the group selected is a group of CpG loci and/or genomic regions associated with a prognosis, for example associated with a good prognosis (for example survival of the subject upon treatment is from at least 1 month to at least 90 years); or associated with a poor prognosis (for example survival of a subject that is expected to be from less than 5 years to less than 1 month).
In certain embodiments, the group selected is a group of CpG loci and/or genomic regions associated aggression of the solid cancer.
In certain embodiments, the group selected is a group of CpG loci and/or genomic regions associated with susceptibility of the solid cancer to a treatment. For example associated with susceptibility of the solid cancer to a treatment with one or more of the following: a hormonal agent, a targeted agent, a biologic agent, an immunotherapy agent, a chemotherapy agent and a radionuclide treatment.
The method further comprises selecting CpG loci and/or genomic regions in the group to provide the cfDNA methylome signature. This may include selecting all of the CpG loci and/or genomic regions in the group or selecting a plurality of the CpG loci and/or genomic regions in the group, for example selecting at least 95%, at least 90%, at least 80%, at least 70%, at least 60%, at least 50%, at least 40%, at least 30%, at least 20%, at least 10%, or at least 5%, or for example selecting less than 95%, less than 90%, less than 80%, less than 70%, less than 60%, less than 50%, less than 40%, less than 30%, less than 20%, less than 10%, or less than 5%.
Selecting CpG loci and/or genomic regions in the group to provide the cfDNA methylome signature may comprise selecting at least 10,000 CpG loci and/or genomic regions, at least 8000 CpG loci and/or genomic regions, at least 5000 CpG loci and/or genomic regions, at least 4000 CpG loci and/or genomic regions, at least 3000 CpG loci and/or genomic regions, at least 2000 CpG loci and/or genomic regions, at least 1000 CpG loci and/or genomic regions, at least 800 CpG loci and/or genomic regions, at least 700 CpG loci and/or genomic regions, at least 600 CpG loci and/or genomic regions, at least 500 CpG loci and/or genomic regions, at least 400 CpG loci and/or genomic regions, at least 300 CpG loci and/or genomic regions, at least 250 CpG loci and/or genomic regions, at least 200 CpG loci and/or genomic regions, at least 150 CpG loci and/or genomic regions, at least 100 CpG loci and/or genomic regions, at least 50 CpG loci and/or genomic regions or at least 10 CpG loci and/or genomic regions.
Selecting CpG loci and/or genomic regions in the group to provide the cfDNA methylome signature may comprise selecting 10,000 or fewer CpG loci and/or genomic regions, 8000 or fewer CpG loci and/or genomic regions, 5000 or fewer CpG loci and/or genomic regions, 4000 or fewer CpG loci and/or genomic regions, 3000 or fewer CpG loci and/or genomic regions, 2000 or fewer CpG loci and/or genomic regions, 1000 or fewer CpG loci and/or genomic regions, 800 or fewer CpG loci and/or genomic regions, 700 or fewer CpG loci and/or genomic regions, 600 or fewer CpG loci and/or genomic regions, 500 or fewer CpG loci and/or genomic regions, 400 or fewer CpG loci and/or genomic regions, 300 or fewer CpG loci and/or genomic regions, 250 or fewer CpG loci and/or genomic regions, 200 or fewer CpG loci and/or genomic regions, 150 or fewer CpG loci and/or genomic regions, 100 or fewer CpG loci and/or genomic regions, 50 or fewer CpG loci and/or genomic regions or 10 or fewer CpG loci and/or genomic regions.
Selecting CpG loci and/or genomic regions in the group to provide the cfDNA methylome signature may comprise selecting 10,000 CpG loci and/or genomic regions, 8000 CpG loci and/or genomic regions, 5000 CpG loci and/or genomic regions, 4000 CpG loci and/or genomic regions, 3000 CpG loci and/or genomic regions, 2000 CpG loci and/or genomic regions, 1000 CpG loci and/or genomic regions, 800 CpG loci and/or genomic regions, 700 CpG loci and/or genomic regions, 600 CpG loci and/or genomic regions, 500 CpG loci and/or genomic regions, 400 CpG loci and/or genomic regions, 300 CpG loci and/or genomic regions, 250 CpG loci and/or genomic regions, 200 CpG loci and/or genomic regions, 150 CpG loci and/or genomic regions, 100 CpG loci and/or genomic regions, 50 CpG loci and/or genomic regions or 10 CpG loci and/or genomic regions.
Preferably, the method comprises selecting at least 5 CpG loci (for example at least 8, at least 10, at least 12, at least 15, at least 20, at least 25, at least 30, at least 40, at least 50, at least 75, at least 100, at least 200, at least 300, at least 400, at least 500, at least 600, at least 700, at least 800, at least 900, at least 1000 or at least 10,000) and/or at least 5 genomic regions (for example at least 8, at least 10, at least 12, at least 15, at least 20, at least 25, at least 30, at least 40, at least 50, at least 75, at least 100, at least 200, at least 300, at least 400, at least 500, at least 600, at least 700, at least 800, at least 900, at least 1000 or at least 10,000) in the group to provide a cfDNA methylome signature.
In one embodiment, the method comprises selecting at least 5 CpG loci in the group to provide a cfDNA methylome signature, for example at least 8, at least 10, at least 12, at least 15, at least 20, at least 25, at least 30, at least 40, at least 50, at least 75, at least 100, at least 200, at least 300, at least 400, at least 500, at least 600, at least 700, at least 800, at least 900, at least 1000 or at least 10,000 CpG loci. In one preferred embodiment, the method comprises selecting at least 10 CpG loci, at least 100 CpG loci, at least 250 CpG loci, or at least 500 CpG loci in the group to provide a cfDNA methylome signature. For example the method comprises selecting 10 CpG loci, 100 CpG loci, 250 CpG loci, 500 CpG loci in the group to provide a cfDNA methylome signature.
In another embodiment, the method comprises selecting at least 5 genomic regions in the group to provide a cfDNA methylome signature, for example at least 8, at least 10, at least 12, at least 15, at least 20, at least 25, at least 30, at least 40, at least 50, at least 75, at least 100, at least 200, at least 300, at least 400, at least 500, at least 600, at least 700, at least 800, at least 900, at least 1000 or at least 10,000 genomic regions. In one preferred embodiment, the method comprises selecting at least 10 genomic regions, at least 100 genomic regions, at least 250 genomic regions, or at least 500 genomic regions in the group to provide a cfDNA methylome signature. For example the method comprises selecting 10 genomic regions, 100 genomic regions, 250 genomic regions, 500 genomic regions in the group to provide a cfDNA methylome signature.
In one preferred embodiment, selecting the CpG loci and/or genomic regions in the group to provide the cfDNA methylome signature comprises selecting the CpG loci and/or genomic regions in the group that have strong (for example high) association with the feature to provide the cfDNA methylome signature. The CpG loci and/or genomic regions with strong (for example high) association with the feature may be CpG loci and/or genomic regions that are within the top 10,000 CpG loci and/or genomic regions most correlated with the feature in the group. For example, CpG loci and/or genomic regions with strong (for example high) association with the feature are CpG loci and/or genomic regions that are within the top 8000 CpG loci and/or genomic regions most correlated with the feature in the group; CpG loci and/or genomic regions with strong (for example high) association with the feature are CpG loci and/or genomic regions that are within the top 6000 CpG loci and/or genomic regions most correlated with the feature in the group; CpG loci and/or genomic regions with strong (for example high) association with the feature are CpG loci and/or genomic regions that are within the top 5000 CpG loci and/or genomic regions most correlated with the feature in the group; CpG loci and/or genomic regions with strong (for example high) association with the feature are CpG loci and/or genomic regions that are within the top 4000 CpG loci and/or genomic regions most correlated with the feature in the group; CpG loci and/or genomic regions with strong (for example high) association with the feature are CpG loci and/or genomic regions that are within the top 3000 CpG loci and/or genomic regions most correlated with the feature in the group; CpG loci and/or genomic regions with strong (for example high) association with the feature are CpG loci and/or genomic regions that are within the top 2000 CpG loci and/or genomic regions most correlated with the feature in the group; CpG loci and/or genomic regions with strong (for example high) association with the feature are CpG loci and/or genomic regions that are within the top 1000 CpG loci and/or genomic regions most correlated with the feature in the group; CpG loci and/or genomic regions with strong (for example high) association with the feature are CpG loci and/or genomic regions that are within the top 800 CpG loci and/or genomic regions most correlated with the feature in the group; or CpG loci and/or genomic regions with strong (for example high) association with the feature are CpG loci and/or genomic regions that are within the top 500, 400, 300, 250, 200, 150, 100, 50 or 10 CpG loci and/or genomic regions most correlated with the feature in the group.
In one preferred embodiment, CpG loci and/or genomic regions correlated with the feature in the group that have strong (for example high) association with the feature are CpG loci and/or genomic regions that are within the top 1000 CpG loci and/or genomic regions most correlated with the feature in the group. More preferably, CpG loci and/or genomic regions correlated with the feature in the group that have strong (for example high) association with the feature are CpG loci and/or genomic regions that are within the top 800 CpG loci and/or genomic regions most correlated with the feature in the group; or even more preferably CpG loci and/or genomic regions most correlated with the feature in the group that have strong (for example high) association with the feature may be CpG loci and/or genomic regions that are within the top 500, 400, 300, 250, 200, 150, 100, 50 or 10 CpG loci and/or genomic regions most correlated with the feature in the group.
In one embodiment wherein the level of methylation variance is determined using a principal component analysis, selecting the CpG loci and/or genomic regions in the group comprises selecting a plurality of CpG loci and/or genomic regions of principal component 1, 2, 3, 4, 5, 6, 7 or 8, for example selecting a plurality of CpG loci and/or genomic regions of principal component 1, 2, 3, 4, 5, 6, 7 or 8 that have strong (for example high) association with the feature of principal component 1, 2, 3, 4, 5, 6, 7 or 8, for example selecting CpG loci and/or genomic regions that are within the top 10,000 CpG loci and/or genomic regions of principal component 1, 2, 3, 4, 5, 6, 7 or 8 most correlated with the feature of principal component 1, 2, 3, 4, 5, 6, 7 or 8; or selecting CpG loci and/or genomic regions that are within the top 5000 CpG loci and/or genomic regions of principal component 1, 2, 3, 4, 5, 6, 7 or 8 most correlated with the feature of principal component 1, 2, 3, 4, 5, 6, 7 or 8; selecting CpG loci and/or genomic regions that are within the top 4000 CpG loci and/or genomic regions of principal component 1, 2, 3, 4, 5, 6, 7 or 8 most correlated with the feature of principal component 1, 2, 3, 4, 5, 6, 7 or 8; selecting CpG loci and/or genomic regions that are within the top 3000 CpG loci and/or genomic regions of principal component 1, 2, 3, 4, 5, 6, 7 or 8 most correlated with the feature of principal component 1, 2, 3, 4, 5, 6, 7 or 8; selecting CpG loci and/or genomic regions that are within the top 2000 CpG loci and/or genomic regions of principal component 1, 2, 3, 4, 5, 6, 7 or 8 most correlated with the feature of principal component 1, 2, 3, 4, 5, 6, 7 or 8; selecting CpG loci and/or genomic regions that are within the top 1000 CpG loci and/or genomic regions of principal component 1, 2, 3, 4, 5, 6, 7 or 8 most correlated with the feature of principal component 1, 2, 3, 4, 5, 6, 7 or 8; or selecting CpG loci and/or genomic regions that are within the top 500, 400, 300, 250, 200, 150, 100, 50 or 10 CpG loci and/or genomic regions of principal component 1, 2, 3, 4, 5, 6, 7 or 8 most correlated with the feature of principal component 1, 2, 3, 4, 5, 6, 7 or 8.
In one embodiment wherein the level of methylation variance is determined using a principal component analysis, selecting the CpG loci and/or genomic regions in the group that that have strong (for example high) association with the feature comprises selecting a plurality of CpG loci and/or genomic regions of principal component 1 correlated with the feature of principal component 1, for example selecting CpG loci and/or genomic regions that are within the top 10,000 CpG loci and/or genomic regions of principal component 1 most correlated with the feature of principal component 1; or selecting CpG loci and/or genomic regions that are within the top 5000 CpG loci and/or genomic regions of principal component 1 most correlated with the feature of principal component 1; selecting CpG loci and/or genomic regions that are within the top 4000 CpG loci and/or genomic regions of principal component 1 most correlated with the feature of principal component 1; selecting CpG loci and/or genomic regions that are within the top 3000 CpG loci and/or genomic regions of principal component 1 most correlated with the feature of principal component 1; selecting CpG loci and/or genomic regions that are within the top 2000 CpG loci and/or genomic regions of principal component 1 most correlated with the feature of principal component 1; selecting CpG loci and/or genomic regions that are within the top 1000 CpG loci and/or genomic regions of principal component 1 most correlated with the feature of principal component 1; or selecting CpG loci and/or genomic regions that are within the top 500, 400, 300, 250, 200, 150, 100, 50 or 10 CpG loci and/or genomic regions of principal component 1 most correlated with the feature of principal component 1.
The method for determining a solid cancer cfDNA methylome signature may further comprise comparing the methylation state of each of the selected CpG loci and/or genomic regions in the first sample and in the one or more further samples with the methylation state of the same CpG locus and/or genomic region in one or more of the following:
A sample of non-cancerous tissue of origin of the solid cancer, sample of the solid cancer, cell-line of the solid cancer; and/or sample of white blood cells may come from the same subject as the first sample and/or the one or more further samples comprising cfDNA; and/or a sample of non-cancerous tissue of origin of the solid cancer, sample of the solid cancer, cell-line of the solid cancer, and/or sample of white blood cells may come from a different subject as the first sample and/or the one or more further samples comprising cfDNA; and/or a sample of non-cancerous tissue of origin of the solid cancer, sample of the solid cancer, cell-line of the solid cancer, and/or sample of white blood cells may come from a different subject as the first sample and each of the one or more further samples comprising cfDNA.
In embodiments where the sample is a sample of the solid cancer, a sample of non-cancerous tissue of origin of the solid cancer and/or a sample of white blood cell, preferably the sample is from the same subject as the subject of the first sample or a subject of the one or more further samples. Additionally, or alternatively, samples of the solid cancer, samples of non-cancerous tissue of origin of the solid cancerm and/or samples of white blood cell from one or more different subjects to the subject of the first sample and the subjects of the one or more further samples are compared.
If a sample is from a different subject to the subject of the first sample and/or the subjects of the one or more further samples, preferably the different sample is from a subject that is age-matched subject with the subject of the first sample and/or the subjects of the one or more further samples.
In one preferred embodiment, the method for determining a solid cancer cfDNA methylome signature further comprises comparing the methylation state of each of the selected CpG loci and/or genomic regions in the first sample and in the one or more further samples with the methylation state of the same CpG locus and/or genomic region with one or more of the following:
a sample of white blood cells from the subject; and/or
a sample cfDNA from a healthy subject.
In one embodiment, the method for determining a solid cancer cfDNA methylome signature further comprises comparing the methylation state of each of the selected CpG loci and/or genomic regions in the first sample and in the one or more further samples with the methylation state of the same CpG locus and/or genomic region with one or more of the following:
a sample of white blood cells from the subject;
a sample of the solid cancer from the subject; and/or
a sample of non-cancerous tissue of origin of the solid cancer from the subject.
In one embodiment, the method for determining a solid cancer cfDNA methylome signature further comprises comparing the methylation state of each of the selected CpG loci and/or genomic regions in the first sample and in the one or more further samples with the methylation state of the same CpG locus and/or genomic region with one or more of the following:
a sample of cfDNA from a healthy subject (for example an age-matched healthy subject); and/or
a sample of non-cancerous tissue of origin of the solid cancer from the subject from a healthy subject (for example an age-matched healthy subject).
In one embodiment, the method for determining a solid cancer cfDNA methylome signature further comprises comparing the methylation state of each of the selected CpG CpG loci and/or genomic regions in the first sample and in the one or more further samples with the methylation state of the same CpG locus and/or genomic region with one or more of the following:
a sample of the solid cancer from multiple different subjects and optionally a sample of the solid cancer from the subject;
a cell-line of the solid cancer from multiple different subjects; and/or
a sample of cfDNA from a subject known to have the solid cancer (for example an age-matched subject known to have the solid cancer, and for example wherein the level of cancer fraction in the cfDNA sample from the different subject is known and/or wherein the sample is known to comprise cfDNA derived from a prostate cancer subtype).
In one embodiment, the method for determining a solid cancer cfDNA methylome signature further comprises comparing the methylation state of each of the selected CpG loci and/or genomic regions in the first sample and in the one or more further samples with the methylation state of the same CpG locus and/or genomic region with one or more of the following:
a sample of the solid cancer from multiple different subjects and optionally a sample of the solid cancer from the subject;
cell-lines of the solid cancer from multiple different subjects;
a sample of white blood cells from the subject;
samples of white blood cells multiple different subjects; and/or
samples of non-cancerous tissue of origin of the solid cancer from multiple different subjects;
a sample of non-cancerous tissue of origin of the solid cancer from the subject; and/or
a sample of cfDNA from a subject known to have the solid cancer (for example an age-matched subject known to have the solid cancer, and for example wherein the level of cancer fraction in the cfDNA sample from the different subject is known and/or wherein the sample is known to comprise cfDNA derived from a prostate cancer subtype).
In one embodiment, the method for determining a solid cancer cfDNA methylome signature further comprises comparing the methylation state of each of the selected CpG loci and/or genomic regions in the first sample and in the one or more further samples with the methylation state of the same CpG loci and/or genomic regions in a sample of cfDNA from a subject known to have the solid cancer (for example an age-matched subject known to have the solid cancer, and for example wherein the level of cancer fraction in the cfDNA sample from the different subject is known and/or wherein the sample is known to comprise cfDNA derived from a prostate cancer subtype), and preferably multiple cfDNA samples (for example at least 2, 3, 4, 5, 10, 20, 40, 50, 100, 200 or 500 samples) each from a different subject known to have the solid cancer (for example each from a different age-matched subject known to have the solid cancer, and for example wherein the level of cancer fraction in the each cfDNA sample from the different subjects is known and/or wherein each the sample is known to comprise cfDNA derived from a prostate cancer subtype).
The method for determining a solid cancer cfDNA methylome signature may further comprise determining a reference value for each of the selected CpG loci and/or genomic regions. In certain embodiments, the reference value is based on the methylation level (e.g. the methylation ratio for a CpG locus or the average methylation ratio for a genomic region) of the same CpG locus and/or genomic region in a cfDNA sample from one or more healthy subjects. In certain embodiments, the reference value is based on the methylation level (e.g. the methylation ratio for a CpG locus or the average methylation ratio for a genomic region) of the same CpG locus and/or genomic region in one or more white blood cell samples. In certain embodiments, the reference value is based on the methylation level (e.g. the methylation ratio for a CpG locus or the average methylation ratio for a genomic region) of the same CpG locus and/or genomic region in a sample of tissue from one or more healthy subjects. In certain embodiments, the reference value is based on the methylation level (e.g. the methylation ratio for a CpG locus or the average methylation ratio for a genomic region) of the same CpG locus and/or genomic region in one or more samples of solid cancer tumour and/or one or more solid cancer cell lines. In certain embodiments, the reference value is based on the methylation level (e.g. the methylation ratio for a CpG locus or the average methylation ratio for a genomic region) of the same CpG locus and/or genomic region in a sample of cfDNA from a subject known to have the solid cancer (for example an age-matched subject known to have the solid cancer, and for example wherein the level of cancer fraction in the cfDNA sample from the different subject is known and/or wherein the sample is known to comprise cfDNA derived from a prostate cancer subtype).
In one embodiment, the reference value is based on the methylation level (e.g. the methylation ratio for a CpG locus or the average methylation ratio for a genomic region) of the same CpG locus and/or genomic region in a cfDNA sample from one or more healthy subjects. In another embodiment, the reference value is based on the methylation level (e.g. the methylation ratio for a CpG locus or the average methylation ratio for a genomic region) of the same CpG locus and/or genomic region in one or more white blood cell samples.
In certain embodiments, a reference value for each of the selected CpG loci and/or genomic regions is the average methylation ratio of the same CpG locus and/or genomic region in or covered by:
a cfDNA sample from a healthy subject, for example a healthy age-matched subject;
a tissue sample from a healthy subject, for example a prostate tissue sample from a healthy subject;
a cancer biopsy sample from a cancer patient, for example a prostate cancer biopsy sample from a prostate cancer patient;
a cancer cell line sample, for example a prostate cancer cell line sample from a prostate cancer cell line;
a sample of white blood cells from a subject, for example the subject or a healthy subject;
a characterized methylome sequence of a white blood cell;
a characterized methylome sequence of a prostate cancer cell line;
a characterized methylome sequence of a cancerous prostate cell;
a characterized methylome sequence of a non-cancerous prostate cell; or
a sample of cfDNA from a subject known to have the solid cancer (for example an age-matched subject known to have the solid cancer, and for example wherein the level of cancer fraction in the cfDNA sample from the different subject is known and/or wherein the sample is known to comprise cfDNA derived from a prostate cancer subtype).
The method for determining a solid cancer cfDNA methylome signature may further comprise determining two or more (for example 3 or more, 4 or more, 5 or more, 10 or more, 15 or more, or 20 or more) reference values for each of the selected CpG loci and/or genomic regions (for example 2, 3, 4, 5, 6, 7, 8, 9 10, 15, 20, 30, 40, 50, 100, 200, 500 or 1000 reference values for each of the selected CpG loci and/or genomic regions). The two or more reference values may be selected from the average methylation ratio of the same CpG locus and/or genomic region in or covered by one or more of the following:
a cfDNA sample from a healthy subject, for example a healthy age-matched subject;
a tissue sample from a healthy subject, for example a prostate tissue sample from a healthy subject;
a cancer biopsy sample from a cancer patient, for example a prostate cancer biopsy sample from a prostate cancer patient;
a cancer cell line sample, for example a prostate cancer cell line sample from a prostate cancer cell line;
a sample of white blood cells from a subject, for example the subject or a healthy subject;
a characterized methylome sequence of a white blood cell;
a characterized methylome sequence of a prostate cancer cell line;
a characterized methylome sequence of a cancerous prostate cell; or
a characterized methylome sequence of a non-cancerous prostate cell;
a sample of cfDNA from a subject known to have the solid cancer (for example an age-matched subject known to have the solid cancer, and for example wherein the level of cancer fraction in the cfDNA sample from the different subject is known and/or wherein the sample is known to comprise cfDNA derived from a prostate cancer subtype).
The method for determining a solid cancer cfDNA methylome signature may further comprise establishing an algorithm for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, prognostication and/or treatment of the solid cancer using the cfDNA methylome signature.
The algorithm may be established using, for example, a random forest classifier, a regression analysis algorithm, for example a least absolute shrinkage and selection operator (LASSO) algorithm, a Naïve Bayes classifier, a support vector machine, a perceptron learning algorithm, a decision tree, a gradient boosting tree, a neural network or k-nearest neighbour algorithm. The algorithm can be implemented using R, python languages or equivalent statistical application or software (such as STATA) by one of ordinary skill in the art.
In certain embodiments, the algorithm is for determining the presence of solid cancer in a further sample comprising DNA using the cfDNA methylome signature.
In certain embodiments, the algorithm is for determining the level of a solid cancer in a further sample comprising DNA using the cfDNA methylome signature, for example the level of solid cancer tumour fraction.
In certain embodiments, the algorithm is for determining a subtype of solid cancer in a further sample comprising DNA using the cfDNA methylome signature.
In preferred embodiments the algorithm comprises comparing the methylation status, the methylation ratio, or the average methylation ratio, for some or all of the selected CpG loci and/or genomic regions of the cfDNA methylome signature to the methylation status, the methylation ratio, or the average methylation ratio for some or all of the selected CpG loci and/or genomic regions in a further sample comprising DNA. Additionally, or alternatively, the algorithm comprises comparing the methylation status, the methylation ratio, or the average methylation ratio, for some or all of the selected CpG loci and/or genomic regions of the cfDNA methylome signature to a reference value for each CpG locus and/or genomic region.
The invention will now be illustrated in a non-limiting way by reference to the following Example.
Plasma samples were collected within 30 days of treatment initiation and at progression in two biomarker studies, separately approved by the Istituto Scientifico Romagnolo per lo Studio e la Cura dei Tumori (IRST), Meldola, Italy (REC 2192/2013) and Royal Marsden, London, UK (REC 04/Q0801/6) and in the PREMIERE trial (EudraCT: 2014-003192-28, NCT02288936) that was sponsored and conducted by the Spanish Genito-Urinary oncology Group (SOGUG) (
These cohorts were described in Romanel et al. (Romanel, A., et al. Sci Transl Med 7, 312re310 (2015)) and Conteduca et al (Conteduca, V. et al., Ann Oncol 28, 1508-1516, (2017)). Briefly, patients needed to have histologically or biochemically confirmed prostate adenocarcinoma and be starting abiraterone or enzalutamide treatment for progressive mCRPC. Patients were required to receive abiraterone or enzalutamide until disease progression as defined by at least two of the following: a rise in PSA, worsening symptoms, or radiological progression defined as progression in soft-tissue lesions measured by computed tomography (CT) imaging according to modified Response Evaluation Criteria in Solid Tumors or progression on bone scanning according to criteria adapted from the Prostate Cancer Clinical Trials Working Group 2 guidelines. Patients with sufficient vials to allow both genome and methylome assessment were prioritised. Metastases were obtained at rapid warm autopsy in the Peter MacCallum warm autopsy program CASCADE (Cancer tissue Collection After Death) described by Alsop et al. (Alsop, K. et al. A, Nat Biotechnol 34, 1010-1014 (2016). (HREC 15/98,
Circulating DNA (10-25 ng) was extracted from plasma using the QIAamp Circulating Nucleic Acid kit (Qiagen™) and quantified using the Quant-iT high-sensitivity Picogreen double-stranded DNA Assay Kit (Invitrogen by Thermo Fisher™). Germline DNA was extracted from white blood cells using the QIAamp DNA kit (Qiagen™). Genomic NGS was performed as described previously (Romanel, A. et al. Sci Transl Med 7, 404 312re310 (2015)). For methylation assessment, raw plasma DNA was bisulfite treated using the ZYMO™ Gold Kit as per the manufacturer's protocol. Swift Bioscience™ Methyl-Seq was used to generate libraries. CpGs were selected from prior data generated using Illumina Infinium HumanMethylation450k microarray (Roche Nimblegen™ targeted capture kit, Epi CpGiant). Probes were designed to hybridize to strands of fully methylated, partially methylated and fully unmethylated derivatives of the target as described below. Libraries were quantified by KAPA library quantification kit (Roche™) before pooling and sequencing on an Illumina™ HiSeq 2500 using paired-end 100-base pair reads. Sequencing matrices for targeted methylome and LP-WGBS are shown in
Data were processed using fastqc to assess quality and read through adapters were trimmed using Trimmomatic v0.36. Since DNA was bisulfite treated, reads were aligned based on three nucleotides (thymine (T), adenosine (A), guanine (G)) to the human genome (hg)19 using the BSMAP v2.90 (Xi, Y. & Li, W., BMC Bioinformatics 10, 232 (2009); Bolger, A. M., et al, Bioinformatics 30, 2114-2120 (2014)). The duplicated reads were removed with Picard tools v2.1.0 (http://broadinstitute.github.io/picard), and unaligned reads were clipped (hard-clipped) using the bamUtil 1.0.13 (Jun, G et al, Genome Res 25, 918-925 (2015)).
The CpG methylation ratio of each loci was calculated using formula (I), which takes cytosine (C) and thymidine (T) counts from all reads covering each CpG loci.
From all sites included in the predesigned capture panel (Roche Nimblegen SeqCap EpiGiant), only sites with a minimum coverage of 10 reads were considered for further analysis of CpG (
Adjacent CpG methylation levels are usually highly related, and previously studies have demonstrated high sensitivity of identifying tissue-specific methylation markers using sliding window approaches (Lehmann-Werman, R. et al. Proc Natl Acad Sci USA 113, E1826-1834 (2016); Guo, S. et al. Nat Genet 49, 635-642 (2017); Sun, K. et al. Proc Natl Acad Sci USA 112, E5503-5512 (2015)). Here adjacent CpG sites were combined into methylation segments of fixed length (the term “methylation segment” and the term “segment” as used in the examples section may also be referred to as a genomic region), and the average methylation ratio across all CpGs within the segment was calculated and used to represent the methylation ratio of the segment using methylKit R package v1.6.2 (Akalin, A. et al. Genome Biol 13, R87 (2012)). Initially 100 bp with sliding window of 50 bp were used and generated >1.47 million windows across all CpGs in the target panel. Principal component analysis (PCA) was applied using the FactoMineR v1.41 package.
To eliminate potential biases due to the selection of segmentation length, segmentation length parameters were optimised. To do so, segments of 10 bp, 100 bp, 1000 bp and 10,000 bp were tested with sliding windows of 5 bp, 50 bp, 500 bp and 5000 bp, respectively. It was found that the smaller the window size, the more data that had to be drop when combining plasma samples due to variable inputs and sequencing coverage (
Thus, to preserve more detailed methylation information, and to guarantee successful execution in a reasonable amount of time, the setting of 100 bp segments with 50 bp sliding window was applied for the rest of the analysis. However, other segment sizes and windows could have been used.
The methylation segments for which methylation ratios available in all baseline samples (n=19) and for which the standard deviation values were in the upper two quartiles, were subjected to principal component analysis (FactorMineR R package v1.41, as described in Lê, S., Josse, J. & Husson, F. FactoMineR: An R Package for Multivariate Analysis. 2008 25, 18, doi:10.18637/jss.v025.i01 (2008).).
More specifically, unscaled PCA using FactoMineR (http://factominer.free.fr) (Lê, S., Josse, J. & Husson, F. FactoMineR: An R Package for Multivariate Analysis. 2008 25, 18 (2008)) was applied. The PCA model comes with the eigenvector, eigenvalues and correlation matrix comprised of correlation coefficient by each segment. The distribution of the top-K highly correlated segments was plotted based on the correlation matrix returned by PCA, and these segments were highly representative of each eigenvector (e.g., principal component 1, or PC1). To identify the optimal value K of highly correlated segments, multiple K values equal to 10, 100, 1,000, and 10,000 were tested and intra-sample variance calculated, and the correlation between the median of the average methylation ratios with genomically-determined tumour fraction was determined (
Significant principal components were determined using a permutation test as implemented in the jackstraw R package (v1.2) (https://CRAN.R-project.org/package=jackstraw). The projection of all the samples based on the PCA eigenvectors was based on the average methylation ratio of each segment (i.e. average methylation ratio of all the CpG loci within each region) used in the initial PCA for all the samples. Missing values were imputed based on the PCA method as implemented in the missMDA R package (v1.13), as described in Josse, J. & Husson, F. missMDA: A Package for Handling Missing Values in Multivariate Data Analysis. 2016 70, 31, doi:10.18637/jss.v070.i01 (2016).
Genomically-determined tumour fraction was determined from targeted next-generation sequencing (NGS) using CLONET as described in Romanel et al. (2015) and Prandi et al. (Prandi, D. et al. Genome Biol 15, 439 (2014)). On high-coverage targeted methylation NGS, PC1 values were calculated as described above, and the median of PC1 values extracted from healthy volunteers were set as 0%, while the median of PC1 values derived from LNCaP samples were set as 100% tumour purity. The tumour fractions of all the plasma samples were obtained with interpolation using PC1 projected values. For tumour fraction estimation based on low-passage whole genome sequencing (LP-WGS) on bisulfite-treated or non-treated plasma DNA, ichorCNA (Adalsteinsson, V. A. et al. Nat Commun 8, 1324 (2017)) was used as described below. For LP-WGBS PC1 projected values were used.
Analysis of LP-WGS by ichorCNA
LP-WGS on both bisulfite-treated and untreated plasma DNA was performed with a target 1× coverage. For each sample, reads from LP-WGS on untreated plasma DNA were aligned to the hg19 using BWA-MEM version 0.7.12-r1039 and de-duplicated using Picard tools v2.1.0. The human genome was then divided into non-overlapping bins of 1 million base pairs, and, for each sample, the de-duplicated reads were counted per bin using HMM Copy (http://compbio.bccrc.ca/software/hmmcopy/) (Ha, G. et al. Genome Res 22, 1995-2007 (2012)). Next, ichorCNA (https://github.com/broadinstitute/ichorCNA) was applied to estimate the tumour content of each sample (Adalsteinsson, V. A. et al. Nat Commun 8, 1324 (2017)). The algorithm first removed bins in the centromere regions with a flanking region of 100,000 base pairs. For all the remaining bins read counts were corrected by GC content and mappability issues. The normalised read counts were then fed into the Hidden Markov model (HMM), which is a probabilistic model assigning each bin into one possible state (hemizygous deletions (HETD, 1 copy), copy neutral (NEUT, 2 copies), copy gain (GAIN, 3 copies), amplification (AMP, 4 copies), and high-level amplification (HLAMP, 5 or more copies). Based on the copy number profile, the model estimated a ploidy and tumour content for every sample. Finally, the algorithm was initiated with ploidy values 2 and 3, and normal fraction, which is 1 minus tumour fraction of 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.95. The solution with maximum likelihood among all of these initial combinations was automatically assigned. The CNA status was estimated based on the log R values of each 1 Mbp region obtained by the ichorCNA analysis with fixed threshold of 0.5 (GAIN: log R≥0.5, LOSS: log R≤0.5).
Reads from LP-WGBS were processed as high coverage NGS. To calculate PC1 values derived from LP-WGBS, the default segmentation length of 100 bp was used and the average methylation ratio of each segment (i.e. average methylation ratio of all the CpG loci within each region) was calculated based on formula (I) to determine the methylation ratio of each loci, and then then mean of all CpG loci in a segment was calculated to arrive at the average methylation ratio for a segment. To maximize the available information obtained from the data, methylation data from higher coverage bisulfite data based regularised iterative PCA algorithm (Josse, J. & Husson, F. missMDA: A Package for Handling Missing Values in Multivariate Data Analysis. 2016 70, 31) (missMDA R package (v1.13)) was inputted, and projected on the PCA model as described above. The regularisation process with random initialisation can also circumvent the over-fitting problem, which might reduce the generalization capabilities of the findings.
The microarray processed data were obtained from the Gene Expression Omnibus (Edgar, R., et al, Nucleic Acids Res 30, 207-210 (2002)) repository (GSE84043). From the dataset probes overlapping with PC1 segments were selected. The average methylation ratio of each segment was obtained considering the median of the 13 values of the overlapping probes. The tumour fraction estimates by different methods were obtained by the sample information published (Fraser, M. et al, Nature 541, 359-364 (2017)).
Pearson correlation was used to measure the association between two parameters (principal component values versus genomically determined tumour fraction estimation, or different approaches of tumour fraction estimations). The association between copy number status of each region and principal components was estimated using the Kruskal-Wallis test. Mann-Whitney U test was used to test significance between two groups (AR gain versus AR non-gain—see
Correlation analyses of continuous measures were performed using the Pearson correlation method as implemented in the R v3.4.0 stats package. The association analysis between principal components and CNA of each region was performed by grouping the principal component values of each sample based on the CNA observed for the region (LOSS, NEUTRAL and GAIN). The differences in the principal component values distribution among groups was then assessed using the Kruskal-Wallis test (one-way ANOVA on ranks) as implemented in the R v3.4.0 stats package.
Methylation Ratio Difference with Kruskal-Wallis and Dunn's Test
The samples were grouped based on tissue of origin and clinical status (white blood cells, plasma healthy volunteer, plasma baseline and plasma progression). Samples were grouped by ct-MethSig and AR-MethSig, and the average methylation ratio of each 100 bp segment was estimated in each group of samples. To keep the analysis consistent, only segments present in all samples (340,467 segments) were considered. All the selected segments were split in two groups based on the overlap with the promoter region of known genes (263,262 non-promoter segments, 77,205 promoter segments). The promoter region was defined as 1k base-pair upstream and downstream of the transcription start site (TSS). The significance of the differences among each group was calculated using Kruskal-Wallis test (one-way ANOVA on ranks) as implemented in the R v3.4.0 (https://www.R-project.org (2018)) stats package. After defining the significance of the differences, the difference of the average methylation ratio across each group was assessed using the Dunn's test as implemented in FSA R package v0.8.22 (https://github.com/droglenc/FSA).
Functional enrichment analysis (chemical and genetic perturbations, MSigDB) was executed using the enrich R package (v0.1) based on all the MSigDB main categories (MSigDB database v6.0) (Liberzon, A. et al. Cell Syst 1, 417-425 (2015)) with a significance threshold of 0.05 on Benjamini corrected p values.
Motif enrichment analysis was used to identify potential transcriptomic regulators of methylation signatures (MethSig). MethSig top 1000 correlated segments were submitted to find the possible motif binding sequences over-represented as compared to the default background set (Zambelli, F., et al, Nucleic Acids Res 41, W535-543 (2013)). The pipeline (Pscan-Chip) (Zambelli, F., et al, Nucleic Acids Res 41, W535-543 (2013)) originally designed for the analysis of chromatin immunoprecipitation followed by next generation sequencing technologies was applied. The program automatically scanned 75 bp preceding and after the ‘peak’ regions that were submitted with controlled background, and know transcriptional factor binding motifs obtained from JASPAR version 2018. Local enrichment p-value was two-tailed and denoted whether the motif was over-represented in the 150-bp region compared to the genomic regions flanking them. Global enrichment denoted whether the motif binding sequence was over-represented in the region with respect to global background composed of pan-genome putative regulatory regions from various cell lines. The analysis on top 1000 highly correlated segments with PC1 (i.e. ct-MethSig) or PC3 (i.e. AR-MethSig) was performed and other randomly selected regions from the custom, targeted enrichment panel. The result of AR-MethSig was validated by an orthogonal pipeline (Heinz, S. et al. Mol Cell 38, 576-589 (2010)), and the finding was consistent to original approach as described above.
Average methylation ratios of ct-MethSig segments derived from LNCaP cell lines, and healthy volunteer plasma were extracted. To estimate the probability density function (pdf), kernel density estimation (kde) was applied, assuming a mixture of two Gaussian distributions consistent with the input dataset of normal prostate epithelium (
Gaussian mixture model: gj(x)=øθ
The mCRPC plasma methylome and genome were concurrently characterized (
A separate aliquot of DNA was subjected to bisulfite treatment and target enrichment NGS for 5.5 million pan-genome CpG sites was performed (target coverage: ≥30×; key sequencing parameters in
Adjacent CpG methylation patterns are usually highly correlated (Guo, S. et al. Nat Genet 49, 635-642, (2017); Lehmann-Werman, R. et al. Proc Natl Acad Sci USA 113, E1826-1834 (2016)). A 100 base-pair sliding window was applied and the data divided into 1.47 million methylation segments as described above. In keeping with prior studies on tissues, the methylation ratio distribution across all methylation segments in plasma and white blood cell samples showed a density peak for hypermethylation and hypomethylation (
The analytical framework was applied on baseline plasma methylome (n=19) to identify methylation features associated with genomically-determined tumour fraction. To use an unbiased approach to explore the complexity of pan-genome plasma methylation changes, principal component analysis (PCA) was performed. Different parameters were experimented on and confirmed the robustness of the finding on progression, healthy volunteer plasma methylome and LNCaP cell line methylome. To expand the applicability of the approach, segments highly correlated with principal components were extracted and tested on LP-WGBS plasma methylome, and external, well-defined tissue data sets using orthogonal approaches (
The first principal component (PC1) contributed 42% of the variance (
To evaluate the clinical applicability of the findings using LP-WGBS, scaled PC1 values were extracted from LP-WGBS. Applying Bland-Altman analysis, a good agreement was found between LP-WGBS derived tumour fraction estimation and estimates from high-coverage targeted NGS (95% limits of agreement: −0.25 to 0.15, bias: —0.05) introducing the opportunity for scalable and cost-efficient circulating tumour DNA detection and quantitation using LP-WGBS (
To test features identified by NGS in datasets with fewer data-points, such as methylation arrays, it was hypothesized that the median of the average methylation ratios of the segments that most strongly correlated to the component features could serve as a proxy of tumour fraction. A high correlation (r≥0.93, Pearson correlation) of the average methylation ratio of the segments with genomically-determined tumour fraction was consistently observed in both negatively (i.e. hypermethylated) and positively (i.e. hypomethylated) correlated group when including 10 to 10,000 segments. Also, the intra-sample variance of average methylation ratios of segments in the top correlated segments gradually increased when more segments were included (
It was confirmed that the median of the average methylation ratios of the selected 1000 segments of the ctMethSig showed a high correlation with tumour fraction (520 segments in negatively (i.e. hypermethylated) correlated regions, hyper-methylated group: r=0.95, P=8.4×10−19; 480 segments in positively (i.e. hypomethylated) correlated regions, hypo-methylated group: r=−0.93, P=3×10−16, Pearson correlation,
Additionally, the finding that the median of the average methylation ratios of all 1000 segments of the ctMethSig can be used as a proxy for tumour fraction was tested in published tissue data sets and confirmed a high correlation with tumour fraction both in mCRPC (Beltran, H. et al. Nat Med 22, 298-305 (2016)) (hypermethylated group: r=0.92, P<1.5×10−6; hypomethylated group: r=−0.74, P<1.4 10−3, Pearson correlation,
To study the biological processes underlying PC1, gene set enrichment analysis (GSEA) was performed on genes overlapping with ct-MethSig segments (i.e. the DNA segments of the genomic locations shown in Tables 1 to 4 above). Significant enrichment (adjusted P<10−4) was observed for targets of the polycomb repressor complex 2 (Lee, T. I. et al. Cell 125, 301-313 (2006)) (PRC2 related category in the Molecular Signature Database or MSigDB,
It was postulated that ct-MethSig included components that were specific to either prostate malignant or non-malignant epithelium. The kernel density estimation of the ct-MethSig average methylation ratios in whole genome bisulfite sequencing data derived from the non-malignant prostate epithelium cell line (PrEC) (Pidsley, R. et al. Genome Res 28, 625-638, (2018)) was plotted and it was observed that there was a bimodal distribution (
Finally, methylation microarray data from 553 prostate cancers from TCGA and 12 CRPC adenocarcinoma from Beltran et al. (Beltran, H. et al, Nat Med 22, 298-305 (2016)) was used to show that the distribution of ctMethSig segments in localized prostate cancer and CRPC tissue includes both cancer and normal components (
To build a classifier for detection of prostate cancer to accurately categorise prostate cancer subjects and healthy subjects, metastatic prostate cancer plasma samples (N=44) were used as described before (
The median of the average methylation ratios of all 1000 segments of ct-MethSig across all samples were used as input for random forest classifier (RFC), a classic machine learning classification method. A RFC model was built on and fitted a number of decision trees each of which categorized a subset of samples to improve the prediction accuracy and control for overfitting. The RFC was run with 1000 times cross-validation to ensure the stability of the model. Briefly, the samples were split into two groups—a training group (plasma DNA containing prostate tumour DNA) and a testing group plasma DNA not containing prostate tumour DNA. The classification model was initially built on the training group and the classifier was tested on the testing group. The model was initially built model selecting 10 trees in one forest, and the result showed 100% accuracy (STD=1%) on training and 95% on testing (STD=11%,
To investigate whether the randomly selected 1, 10 or 100 segments, or all 1000 segments, of ct-MethSig could construct a reliable classifier, a fixed number of segments (1, 10, and 100) were randomly selected, and these segment(s) used to build RFC (n_estimators=100) with 1000-time iteration. The results indicated that using only 1 randomly selected the testing accuracy was 84% (STD %=20%). The testing accuracy gradually improved when more segments were included (
In summary, the development of a methylation based classifier was achievable and able to identify plasma samples containing circulating tumour DNA with high accuracy.
Next plasma DNA methylation changes that could potentially identify distinct methylation subtypes were investigated. The second principal component (PC2) was driven by a single patient (02) and was not investigated further. In the third principal component (PC3) a weak correlation with tumour fraction was found (r=0.01, P=0.96, Pearson correlation) (
Functional enrichment analysis on the top 1000 segments of PC3 (referred to herein as AR-MethSig and the segments shown in Table 8 above) showed enrichment in histone H3 tri-methylation markers (
AR-MethSig hypomethylation strongly associates with AR copy number gainNext, genome-wide copy number profiles were extracted from LP-WGS and confirmed high similarity between results from the same sample with and without bisulfite treatment (
Given the association of PC3 values with AR copy number it was confirmed that patient plasma and tissue samples with AR gain had significantly lower average methylation ratios in the AR-MethSig segments (i.e. average methylation ratios in the AR-MethSig segments indicative of hypomethylation) than AR copy number normal samples (P<0.001 and P=0.023 respectively, Wilcoxon signed-rank test;
In Example 1, the present inventors performed next-generation sequencing (NGS) on plasma DNA with and without bisulfite treatment from mCRPC patients receiving either abiraterone or enzalutamide in the pre- or post-chemotherapy setting. Using principal component analysis on the mCRPC plasma methylome, the inventors surprisingly found that the main contributor to methylation variance (principal component one, or PC1) was strongly correlated with genomically-determined tumour fraction (r=−0.96; P<10−8). Further the 1000 top correlated segments of the PC1, “ct-MethSig”, which are presented in Tables 1 to 4 above, revealed that these segments comprised of methylation patterns specific to either prostate cancer or prostate normal epithelium.
The inventors used a custom target-capture approach to define the methylation status of pan-genome CpG islands. By using 100 bp sliding window strategy, the inventors obtained close to 0.5 million methylation segments with 10× coverage in all of the 19 “baseline” plasma DNA samples and used them to construct a principal component analysis. Novel to the inventors' approach was the construction of their model using solely mCRPC plasma DNA that has a variable ratio of normal DNA, primarily arising from white blood cells (Moss, J. et al. Nat Commun 9, 5068, (2018)), and validating the model using tumour DNA that harbors methylation changes that are either prostate epithelium-specific or cancer-specific. The method resulted in the ct-MethSig signature, the segments of which are shown in Tables 1 to 4. These segments can be used as described herein to very accurately determine the level of prostate cancer fraction in a cfDNA sample as shown, for example, in
The inventors found that the ct-MethSig did not include genes whose methylation status has been previously reported as diagnostic of prostate cancer such as, GSTP1, APC, and RASSF1 (Massie, C. E, et al, J Steroid Biochem Mol Biol 166, 1-15 (2017)). Although not wishing to be bound by theory, the present inventors being that this finding could be explained by highly variable methylation levels at the genomic segments of the signature in non-cancer plasma DNA compared to cancer plasma DNA.
As well as the signature of Tables 1 to 4 derived from the PC1 found by the present inventors that can be used to determine prostate cancer fraction from a sample, the inventors also surprisingly found a signature that can be used to extract information specific to an individual's cancer. That signature was derived from an orthogonal methylation signature (principal component three (PC3)), and the segments of this signature are defined in Table 8. The inventors surprisingly found that this signature can be used to identify a sub-group of cancers characterized by a more aggressive clinical course and that is enriched for AR copy number gain. In particular, this signature showed enrichment for androgen receptor binding sequences and hypomethylation at putative AR binding sites associated with AR copy number gain. Previous studies have reported worse outcome for patients with AR gain in plasma (Romanel, A. et al. Sci Transl Med 7, 312re310, (2015); Conteduca, V. et al., Ann Oncol 28, 1508-1516, (2017)) and given the high overlap between this genomic lesion and this signature, the inventors believe that this methylation signature identifies the same phenotype. Thus the inventors surprisingly found that a methylation signature can be used to detect a gene abnormality.
Thus, in summary, the present inventors' plasma methylome investigation using their innovative workflow has led to two novel signatures that can be used in methods, kits and uses as defined herein, to very accurately quantitate tumour fraction or identify distinct biologically-relevant subtypes of mCRPC with distinct biological mechanisms and differential clinical outcomes. As such, the signatures can be used for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises cfDNA.
§ 1. A method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises circulating free DNA (cfDNA), the method comprising:
§ 2. The method of clause 1, wherein each of the genomic regions is covered by at least one sequence read of at least two characterized methylome sequences, for example at least one sequence read of at least 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 50, 100, 200, 300, 400, 500, or 1000 characterized methylome sequences.
§ 3. The method of clause 1 or 2, wherein each of the genomic regions is covered by at least 10 sequence reads, for example at least 10, 12, 15, 20, 25, 50, 100, 200, 300, 400, 500, or 1000 sequence reads, and preferably wherein each sequence read or the majority of the sequence reads (for example at least 50%, 60%, 70%, 80% or 90% of the sequence reads) are from different characterized methylome sequences.
§ 4. The method of any one of clauses 1 to 3, wherein calculating a methylation score using the average methylation ratio for each genomic region comprises:
§ 5. The method of clause 4, wherein the first group of genomic regions are all of the hypermethylated genomic regions for which the average methylation ratio has been determined, and the second group of genomic regions are all of the hypomethylated genomic regions for which the average methylation ratio has been determined.
§ 6. The method of any one of clauses 1 to 5, wherein analyzing the methylation score to determine the level of prostate cancer fraction in the cfDNA sample comprises comparing the methylation score to one or more reference methylation scores, wherein a reference methylation score is a methylation score calculated for the same genomic regions (for example, calculated using the average methylation ratio for the same genomic regions) in one or more of the following a cfDNA sample from a healthy subject, for example a healthy age-matched subject;
§ 7. The method of any one of clauses 1 to 6, wherein calculating a methylation score using the average methylation ratio for each genomic region comprises:
§ 8. The method of any one of clauses 1 to 6, wherein calculating a methylation score using the average methylation ratio for each genomic region comprises comparing the average methylation ratio at each genomic region to a reference methylation ratio for each genomic region to determine a methylation ratio score for each genomic region,
and wherein the reference methylation ratio is the average methylation ratio for the same genomic region in or covered by:
§ 9. The method of clause 8, wherein analyzing the methylation score to determine the level of prostate cancer DNA comprises determining the number of methylation ratio scores that are indicative of prostate cancer DNA.
§ 10. The method of any one of clauses 1 to 9, wherein the methylome sequence of a cfDNA molecule is determined by using methylation aware sequencing (for example with bisulfite sequencing), methylation-sensitive restriction enzyme digestion, methylation-specific PCR, methylation-dependent DNA precipitation, methylated DNA binding proteins/peptides, or single molecule sequences without sodium bisulfite treatment.
§ 11. The method of any one of clauses 1 to 10, wherein the methylome sequence of a cfDNA molecule is determined by performing methylation aware sequencing, for example wherein the methylation aware sequencing comprises treating the DNA molecule with sodium bisulfite and performing sequencing of the treated DNA molecule.
§ 12. The method of any one of clauses 1 to 11, comprising determining the average methylation ratio at 25 or more, 50 or more, 100 or more, 150 or more, 200 or more, 300 or more, 400 or more, 500 or more, 600 or more, 700 or more, 800 or more, or 900 or more genomic regions (for example comprising determining the average methylation ratio at 25, 50, 100, 150, 200, 300, 400, 500, 600, 700, 800, 900 or 1000 genomic regions).
§ 13. The method of any one of clauses 1 to 12, wherein the genomic regions are selected from:
§ 14. The method of any one of clauses 1 to 13, wherein the genomic regions are selected from:
§ 15. The method of any one of clauses 1 to 14, wherein the genomic regions have a 100 bp genomic location defined in any one of Tables 1 to 4, Table 5, Table 6 or Table 7.
§ 16. The method of any one of clauses 1 to 15, comprising characterising the average methylation ratio at 50 or more (for example 50), 100 or more (for example 100), 200 or more (for example 200), 500 or more (for example 500), or 800 or more (for example 800 or 1000) genomic regions, wherein the genomic regions each have a genomic location defined in Tables 1 to 4; or
characterising the average methylation ratio at 10 or more (for example 10), 50 or more (for example 50) or 100 or more (for example 100), wherein each of the genomic regions have a genomic location defined in Table 5; or
characterising the average methylation ratio at 10 or more (for example 10), 50 or more (for example 50) or at 100 or more (for example 100), wherein each of the genomic regions have a genomic location defined in Table 6; or
characterising the average methylation ratio at 10 or more (for example 10), 50 or more (for example 50) or at 100 or more (for example 100), wherein each of the genomic regions have a genomic location defined in Table 7.
§ 17. The method of any one of clauses 1 to 16, wherein at least 25% of the genomic regions are prostate cancer specific genomic regions; or wherein at least 25% of the genomic regions are prostate tissue specific genomic regions.
§ 18. The method of any one of clauses 1 to 17, wherein at least 40% of the genomic regions are prostate cancer specific genomic regions, for example at least 50, 60, 70, 80, 90 or 95% (for example 95, 96, 97, 98, 99 and 100%) of the genomic regions are prostate cancer specific genomic regions; or wherein at least 40% of the genomic regions are prostate tissue specific genomic regions, for example at least 50, 60, 70, 80, 90 or 95% (for example 95, 96, 97, 98, 99 and 100%) of the genomic regions are prostate tissue specific genomic regions.
§ 19. The method of any one of clauses 1 to 18, wherein at least 40% of the genomic regions comprise, have or are within genomic locations defined in Tables 1 and/or 2, or Table 5 or Table 6 or Table 7, for example at least 50, 60, 70, 80, 90 or 95% (for example 95, 96, 97, 98, 99 and 100%) of the genomic regions comprise, have or are within a genomic location defined in Tables 1 and/or 2 or Table 5 or Table 6 or Table 7.
§ 20. The method of any one of clauses 1 to 19, wherein a plurality of cfDNA molecules is at least 10,000, at least 50,000, at least 100,000, at least 500,000, at least 1,000,000, at least 5,000,000, at least 10,000,000, or at least 100,000,000 cfDNA molecules.
§ 21. The method of any one of clauses 1 to 20, wherein the prostate cancer is acinar adenocarcinoma prostate cancer, ductal adenocarcinoma prostate cancer, transitional cell cancer of the prostate, squamous cell cancer of the prostate, or small cell prostate cancer (for example wherein the prostate cancer is acinar adenocarcinoma prostate cancer or ductal adenocarcinoma prostate cancer).
§ 22 The method of any one of clauses 1 to 21 wherein the prostate cancer is castration resistant prostate cancer and/or is metastatic prostate cancer.
§ 23. The method of any one of clauses 1 to 22, wherein the sample comprising cfDNA is a blood or plasma sample.
§ 24. The method of any one of clauses 1 to 23, further comprising measuring the level of prostate-specific antigen (PSA) in a sample of blood from the subject, and determining if the subject has an abnormal level of PSA in the blood (for example a level of PSA in the blood of at least 4.0 ng/mL or, if the subject has had a previous PSA test, an increased level of PSA compared to the previous test).
§ 25. The method of clause 24, wherein the subject has an abnormal level of PSA in the blood (for example a level of PSA in the blood of at least 4.0 ng/mL or, if the subject has had a previous PSA test, an increased level of PSA compared to the previous test); or wherein the subject has a normal level of PSA in the blood (for example a level of PSA in the blood of 4.0 ng/mL or less).
§ 26. The method of any one of clauses 1 to 25, further comprising repeating the method on a second sample obtained from the subject after the subject has undergone a treatment for prostate cancer, wherein the second sample comprises cfDNA, and comparing the level of prostate cancer fraction in the two samples.
§ 27. The method of any one of clauses 1 to 26 for screening and/or prognostication of prostate cancer, wherein prostate cancer is predicted when a level of prostate cancer is determined, for example a detectable level of prostate cancer, for example a percentage level of prostate cancer fraction of at least 0.01%.
§ 28. The method of any one of clauses 1 to 27, for detecting, screening and/or prognostication of metastatic prostate cancer, wherein metastatic prostate cancer is predicted when a level of prostate cancer is determined, for example a detectable level of prostate cancer, for example a percentage level of prostate cancer fraction of at least 0.01%.
§ 29. The method of any one of clauses 1 to 28, for detecting, screening and/or prognostication of prostate cancer, wherein metastatic prostate cancer with a poor prognosis is predicted when a level of prostate cancer is determined, for example a detectable level of prostate cancer, for example a percentage level of prostate cancer fraction of at least 0.01%.
§ 30. An in-vitro diagnostic kit for use in the detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer, comprising one or more reagents for detecting the presence or absence of at least 10 DNA molecules having a DNA sequence corresponding to all or part of a genomic location comprising at least one CpG locus defined in Tables 1 to 4, or comprising at least one CpG locus defined in Table 5, or comprising at least one CpG locus defined in Table 6, or comprising at least one CpG locus defined in Table 7.
§ 31. The kit as defined in clause 30, wherein the kit comprises one or more reagents for detecting the presence or absence of at least 15, 20, 30, 40, 50, 75, 100, 150, 200, 250, 300, 400, 500, 600, 700, 800, or 900 DNA molecules (for example 15, 20, 30, 40, 50, 75, 100, 150, 200, 250, 300, 400, 500, 600, 700, 800, 900 or 1000 DNA molecules) having a DNA sequence corresponding to all or part of a genomic location comprising at least one CpG locus defined in Tables 1 to 4.
§ 32. The kit as defined in clause 30 or 31, wherein the kit comprises oligonucleotides for specifically hybridizing to at least a section of the at least 10 DNA molecules (for example, at least 15, 20, 30, 40, 50, 75, 100, 150, 200, 250, 300, 400, 500, 600, 700, 800, or 900 DNA molecules) having a DNA sequence corresponding to all or part of a genomic location defined in Tables 1 to 4.
§ 33. The kit of any one of clauses 30 to 32, wherein at least one of the oligonucleotides for specifically hybridizing to at least a section of a DNA molecule is an amplification primer, for example each of the oligonucleotides for specifically hybridizing to at least a section of a DNA molecule is an amplification primer.
§ 34. A computer product comprising a non-transitory computer readable medium storing a plurality of instructions that when executed control a computer system to perform the method of any one of clauses 1 to 29.
§ 35. A computer-executable software for performing the method of any one of clauses 1 to 29.
§ 36. The kit of any one of clauses 30 to 33, wherein the kit comprises instructions for use which define how to determine the level of prostate cancer fraction in a sample comprising cfDNA from a subject, and/or comprises a computer product as defined in clause 34, and/or a computer-executable software as defined in clause 35.
§ 37. A computer-implemented method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises circulating free DNA (cfDNA), the method comprising:
§ 38. A computer-implemented method for classifying a prostate cancer patient into one or more of a plurality of treatment categories, the method comprising determining the level of prostate cancer DNA in a sample obtained from a subject, wherein the sample comprises circulating free DNA (cfDNA), the method comprising:
§ 39. The method of any one of clauses 1 to 29, 37 or 38 further comprising treating the subject for prostate cancer using a therapeutic agent for the treatment of prostate cancer;
or ceasing or altering treatment with a therapeutic agent for the treatment of prostate cancer; or initiating a non-therapeutic agent treatment for prostate cancer (for example initiation of treatment by surgery or radiation).
§ 40. A method for treating prostate cancer in a subject comprising the method of one of clauses 1 to 29, 37 or 38 and further comprising treating the subject using a therapeutic agent for the treatment of prostate cancer, surgery, and/or radiotherapy; or a method for treating prostate cancer in a subject, comprising administering to the subject an effective amount of a therapeutic agent for the treatment of prostate cancer after the subject has been determined to have prostate cancer based on a method as defined in one of clauses 1 to 29, 37 or 38.
§ 41. The method of clause 40, wherein the method of clause 1 to 29, 37 or 38 is performed before and/or after treating the subject.
§ 42. A method of any one of clauses 39 to 41, comprising performing the method of clause 1 to 29, 37 or 38 before treating the subject, and subsequently repeating the method of clause 1 to 29, 37 or 38 after the treatment, for example at least 1 week, at least 2 weeks, at least 3 weeks, at least 4 weeks, at least 1 month, at least 2 months, at least 3 months, at least 6 months, at least 9 months, at least 12 months, at least 24 months or at least 36 months after treating the subject.
§ 43. The method of clause 42, wherein the method comprises continuing to treat the subject with the therapeutic agent for the treatment of prostate cancer if the level of prostate cancer fraction is substantially the same in the initial and subsequent method or lower in the subsequent method than in the initial method.
§ 44. The method of clause 42 or 43, wherein the method comprises
§ 45. A method of treating a subject in need of treatment with a therapeutic agent for the treatment of prostate cancer, comprising
§ 46. A therapeutic agent for the treatment of prostate cancer for use in the treatment of prostate cancer, whereby
§ 47. A method as defined in clause 40 to 45, or a therapeutic agent for the treatment of prostate cancer for use as defined in clause 46, wherein a second therapeutic agent for the treatment of prostate cancer is administered if the subject has a level of prostate cancer DNA (for example a detectable level of prostate cancer DNA, for example 0.01% or more prostate cancer DNA).
§ 48. The method of clause 45, or a therapeutic agent for the treatment of prostate cancer for use as defined in clause 46, wherein
§ 49. A method of determining one or more suitable therapeutic agents for the treatment of prostate cancer for a subject having prostate cancer comprising
§ 50. A method of determining a suitable treatment regimen for a subject having prostate cancer comprising
§ 51. The method as defined in clause 50, wherein the standard treatment is a treatment with a therapeutic agent for the treatment of prostate cancer, and a non-standard treatment is a treatment with two or more therapeutic agents for the treatment of prostate cancer;
or wherein the standard treatment is a treatment with a hormonal agent for the treatment of prostate cancer, and a non-standard treatment is a treatment with a hormonal agent for the treatment of prostate cancer, and a chemotherapeutic agent for the treatment of prostate cancer and/or a immunotherapy treatment of prostate cancer and/or a targeted treatment of prostate cancer and/or a biologic agent treatment of prostate cancer.
§ 52. A computerized method and/or computer-assisted method for determining one or more suitable therapeutic agents for the treatment of prostate cancer in a subject having prostate cancer, the method comprising performing the steps of clause 49; or a computerized method and/or computer-assisted method for determining a suitable treatment regimen for a subject having prostate cancer, the method comprising performing the steps of clause 50 or clause 51.
§ 53. A method or therapeutic agent as defined in any one of clauses 39 to 52, wherein the therapeutic agent for the treatment of prostate cancer is selected from the group consisting of a hormonal agent, a targeted agent, a biologic agent, an immunotherapy agent, a chemotherapy agent;
for example:
a hormonal agent selected from LHRH agonists (for example leuprolide, goserelin, triptorelin, or histrelin), LHRH antagonists (for example degarelix), androgen blockers (for example abiraterone or ketoconazole), anti-androgens (for example flutamide, bicalutamide, nilutamide, enzalutamide, apalutamide or darolutamide), estrogens, and steroids (for example prednisone or dexamethasone);
a targeted agent selected from poly(ADP-ribose) polymerase (PARP) inhibitor (for example olaparib, rucaparib, niraparib or talazoparib), a epidermal growth factor receptor (EGFR) inhibitor (for example gefitinib, erlotinib, afatinib, brigatinib, icotinib, cetuximab, or osimertinib, adavosertib, lapatinib), and a tyrosine kinase inhibitor (for example imatinib, gefitinib, erlotinib, sunitinib);
a biologic agent selected from monoclonal antibodies (for example pertuzumab, trastuzumab and Solitomab), hormones (for example a hormonal agent selected from LHRH agonists (for example leuprolide, goserelin, triptorelin, or histrelin), LHRH antagonists (for example degarelix), androgen blockers (for example abiraterone or ketoconazole), anti-androgens (for example flutamide, bicalutamide, nilutamide, enzalutamide, apalutamide or darolutamide), and estrogens), interferons (for example interferons-α, -β, -γ), and interleukin-based products (for example interleukin-2);
an immunotherapy agent selected from a cancer vaccine (for example sipuleucel-T), T-cell therapy, monoclonal antibody therapy, immune checkpoint therapy (for example a PD-1 inhibitor (e.g pembrolizumab, nivolumab, cemiplimab spartalizumab), a PD-L1 inhibitor (e.g. atezolizumab, avelumab or durvalumab), or a CTLA-4 (e.g. ipilimumab)), and non-specific immunotherapies (for example interferons and inerleukins); or
a chemotherapy agent selected from docetaxel, cabazitaxel, and c-Met inhibitors (for example cabozantinib).
§ 54. A method or therapeutic agent as defined in any one of clauses 39 to 52, wherein the therapeutic agent for the treatment of prostate cancer is a hormonal agent and optionally a chemotherapy agent and/or optionally a further hormonal agent and/or optionally a targeted agent and/or optionally a radionuclide agent and/or an immunotherapy agent (for example a LHRH agonist (for example leuprolide, goserelin, triptorelin, or histrelin) or a LHRH antagonist (for example degarelix), and optionally a chemotherapy agent (for example docetaxel, cabazitaxel, carboplatin) and/or optionally a further hormonal treatment (for example enzalutamide, abiraterone, darolutamide) and/or optionally a radionuclide agent (Radium223, PSMA-labelled radionuclide) and/or optionally a PARP inhibitor (for example olaparib, rucaparib, niraparib or talazoparib) and/or an immunotherapy agent (for example nivolumab, pembroluzimab, ipilumimab, durvalumab)).
§ 55. A method for determining a solid cancer circulating free DNA (cfDNA) methylome signature for use in detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, prognostication and/or treatment of the solid cancer, the method comprising:
§ 56. The method of clause 55, wherein the method further comprises aligning the methylome sequences for the first sample with a reference genome for the subject; and aligning the methylome sequences for each of the one or more further samples with the same reference genome.
§ 57. The method of clause 55 or 56, wherein the reference genome is selected from hg38, hg19, hg18, hg17 and hg16.
§ 58. The method of any one of clauses 55 to 57, comprising selecting at least 25 CpG loci (for example at least 50, at least 75, at least 100, at least 200, at least 300, at least 400, at least 500, at least 600, at least 700, at least 800, at least 900, at least 1000 or at least 10,000) and/or at least 25 genomic regions (for example at least 50, at least 75, at least 100, at least 200, at least 300, at least 400, at least 500, at least 600, at least 700, at least 800, at least 900, at least 1000 or at least 10,000) CpG loci and/or genomic regions in the group to provide the cfDNA methylome signature.
§ 59. The method of any one of clauses 55 to 58, wherein the variance analysis performed is a dimensionality reduction.
§ 60. The method as defined in clause 59, wherein the dimensionality reduction is a principal component analysis, a logistic regression analysis, a nearest neighbor analysis, a support vector machine, a neural network model, a NMF (non-negative matrix factorisation), an ICA (independent component analysis) or a FA (factor analysis) is used to determine the level of methylation variance in the samples.
§ 61. The method as defined in clause 60, wherein the variance analysis performed is a principal component analysis.
§ 62. The method as defined in clause 61, wherein selecting a group of CpG loci and/or genomic regions associated with a feature of the samples comprises selecting one of principal component 1, principal component 2, principal component 3, principal component 4, principal component 5, principal component 6, principal component 7, principal component 8 or a higher principal component.
§ 63. The method of any one of clauses 55 to 62, wherein selecting the CpG loci and/or genomic regions in the group to provide the cfDNA methylome signature comprises selecting the CpG loci and/or genomic regions in the group that have strong association with the feature, for example selecting CpG loci and/or genomic regions that are within the top 10,000 CpG loci and/or genomic regions most correlated with the feature in the group (for example selecting CpG loci and/or genomic regions that are within the top 8000, 5000, 3000, 2000, 1000, 800, 500, 400, 300, 250, 200, 150, 100, 50 or 10 CpG loci and/or genomic regions most correlated with the feature in the group).
§ 64. The method of any one of clauses 55 to 63, wherein selecting CpG loci and/or genomic regions in the group to provide the cfDNA methylome signature comprises selecting at least 5 CpG loci (for example at least 8, at least 10, at least 12, at least 15, at least 20, at least 25, at least 30, at least 40, at least 50, at least 75, at least 100, at least 200, at least 300, at least 400, at least 500, at least 600, at least 700, at least 800, at least 900, at least 1000 or at least 10,000) and/or at least 5 genomic regions (for example at least 8, at least 10, at least 12, at least 15, at least 20, at least 25, at least 30, at least 40, at least 50, at least 75, at least 100, at least 200, at least 300, at least 400, at least 500, at least 600, at least 700, at least 800, at least 900, at least 1000 or at least 10,000) in the group to provide a cfDNA methylome signature.
§ 65. The method of clause 61 or 62, or clauses 63 and 64 when dependent on clauses 61 or 62, wherein selecting CpG loci and/or genomic regions in the group to provide the cfDNA methylome signature comprises selecting a plurality of CpG loci and/or genomic regions of principal component 1, 2, 3, 4, 5, 6, 7 or 8, for example selecting CpG loci and/or genomic regions that are within the top 10,000 CpG loci and/or genomic regions of principal component 1, 2, 3, 4, 5, 6, 7 or 8 most correlated with the feature of principal component 1, 2, 3, 4, 5, 6, 7 or 8; or selecting CpG loci and/or genomic regions that are within the top 5000 CpG loci and/or genomic regions of principal component 1, 2, 3, 4, 5, 6, 7 or 8 most correlated with the feature of principal component 1, 2, 3, 4, 5, 6, 7 or 8; selecting CpG loci and/or genomic regions that are within the top 4000 CpG loci and/or genomic regions of principal component 1, 2, 3, 4, 5, 6, 7 or 8 most correlated with the feature of principal component 1, 2, 3, 4, 5, 6, 7 or 8; selecting CpG loci and/or genomic regions that are within the top 3000 CpG loci and/or genomic regions of principal component 1, 2, 3, 4, 5, 6, 7 or 8 most correlated with the feature of principal component 1, 2, 3, 4, 5, 6, 7 or 8; selecting CpG loci and/or genomic regions that are within the top 2000 CpG loci and/or genomic regions of principal component 1, 2, 3, 4, 5, 6, 7 or 8 most correlated with the feature of principal component 1, 2, 3, 4, 5, 6, 7 or 8; selecting CpG loci and/or genomic regions that are within the top 1000 CpG loci and/or genomic regions of principal component 1, 2, 3, 4, 5, 6, 7 or 8 most correlated with the feature of principal component 1, 2, 3, 4, 5, 6, 7 or 8; or selecting CpG loci and/or genomic regions that are within the top 500, 400, 300, 250, 200, 150, 100, 50 or 10 CpG loci and/or genomic regions of principal component 1, 2, 3, 4, 5, 6, 7 or 8 most correlated with the feature of principal component 1, 2, 3, 4, 5, 6, 7 or 8.
§ 66. The method of any one of clauses 55 to 65, wherein the first sample comprising cfDNA and each of the one or more further samples is a blood sample; or wherein the first sample comprising cfDNA and each of the one or more further samples is a plasma sample.
§ 67. The method of any one of clauses 55 to 66, wherein the cancer is prostate cancer.
§ 68. The method of any one of clauses 55 to 67 comprising repeating steps (i) to (iii) for 2 or more further samples, 3 or more further samples, 4 or more further samples, 5 or more further samples, 6 or more further samples, 7 or more further samples, 8 or more further samples, 9 or more further samples, 10 or more further samples, 12 or more further samples, 15 or more further samples, 20 or more further samples, 25 or more further samples, 30 or more further samples, 40 or more further samples, 50 or more further samples, 60 or more further samples, 70 or more further samples, 80 or more further samples, 90 or more further samples, 100 or more further samples, 200 or more further samples, 300 or more further samples, 400 or more further samples, 500 or more further samples or 1000 or more further samples comprising cfDNA each from subjects known to have the solid cancer.
§ 69. The method of any one of clauses 55 to 68, wherein the first sample and one or more of the further samples are from different subjects (for example wherein the first sample and each of the one or more of the further samples are from different subjects) and/or wherein the first sample and one or more of the further samples are from the same subject, for example the same subject but at different time points, for example before treatment, during a treatment, after a treatment, before progression, after progression, and/or after change of the disease to metastatic cancer.
§ 70. The method of any one of clauses 55 to 69, further comprising comparing the methylation state of each of the selected CpG loci and/or genomic regions in the first sample and in the one or more further samples with the methylation state of the same CpG locus and/or genomic region in one or more of the following:
§ 71. The method of any one of clauses 55 to 70, further comprising determining a reference value (for example one more reference value, e.g. 3 or more, 4 or more, 5 or more, 10 or more, 15 or more, or 20 or more reference values) for each of the selected CpG loci and/or genomic regions, for example wherein a reference value for each of the selected CpG loci and/or genomic regions is the average methylation ratio of the same CpG locus and/or genomic region in or covered by:
§ 72. The method of any one of clauses 55 to 71, further comprising establishing an algorithm for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, prognostication and/or treatment of the solid cancer using the cfDNA methylome signature, for example wherein
§ 73. The method of clause 72, where the algorithm comprises comparing the methylation status, the methylation ratio, or the average methylation ratio, for some or all of the selected CpG loci and/or genomic regions of the cfDNA methylome signature to the methylation status, the methylation ratio, or the average methylation ratio for some or all of the selected CpG loci and/or genomic regions in a further sample comprising DNA; and/or wherein the algorithm comprises comparing the methylation status, the methylation ratio, or the average methylation ratio, for some or all of the selected CpG loci and/or genomic regions of the cfDNA methylome signature to a reference value for each CpG locus and/or genomic region.
§ 74. A computer implemented method for determining a solid cancer cfDNA methylome signature for use in the detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of the solid cancer, the method comprising performing the method of any one of clauses 55 to 73.
§ 75. A computer product comprising a non-transitory computer readable medium storing a plurality of instructions that when executed control a computer system to perform the method of any one of clauses 55 to 73.
§ 76. A computer-executable software for performing the method of any one of clauses 55 to 73.
§ 77. A computer-implemented software for determining a solid cancer cfDNA methylome signature for use in the detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of the solid cancer, the method comprising:
Further aspects of the invention are defined in the following numbered clauses:
§ 1. A method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises circulating free DNA (cfDNA), the method comprising:
§ 2. The method of § 1, wherein the method comprises determining the level of cfDNA in the sample that is derived from a prostate cancer subtype.
§ 3. The method of § 1 or § 2, wherein each of the genomic regions is covered by at least one sequence read of at least two characterized methylome sequences, for example at least one sequence read of at least 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 50, 100, 200, 300, 400, 500, or 1000 characterized methylome sequences.
§ 4. The method of any one of § 1 to § 3, wherein each of the genomic regions is covered by at least 10 sequence reads, for example at least 10, 12, 15, 20, 25, 50, 100, 200, 300, 400, 500, or 1000 sequence reads, and preferably wherein each sequence read or the majority of the sequence reads (for example at least 50%, 60%, 70%, 80% or 90% of the sequence reads) are from different characterized methylome sequences.
§ 5. The method of any one of § 1 to § 4, wherein calculating a methylation score using the average methylation ratio for each genomic region comprises:
determining the median (or the mean) of the average methylation ratios for all genomic regions for which the average methylation ratio has been determined; or
determining the median (or the mean) of the average methylation ratios for a first group of genomic regions to obtain a first methylation score and/or determining the median (or the mean) of the average methylation ratios for second group of genomic regions to obtain a second methylation score; or
comparing the average methylation ratio at each genomic region to a reference methylation ratio for each genomic region to determine a methylation ratio score for each genomic region.
§ 6. The method of § 5, wherein the first group of genomic regions are all of the hypermethylated genomic regions for which the average methylation ratio has been determined, and the second group of genomic regions are all of the hypomethylated genomic regions for which the average methylation ratio has been determined.
§ 7. The method of any one of § 1 to § 6, wherein analyzing the methylation score to determine whether the sample comprises cfDNA derived from a prostate cancer subtype comprises comparing the methylation score to one or more reference methylation scores, wherein a reference methylation score is a methylation score calculated for the same genomic regions (for example, calculated using the average methylation ratio for the same genomic regions) in one or more of the following
a cfDNA sample from a healthy subject, for example a healthy age-matched subject;
a tissue sample from a healthy subject, for example a prostate tissue sample from a healthy subject;
a cancer biopsy sample from a cancer patient, for example a prostate cancer biopsy sample from a prostate cancer patient;
a cancer cell line sample, for example a prostate cancer cell line sample from a prostate cancer cell line;
a sample of white blood cells from a subject, for example the subject or a healthy subject;
a cfDNA sample from a different subject having prostate cancer, wherein preferably the sample is known to comprise cfDNA derived from the prostate cancer subtype (more preferably multiple cfDNA samples (for example at least 2, 3, 4, 5, 10, 20, 40, 50, 100, 200, 300 or 500 samples) each from a different subject having prostate cancer, wherein preferably the each sample is known to comprise cfDNA derived from the prostate cancer subtype, and more preferably wherein each cfDNA sample has a different level of cfDNA derived from the prostate cancer subtype);
a characterized methylome sequence of a white blood cell;
a characterized methylome sequence of a prostate cancer cell line;
a characterized methylome sequence of a cancerous prostate cell; and/or
a characterized methylome sequence of a non-cancerous prostate cell.
§ 8. The method of any one of § 1 to § 7, wherein calculating a methylation score using the average methylation ratio for each genomic region comprises:
determining the median (or the mean) of the average methylation ratios for all genomic regions for which the average methylation ratio has been determined, and
wherein calculating a reference methylation score using the average methylation ratio for each genomic region comprises:
determining the median (or the mean) of the average methylation ratios for all genomic regions for which the average methylation ratio has been determined; or
wherein calculating a methylation score using the average methylation ratio for each genomic region comprises
determining the median (or the mean) of the average methylation ratios for a first group of genomic regions to obtain a first methylation score and/or determining the median (or the mean) of the average methylation ratios for second group of genomic regions to obtain a second methylation score (for example wherein the first group of genomic regions are all of the hypermethylated genomic regions, and the second group of genomic regions are all of the hypomethylated genomic regions), and
calculating a reference methylation score using the average methylation ratio for each genomic region comprises:
determining the median (or the mean) of the average methylation ratios for a first group of genomic regions to obtain a first methylation score and/or determining the median (or the mean) of the average methylation ratios for second group of genomic regions to obtain a second methylation score (for example wherein the first group of genomic regions are all of the hypermethylated genomic regions, and the second group of genomic regions are all of the hypomethylated genomic regions).
§ 9. The method of any one of § 1 to § 8, wherein calculating a methylation score using the average methylation ratio for each genomic region comprises comparing the average methylation ratio at each genomic region to a reference methylation ratio for each genomic region to determine a methylation ratio score for each genomic region,
and wherein the reference methylation ratio is the average methylation ratio for the same genomic region in or covered by:
a cfDNA sample from a healthy subject, for example a healthy age-matched subject;
a tissue sample from a healthy subject, for example a prostate tissue sample from a healthy subject;
a cancer biopsy sample from a cancer patient, for example a prostate cancer biopsy sample from a prostate cancer patient;
a cancer cell line sample, for example a prostate cancer cell line sample from a prostate cancer cell line;
a sample of white blood cells from a subject, for example the subject or a healthy subject;
a cfDNA sample from a different subject having prostate cancer, wherein preferably the sample is known to comprise cfDNA derived from the prostate cancer subtype (preferably multiple cfDNA samples (for example at least 2, 3, 4, 5, 10, 20, 40, 50, 100, 200, 300 or 500 samples) each from a different subject having prostate cancer, wherein preferably each sample is known to comprise cfDNA derived from the prostate cancer subtype, and more preferably wherein each cfDNA sample has a different level of cfDNA derived from the prostate cancer subtype);
a characterized methylome sequence of a white blood cell;
a characterized methylome sequence of a prostate cancer cell line;
a characterized methylome sequence of a cancerous prostate cell; and/or
a characterized methylome sequence of a non-cancerous prostate cell.
§ 10. The method of § 9, wherein analyzing the methylation score to determine whether the sample comprises cfDNA derived from a prostate cancer subtype comprises determining the number of methylation ratio scores that are indicative of the prostate cancer subtype.
§ 11. The method of any one of § 1 to § 10, wherein the methylome sequence of a cfDNA molecule is determined by using methylation aware sequencing (for example with bisulfite sequencing), methylation-sensitive restriction enzyme digestion, methylation-specific PCR, methylation-dependent DNA precipitation, methylated DNA binding proteins/peptides, or single molecule sequences without sodium bisulfite treatment.
§ 12. The method of any one of § 1 to § 11, wherein the methylome sequence of a cfDNA molecule is determined by performing methylation aware sequencing, for example wherein the methylation aware sequencing comprises treating the DNA molecule with sodium bisulfite and performing sequencing of the treated DNA molecule.
§ 13. The method of any one of § 1 to § 12, wherein the genomic regions are selected from:
a 100 to 150 bp region comprising or having a genomic location defined in Table 8, and
a 10 to 99 bp region within a genomic location defined in Table 8 and comprising at least one CpG locus; or
a 100 to 150 bp region comprising or having a genomic location defined in Table 9, and
a 10 to 99 bp region within a genomic location defined in Table 9 and comprising at least one CpG locus.
§ 14. The method of any one of § 1 to § 13, wherein the genomic regions are selected from:
a 100 to 120 bp region comprising or having a genomic location defined in Table 8, and
a 50 to 99 bp region within a genomic location defined in Table 8 and comprising at least one CpG locus; or
a 100 to 120 bp region comprising or having a genomic location defined in Table 9, and
a 50 to 99 bp region within a genomic location defined in Table 9 and comprising at least one CpG locus.
§ 15. The method of any one of § 1 to § 14, wherein the genomic regions have a 100 bp genomic location defined in Table 8, or wherein the genomic regions have a 100 bp genomic location defined in Table 9.
§ 16. The method of any one of § 1 to § 15, comprising characterising the average methylation ratio at 25 or more, 50 or more, 100 or more, 150 or more, 200 or more, 300 or more, 400 or more, or 500 or more genomic regions (for example comprising determining the average methylation ratio at 25, 50, 100, 150, 200, 300, 400 or 500 genomic regions), wherein the genomic regions have a genomic location defined in Table 8.
§ 17. The method of any one of § 1 to § 15, comprising characterising the average methylation ratio at 25 or more, 50 or more, 100 or more, 150 or more, 200 or more, 300 or more, 400 or more, or 500 or more genomic regions (for example comprising determining the average methylation ratio at 25, 50, 100, 150, 200, 300, 400 or 500 genomic regions), wherein the genomic regions have a genomic location defined in Table 8.
§ 16. The method of any one of § 1 to § 16, comprising characterising the average methylation ratio at 25 or more, 50 or more, 100 or more, 125 or more, or 150 genomic regions (for example comprising determining the average methylation ratio at 25, 50, 100, 125, or 150 genomic regions), wherein the genomic regions have a genomic location defined in Table 9.
§ 18. The method of any one of § 1 to § 17, wherein at least 25% of the genomic regions are prostate tissue specific genomic regions; or wherein at least 25% of the regions are prostate cancer specific genomic regions.
§ 19. The method of any one of § 1 to § 18, wherein at least 40% of the genomic regions are prostate cancer specific genomic regions, for example at least 50, 60, 70, 80, 90 or 95% (for example 95, 96, 97, 98, 99 and 100%) of the genomic regions are prostate cancer specific genomic regions; or wherein at least 40% of the genomic regions are prostate tissue specific genomic regions, for example at least 50, 60, 70, 80, 90 or 95% (for example 95, 96, 97, 98, 99 and 100%) of the genomic regions are prostate tissue specific genomic regions.
§ 20. The method of any one of § 1 to § 19, wherein a plurality of cfDNA molecules is at least 10,000, at least 50,000, at least 100,000, at least 500,000, at least 1,000,000, at least 5,000,000, at least 10,000,000, or at least 100,000,000 cfDNA molecules.
§ 21. The method of any one of § 1 to § 20, wherein the prostate cancer is acinar adenocarcinoma prostate cancer, ductal adenocarcinoma prostate cancer, transitional cell cancer of the prostate, squamous cell cancer of the prostate, or small cell prostate cancer.
§ 22. The method of any one of § 1 to § 21 wherein the prostate cancer is castration resistant prostate cancer and/or is metastatic prostate cancer.
§ 23. The method of § 1 to § 22, wherein the prostate cancer subtype is one that has an aggressive clinical course and/or androgen receptor (AR) copy number gain, for example an androgen-insensitive prostate cancer subtype.
§ 24. The method of any one of § 1 to § 23, wherein the sample comprising cfDNA is a blood or plasma sample.
§ 25. The method of any one of § 1 to § 24, further comprising measuring the level of prostate-specific antigen (PSA) in a sample of blood from the subject, and determining if the subject has an abnormal level of PSA in the blood (for example a level of PSA in the blood of at least 4.0 ng/mL), or, if the subject has had a previous PSA test, an increased level of PSA compared to the previous test.
§ 26. The method of any one of § 1 to § 25, further comprising repeating the method on a second sample obtained from the subject after the subject has undergone a treatment for prostate cancer, wherein the second sample comprises cfDNA, and comparing the detectable level of cfDNA derived from a prostate cancer subtype in each sample.
§ 27. The method of any one of § 1 to § 26, for screening and/or prognostication of prostate cancer, wherein prostate cancer with a poor prognosis is predicted when cfDNA derived from the prostate cancer subtype is identified in the sample, for example a detectable level of cfDNA derived from the prostate cancer subtype, for example a percentage level of cfDNA derived from the prostate cancer subtype of at least 0.01%.
§ 28. An in-vitro diagnostic kit for use in the detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer, comprising one or more reagents for detecting the presence or absence of at least 10 DNA molecules having a DNA sequence corresponding to all or part of a genomic location comprising at least one CpG locus defined in Table 8 or Table 9.
§ 29. The kit as described in § 28, wherein the kit comprises one or more reagents for detecting the presence or absence of at least 15, 20, 30, 40, 50, 75, 100, 150, 200, 250, 300, 400, 500, 600, 700, 800, or 900 DNA molecules (for example 15, 20, 30, 40, 50, 75, 100, 150, 200, 250, 300, 400, 500, 600, 700, 800, 900 or 1000 DNA molecules) having a DNA sequence corresponding to all or part of a genomic location comprising at least one CpG locus defined in Table 8 or Table 9.
§ 30. The kit as described in § 28 or § 29, wherein the kit comprises oligonucleotides for specifically hybridizing to at least a section of the at least 10 DNA molecules (for example, at least 15, 20, 30, 40, 50, 75, 100, 150, 200, 250, 300, 400, 500, 600, 700, 800, or 900 DNA molecules) having a DNA sequence corresponding to all or part of a genomic location defined in Table 8 or Table 9.
§ 31. The kit of any one of § 28 to § 30, wherein at least one of the oligonucleotides for specifically hybridizing to at least a section of a DNA molecule is an amplification primer, for example each of the oligonucleotides for specifically hybridizing to at least a section of a DNA molecule is an amplification primer.
§ 32. A computer product comprising a non-transitory computer readable medium storing a plurality of instructions that when executed control a computer system to perform the method of any one of § 1 to § 27.
§ 33. A computer-executable software for performing the method of any one of § 1 to § 27.
§ 34. The kit of any one of § 28 to § 31, wherein the kit comprises instructions for use which define how to determine whether a sample comprises cfDNA derived from a prostate cancer subtype and/or the level of cfDNA in the sample that is derived from a prostate cancer subtype, and/or comprises a computer product as defined in § 32, and/or a computer-executable software as defined in § 33.
§ 35. A computer-implemented method for detecting, screening, monitoring, staging, classification, selecting treatment for, ascertaining whether treatment is working in, and/or prognostication of prostate cancer in a sample obtained from a subject, wherein the sample comprises circulating free DNA (cfDNA), the method comprising:
receiving a data set in a computer comprising a processor and a computer readable medium, wherein the data set comprises the methylome sequence of a plurality of cfDNA molecules in the sample;
and wherein the computer readable medium comprises instructions that, when executed by the processors, causes the computer to perform a method of any one of § 1 to § 27 (for example causes the computer to perform a method comprising the following steps:
characterizing the methylome sequence of a plurality of cfDNA molecules in the sample, wherein the methylome sequence of a cfDNA molecule is the DNA sequence and the methylation profile of the molecule;
determining the average methylation ratio at 10 or more genomic regions, each genomic region being selected from the group consisting of:
a 100 to 200 bp region comprising or having a genomic location defined in Table 8, and
a 2 to 99 bp region within a genomic location defined in Table 8 and comprising at least one CpG locus,
and wherein each of the genomic regions is covered by at least one sequence read of at least one characterized methylome sequence;
calculating a methylation score using the average methylation ratio for each of the genomic regions;
analyzing the methylation score to determine whether the sample comprises cfDNA derived from a prostate cancer subtype.
§ 36. A computer-implemented method for classifying a prostate cancer patient into one or more of a plurality of treatment categories, the method comprising determining the level of prostate cancer DNA in a sample obtained from a subject, wherein the sample comprises circulating free DNA (cfDNA), the method comprising:
receiving a data set in a computer comprising a processor and a computer readable medium, wherein the data set comprises the methylome sequence of a plurality of cfDNA molecules in a sample obtained from a subject, wherein the sample comprises cfDNA;
and wherein the computer readable medium comprises instructions that, when executed by the processors, causes the computer to perform a method of any one of § 1 to § 27, for example causes the computer to perform a method comprising the following steps:
characterizing the methylome sequence of a plurality of cfDNA molecules in the sample, wherein the methylome sequence of a cfDNA molecule is the DNA sequence and the methylation profile of the molecule;
determining the average methylation ratio at 10 or more genomic regions, each of the genomic regions being selected from the group consisting of:
a 100 to 200 bp region comprising or having a genomic location defined in Table 8, and
a 2 to 99 bp region within a genomic location defined in Table 8 and comprising at least one CpG locus,
and wherein each of the genomic region is covered by at least one sequence read of at least one characterized methylome sequence;
calculating a methylation score using the average methylation ratio for each of the genomic regions;
analyzing the methylation score to determine whether the sample comprises cfDNA derived from a prostate cancer subtype.
§ 37. The method of any one of § 1 to § 27, § 35 or § 36 further comprising treating the subject for prostate cancer using a therapeutic agent for the treatment of prostate cancer; or ceasing or altering treatment with a therapeutic agent for the treatment of prostate cancer; or initiating a non-therapeutic agent treatment for prostate cancer (for example initiation of treatment by surgery or radiation).
§ 38. A method for treating prostate cancer in a subject comprising the method of § 1 to § 27, § 35 or § 36 and further comprising treating the subject using a therapeutic agent for the treatment of prostate cancer, surgery, and/or radiotherapy; or a method for treating prostate cancer in a subject, comprising administering to the subject an effective amount of a therapeutic agent for the treatment of prostate cancer after the subject has been determined to have prostate cancer subtype based on a method as defined in § 1 to § 27, § 35, or § 36.
§ 39. The method of § 38, wherein the method of § 1 to § 27, § 35, or § 36 is performed before and/or after treating the subject.
§ 40. A method of any one of § 37 to § 39, comprising performing the method of § 1 to § 27, § 35, or § 36 before treating the subject, and subsequently repeating the method of use § 1 to § 27, § 35, or § 36 after the treatment, for example at least 1 week, at least 2 weeks, at least 3 weeks, at least 4 weeks, at least 1 month, at least 2 months, at least 3 months, at least 6 months, at least 9 months, at least 12 months, at least 24 months or at least 36 months after treating the subject.
§ 41. The method of § 40, wherein the method comprises continuing to treat the subject with the therapeutic agent for the treatment of prostate cancer if the cfDNA derived from a prostate cancer subtype is detected in the sample and/or the sample comprises a level of cfDNA derived from the prostate cancer subtype that is substantially the same in the initial and subsequent method or lower in the subsequent method than in the initial method.
§ 42. The method of § 40 or § 41, wherein the method comprises
ceasing or altering treatment with the therapeutic agent for the treatment of prostate cancer; and/or
initiating treatment with a second therapeutic agent for the treatment of prostate cancer; and/or
initiating a non-therapeutic agent treatment (e.g., surgery or radiation),
if the sample comprises cfDNA derived from a prostate cancer subtype and/or the sample comprises a level of cfDNA derived from a prostate cancer subtype that is substantially the same in the initial and subsequent method or higher in the subsequent method than in the initial method.
§ 43. The method of § 42, wherein the second therapeutic agent is a chemotherapeutic agent or a PARP inhibitor.
§ 44. A method of treating a subject in need of treatment with a therapeutic agent for the treatment of prostate cancer, comprising
i) performing the method of any one of § 1 to § 27, § 35, or § 36 to determine if the sample comprises cfDNA derived from a prostate cancer subtype and/or determine the level of cfDNA in the sample derived from a prostate cancer subtype;
ii) administering a therapeutic agent for the treatment of prostate cancer if the sample comprises cfDNA derived from a prostate cancer subtype and/or if the sample comprises a level of cfDNA derived from a prostate cancer subtype (for example 0.01% or more cfDNA derived from a prostate cancer subtype).
§ 45. A therapeutic agent for the treatment of prostate cancer for use in the treatment of prostate cancer, wherein
i) the method of any one of § 1 to § 27, § 35 or § 36 is performed to determine if a sample comprises cfDNA derived from a prostate cancer subtype in a subject and/or determine the level of cfDNA in the sample derived from a prostate cancer subtype in a subject;
ii) the therapeutic agent is administered if the sample comprises cfDNA derived from a prostate cancer subtype in the subject and/or if the sample comprises a level of cfDNA derived from a prostate cancer subtype (for example 0.01% or more cfDNA derived from a prostate cancer subtype).
§ 46. A method as described in § 39 to § 44, or a therapeutic agent for the treatment of prostate cancer for use as described in § 45, wherein a second therapeutic agent for the treatment of prostate cancer is administered if a sample from the subject has cfDNA derived from a prostate cancer subtype and/or has a level of cfDNA derived from a prostate cancer subtype (for example a detectable level of prostate cancer DNA, for example 0.01% or more cfDNA derived from a prostate cancer subtype).
§ 47. The method as described in § 44, or a therapeutic agent for the treatment of prostate cancer for use as described in § 45, wherein
(iii) at least 1 week, at least 2 weeks, at least 3 weeks, at least 4 weeks, at least 1 month, at least 2 months, at least 3 months, at least 6 months, at least 9 months, at least 12 months, at least 24 months, or at least 36 months, after the administration of the therapeutic agent, a further sample comprising cfDNA is obtained from the subject, and the method of any one of § 1 to § 27, § 35, or § 36 is performed to determine if the further sample comprises cfDNA derived from a prostate cancer subtype in a subject and/or determine the level of cfDNA that is derived from a prostate cancer subtype.
§ 48. A method of determining one or more suitable therapeutic agents for the treatment of prostate cancer for a subject having prostate cancer comprising
performing the method of any one of § 1 to § 27, § 35 or § 36;
determining the one or more suitable therapeutic agents for the treatment of prostate cancer by reference to whether the sample comprises cfDNA derived from a prostate cancer subtype and/or the level of cfDNA in the sample that is derived from a prostate cancer subtype, whereby one therapeutic agent is suitable for a subject with a sample having no cfDNA derived from a prostate cancer subtype (for example an undetectable level of cfDNA derived from a prostate cancer subtype) or a level of cfDNA derived from a prostate cancer subtype of less than 0.01%, and two or more therapeutic agents are suitable for a subject with a level of cfDNA derived from a prostate cancer subtype (for example a percentage level of cfDNA derived from a prostate cancer subtype of at least 0.01%);
or whereby a therapeutic agent selected from a first list of therapeutic agents is suitable for a subject with a sample having no cfDNA derived from a prostate cancer subtype (for example an undetectable level of cfDNA derived from a prostate cancer subtype) or a level of cfDNA derived from a prostate cancer subtype of less than 0.01%, and a therapeutic agent from a second list of therapeutic agents, or two or more therapeutic agents from the first list, is suitable for a subject with a level of cfDNA derived from a prostate cancer subtype (for example a percentage level of cfDNA derived from a prostate cancer subtype of at least 0.01%).
§ 49. A method of determining a suitable treatment regimen for a subject having prostate cancer comprising
performing the method of any one of claims § 1 to § 27, § 35 or § 36;
determining the treatment regimen by reference whether the sample comprises cfDNA derived from a prostate cancer subtype and/or the level of cfDNA in the sample that is derived from a prostate cancer subtype, whereby a standard treatment is suitable for a subject with a sample having no cfDNA derived from a prostate cancer subtype (for example an undetectable level of cfDNA derived from a prostate cancer subtype) or a percentage level of cfDNA derived from a prostate cancer subtype in the cfDNA sample of less than 0.01%, and a non-standard treatment is suitable for a subject when a level cfDNA derived from a prostate cancer subtype (for example a detectable level of cfDNA derived from a prostate cancer subtype in the cfDNA sample) or a percentage level of cfDNA derived from a prostate cancer subtype in the cfDNA sample is determined of at least 0.01%.
§ 50. The method as claimed in § 49, wherein the standard treatment is a treatment with a therapeutic agent for the treatment of prostate cancer, and a non-standard treatment is a treatment with two or more therapeutic agents for the treatment of prostate cancer;
or wherein the standard treatment is a treatment with a hormonal agent for the treatment of prostate cancer, and a non-standard treatment is a treatment with a hormonal agent for the treatment of prostate cancer, and a chemotherapeutic agent for the treatment of prostate cancer and/or a immunotherapy treatment of prostate cancer and/or a targeted treatment of prostate cancer and/or a biologic agent treatment of prostate cancer.
§ 51. A computerized method and/or computer-assisted method for determining one or more suitable therapeutic agents for the treatment of prostate cancer for a subject having prostate cancer, the method comprising performing the steps of § 48; or for selecting a treatment regimen for a subject having prostate cancer, the method comprising the steps of § 49 or § 50.
§ 52. A method or therapeutic agent as described in any one of § 37 to § 51, wherein the therapeutic agent for the treatment of prostate cancer is selected from the group consisting of a hormonal agent, a targeted agent, a biologic agent, an immunotherapy agent, a chemotherapy agent;
for example: a hormonal agent selected from LHRH agonists (for example leuprolide, goserelin, triptorelin, or histrelin), LHRH antagonists (for example degarelix), androgen blockers (for example abiraterone or ketoconazole), anti-androgens (for example flutamide, bicalutamide, nilutamide, enzalutamide, apalutamide or darolutamide), estrogens, and steroids (for example prednisone or dexamethasone);
a targeted agent selected from poly(ADP-ribose) polymerase (PARP) inhibitor (for example olaparib, rucaparib, niraparib or talazoparib), a epidermal growth factor receptor (EGFR) inhibitor (for example gefitinib, erlotinib, afatinib, brigatinib, icotinib, cetuximab, or osimertinib, adavosertib, lapatinib), and a tyrosine kinase inhibitor (for example imatinib, gefitinib, erlotinib, sunitinib);
a biologic agent selected from monoclonal antibodies (for example pertuzumab, trastuzumab and Solitomab), hormones (for example a hormonal agent selected from LHRH agonists (for example leuprolide, goserelin, triptorelin, or histrelin), LHRH antagonists (for example degarelix), androgen blockers (for example abiraterone or ketoconazole), anti-androgens (for example flutamide, bicalutamide, nilutamide, enzalutamide, apalutamide or darolutamide), and estrogens), interferons (for example interferons-α, -β, -γ), and interleukin-based products (for example interleukin-2);
an immunotherapy agent selected from a cancer vaccine (for example sipuleucel-T), T-cell therapy, monoclonal antibody therapy, immune checkpoint therapy (for example a PD-1 inhibitor (e.g pembrolizumab, nivolumab, cemiplimab spartalizumab), a PD-L1 inhibitor (e.g. atezolizumab, avelumab or durvalumab), or a CTLA-4 (e.g. ipilimumab)), and non-specific immunotherapies (for example interferons and inerleukins);
a chemotherapy agent selected from docetaxel, cabazitaxel, and c-Met inhibitors (for example cabozantinib).
Number | Date | Country | Kind |
---|---|---|---|
1915469.9 | Oct 2019 | GB | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/GB2020/052706 | 10/23/2020 | WO |