The instant application contains a Sequence Listing which has been submitted electronically in ASCII format and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Jun. 9, 2021, is named ‘Sequence listing as filed 17 Jun. 2021 N417235WO MGW JAS.txt’ and is 3,022,158 bytes in size.
The present invention relates to assays for predicting the presence, absence or development of grade 3 cervical intra-epithelial neoplasia (CIN3) and/or cancer in an individual, particularly cervical or endometrial cancer, most preferably cervical cancer, by determining the methylation status of certain CpGs in a population of DNA molecules in a sample which has been taken from the individual, deriving an index value based on the methylation status of the certain CpGs, and predicting the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, in the individual based on the cancer index value. The invention further relates to a method of treating and/or preventing CIN3 and/or cervical cancer in an individual, the method comprising assessing the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual by performing the assays of the invention, followed by administering one or more therapeutic or preventative treatments or measures to the individual based on the assessment. The invention further provides a method of monitoring the CIN3 and/or cancer status of an individual according to changes in the individual's cancer index value over the course of time. The invention further relates to arrays which are suitable for performing the assays of the invention.
The project leading to this application has been funded by the European Commission's Horizon 2020 Research and Innovation Action, H2020 FORECEE under Grant Agreement No. 634570, the European Commission's Horizon 2020 European Research Council Executive Agency, H2020 BRCA-ERC under Grant Agreement No. 742432 as well as the charity, The Eve Appeal.
Cervical cancer screening has been the most successful personalised cancer prevention strategy to date; the screening aims to identify women with a pre-invasive lesion, which is then surgically excised.
At this point in time, the majority of countries are changing screening from cytology to HPV testing as the primary screen and utilising cytology to triage HPV+ve women for colposcopic assessment. However, several challenges remain for HPV-based screening:
To date, the best performing triage algorithm for HPV+ve women with respect to diagnosing CIN3+ is dual immunostaining for p16/Ki-67; both cytology and dual staining have a specificity of 75% but dual staining is more sensitive (74.9%) when compared with conventional cytology (51.9%).
The present inventors, along with other investigators, have previously shown the feasibility of utilising DNA methylation markers in order to identify women with pre-invasive or invasive cancers. The clinical use of DNA methylation markers to identify women at high risk for CIN3+ has been hindered by several factors:
Using a cohort-based nested case/control setting, the inventors have developed and validated a DNA methylation signature (called Women's cancer risk IDentification CIN3 index, WID-CIN3-index) in cervical smear samples which is capable of both diagnosing and predicting the future risk of CIN3+ in an individual as well as cancer, particularly cervical or endometrial cancer, most preferably cervical cancer.
The current inventors set out to understand whether DNAme (DNA methylation) profiles may be used to detect the presence or absence of grade 3 cervical intra-epithelial neoplasia (CIN3) and/or cervical cancer. The inventors also set out to understand whether said DNAme profiles may be associated with the development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, and therefore whether such profiles may be capable of functioning as surrogate markers for individual stratification purposes in connection with CIN3 and/or cervical cancer.
In this regard the inventors have succeeded in developing assays involving “cancer index values” which are derived from and associated with DNAme profiles established from samples comprising epithelial cells. The sample may particularly be derived from the cervix, the vagina, the buccal area, blood and/or urine. The sample is preferably a cervical liquid-based cytology sample, and more preferably a cervical smear sample and which values can be used to stratify the individual in connection with cancer. A preferred sample for use in any of the assays described and defined herein is a cervical tissue sample. A particularly preferred sample for use in any of the assays described and defined herein is a cervical smear sample.
The cancer index value is determined from data relating to the methylation status of one or more CpGs in a panel of CpGs as further defined and described herein. CpGs of the panel are methylation sites in DNA from cells derived from/obtained from samples from tissue in which the native tissue structure is preserved e.g. a biopsy, or a sample comprising exfoliated cells from a tissue surface. The samples may comprise epithelial cells. The sample may particularly be derived from the cervix, the vagina, the buccal area, blood and/or urine. The sample is preferably a cervical smear sample and more preferably a cervical liquid-based cytology sample which can be collected by a health care professional or by a women herself (self-collection).
For the purposes of the present invention, the cancer index value may be used interchangeably herein with “WID-CIN-Index”, “WID-Index”, “cancer index”, “index” or “index value” (WID=women's risk identification). Furthermore, any reference to a cancer index value in the context of the present invention, may be equally used for the assessment of the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual.
Based on studies with patients known to be CIN3-negative and/or free of cervical cancer, the inventors have established cancer index values, using specific panels of CpGs, which have been determined to be associated with/characteristic of cervical tissue which is CIN3-negative and/or negative for cervical cancer. Based on studies with patients known to possess CIN3 and/or cervical cancer, the inventors have established cancer index values which have been determined to be associated with/characteristic of cervical tissue which is positive for CIN3 and/or cervical cancer. Based on studies with patients known to be CIN3-negative and/or free of cervical cancer, wherein the same patients when assayed between one to four years later are subsequently shown determined to be CIN3-positive and/or positive for cervical cancer, the inventors have established cancer index values which have been determined to be associated with/characteristic of cervical tissue which is positive for CIN3 and/or cervical cancer.
Thus, the inventors have been able to establish cancer index values, using specific panels of CpGs, which can characterize an individual as having CIN3 and/or cancer or not having CIN3 and/or cancer, or having a high risk of CIN3 and/or cancer development. The cancer is preferably cervical or endometrial cancer, most preferably cervical cancer.
By determining the methylation profile-based cancer index value from a sample derived from the individual, the individual may be seen to possess a cancer index value which correlates with those possessed by individuals which are known, via the inventor's studies described herein, to be CIN3 positive or negative and/or cervical cancer positive or negative, or to become CIN3 positive or negative and/or cervical cancer positive or negative. Such correlations have been determined with a high degree of statistical accuracy, particularly with respect to parameters relevant to biological assays such as receiver operating characteristics (ROC) sensitivity and specificity, as well as area under the curve (AUC). Accordingly, by determining the cancer index value from a sample from a given individual, the individual may be determined to possess cervical tissue that is positive for CIN3 and/or cancer, i.e. the individual is diagnosed as having CIN3 and/or cervical cancer. Conversely, by determining the cancer index value from a sample from a given individual, the individual may be determined to possess cervical tissue which is negative for CIN3 and/or cancer, i.e. the individual is diagnosed as not having CIN3 and/or cervical cancer.
Assessment of CIN3 in accordance with assays of the invention may identify individuals likely to develop CIN3 in the future, particularly within about four years from the date of to the first assessment of the individual with the one or more of the assays described herein.
Assessment of the development of cancer in accordance with the assays of the invention may refer to assessing progression or worsening of a pre-existing cancer lesion in an individual. Assessment of the development of cancer in accordance with the assays of the invention may refer to predicting the likelihood of recurrence of cancer.
The observations described herein establish that the cancer index value, as further described and defined herein, is dynamic and can change over the course of time. The cancer index value may therefore be used to monitor an individual's cancer status and risk of cancer development. Moreover, the cancer index value may be used to monitor the efficacy of cancer treatments being administered to an individual, including therapeutic treatments and preventative treatments.
Accordingly, in the context of the present invention, by determining the cancer index value from a sample from a given individual it is possible to assess the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual, or in other words to stratify the individual for cancer. In the context of the present invention, stratification for cancer is the process of categorizing the individual as being a member of a group of individuals who possess a phenotype in connection with cancer, including the presence or absence of cancer in the individual, or the development of cancer, i.e. by having epithelial cells, particularly derived from the cervix, the vagina, the buccal area, blood and/or urine, more preferably a cervical smear sample and even more preferably a cervical liquid-based cytology sample.
As explained herein, the assay methods of the invention are based on a cancer index value derived from a methylation profile from DNA originating from cells, particularly derived from the cervix, the vagina, the buccal area, blood and/or urine, more preferably a cervical liquid-based cytology sample, and even more preferably a cervical smear sample. Accordingly, the assays provide means for correlating an epithelial cell, or most preferably a cervical smear sample-derived DNA methylation profile with a status connected with cervical or endometrial cancer ranging from the individual being cancer negative, to the individual being cancer positive, with high statistical accuracy. Because the assays of the invention provide a correlation between the methylation profile and the disease status, the skilled person will appreciate that as part of the stratification process and outcome, disease status is assigned on the basis of a likelihood. As such, the methods of the invention provide assays which are predictive of an individual's status with respect to cancer. The assays of the invention accordingly provide means for predicting the presence or absence of cancer in an individual. The assays of the invention accordingly also provide means for predicting the development of cancer in an individual. The assays of the invention can provide means for predicting the development of cancer in an individual since the inventors have demonstrated that specific cancer index values can define cervical and endometrial tissue which is cancer negative, whilst others can define cervical and endometrial tissue which is cancer positive, and since the specific cancer index values may be dynamic and thereby increased in association with tumour stage and further increased cancer risk factors such as the women being post-menopausal, the values may be subject to change along a scale of cancer risk.
Whilst disease status may be assigned on the basis of a likelihood, the inventors have demonstrated herein that correlations between DNA methylation profile and cancer status using cancer index values can be achieved with a very high degree of statistical accuracy using parameters relevant to biological assays, as described further herein. As such, the assays of the invention provide means for predicting the presence or absence of cancer in an individual and for predicting the development of cancer in an individual, and for stratifying an individual for cancer, and wherein the prediction/stratification can be defined to be statistically highly reliable and robust. This in turn means that the prediction/stratification can be made with a high level of confidence. The assays of the invention can be defined to be statistically accurate by means known in the art, as further described and defined herein. The assays of the invention can be defined according to parameters relating to their statistical specificity and sensitivity. These parameters define the likelihood of false positive and false negative test results. The lower the proportion of false positive and false negative test results the more statistically accurate the assay becomes. In this regard the inventors have established CpG panels, as described and defined further herein, wherein the methylation status of CpGs in the panel can be used to establish cancer index values such that the assays produce statistically accurate predictions of cancer status. Accordingly, the inventors have determined that the assays described herein may be defined according to statistical parameters such as percentage specificity and sensitivity and also by receiver operating characteristics (ROC) area under the curve (AUC). All such means are known in the art and are known to be defined measures of statistical accuracy for biological assays such as those described and defined herein.
Thus the methods of the invention provide assays which can be used, with a high degree of statistical accuracy, to predict the presence, absence or development of CIN3 and/or cancer, particularly cervical and endometrial cancer, most preferably cervical cancer. The methods of the invention provide assays which can be used, with a high degree of statistical accuracy, to stratify an individual with respect to cancer status. Accordingly, the methods of the invention provide useful information to individuals and their physicians concerning patient cancer status. This information may help inform actual therapeutic treatment measures if the presence of cancer is identified in the individual. The information may help to monitor the progress of therapeutic treatment measures in the individual by monitoring changes in the cancer index value over the course of a period of time. The information may help to monitor the progress of prophylactic or preventative treatment measures in the individual by monitoring changes in the cancer index value over the course of a period of time. As such the methods of the invention offer significant advantages in the personalized prevention and early detection as well as treatment and management of cancer in individuals.
Accordingly, the invention provides an assay for assessing the presence, absence or development of grade 3 cervical intra-epithelial neoplasia (CIN3) and/or cancer in an individual, the assay comprising:
The assay of the invention may be performed as above and additionally wherein the panel of one or more CpGs comprises at least 500 CpGs selected from the CpGs identified at nucleotide positions 61 to 62 in SEQ ID NOs 1 to 5000, preferably wherein the assay is characterised as having an AUC of at least 0.80.
The assay of the invention may be performed as above and additionally wherein the panel of one or more CpGs comprises at least the CpGs identified in SEQ ID NOs 1 to 500 and identified at nucleotide positions 61 to 62, preferably wherein the assay is characterised as having an AUC of at least 0.92.
The assay of the invention may be performed as above and additionally wherein the panel of one or more CpGs comprises at least 1000 CpGs selected from the CpGs identified at nucleotide positions 61 to 62 in SEQ ID NOs 1 to 5000, preferably wherein the assay is characterised as having an AUC of at least 0.80.
The assay of the invention may be performed as above and additionally wherein the panel of one or more CpGs comprises at least the CpGs identified in SEQ ID NOs 1 to 1000 and identified at nucleotide positions 61 to 62, preferably wherein the assay is characterised as having an AUC of at least 0.92.
The assay of the invention may be performed as above and additionally wherein the panel of one or more CpGs comprises at least 1500 CpGs selected from the CpGs identified at nucleotide positions 61 to 62 in SEQ ID NOs 1 to 5000, preferably wherein the assay is characterised as having an AUC of at least 0.82.
The assay of the invention may be performed as above and additionally wherein the panel of one or more CpGs comprises at least the CpGs identified in SEQ ID NOs 1 to 1500 and identified at nucleotide positions 61 to 62, preferably wherein the assay is characterised as having an AUC of at least 0.92.
The assay of the invention may be performed as above and additionally wherein the panel of one or more CpGs comprises at least 2000 CpGs selected from the CpGs identified at nucleotide positions 61 to 62 in SEQ ID NOs 1 to 5000, preferably wherein the assay is characterised as having an AUC of at least 0.81.
The assay of the invention may be performed as above and additionally wherein the panel of one or more CpGs comprises at least the CpGs identified in SEQ ID NOs 1 to 2000 and identified at nucleotide positions 61 to 62, preferably wherein the assay is characterised as having an AUC of at least 0.92.
The assay of the invention may be performed as above and additionally wherein the panel of one or more CpGs comprises at least the 5000 CpGs identified at nucleotide positions 61 to 62 in SEQ ID NOs 1 to 5000, and further wherein the assay is characterised as having an AUC of at least 0.92.
The assay of the invention may be performed as above and additionally wherein the step of determining in the population of DNA molecules in the sample the methylation status of the one or more CpGs in the panel comprises determining a β value of each CpG.
The assay of the invention may be performed as above and additionally wherein the step of deriving the cancer index value based on the methylation status of the one or more CpGs in the panel comprises:
The assay of the invention may be performed as above and additionally wherein the cancer index value is a WID-CIN-Index cancer index value, and wherein the mathematical model which is applied to the methylation β-value data set to generate the cancer index is an algorithm according to the following formula:
wherein:
The assay of the invention may be performed as above and additionally wherein when the cancer index value for the individual is about −0.331 or more, the individual is assessed as having CIN3 and/or cancer or as having a high risk of CIN3 and/or cancer development, or wherein when the cancer index value for the individual is less than about −0.331, the individual is assessed as not having CIN3 and/or cancer or as having a low risk of CIN3 and/or cancer development, preferably wherein:
The assay of the invention may be performed as above and additionally wherein when the cancer index value for the individual is about −0.167 or more, the individual is assessed as having CIN3 and/or cancer, or as having a high risk of CIN3 and/or cancer development, or wherein when the cancer index value for the individual is less than about −0.167, the individual is assessed as not having CIN3 and/or cancer or as having a low risk of CIN3 and/or cancer development, preferably wherein:
The assay of the invention may be performed as above and additionally wherein the step of determining the methylation status of a panel of one or more CpGs comprises determining the methylation status of one or more CpGs denoted by CG identified in a panel of one or more DMRs defined by SEQ ID NOs 5001 to 5418, optionally wherein the panel of one or more CpGs comprises two or more CpGs denoted by CG identified in the panel of DMR(s), three or more CpGs denoted by CG identified in the panel of DMR(s), four or more CpGs denoted by CG identified in the panel of DMR(s), or all CpGs denoted by CG identified in the DMR(s) defined by SEQ ID NOs 5001 to 5418.
The assay of the invention may be performed as above and additionally wherein the step of determining the methylation status of a panel of the one or more CpGs comprises determining the methylation status of five or more, six or more, seven or more, eight or more, or nine or more, or all of the CpGs denoted by CG within any one or more of the DMRs defined by SEQ ID NOs 5001 to 5418.
The assay of the invention may be performed as above and additionally wherein the step of determining the methylation status of a panel of one or more CpGs comprises determining the methylation status of two or more, three or more, four or more, five or more, six or more, seven or more, eight or more, or nine or more, or all of the CpGs denoted by CG within:
The assay of the invention may be performed as above and additionally wherein the step of determining the methylation status of a panel of one or more CpGs comprises determining the methylation status of one or more CpGs within any one or more DMRs selected from the group of DMRs consisting of DMRs 1 to 418 as defined by SEQ ID NOs 5001 to 5418, including:
The assay of the invention may be performed as above and additionally wherein:
The assay of the invention may be performed as above and additionally wherein:
The assay of the invention may be performed as above and additionally wherein the step of determining the methylation status of the one or more CpGs in the panel further comprises or additionally comprises determining the methylation status of each CpG within one or more of the sequences identified by SEQ ID NOs 5703 to 5786.
The assay of the invention may be performed as above and additionally wherein the step of determining the methylation status of the one or more CpGs in the panel comprises determining each CpG within:
The assay of the invention may be performed as above and additionally wherein the step of determining the methylation status of the one or more CpGs in the panel further comprises or additionally comprises determining the methylation status of each CpG within one or more of the sequences identified by SEQ ID NOs 5703, 5731, 5759, 5723, 5751, 5779, 5713, 5741, 5769, 5706, 5734, 5762, 5705, 5733, and 5761, or even more preferably wherein the panel of one or more CpGs further comprises or additionally comprises determining the methylation status of each CpG within one or more of the sequences identified by SEQ ID NOs 5703, 5731, 5759, 5713, 5741, 5769, 5706, 5734, and 5762.
The assay of the invention may be performed as above and additionally wherein the step of determining in the population of DNA molecules in the sample the methylation status of each CpG in the panel of one or more CpGs comprises:
The assay of the invention may be performed as above and additionally wherein the step of determining the methylation status of each CpG in the panel of one or more CpGs comprises:
The invention also provides a method of treating or preventing CIN3 and/or cervical cancer in an individual, the method comprising:
The method of the invention may be performed as above and additionally wherein the individual is assessed as not having CIN3 and/or cancer or as having a low risk of CIN3 and/or cancer development, and wherein the cancer index value is about −0.530 or more and is less than about −0.330, and preferably wherein the assay comprises determining methylation β-values for each CpG in the panel of one or more CpGs, the individual is subjected to one or more treatments according to their cancer index value, the one or more treatments comprise a repeat assay according to any one of the assays of the invention, preferably wherein the repeat assay is performed about one year after the previous assay.
The assay of the invention may be performed as above and additionally wherein the individual is assessed as having a moderate risk of having CIN3 and/or cancer or as having a moderate risk of CIN3 and/or cancer development, and wherein the cancer index value is about −0.330 or more and is less than about −0.170, and preferably wherein the assay comprises determining methylation β-values for each CpG in the panel of one or more CpGs, the individual is subjected to one or more treatments according to their cancer index value, the one or more treatments comprise a test for human papilloma virus (HPV) status and wherein:
The assay of the invention may be performed as above and additionally wherein the individual is assessed as having CIN3 and/or cancer or as having a high risk of CIN3 and/or cancer development, and wherein the cancer index value is about −0.170 or more, and preferably wherein the assay comprises determining methylation β-values for each CpG in the panel of one or more CpGs, the individual is subjected to one or more treatments according to their cancer index value, the one or more treatments comprise a colposcopy, and wherein the colposcopy is negative, an endometrial biopsy and hysteroscopy.
The assay of the invention may be performed as above and additionally wherein the individual is assessed as having CIN3 and/or cancer or as having a high risk of CIN3 and/or cancer development, and wherein the cancer index value is about:
The assay of the invention may be performed as above and additionally wherein the one or more treatments that the individual is subjected to are repeated on a monthly, three monthly, six monthly, yearly or two yearly basis following an initial administration.
The invention also provides a method of monitoring the CIN3 and/or cancer status of an individual according to the individual's cancer index value, the method comprising: (a) assessing the presence, absence or development of CIN3 and/or cancer in an individual by performing the assay according to any one of the assays of the invention at a first time point; (b) assessing the presence, absence or development of CIN3 and/or cancer in the individual by performing the assay according to any one of the assays of the invention at one or more further time points; and (c) monitoring any change in cancer index value and/or the CIN3 and/or cancer status of the individual between time points.
The method of the invention may be performed as above and additionally wherein the further time points are monthly, three monthly, six monthly, yearly or two yearly basis following an initial assessment.
The method of the invention may be performed as above and additionally wherein depending on the cancer status of the individual, one or more treatments are administered to the individual according to any one of the methods of the invention, or when the cancer index value of the individual is:
The method of the invention may be performed as above and additionally wherein an increase in the cancer index value indicates a negative response to the one or more treatments.
The method of the invention may be performed as above and additionally wherein changes are made to the one or more treatments if a negative response is identified.
The method of the invention may be performed as above and additionally wherein a decrease in the cancer index value indicates a positive response to the one or more treatments.
The method of the invention may be performed as above and additionally wherein changes are made to the one or more treatments if a positive response is identified.
The assay of the invention may be performed as above and additionally wherein the sample is obtained from a tissue comprising epithelial cells, preferably wherein the sample is not obtained from ovarian or endometrial tissue.
The assay of the invention may be performed as above and additionally wherein the sample is obtained from:
The assay of the invention may be performed as above and additionally wherein the assay is for assessing the presence, absence or development of:
The invention also provides an array capable of discriminating between methylated and non-methylated forms of CpGs; the array comprising oligonucleotide probes specific for a methylated form of each CpG in a CpG panel and oligonucleotide probes specific for a non-methylated form of each CpG in the panel; wherein the panel consists of at least 500 CpGs selected from the CpGs identified in SEQ ID NOs 1 to 5000 and identified at nucleotide positions 61 to 62, and identified in SEQ ID NOs 5001 to 5418 and denoted by CG.
The array of the invention may be performed as above and additionally provided that the array is not an Infinium MethylationEPIC BeadChip array or an Infinium HumanMethylation450, and/or provided that the number of CpG-specific oligonucleotide probes of the array is 482,000 or less, 480,000 or less, 450,000 or less, 440,000 or less, 430,000 or less, 420,000 or less, 410,000 or less, or 400,000 or less.
The array of the invention may be performed as above and additionally wherein the panel comprises any panel of CpGs defined in the assays of any one of the assays of the invention.
The array of the invention may be performed as above and additionally further comprising one or more oligonucleotides comprising any set of CpGs defined in the assays of any one of the assays of the invention, wherein the one or more oligonucleotides are hybridized to corresponding oligonucleotide probes of the array.
The invention also provides a hybridized array, wherein the array is obtainable by hybridizing to an array according to any one of the arrays of the invention a group of oligonucleotides comprising any panel of CpGs defined in the assays of any one of the assays of the invention.
The invention also provides a process for making the hybridized array according to the hybridised array of the invention, comprising contacting an array according to any one of the arrays of the invention with a group of oligonucleotides comprising any panel of CpGs defined in any one of the assays of the invention.
The present inventors sought to identify CpG methylation-based assays capable of assessing the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual. Any of the assays described herein for assessing the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual are capable of being utilised for assessing the presence, absence or development of cervical cancer and/or endometrial cancer, particularly cervical cancer. The present inventors compared CpG methylation levels in non-cancerous epithelial cells, particularly be derived from the cervix, the vagina, the buccal area, blood and/or urine, preferably derived from a cervical liquid-based cytology sample, and more preferably a cervical smear sample across groups of women that were either known to be both cervical and endometrial cancer negative, or known to be cervical and/or endometrial cancer positive. This led to the surprising establishment of a “cancer index”, used interchangeable herein with “index”, “index value”, “WID-CIN-Index” or “WID-Index” (WID=women's risk identification).
A CpG as defined herein refers to the CG dinucleotide motif identified in relation to each SEQ ID NO., wherein the CG dinucleotide of interest is denoted by CG and by [[CG]]. Thus by determining the methylation status of any panel of one or more CpGs defined by or identified in a given SEQ ID NO, it is meant that a determination is made as to the methylation status of the cytosine of the CG dinucleotide motif identified in square brackets in the panel of one or more CpGs in each sequence shown below, accepting that variations in the sequence upstream and downstream of any given CpG may exist due to sequencing errors or variation between individuals.
As set out in more detail in the Examples, the methylation status of sub-selections of the 5000 CpGs, as identified in SEQ ID NOs 1 to 5000, may be determined in order to assess an individual for the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, with high sensitivity and specificity. A panel of one or more of the CpGs identified in SEQ ID NOs 1 to 5000 may be utilised to derive a cancer index for an individual in accordance with the invention described herein.
The methylation status of a panel of one or more CpGs of the 5000 CpGs defined according to SEQ ID NOs: 1 to 5000 may be assessed by any suitable technique. As explained in more detail in the Examples below, one particular exemplary technique which the inventors have used is an array-based analysis technique coupled with beta value analysis. SEQ ID NOs 1 to 5000 correspond to the sequences of commercial probes utilised in said array.
The inventors further identified 418 differentially methylated regions (DMRs) with relevance to CIN3 and cancer, particularly cervical or endometrial cancer. The nucleotide sequences of the 418 DMRs are defined respectively by the nucleotide sequences of SEQ ID NO: 5001 to 5418 as set out in Table 1 below, accepting that variation in the nucleotide sequence of any given DMR may exist due to sequencing errors and/or variation between individuals. In each of the sequences corresponding to SEQ ID NO: 5001 to 5418, the cytosine of the CG dinucleotide motif identified in square brackets or double square brackets is a cytosine of a CpG which may be included in a panel of CpGs when performing the assays of the invention.
The inventors further defined 28 regions within a select number of the 418 DMRs with particular relevance to CIN3 and cancer, particularly cervical or endometrial cancer. The nucleotide sequences of the 28 regions are defined respectively by the nucleotide sequences of SEQ ID NOs: 5703 to 5786 as set out in Table 2 below, accepting that variation in the nucleotide sequence of any given DMR may exist due to sequencing errors and/or variation between individuals. When any one or more of the 28 regions are included in a panel of CpGs when performing the assays of the invention, the methylation status of every cytosine within a CG dinucleotide in the region is determined. The amplicon sequences generated by the 28 primer and probe reactions as set out Table 2 are described and defined by SEQ ID NOs 5787 to 5814 and in Table 12. In any of the assays described herein, the step of determining the methylation status of a panel of one or more CpGs may comprise determining the methylation status of one or more CpGs within any one or more the amplicons defined by SEQ ID NOs 5787 to 5814 and denoted by CG. More preferably, in any of the assays described herein, the step of determining the methylation status of a panel of one or more CpGs may comprise determining the methylation status of one or more CpGs within any one or more the amplicons defined by SEQ ID NOs 5787, 5790, 5797, 5807 and 5789, although more preferably 5787, 5790, 5797, and denoted by CG. Yet more preferably, in any of the assays described herein, the step of determining the methylation status of a panel of one or more CpGs may comprise determining the methylation status of all of the CpGs denoted by CG in the amplicons defined by SEQ ID NOs 5787, 5790, 5797, 5807 and 5789, although more preferably 5787, 5790, 5797.
Cancer-related CpGs for analysis
In any of the assays described herein, in a sample which has been taken from an individual, the sample comprises a population of DNA molecules.
The assay of the invention further comprises determining in the population of DNA molecules in the sample the methylation status of a panel of:
A cancer index value is then derived based on the methylation status of the one or more CpGs in the panel, which is used to assess the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, in the individual based on the cancer index value.
In any of the assays described herein, in DNA derived from cells in the sample the methylation status of each CpG in a panel of
In any of the assays described herein, in DNA derived from cells in the sample the methylation status of each CpG in a panel of one or more CpGs from a panel of CpGs identified at nucleotide positions 61 to 62 in SEQ ID NOs 1 to 5000, is determined.
In any of the assays described herein, the panel of one or more CpGs may comprise at least 500 CpGs selected from the CpGs identified at nucleotide positions 61 to 62 in SEQ ID NOs 1 to 5000, preferably wherein the assay is characterised as having a receiver operating characteristics (ROC) area under the curve (AUC) of at least 0.80. The panel of one or more CpGs may comprise at least the CpGs identified at nucleotide positions 61 to 62 in SEQ ID NOs 1 to 5000, preferably wherein the assay is characterised as having an ROC AUC of at least 0.92.
In any of the assays described herein, the panel of one or more CpGs may comprise at least 1000 CpGs selected from the CpGs identified at nucleotide positions 61 to 62 in SEQ ID NOs 1 to 5000, preferably wherein the assay is characterised as having a ROC AUC of at least 0.80. The panel of one or more CpGs may comprise at least the CpGs identified in SEQ ID NOs 1 to 1000 and identified at nucleotide positions 61 to 62, preferably wherein the assay is characterised as having a ROC AUC of at least 0.92.
In any of the assays described herein, the panel of one or more CpGs may comprise at least 1500 CpGs selected from the CpGs identified at nucleotide positions 61 to 62 in SEQ ID NOs 1 to 5000, preferably wherein the assay is characterised as having a ROC AUC of at least 0.82. The panel of one or more CpGs may comprise at least the CpGs identified in SEQ ID NOs 1 to 1500 and identified at nucleotide positions 61 to 62, preferably wherein the assay is characterised as having a ROC AUC of at least 0.92.
In any of the assays described herein, the panel of one or more CpGs may comprise at least 2000 CpGs selected from the CpGs identified at nucleotide positions 61 to 62 in SEQ ID NOs 1 to 5000, preferably wherein the assay is characterised as having an AUC of at least 0.81. The panel of one or more CpGs may comprise at least the CpGs identified in SEQ ID NOs 1 to 200 and identified at nucleotide positions 61 to 62, preferably wherein the assay is characterised as having a ROC AUC of at least 0.92.
In any of the assays described herein, the panel of one or more CpGs may comprise at least the 5000 CpGs identified at nucleotide positions 61 to 62 in SEQ ID NOs 1 to 5000, and further wherein the assay is characterised as having a ROC AUC of at least 0.92.
In any of the above-described assays, the assay may be characterised as having a ROC AUC of 0.60 or more, 0.61 or more, 0.62 or more, 0.63 or more, 0.64 or more, 0.65 or more, 0.66 or more, 0.67 or more, 0.68 or more, 0.69 or more, 0.70 or more, 0.71 or more, 0.72 or more, 0.73 or more, 0.74 or more, 0.75 or more, 0.76 or more, 0.77 or more, 0.78 or more, 0.79 or more, 0.80 or more, 0.81 or more, 0.82 or more, 0.83 or more, 0.84 or more, 0.85 or more, 0.86 or more, 0.87 or more, 0.88 or more, 0.89 or more or 0.90 or more.
In any of the described assays, the methylation status of the one or more CpGs in the panel is preferably determined by a β-value analysis, and the assay is for assessing the presence, absence or development of CIN3 and/or cancer, preferably the cancer is cervical cancer or endometrial cancer. Most preferably, the cancer is cervical cancer.
In any of the assays described herein, the panel of one or more CpGs may comprise at least 500 CpGs selected from the CpGs identified at nucleotide positions 61 to 62 in SEQ ID NOs 1 to 5000, optionally wherein:
In any of the described assays, the methylation status of the one or more CpGs in the panel is preferably determined by a β-value analysis, and the assay is for assessing the presence, absence or development of CIN3 and/or cancer, preferably the cancer is cervical cancer or endometrial cancer. Most preferably, the cancer is cervical cancer.
In any of the above-described assays, the assay may be characterised as having a ROC AUC of 0.60 or more, 0.61 or more, 0.62 or more, 0.63 or more, 0.64 or more, 0.65 or more, 0.66 or more, 0.67 or more, 0.68 or more, 0.69 or more, 0.70 or more, 0.71 or more, 0.72 or more, 0.73 or more, 0.74 or more, 0.75 or more, 0.76 or more, 0.77 or more, 0.78 or more, 0.79 or more, 0.80 or more, 0.81 or more, 0.82 or more, 0.83 or more, 0.84 or more, 0.85 or more, 0.86 or more, 0.87 or more, 0.88 or more, 0.89 or more or 0.90 or more.
In any of the assays described herein, the panel of one or more CpGs may comprise:
In any of the described assays, the methylation status of the one or more CpGs in the panel is preferably determined by a β-value analysis, and the assay is for assessing the presence, absence or development of CIN3 and/or cancer, preferably the cancer is cervical cancer or endometrial cancer. Most preferably, the cancer is cervical cancer.
In any of the above-described assays, the assay may be characterised as having a ROC AUC of 0.60 or more, 0.61 or more, 0.62 or more, 0.63 or more, 0.64 or more, 0.65 or more, 0.66 or more, 0.67 or more, 0.68 or more, 0.69 or more, 0.70 or more, 0.71 or more, 0.72 or more, 0.73 or more, 0.74 or more, 0.75 or more, 0.76 or more, 0.77 or more, 0.78 or more, 0.79 or more, 0.80 or more, 0.81 or more, 0.82 or more, 0.83 or more, 0.84 or more, 0.85 or more, 0.86 or more, 0.87 or more, 0.88 or more, 0.89 or more or 0.90 or more.
In any of the assays described herein, the panel of one or more CpGs may comprise:
In any of the described assays, the methylation status of the one or more CpGs in the panel is preferably determined by a β-value analysis, and the assay is for assessing the presence, absence or development of CIN3 and/or cancer, preferably the cancer is cervical cancer or endometrial cancer. Most preferably, the cancer is cervical cancer.
In any of the above-described assays, the assay may be characterised as having a ROC AUC of 0.60 or more, 0.61 or more, 0.62 or more, 0.63 or more, 0.64 or more, 0.65 or more, 0.66 or more, 0.67 or more, 0.68 or more, 0.69 or more, 0.70 or more, 0.71 or more, 0.72 or more, 0.73 or more, 0.74 or more, 0.75 or more, 0.76 or more, 0.77 or more, 0.78 or more, 0.79 or more, 0.80 or more, 0.81 or more, 0.82 or more, 0.83 or more, 0.84 or more, 0.85 or more, 0.86 or more, 0.87 or more, 0.88 or more, 0.89 or more or 0.90 or more.
In any of the assays described herein, the step of determining the methylation status of the one or more CpGs in the panel may comprise determining the methylation status of one or more CpGs selected from within a panel of one or more Differentially Methylated Regions (DMRs) defined by SEQ ID NOs 5001 to 5418, wherein selected CpGs in each DMR are denoted by CG. The nucleotide sequences of the 418 DMRs are defined respectively by the nucleotide sequences of SEQ ID NO: 5001 to 5418 as set out in Table 1, accepting that variation in the nucleotide sequence of any given DMR may exist due to sequencing errors and/or variation between individuals. In each sequence shown below the cytosine of the CG dinucleotide motifs identified in square brackets and double square brackets is a cytosine of a CpG which may be included in a panel of CpGs when performing the assays of the invention.
In any of the assays described herein, the step of determining the methylation status of the one or more CpGs in the panel may comprise determining the methylation status of one or more CpGs denoted by CG within any one or more DMRs or within any combination of two or more DMRs defined by SEQ ID NOs 5001 to 5418, wherein selected CpGs in each DMR are denoted by CG. The DMRs are selected from the group consisting of DMRs 1 to 418 (SEQ ID NOs 5001 to 5418; as set out in Table 1).
The step of determining the methylation status of a panel of one or more CpGs may comprise determining a cancer index value of one or more of the CpGs denoted by CG within any one of the DMRs 1 to 418, or within any combination of two or more DMRs of 1 to 418.
The step of determining the methylation status of a panel of one or more CpGs may comprise determining a cancer index value of two or more, three or more, four or more, five or more, six or more, seven or more, eight or more, or nine or more of the CpGs denoted by CG within any one of the DMRs 1 to 418 (defined by SEQ ID NOs 5001 to 5418), optionally within any combination of two or more DMRs of 1 to 418.
The panel of one or more CpGs may comprise two or more CpGs of the DMR(s), three or more CpGs of the DMR(s), four or more CpGs of the DMR(s) or all CpGs of the DMR(s).
The step of determining the methylation status of a panel of one or more CpGs may comprise determining a cancer index value of least two or more, three or more, four or more, five or more, six or more, seven or more, eight or more, or nine or more of the CpGs denoted by CG within any one of the DMRs 1 to 418, or within:
In any of the assays described herein, the step of determining the methylation status of a panel of one or more CpGs may comprise determining the methylation status of one or more CpGs within any one or more DMRs selected from the group of DMRs consisting of DMRs 1 to 418 as defined by SEQ ID NOs 5001 to 5418, including:
In any one of the assays described herein, the step of determining the methylation status of the one or more CpGs in the panel may comprise or may additionally comprise determining the methylation status of each CpG within one or more of the sequences identified by SEQ ID NOs 5703 to 5786.
In any one of the assays described herein, the step of determining the methylation status of the one or more CpGs in the panel may preferably comprise or may preferably additionally comprise determining the methylation status of each CpG within one or more of the sequences identified by SEQ ID NOs 5703, 5731, 5759, 5723, 5751, 5779, 5713, 5741, 5769, 5706, 5734, 5762, 5705, 5733 and 5761. More preferably, in any one of the assays described herein, the step of determining the methylation status of the one or more CpGs in the panel may comprise or may additionally comprise determining the methylation status of each CpG within all of the sequences identified by SEQ ID NOs 5703, 5731, 5759, 5723, 5751, 5779, 5713, 5741, 5769, 5706, 5734, 5762, 5705, 5733 and 5761.
In any one of the assays described herein, the step of determining the methylation status of the one or more CpGs in the panel may preferably comprise or may preferably additionally comprise determining the methylation status of each CpG within one or more of the sequences identified by SEQ ID NOs 5703, 5731, 5759, 5713, 5741, 5769, 5706, 5734, and 5762. More preferably, in any one of the assays described herein, the step of determining the methylation status of the one or more CpGs in the panel may comprise or may additionally comprise determining the methylation status of each CpG within all of the sequences identified by SEQ ID NOs SEQ ID NOs 5703, 5731, 5759, 5713, 5741, 5769, 5706, 5734, and 5762.
The invention also provides a variety of assays, each comprising any 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more (or any range derivable therein) of a variety of steps and in no particular order, including methods of the following: measuring in a sample; analyzing a sample; assessing a sample; evaluating a sample; measuring nucleic acids in a sample; assessing nucleic acids in a sample; detecting nucleic acids in a sample; measuring methylation in nucleic acids in a sample; analyzing nucleic acids in a sample; assessing nucleic acids in a sample; measuring methylation at one or more CpG dinucleotides in a sample; detecting methylation at one or more CpG dinucleotides in a sample; assaying methylation at one or more CpG dinucleotides in a sample; assessing methylation at one or more CpG dinucleotides in a sample; measuring a methylation status in a sample; assaying a methylation status in a sample; detecting methylation status in a sample; determining methylation status in a sample; identifying methylation status in a sample; measuring one or more DNA methylation markers in a sample; assessing one or more DNA methylation markers in a sample; detecting one or more DNA methylation markers in a sample; measuring the presence of methylation at one or more markers in a sample; detecting the presence of methylation at one or more markers in a sample; assessing the presence of methylation at one or more markers in a sample; assaying the presence of one of more markers in a sample; measuring one or more DNA methylation markers in a sample but excluding the measuring of one or more other DNA methylation markers in the sample; assessing one or more DNA methylation markers in a sample but excluding the assessing of one or more other DNA methylation markers in the sample; analyzing one or more DNA methylation markers in a sample but excluding the analyzing of one or more other DNA methylation markers in the sample; detecting one or more DNA methylation markers in a sample but excluding the detecting of one or more other DNA methylation markers in the sample; measuring methylation status in nucleic acids from a sample from tissue from an individual other than tissue from the individual suspected of, or at risk for, being cancerous; detecting methylation status in nucleic acids from a sample from tissue from an individual other than tissue from the individual suspected of, or at risk for, being cancerous; analyzing methylation status in nucleic acids from a sample from tissue from an individual other than tissue from the individual suspected of, or at risk for, being cancerous; assessing methylation status in nucleic acids from a sample from tissue from an individual other than tissue from the individual suspected of, or at risk for, being cancerous; measuring methylation at one or more CpG dinucleotides in a sample but excluding the measuring of methylation at one or more CpG dinucleotides in the sample; assessing methylation at one or more CpG dinucleotides in a sample but excluding the assessing of methylation at one or more CpG dinucleotides in the sample; analyzing methylation at one or more CpG dinucleotides in a sample but excluding the analyzing of methylation at one or more CpG dinucleotides in the sample; detecting methylation at one or more CpG dinucleotides in a sample but excluding the detecting of methylation at one or more CpG dinucleotides in the sample; measuring methylation at one or more CpG dinucleotides in nucleic acids from a sample from tissue from an individual other than tissue from the individual suspected of, or at risk for, being cancerous; detecting methylation at one or more CpG dinucleotides in nucleic acids from a sample from tissue from an individual other than tissue from the individual suspected of, or at risk for, being cancerous; analyzing methylation at one or more CpG dinucleotides in nucleic acids from a sample from tissue from an individual other than tissue from the individual suspected of, or at risk for, being cancerous; assessing methylation at one or more CpG dinucleotides in nucleic acids from a sample from tissue from an individual other than tissue from the individual suspected of, or at risk for, being cancerous; treating an individual for cancer when the individual has been determined to have a methylation status at one or more methylation markers; treating an individual for cancer when the individual has been determined to have methylation at one or more CpG dinucleotides;
Moreover, in some aspects of the invention, an individual who is administered a therapy or treatment has been subjected to any of the methods and steps described herein.
Described herein are assays that utilise a statistically robust panel of one or more CpGs whose methylation status can be determined to provide a reliable prediction of the presence or development of CIN3 and/or cancer in an individual. By determining the methylation status of each CpG within the panel of one or more CpGs, a cancer index value may be derived thus enabling stratification of individuals according to their risk of developing CIN3 and/r cancer or of having cancer, particularly cervical and/or endometrial cancer, with statistically robust sensitivity and specificity. The skilled person would understand that the methylation status of each CpG within a panel of one or more CpGs can be determined by any suitable means in order to thereby derive the cancer index value. Any one method, or a combination of methods, may be used to determine the methylation status of each CpG within a panel of one or more CpGs.
Various exemplary methods for determining the methylation status of each CpG within a panel of one or more CpGs are described herein. For example, in one method a percent methylated reference (PMR) value of a CpG may be determined. In another method the methylation β-values of a CpG may be determined. Different mechanisms may be employed to determine specific values depending on the circumstances, such as PCR-based mechanisms or array-based mechanisms.
In any of the assays described herein, the assessment of the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual is based on the cancer index value of the individual at the time of testing.
As explained herein, using panels of the specific CpGs disclosed herein, cancer index values can be established which correspond with CIN3 and/or cancer negative samples, because they are based on values derived from individuals known to be CIN3 and/or cancer negative. Similarly, using panels of the specific CpGs disclosed herein, cancer index values can be established which correspond with CIN3 and/or cancer positive samples from individuals known to be CIN3 and/or cancer positive. A user can then apply these cancer index values to assess the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, in any test individual whose CIN3 and/or cancer status is required to be tested. As also explained herein, the assays of the invention are capable of being performed with a high degree of statistical accuracy.
As explained herein, the described assays particularly relate to the assessment of the presence, absence or development of cervical cancer and/or endometrial cancer, particularly cervical cancer.
A skilled person will readily appreciate that a cancer index value provides a value that indicates a “likelihood” or “risk” or “prediction” of any of the assays of the invention correctly assessing the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual. This because the assessment is based upon a correlation between DNA methylation profiles of tissue samples and individual disease status. Nevertheless, as demonstrated by data set out in the Examples and elsewhere herein, the assays of the invention provide such correlations with high statistical accuracy, thus providing the skilled person with a high degree of confidence that the cancer index value which is determined for any test individual whose cancer status is required to be tested will provide an accurate correlation with actual disease status for the individual.
In the context of the present invention, “likelihood”, “risk” and “prediction” may be used synonymously with each other.
Any references herein to sequences, genomic sequences and/or genomic coordinates are derived based upon Homo sapiens (human) genome assembly GRCh37 (hg19). The skilled person would understand variations in the nucleotide sequences of any given sequence, particularly DMRs 1 to 418, may exist due to sequencing errors and/or variation between individuals.
The assay of the invention represents a ‘prediction’ because any cancer index value (WID-CIN-Index) derived in accordance with the invention is unlikely to be capable of diagnosing every individual as having or not having cancer with 100% specificity and 100% sensitivity. Rather, depending on the cancer index cutpoint threshold applied by the user for positively predicting the presence of cancer in an individual, the false positive and false negative rate will vary. In other words, the inventors have discovered that the assays of the invention can achieve variable levels of sensitivity and specificity for predicting the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, as defined by receiver operating characteristics, depending on the cancer index cutpoint threshold chosen and applied by the user. Such sensitivity and specificity can be seen from the data disclosed herein to be achievable at high proportions, demonstrating accurate and statistically-significant discriminatory capability.
Similarly, cancer index values which have been pre-determined to correlate with specific cancer phenotypes, such as the presence or absence of cancer, have been defined with a high level of statistical accuracy as explained further herein.
Assessing the ‘development’ of cancer in the context of the invention may refer to assessing whether an individual is likely or unlikely to develop cancer. Cells from sampled these tissues/anatomical sites can act as a surrogate for cervical and/or endometrial cells that may transform to cancer. Assessing the development of cancer in accordance with the assays of the invention may refer to assessing an increased or decreased likelihood of CIN3 and/or cancer development, particularly cervical cancer and endometrial cancer, preferably cervical cancer. Assessing the development of cancer in accordance with the assays of the invention may refer to assessing progression or worsening of a pre-existing cancer lesion in an individual. Assessment of the development of cancer in accordance with the assays of the invention may refer to predicting the likelihood of recurrence of cancer.
In any of the assays described herein, the step of assessing the presence or development of CIN3 and/or cancer in an individual based on a cancer index value may involve the application of a threshold value. Threshold values can provide a risk-based indication of an individual's CIN3 and/or cancer status, whether that is CIN3 and/or cancer positive, or CIN3 and/or cancer negative. Threshold values can also provide a means for identifying whether the cancer index value is intermediate between a CIN3 and/or cancer positive value and a CIN3 and/or cancer negative value. As explained herein, the cancer index value may be dynamic and subject to change depending upon genetic and/or environmental factors. Accordingly, the cancer index value may provide a means for assessing and monitoring cancer development. Cancer index values may therefore indicate at least a low risk or a high risk that the individual has a CIN3 and/or cancer positive status or has a status that is indicative of the development of CIN3 and/or cancer. If the cancer index value of an individual is determined by the assays of the invention at two or more time points, an increase or decrease in the individual's cancer index value may indicate an increased or decreased risk of the individual having or developing CIN3 and/or cancer, particularly cervical and/or endometrial cancer, most preferably cervical cancer.
Throughout the disclosure herein the terms “threshold value”, “cutpoint”, and “cutpoint threshold” are to be considered synonymous and interchangeable.
As explained further herein any assay of the invention is an assay for assessing the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual. The types of cancer are set out further herein. As explained further herein, the assays of the invention provide means for assessing whether an individual is at risk of having or developing CIN3 and/or cancer based on specific cutpoint thresholds. Such risk assessments can be provided with a high degree of confidence based on the statistical parameters which characterise the assay. Thus in any of the assays described herein involving cancer index cutpoint thresholds, the cutpoint threshold may be used for risk assessment purposes. Equally, in any of the assays described herein involving cancer index cutpoint thresholds, the cutpoint threshold value may be used to specify whether or not an individual has CIN3 and/or cancer as a pure diagnostic test. Again, such diagnostic tests can be provided with a high degree of confidence based on the statistical parameters which characterise the assay. Accordingly, in any assay described herein which specifies that a cancer index value for the individual is a specific value or more, or is “about” a specific value or more, the individual may be assessed as having cancer. In any assay described herein which specifies that a cancer index value for the individual is less than a specific value, or is less than “about” a specific value, the individual may be assessed as not having cancer. The term “about” is to be understood as providing a range of +/−5% of the value.
Accordingly, any assay of the invention is an assay for assessing the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual, the assay comprising:
Any of the assays of the invention are particularly for assessing the presence or absence of CIN3 and/or cancer in an individual.
Such an assay may be performed in accordance with any of the methods disclosed and defined herein.
As explained further herein, any assay of the invention for assessing the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual may alternatively be referred to as an assay for stratifying an individual in accordance with their CIN3 and/or cancer status.
Accordingly, any assay of the invention is an assay for stratifying an individual for the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual, the assay comprising:
Such an assay may be performed in accordance with any of the methods disclosed and defined herein.
Accordingly, any assay of the invention is an assay for stratifying an individual for CIN3 and/or cancer, the assay comprising:
Such an assay may be performed in accordance with any of the methods disclosed and defined herein.
The cancer index value may be derived by any suitable means. Preferably, the cancer index value may be derived by assessing the methylation status of the panel of:
in a sample provided from an individual. The methylation status of the CpGs may be determined by any suitable means. For example, in any of the assays described herein the step of determining the methylation status of each CpG in the panel of one or more CpGs may comprise:
In any of the assays described herein the step of determining the methylation status of each CpG in the panel of one or more CpGs may comprise bisulphite converting the DNA.
The step of determining in the population of DNA molecules in the sample the methylation status of a panel of one or more CpGs may comprises determining a β value of each CpG. Deriving the cancer index value may involve providing a methylation β-value data set comprising the methylation β-values for each CpG in the panel of one or more CpGs. Additionally, or alternatively, the step of determining in the population of DNA molecules in the sample the methylation status of a panel of one or more CpGs may comprises determining a percent methylated reference value for each of the panel of one or more CpGs. Optionally deriving the cancer index value may also involve estimating the fraction of contaminating DNA within the DNA provided from a sample.
DNA may be DNA originating from a particular source organism, tissue or cell type. Preferably the contaminating DNA originates from one or more different cell types to one or more cell types of interest. A cell type of interest may particularly be an epithelial cell. In some aspects of the invention, it may be preferable to estimate the fraction of contaminating DNA after the step of providing a sample which has been take from an individual. The assays described herein may optionally involve estimating a contaminating DNA fraction within DNA in the sample by any suitable means. Preferably, the contaminating DNA fraction for the sample is estimated via any suitable bioinformatics analysis tool. A bioinformatics analysis tool that may be used to estimate a contaminating DNA fraction may be EpIDISH. As described herein, it may be desirable to estimate the fraction of contaminating DNA from the one or more cell types that are different to the one or more cell types of interest because the cancer index value used for predicting the presence or development of cancer in an individual may, in some instances, only be reliably derived from determining the methylation status of a set of CpGs from DNA of a particular cell type of interest. Particularly, methylation status beta-values may differ in the one or more cell types of interest within a sample relative to methylation status beta-values in contaminating DNA from different cell types within the same sample. Thus, the derived cancer index value may in some instances have a decreased predictive power without estimating and controlling for the contaminating DNA fraction within the DNA provided from the sample. In assays of the invention that involve estimating the fraction of contaminating DNA and accordingly controlling for said contaminating DNA, it is preferable to estimate an immune cell DNA fraction within the DNA provided from the sample. In particular assays of the invention, wherein the individual has an immune cell contamination of over 50% (i.e. wherein more than 50% of the DNA in the sample is deemed to be derived from immune cells), the assay may preferably involve controlling for the immune cell contamination by deriving the cancer index, in accordance with the invention, solely from the DNA molecules derived from epithelial cells.
Any of the assays described herein comprising a step of deriving a cancer index value based on the methylation status of the one or more CpGs in the panel may further comprise applying an algorithm to the methylation beta-value dataset to obtain the cancer index value. Preferably, in any of the assays described herein, the step of deriving the cancer index value based on the methylation status of the panel of CpGs comprises providing a methylation beta-value data set comprising the methylation beta-values for each CpG in the panel and applying an algorithm to the methylation beta-value data set to obtain the cancer index value.
In any of the assays described herein, the step of deriving the cancer index value based on the methylation status of the one or more CpGs in the panel may comprise:
In any of the assays described herein, the cancer index value may be calculated by any suitable mathematical model such as an algorithm or formula. Preferably, the cancer index value is termed Women's risk Identification for Cervical Cancer Index (WID-CIN-index) and wherein the mathematical model which is applied to the methylation β-value data set to generate the cancer index is calculated by an algorithm according to the following formula:
In any of the assays described herein, the WID-CIN-index algorithm applies real valued coefficients inferred by initially training on a dataset (this dataset in the exemplary embodiments of the invention described in the Examples consisted of 165 CIN3+ cases and 202 human papillomavirus positive (HPV+) controls) to fit a ridge classifier using the R package glmnet with a mixing parameter value of alpha=0 (ridge penalty) and binomial response type. Ten-fold cross-validation was used internally by the cv.glmnet function in order to determine the optimal value of the regularisation parameter lambda. The beta values from n CpGs for individual ν, β1ν, . . . , βnν, are used as inputs to the ridge classifier. The coefficients w1, . . . , wn are obtained from the fitted model. The following quantity was computed for each individual ν in the training set:
Any suitable real valued coefficients may be applied to the WID-CIN-Index in any of the assays described herein.
The value of the parameters μ and σ are given by the mean and standard deviation of xν in the training dataset respectively.
Thus, any suitable μ and σ real valued parameters may be applied to the WID-CIN-index in any of the assays described herein. Any suitable training data set may be applied to the assays described herein in order to infer real valued parameters and coefficients that can subsequently be applied to the WID-CIN-index formula according to the present invention. Exemplary ways of utilising a training dataset in accordance with the present invention are further described in the ‘Statistical analyses for classifier development’ section of the Materials and Methods section of the Examples.
Exemplary μ and σ real valued parameters are provided in Table 2 for CpG subsets identified in SEQ ID NOs 1 to 5000. These real valued parameters may be applied to any of the assays described herein wherein the real parameters correspond to any one of the sets of CpGs identified in SEQ ID NOs 1 to 5000 and set out in the left hand column of Table 3.
Exemplary w1, . . . , wn real value coefficients are provided below for the CpGs identified at positions 61 to 62 in SEQ ID NOs 1 to 5000. These real value coefficients may be applied to any of the assays described herein wherein the real parameters correspond to any one of the sets of CpGs identified in SEQ ID NOs 1 to 5000 wherein the 5000 real value coefficients below in turn correspond to the CpGs in turn identified at nucleotide positions 61 to 62 in SEQ ID NOs 1 to 5000. Accordingly, the listed coefficients are presented below in numerical order corresponding respectively to the CpGs identified in SEQ ID NOs 1 to 5000. Thus the first number below corresponds to SEQ ID NO 1, the second number corresponds to SEQ ID NO 2 etc. The exemplary real value coefficients are as follows:
The predicting the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual may particularly involve a threshold cancer index value being applied in order to assess or stratify an individual has having or not having cancer or of having a high or low risk of CIN3 and/or cancer development.
The assays of the invention may involve a threshold index being applied in order to assess the presence or absence of CIN3 and/or cancer in an individual. The assessment may be characterised by receiver operating characteristics, particularly and area under the curve (AUC), sensitivity, and specificity, indicative of the reliability of the threshold being applied in order to assess the presence or absence of CIN3 and/or cancer in an individual.
In any of the assays described herein, wherein when the cancer index value for the individual is about −0.331 or more, the individual is assessed as having cancer and/or CIN3, or as having a risk of cancer and/or CIN3 development, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual is based on the WID-CIN-Index. The panel of one or more CpGs used to derive the cancer index value may comprise:
In any of the assays described herein, wherein when the cancer index value for the individual is about −0.331 or more, the individual is assessed as having cancer and/or CIN3 or as having a high risk of cancer and/or CIN3 development, or wherein when the cancer index value for the individual is less than about −0.331 the individual is assessed as not having cancer and/or CIN3 or as having a low risk of cancer development and/or CIN3, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual is based on the WID-CIN-Index. The panel of one or more CpGs used to derive the cancer index value may comprise:
In any of the assays described herein, wherein when the cancer index value for the individual is about −0.311 or more, the individual is assessed as having cancer and/or CIN3 or as having a high risk of cancer and/or CIN3 development, or wherein when the cancer index value for the individual is less than about −0.311, the individual is assessed as not having cancer and/or CIN3 or as having a low risk of cancer development and/or CIN3, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual is based on the WID-CIN-Index. The panel of one or more CpGs used to derive the cancer index value may comprise:
In any of the described assays, the methylation status of the one or more CpGs in the panel is preferably determined by a β-value analysis, and the assay is for assessing the presence, absence or development of CIN3 and/or cancer, preferably the cancer is cervical cancer or endometrial cancer. Most preferably, the cancer is cervical cancer.
In any of the assays described herein, wherein when the cancer index value for the individual is about −0.167 or more, the individual is assessed as having cancer and/or CIN3, or as having a risk of cancer and/or CIN3 development, or wherein when the cancer index value for the individual is less than about −0.167, the individual is assessed as not having cancer and/or CIN3 or as having a low risk of cancer development and/or CIN3, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual is based on the WID-CIN-Index. The panel of one or more CpGs used to derive the cancer index value may comprise:
In any of the assays described herein, wherein when the cancer index value for the individual is about −0.167 or more, the individual is assessed as having cancer and/or CIN3 or as having a high risk of cancer and/or CIN3 development, or wherein when the cancer index value for the individual is less than about −0.167, the individual is assessed as not having cancer and/or CIN3 or as having a low risk of cancer development and/or CIN3, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual is based on the WID-CIN-Index. The panel of one or more CpGs used to derive the cancer index value may comprise:
In any of the assays described herein, wherein when the cancer index value for the individual is about −0.167 or more, the individual is assessed as having cancer and/or CIN3, or as having a risk of cancer and/or CIN3 development, or wherein when the cancer index value for the individual is less than about −0.167, the individual is assessed as not having cancer and/or CIN3 or as having a low risk of cancer development and/or CIN3, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual is based on the WID-CIN-Index. The panel of one or more CpGs used to derive the cancer index value may comprise:
In any of the described assays, the methylation status of the one or more CpGs in the panel is preferably determined by a β-value analysis, and the assay is for assessing the presence, absence or development of CIN3 and/or cancer, preferably the cancer is cervical cancer or endometrial cancer. Most preferably, the cancer is cervical cancer.
In any of the described assays, the methylation status of the one or more CpGs in the panel is preferably determined by a β-value analysis, and the assay is for assessing the presence, absence or development of CIN3 and/or cancer, preferably the cancer is cervical cancer or endometrial cancer. Most preferably, the cancer is cervical cancer.
The ROC data set out in Tables 4, 5 and 6 corresponding to each specified panel of SEQ ID NOs: 1 to 5000 are derived by determining a cancer index value from said panel.
The predicting of the presence, absence, or development of cancer in an individual may particularly involve determining the mean β-value across any panel of one or more CpGs defined herein. A threshold mean β-value may be applied in order to stratify an individual as having or not having cancer, or of having a high or low risk of CIN3 and/or cancer development, preferably wherein the cancer is cervical or endometrial cancer, more preferably wherein the cancer is cervical cancer.
In any of the assays described herein, wherein:
In any of the assays described herein, wherein:
In any of the assays described herein, wherein:
In any of the assays described herein, wherein:
In any of the assays described herein, wherein:
In any of the assays described herein, wherein:
In any of the assays described herein, wherein:
In any of the assays described herein, wherein:
In any of the assays described herein, wherein:
In any of the assays described herein, wherein:
In any of the assays described herein, wherein:
In any of the assays described herein:
In any of the described assays, the methylation status of the one or more CpGs in the panel is preferably determined by a β-value analysis, and the assay is for assessing the presence, absence or development of CIN3 and/or cancer, preferably the cancer is cervical cancer or endometrial cancer. Most preferably, the cancer is cervical cancer.
The ROC data set out in Table 7 corresponding to each of SEQ ID NOs: 5001 to 5418 are derived by determining a cancer index value from a panel of CpGs in each instance whereby the panel comprises the CpGs denoted by [[CG]].
The predicting of the presence, absence, or development of cancer in an individual may particularly involve determining percent methylated reference for the panel of one or more CpGs. A threshold percent methylated reference value may be applied in order to stratify an individual as having or not having cancer, or of having a high or low risk of CIN3 and/or cancer development, preferably wherein the cancer is cervical or endometrial cancer, more preferably wherein the cancer is cervical cancer.
In any of the assays described herein, the step of determining the methylation status of the one or more CpGs in the panel may comprise determining each CpG within:
In any of the described assays, the methylation status of the one or more CpGs in the panel is preferably determined by a percent methylated reference analysis, and the assay is for assessing the presence, absence or development of CIN3 and/or cancer, preferably the cancer is cervical cancer or endometrial cancer. Most preferably, the cancer is cervical cancer.
The ROC data set out in Tables 8, 10, 11 and 13 corresponding to each of SEQ ID NOs: 5703 to 5786 are derived by determining a cancer index value from a panel of CpGs, wherein the panel in each instance comprises all of the CpGs in the sequence(s) defined by the SEQ ID NO. The ROC data set out in Tables 8 and 13 relate to the performance of the assay of the invention in stratifying an individual as having or not having CIN3, or as having a high or low risk of developing CIN3. The ROC data set out in Table 11 relates to the performance of the assay of the invention in stratifying an individual as having or not having endometrial cancer, or as having a high or low risk of developing endometrial cancer.
In view of the observations described herein (see Examples), the inventors derived a cancer index based on an analysis of methylation status (DNAme; as described above) for use in assays for assessing the presence or development of cancer in an individual.
As explained herein, the described assays particularly relate to the assessment of assessing the presence, absence or development of cervical cancer and/or endometrial cancer, particularly cervical cancer.
Any of the assays described herein involve deriving a cancer index value based on the methylation of status of a panel of one or more CpGs assayed in a sample provided from an individual, as described and defined herein.
The cancer index value may be derived by any suitable means.
The inventors have identified specific CpGs, as described and defined herein, which may be used to form a panel of CpGs whose methylation status is determined in order to establish cancer index values in accordance with the assays described and defined herein. Using these panels the inventors have demonstrated that it is possible to derive a cancer index value which correlates with and is indicative of normal tissue, i.e. tissue which is CIN3 and/or cancer negative, in particular cervical and/or endometrial tissue which is cancer negative. Accordingly, cancer can be assessed to be absent in the individual. Using these panels the inventors have demonstrated that it is possible to derive a cancer index value which correlates with and is indicative of CIN3 and/or cancer tissue, i.e. tissue which is CIN3 and/or cancer positive, in particular cervical and/or endometrial tissue which is cancer negative. Accordingly, CIN3 and/or cancer can be assessed to be present in the individual. As explained herein, the inventors have shown that using panels of the CpGs that have been identified. Accordingly, CIN3 and/or cancer can be assessed to be present in the individual. As explained herein, the inventors have shown that using panels of the CpGs that have been identified it can be shown that the DNA methylation profile of normal cells, particularly from epithelial cells, particularly derived from the cervix, vaginal, the buccal area, blood or urine, or from a cervical liquid-based cytology sample, and more preferably from a cervical smear sample, as indicated by the cancer index value, is dynamic and subject to change on a continuum from indicating CIN3 and/or cancer negative to CIN3 and/or cancer positive tissue. In particular, the cancer index value described herein acts as a surrogate for indicating whether the cervical and/or endometrial tissue of an individual is cancer negative or cancer positive to a high degree of statistical accuracy. As such, using panels of the CpGs that have been identified it is possible to establish a cancer index value scale that can be used to assess the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual.
As described herein, the inventors have used certain methods for determining the methylation status of specific CpGs in the population of DNA molecules in the sample. For example, in one method a percent methylated reference (PMR) value of a CpG may be determined. In another method the methylation β-values of a CpG may be determined. Different mechanisms may be employed to determine specific values depending on the circumstances, such as PCR-based mechanisms or array-based mechanisms.
As will be apparent to a skilled person, in the assays of the invention the steps of determining the methylation status of specific CpGs in the population of DNA molecules in the sample are not limited to any one specific methodology. As the skilled person will appreciate, because the cancer index value is based on the methylation status of CpGs, and since the methylation status of CpGs can be represented by values which may be specific to a specific methodology, e.g. percent methylated reference (PMR) value or methylation β-value, then the range of cancer index values which define cancer negative and cancer positive samples may be dependent upon the methodology used to determine the methylation status of CpGs. Nevertheless, a user may readily reproduce and implement the assays of the invention using any suitable methodology for determining the methylation status of CpGs, provided that the same methodology is used consistently. Moreover, the user can readily establish, de novo, cancer index values which define cancer negative and cancer positive samples by determining the methylation status of CpGs in panels constituting the specific CpGs disclosed herein from known cancer negative and cancer positive patient samples. Once such cancer index values are established using the CpGs identified herein, a user may use these values as a basis for assessing the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, in any test individual whose cancer status is to be determined. Accordingly, cancer index values according to the present invention are not limited to specific methods of determination of methylation status of CpGs. On the contrary, the skilled person will appreciate that cancer index values can be established which reflect the intrinsic capabilities of the CpGs identified herein to correlate methylation status with CIN3 and/or cancer disease status.
Accordingly, the cancer index value may be derived by assessing the methylation status of the one or more CpGs in the panel in a sample provided from an individual by any suitable means.
The step of determining the methylation status of each CpG in the panel of one or more CpGs may be achieved by determining a percent methylated reference (PMR) value of each one of the one or more CpGs. The step of determining the methylation status of each CpG in the panel of one or more CpGs may be achieved by determining the methylation β-value of each one of the one or more CpGs.
In any of the assays described herein, the methylation status of the CpGs may be determined by any suitable means. For example, in any of the assays described herein the step of determining the methylation status of each CpG in the panel of one or more CpGs may comprise:
The step of determining the methylation status of each CpG in the panel of one or more CpGs may comprise a conversion step in order to distinguish methylated CpG dinucleotides relative to non-methylated CpG dinucleotides. The conversion step may comprise e.g. bisulfate conversion or TAPS (TET-assisted pyridine borane sequencing) conversion of the DNA in a sample that is to be applied to any one or more of a. to c. above. TAPS may particularly involve the steps of oxidising 5-methylcytosine bases (5mC) to 5-carboxylcytosine bases (5caC), preferably by ten-eleven translocation (TET), and/or oxidising 5-hydroxymethylcytosine bases (5hmC) to 5-carboxylcytosine bases (5caC), preferably by ten-eleven translocation (TET); followed by reducing 5-carboxylcytosine bases (5caC) to dihydrouracil bases (DHU), optionally with pyridine borane.
The step of determining the methylation status of each CpG in the panel of one or more CpGs may additionally, or alternatively, comprise the use of TempO-seq (templated Olig-sequencing). The oligonucleotides in the context of TempO-seq may or may not be designed such that they hybridise with methylated CpG dinucleotides following a prior conversion as described herein.
The step of determining the methylation status of each CpG in the panel of one or more CpGs may comprise the contacting the DNA in the sample with one or more methylation sensitive restriction endonucleases that cleave methylated and/or unmethylated forms of their restriction sites, and preferably the contacting of the DNA is prior to performing any one of a. to c. above. In assays of the invention wherein methylation sensitive restriction enzymes are used, one or more control reactions are performed. Preferably, the one or more control reactions involve interrogation of known loci that contain (i) no restriction endonuclease sites; (ii) a restriction site that is methylated; (iii) a restriction site that is unmethylated.
Using any of the methods for determining the methylation status of each CpG in the panel of one or more CpGs, the proportion of methylated and unmethylated CpGs at any given locus may be determined, thereby enabling generation of a cancer index value.
Preferably, the step of determining in the population of DNA molecules in the sample the methylation status of a panel of one or more CpGs further comprises determining a β value of each CpG. Deriving the cancer index value may involve providing a methylation β-value data set comprising the methylation β-values for each CpG in the panel of one or more CpGs.
Methylation of DNA is a recognised form of epigenetic modification which has the capability of altering the expression of genes and other elements such as microRNAs. In cancer development and progression, methylation may have the effect of e.g. silencing tumor suppressor genes and/or increasing the expression of oncogenes. Other forms of dysregulation may occur as a result of methylation. Methylation of DNA occurs at discrete loci which are predominately dinucleotides consisting of a CpG motif, but may also occur at CHH motifs (where H is A, C, or T). During methylation, a methyl group is added to the fifth carbon of cytosine bases to create methylcytosine.
Methylation can occur throughout the genome and is not limited to regions with respect to an expressed sequence such as a gene. Methylation typically, but not always, occurs in a promoter or other regulatory region of an expressed sequence such as enhancer elements. Most typically, the methylation status of CpGs is clustered in CpG islands, for example CpG islands present in the regulatory regions of genes, especially in their promoter regions.
Typically, an assessment of DNA methylation status involves analysing the presence or absence of methyl groups in DNA, for example methyl groups on the 5 position of one or more cytosine nucleotides. Preferably, the methylation status of one or more cytosine nucleotides present as a CpG dinucleotide (where C stands for Cytosine, G for Guanine and p for the phosphate group linking the two) is assessed.
A variety of techniques are available for the identification and assessment of CpG methylation status, as will be outlined briefly below. The assays described herein encompass any suitable technique for the determination of CpG methylation status.
Methyl groups are lost from a starting DNA molecule during conventional in vitro handling steps such as PCR. To avoid this, techniques for the detection of methyl groups commonly involve the preliminary treatment of DNA prior to subsequent processing, in a way that preserves the methylation status information of the original DNA molecule. Such preliminary techniques involve three main categories of processing, i.e. bisulphite modification, restriction enzyme digestion and affinity-based analysis. Products of these techniques can then be coupled with sequencing or array-based platforms for subsequent identification or qualitative assessment of CpG methylation status.
Techniques involving bisulphite modification of DNA have become the most common assays for detection and assessment of methylation status of CpG dinucleotides. Treatment of DNA with bisulphite, e.g. sodium bisulphite, converts cytosine bases to uracil bases, but has no effect on 5-methylcytosines. Thus, the presence of a cytosine in bisulphite-treated DNA is indicative of the presence of a cytosine base which was previously methylated in the starting DNA molecule. Such cytosine bases can be detected by a variety of techniques. For example, primers specific for unmethylated versus methylated DNA can be generated and used for PCR-based identification of methylated CpG dinucleotides. DNA may be amplified, either before or after bisulphite conversion. A separation/capture step may be performed, e.g. using binding molecules such as complementary oligonucleotide sequences. Standard and next-generation DNA sequencing protocols can also be used.
In other approaches, methylation-sensitive enzymes can be employed which digest or cut only in the presence of methylated DNA. Analysis of resulting fragments is commonly carried out using microarrays.
Affinity-based techniques exploit binding interactions to capture fragments of methylated DNA for the purposes of enrichment. Binding molecules such as anti-5-methylcytosine antibodies are commonly employed prior to subsequent processing steps such as PCR and sequencing.
Olkhov-Mitsel and Bapat (2012) provide a comprehensive review of techniques available for the identification and assessment of biomarkers involving methylcytosine.
For the purposes of assessing the methylation status of the CpG-based biomarkers characterised and described herein, any suitable assay can be employed.
Assays described herein may comprise determining methylation status of CpGs by bisulphite converting the DNA. Preferred assays involve bisulphite treatment of DNA, including amplification of the identified CpG loci for methylation specific PCR and/or sequencing and/or assessment of the methylation status of target loci using methylation-discriminatory microarrays.
Amplification of CpG loci can be achieved by a variety of approaches. Preferably, CpG loci are amplified using PCR. A variety of PCR-based approaches may be used. For example, methylation-specific primers may be hybridized to DNA containing the CpG sequence of interest. Such primers may be designed to anneal to a sequence derived from either a methylated or non-methylated CpG locus. Following annealing, a PCR reaction is performed and the presence of a subsequent PCR product indicates the presence of an annealed CpG of identifiable sequence. In such assays, DNA is bisulphite converted prior to amplification. Such techniques are commonly referred to as methylation specific PCR (MSP).
In other techniques, PCR primers may anneal to the CpG sequence of interest independently of the methylation status, and further processing steps may be used to determine the status of the CpG. Assays are designed so that the CpG site(s) are located between primer annealing sites. This assay scheme is used in techniques such as bisulphite genomic sequencing, COBRA, Ms-SNuPE. In such assay, DNA can be bisulphite converted before or after amplification.
Small-scale PCR approaches may be used. Such approaches commonly involve mass partitioning of samples (e.g. digital PCR). These techniques offer robust accuracy and sensitivity in the context of a highly miniaturised system (pico-liter sized droplets), ideal for the subsequent handling of small quantities of DNA obtainable from the potentially small volume of cellular material present in biological samples, particularly urine samples. A variety of such small-scale PCR techniques are widely available. For example, microdroplet-based PCR instruments are available from a variety of suppliers, including RainDance Technologies, Inc. (Billerica, MA; http://raindancetech.com/) and Bio-Rad, Inc. (http://www.bio-rad.com/). Microarray platforms may also be used to carry out small-scale PCR. Such platforms may include microfluidic network-based arrays e.g. available from Fluidigm Corp. (www.fluidigm.com).
Following amplification of CpG loci, amplified PCR products may be coupled to subsequent analytical platforms in order to determine the methylation status of the CpGs of interest. For example, the PCR products may be directly sequenced to determine the presence or absence of a methylcytosine at the target CpG or analysed by array-based techniques.
Any suitable sequencing techniques may be employed to determine the sequence of target DNA. In the assays of the present invention the use of high-throughput, so-called “second generation”, “third generation” and “next generation” techniques to sequence bisulphite-treated DNA can be used.
In second generation techniques, large numbers of DNA molecules are sequenced in parallel. Typically, tens of thousands of molecules are anchored to a given location at high density and sequences are determined in a process dependent upon DNA synthesis. Reactions generally consist of successive reagent delivery and washing steps, e.g. to allow the incorporation of reversible labelled terminator bases, and scanning steps to determine the order of base incorporation. Array-based systems of this type are available commercially e.g. from Illumina, Inc. (San Diego, CA; http://www.illumina.com/).
Third generation techniques are typically defined by the absence of a requirement to halt the sequencing process between detection steps and can therefore be viewed as real-time systems. For example, the base-specific release of hydrogen ions, which occurs during the incorporation process, can be detected in the context of microwell systems (e.g. see the Ion Torrent system available from Life Technologies; http://www.lifetechnologies.com/). Similarly, in pyrosequencing the base-specific release of pyrophosphate (PPi) is detected and analysed. In nanopore technologies, DNA molecules are passed through or positioned next to nanopores, and the identities of individual bases are determined following movement of the DNA molecule relative to the nanopore. Systems of this type are available commercially e.g. from Oxford Nanopore (https://www.nanoporetech.com/). In an alternative assay, a DNA polymerase enzyme is confined in a “zero-mode waveguide” and the identity of incorporated bases are determined with florescence detection of gamma-labeled phosphonucleotides (see e.g. Pacific Biosciences; http://www.pacificbiosciences.com/).
In other assays sequencing steps may be omitted. For example, amplified PCR products may be applied directly to hybridization arrays based on the principle of the annealing of two complementary nucleic acid strands to form a double-stranded molecule. Hybridization arrays may be designed to include probes which are able to hybridize to amplification products of a CpG and allow discrimination between methylated and non-methylated loci. For example, probes may be designed which are able to selectively hybridize to an CpG locus containing thymine, indicating the generation of uracil following bisulphite conversion of an unmethylated cytosine in the starting template DNA. Conversely, probes may be designed which are able to selectively hybridize to a CpG locus containing cytosine, indicating the absence of uracil conversion following bisulphite treatment. This corresponds with a methylated CpG locus in the starting template DNA.
Following the application of a suitable detection system to the array, computer-based analytical techniques can be used to determine the methylation status of a CpG. Detection systems may include, e.g. the addition of fluorescent molecules following a methylation status-specific probe extension reaction. Such techniques allow CpG status determination without the specific need for the sequencing of CpG amplification products. Such array-based discriminatory probes may be termed methylation-specific probes.
Any suitable methylation-discriminatory microarrays may be employed to assess the methylation status of the CpGs described herein. One particular methylation-discriminatory microarray system is provided by Illumina, Inc. (San Diego, CA; http://www.illumina.com/). In particular, the Infinium MethylationEPIC BeadChip array and the Infinium HumanMethylation450 BeadChip array systems may be used to assess the methylation status of CpGs for predicting cancer development as described herein. Such a system exploits the chemical modifications made to DNA following bisulphite treatment of the starting DNA molecule. Briefly, the array comprises beads to which are coupled oligonucleotide probes specific for DNA sequences corresponding to the unmethylated form of a CpG, as well as separate beads to which are coupled oligonucleotide probes specific for DNA sequences corresponding to the methylated form of an CpG. Candidate DNA molecules are applied to the array and selectively hybridize, under appropriate conditions, to the oligonucleotide probe corresponding to the relevant epigenetic form. Thus, a DNA molecule derived from a CpG which was methylated in the corresponding genomic DNA will selectively attach to the bead comprising the methylation-specific oligonucleotide probe, but will fail to attach to the bead comprising the non-methylation-specific oligonucleotide probe. Single-base extension of only the hybridized probes incorporates a labeled ddNTP, which is subsequently stained with a fluorescence reagent and imaged. The methylation status of the CpG is determined by calculating the ratio of the fluorescent signal derived from the methylated and unmethylated sites.
Infinium HumanMethylation450 BeadChip array systems can be used to interrogate CpGs in the assays described herein. Alternative or customised arrays could, however, be employed to interrogate the cancer-specific CpG biomarkers defined herein, provided that they comprise means for interrogating all CpG for a given assay, as defined herein.
Techniques involving combinations of the above-described assays may also be used. For example, DNA containing CpG sequences of interest may be hybridized to microarrays and then subjected to DNA sequencing to determine the status of the CpG as described above.
In the assays described above, sequences corresponding to CpG loci may also be subjected to an enrichment process if desired. DNA containing CpG sequences of interest may be captured by binding molecules such as oligonucleotide probes complementary to the CpG target sequence of interest. Sequences corresponding to CpG loci may be captured before or after bisulphite conversion or before or after amplification. Probes may be designed to be complementary to bisulphite converted DNA. Captured DNA may then be subjected to further processing steps to determine the status of the CpG, such as DNA sequencing steps.
Capture/separation steps may be custom designed. Alternatively a variety of such techniques are available commercially, e.g. the SureSelect target enrichment system available from Agilent Technologies (http://www.agilent.com/home). In this system biotinylated “bait” or “probe” sequences (e.g. RNA) complementary to the DNA containing CpG sequences of interest are hybridized to sample nucleic acids. Streptavidin-coated magnetic beads are then used to capture sequences of interest hybridized to bait sequences. Unbound fractions are discarded. Bait sequences are then removed (e.g. by digestion of RNA) thus providing an enriched pool of CpG target sequences separated from non-CpG sequences. Template DNA may be subjected to bisulphite conversion and target loci amplified by small-scale PCR such as microdroplet PCR using primers which are independent of the methylation status of the CpG. Following amplification, samples may be subjected to a capture step to enrich for PCR products containing the target CpG, e.g. captured and purified using magnetic beads, as described above. Following capture, a standard PCR reaction is carried out to incorporate DNA sequencing barcodes into CpG-containing amplicons. PCR products are again purified and then subjected to DNA sequencing and analysis to determine the presence or absence of a methylcytosine at the target genomic CpG.
CpG biomarker loci defined herein by SEQ ID NOs 1 to 5000 correspond to Illumina® identifiers (IlmnID) known in the art. These CpG loci identifiers refer to individual CpG sites used in the commercially available Illumina® Infinium Methylation EPIC BeadChip kit and Illumina® Infinium Human Methylation450 BeadChip kit. The identity of each CpG site represented by each CpG loci identifier is publicly available from the Illumina, Inc. website under reference to the CpG sites used in the Infinium Methylation EPIC BeadChip kit and the Infinium Human Methylation450 BeadChip kit.
To complement evolving public databases to provide accurate CpG loci identifiers and strand orientation, Illumina® has developed a method to consistently designate CpG loci based on the actual or contextual sequence of each individual CpG locus. To unambiguously refer to CpG loci in any species, Illumina® has developed a consistent and deterministic CpG loci database to ensure uniformity in the reporting of methylation data. The Illumina® method takes advantage of sequences flanking a CpG locus to generate a unique CpG locus cluster ID. This number is based on sequence information only and is unaffected by genome version. Illumina's standardized nomenclature also parallels the TOP/BOT strand nomenclature (which indicates the strand orientation) commonly used for single nucleotide polymorphism (SNP) designation.
Illumina® Identifiers for the Infinium MethylationEPIC BeadChip and Infinium Human Methylation450 BeadChip system are also available from public repositories such as Gene Expression Omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo/).
By assessing the methylation status of a CpG it is meant that a determination is made as to whether a given CpG is methylated or unmethylated. In addition, it is meant that a determination is made as to the degree to which a given CpG site is methylated across a population of CpG loci in a sample.
CpG methylation status may be measured indirectly using a detection system such as fluorescence. A methylation-discriminatory microarray may be used. When calculating the degree of methylation of a given CpG, the Illumina® definition of beta-values may be used. The Illumina® methylation beta-value of a specific CpG site is calculated from the intensity of the methylated (M) and unmethylated (U) alleles, as the ratio of fluorescent signals β=Max(M,0)/[Max(M,0)+Max(U,0)+100]. On this scale, 0<β<1, β values of 1 or close to 1 indicate 100% methylation whereas values of 0 or close to 0 indicate 0% methylation.
The methylation status of any one or more CpGs of the CpGs defined by SEQ ID NOs: 1 to 5000 or identified in SEQ ID NOs: 5001 to 5418 may be assessed by any suitable technique. As explained in more detail in the Examples below, one particular exemplary technique which the inventors have used is a methylation discriminatory array, such as an Illumina InfiniumMethylation EPIC BeadChip. These assays utilise probes directed to methylated and unmethylated CpGs at a given locus.
Another exemplary technique which the inventors have used to determine the methylation status of any one or more CpGs is a fluorescence-based PCR technique referred to as MethyLight. These assays utilise forward and reverse PCR primers specific for sequences encompassing any one or more of the 5000 CpGs defined according to SEQ ID NOS: 1 to 5000, or identified in SEQ ID NOs: 5001 to 5418 and 5703 to 5786. The methylation status of one or more of the CpGs defined by SEQ ID NOs: 1 to 5000, or identified in SEQ ID NOs: 5001 to 5418 and 5703 to 5786 may therefore be determined by MethyLight analysis. The detectable probes are typically designed such that they hybridise only to methylated forms of the one or more CpGs to be assayed in view of the bisulfite conversion step within a typical MethyLight protocol.
Software programs which aid in the in silico analysis of bisulphite converted DNA sequences and in primer design for the purposes of methylation-specific analyses are generally available and have been described previously.
In risk models for predicting cancer, a receiver-operating-characteristic (ROC) curve analysis often used, in which the area under the curve (AUC) is assessed. Each point on the ROC curve shows the effect of a rule for turning a risk/likelihood estimate into a prediction of the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual. The AUC measures how well the model discriminates between case subjects and control subjects. An ROC curve that corresponds to a random classification of case subjects and control subjects is a straight line with an AUC of 50%. An ROC curve that corresponds to perfect classification has an AUC of 100%.
In any of the methods described herein, the 95% confidence interval for the ROC AUC may be between 0.60 and 1.
In any of the methods described herein, the interval may be defined as a range having as an upper limit any number between 0.60 and 1. The upper limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.90, 0.91, 0.92, 0.93, 0.94, 0.95, 0.96, 0.97, 0.98, 0.99 or 1.00.
In any of the methods described herein, the interval may be defined as a range having as a lower limit any number between 0.60 and 1. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.90, 0.91, 0.92, 0.93, 0.94, 0.95, 0.96, 0.97, 0.98, 0.99 or 1.00.
In any of the methods described herein, the interval range may comprise any of the above lower limit numbers combined with any of the above upper limit numbers as appropriate.
Preferably, the upper limit number is 1. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 1 and as a lower limit any number between 0.60 and 1. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.90, 0.91, 0.92, 0.93, 0.94, 0.95, 0.96, 0.97, 0.98, 0.99 or 1.00.
The upper limit number may be 0.99. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.99 and as a lower limit any number between 0.60 and 0.99. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.90, 0.91, 0.92, 0.93, 0.94, 0.95, 0.96, 0.97, 0.98 or 0.99.
The upper limit number may be 0.98. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.98 and as a lower limit any number between 0.60 and 0.98. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.90, 0.91, 0.92, 0.93, 0.94, 0.95, 0.96, 0.97 or 0.98.
The upper limit number may be 0.97. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.97 and as a lower limit any number between 0.60 and 0.97. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.90, 0.91, 0.92, 0.93, 0.94, 0.95, 0.96 or 0.97.
The upper limit number may be 0.96. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.96 and as a lower limit any number between 0.60 and 0.96. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.90, 0.91, 0.92, 0.93, 0.94, 0.95 or 0.96.
The upper limit number may be 0.95. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.95 and as a lower limit any number between 0.60 and 0.95. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.90, 0.91, 0.92, 0.93, 0.94 or 0.95.
The upper limit number may be 0.94. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.94 and as a lower limit any number between 0.60 and 0.94. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.90, 0.91, 0.92, 0.93 or 0.94.
The upper limit number may be 0.93. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.93 and as a lower limit any number between 0.60 and 0.93. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.90, 0.91, 0.92 or 0.93.
The upper limit number may be 0.92. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.92 and as a lower limit any number between 0.60 and 0.92. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.90, 0.91 or 0.92.
The upper limit number may be 0.91. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.91 and as a lower limit any number between 0.60 and 0.91. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.90 or 0.91.
The upper limit number may be 0.90. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.90 and as a lower limit any number between 0.60 and 0.90. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89 or 0.90.
The upper limit number may be 0.89. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.89 and as a lower limit any number between 0.60 and 0.89. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88 or 0.89.
The upper limit number may be 0.88. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.88 and as a lower limit any number between 0.60 and 0.88. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87 or 0.88.
The upper limit number may be 0.87. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.87 and as a lower limit any number between 0.60 and 0.87. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86 or 0.87.
The upper limit number may be 0.86. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.86 and as a lower limit any number between 0.60 and 0.86. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85 or 0.86.
The upper limit number may be 0.85. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.85 and as a lower limit any number between 0.60 and 0.85. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84 or 0.85.
The upper limit number may be 0.84. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.84 and as a lower limit any number between 0.60 and 0.84. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83 or 0.84.
The upper limit number may be 0.83. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.83 and as a lower limit any number between 0.60 and 0.83. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82 or 0.83.
The upper limit number may be 0.82. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.82 and as a lower limit any number between 0.60 and 0.82. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81 or 0.82.
The upper limit number may be 0.81. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.81 and as a lower limit any number between 0.60 and 0.81. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80 or 0.81.
The upper limit number may be 0.80. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.80 and as a lower limit any number between 0.60 and 0.80. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79 or 0.80.
The upper limit number may be 0.79. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.79 and as a lower limit any number between 0.60 and 0.79. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78 or 0.79.
The upper limit number may be 0.78. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.78 and as a lower limit any number between 0.60 and 0.78. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77 or 0.78.
The upper limit number may be 0.77. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.77 and as a lower limit any number between 0.60 and 0.77. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76 or 0.77.
The upper limit number may be 0.76. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.76 and as a lower limit any number between 0.60 and 0.76. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75 or 0.76.
The upper limit number may be 0.75. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.75 and as a lower limit any number between 0.60 and 0.75. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74 or 0.75.
The upper limit number may be 0.74. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.74 and as a lower limit any number between 0.60 and 0.74. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73 or 0.74.
The upper limit number may be 0.73. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.73 and as a lower limit any number between 0.60 and 0.73. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72 or 0.73.
The upper limit number may be 0.72. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.72 and as a lower limit any number between 0.60 and 0.72. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71 or 0.72.
The upper limit number may be 0.71. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.71 and as a lower limit any number between 0.60 and 0.71. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70 or 0.71.
The upper limit number may be 0.70. Thus, the 95% confidence ROC AUC interval may be defined as a range having an upper limit of 0.70 and as a lower limit any number between 0.60 and 0.70. The lower limit number may be 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69 or 0.70.
The term “treatment” as used herein is intended to refer to any intervention or procedure performed on an individual, including a surgical intervention or a pharmacological intervention such as the administration of a compound or drug. Any such treatment may be performed for therapeutic purposes or for preventative or prophylactic purposes.
The invention also encompasses the performance of one or more treatment steps following a positive classification of CIN3 and/or cancer, particularly cervical and/or endometrial cancer, based on any of the methods described herein. Said treatments may be considered “therapeutic” treatments.
The invention also encompasses the performance of one or more treatment steps following a negative classification of CIN3 and/or cancer or prediction of an individual being at risk of cancer development, particularly cervical and/or endometrial cancer, based on any of the methods described herein. Said treatments may be considered “risk prevention”, “preventative” or “prophylactic” treatments.
The invention also encompasses the performance of one or more treatment steps following a negative classification of CIN3 and/or cancer or prediction of an individual being at risk of CIN3 and/or cancer development based on any of the methods described herein, in an individual that harbours one or more mutations that predispose the individual to an increased risk of developing CIN3 and/or cancer.
The invention thus encompasses a method of treating a cancer and/or CIN3 patient comprising administering chemotherapy, radiation, immunotherapy or any therapy described herein to the patient determined to have a cancer index value which indicates that the patient has is positive for cancer and/or CIN3 based on any of the assays described herein, preferably wherein the cancer is cervical cancer.
The invention thus encompasses a method of treating and/or preventing CIN3 and/or cancer in an individual, the method comprising:
The invention thus encompasses a method of treating and/or preventing cancer in an individual, the method comprising:
In any of the methods of treatment encompassed by the invention, the step of predicting the presence or development of CIN3 and/or cancer, preferably wherein the cancer in cervical and/or endometrial cancer, in an individual may involve deriving a cancer index value.
In any of the methods of treatment encompassed by the invention, the step of predicting the presence or development of CIN3 and/or cancer in an individual may involve the use of any one of the arrays described herein.
In any of the methods of treatment encompassed by the invention, the step of stratifying the individual may involve applying any one of the thresholds according to any one of the assays of the invention described herein.
The step of administering one or more treatments may comprise different treatment steps depending on the stratification of an individual on the basis of their CIN3 and/or cancer status or their risk of having CIN3 and/or cancer or on the basis of risk of CIN3 and/or cancer development, particularly cervical and/or endometrial cancer, most preferably cervical cancer. Particularly the amount of an invasiveness of the treatments administered may vary dependent on the stratification of an individual on the basis of their CIN3 and/or cancer status or their risk of having CIN3 and/or cancer or on the basis of their risk of CIN3 and/or cancer development. The treatments administered to the individual may comprise any treatments considered suitable by a person skilled in the art.
For example, when the individual is assessed as not having CIN3 and/or cancer, as having a low risk of having CIN3 and/or cancer or as having a low risk of CIN3 and/or cancer development, and wherein the cancer index value is about −0.530 or more and less than −0.330, and preferably wherein the assay comprises determining methylation β-values for each CpG in the panel of one or more CpGs the individual is subjected to one or more treatments according to cancer index value, the one or more treatments may comprise a repeat assay according to the assays of the invention described herein, preferably wherein the repeat assay is performed about one year after the previous assay, preferably wherein the cancer is cervical and/or endometrial cancer, most preferably wherein the cancer is cervical cancer.
For example, when the individual is assessed as having a moderate risk of having CIN3 and/or cancer or as having a moderate risk of CIN3 and/or cancer development, and wherein the cancer index value is about −0.330 or more and less than −0.170, and preferably wherein the assay comprises determining methylation β-values for each CpG in the panel of one or more CpGs the individual is subjected to one or more treatments according to cancer index value, the one or more treatments may comprise a test for human papilloma virus (HPV) status and wherein:
For example, when the individual is assessed as having CIN3 and/or cancer or as having a high risk of CIN3 and/or cancer development, and wherein the cancer index value is about −0.170 or more, and preferably wherein the assay comprises determining methylation β-values for each CpG in the panel of one or more CpGs the individual is subjected to one or more treatments according to cancer index value, the one or more treatments may comprise a colposcopy, and wherein the colposcopy is negative, an endometrial biopsy and hysteroscopy.
For example, when the individual is assessed as having CIN3 and/or cancer or as having a high risk of CIN3 and/or cancer development, and wherein the cancer index value is about:
In any of the assays described herein, the one or more treatments that the individual is subjected may be repeated at any suitable interval as would be understood by a person skilled in the art. For example, any one of the one or more treatments that the individual is subjected to are repeated on a monthly, three monthly, six monthly, yearly or two yearly basis following an initial administration.
Other exemplary treatments comprise one or more surgical procedures, one or more chemotherapeutic agents, one or more cytotoxic chemotherapeutic agents one or more radiotherapeutic agents, one or more immunotherapeutic agents, one or more biological therapeutics, one or more anti-hormonal treatments or any combination of the above following a positive diagnosis of cancer.
In any of the methods of treatment described herein, the individual may particularly be administered treatments recited in Table 9. Four sub-groups defined by ranges of cancer index values are specified in Table 9 as corresponding to preferred clinical actions, comprising intensified screening, administration of therapeutics and surgery.
Cancer treatments may be administered to an individual harbouring cancer or at risk of cancer development, in an amount sufficient to prevent, treat, cure, alleviate or partially arrest cancer or one or more of its symptoms. Such treatments may result in a decrease in severity, and/or decreased cancer index value, of cancer symptoms, or an increase in frequency or duration of symptom-free periods. A treatment amount adequate to accomplish this defined as “therapeutically effective amount”. Effective amounts for a given purpose will depend on the severity of cancer and/or the individual's cancer index value as well as the weight and general state of the individual. As used herein, the term “individual” includes any human, preferably wherein the human is a woman. As used herein, “treatment” is to be considered synonymous with “therapeutic agent”.
The following therapeutic agents may be administered to an individual based on their cancer risk alone or in combination with any other treatment described herein. The therapeutic agent may be directly attached, for example by chemical conjugation, to an antibody. Methods of conjugating agents or labels to an antibody are known in the art. For example, carbodiimide conjugation (Bauminger & Wilchek (1980) Methods Enzymol. 70, 151-159) may be used to conjugate a variety of agents, including doxorubicin, to antibodies or peptides. The water-soluble carbodiimide, 1-ethyl-3-β-dimethylaminopropyl) carbodiimide (EDC) is particularly useful for conjugating a functional moiety to a binding moiety. Other methods for conjugating a moiety to antibodies can also be used. For example, sodium periodate oxidation followed by reductive alkylation of appropriate reactants can be used, as can glutaraldehyde cross-linking. However, it is recognised that, regardless of which method of producing a conjugate of the invention is selected, a determination must be made that the antibody maintains its targeting ability and that the functional moiety maintains its relevant function.
A cytotoxic moiety may be directly and/or indirectly cytotoxic. By “directly cytotoxic” it is meant that the moiety is one which on its own is cytotoxic. By “indirectly cytotoxic” it is meant that the moiety is one which, although is not itself cytotoxic, can induce cytotoxicity, for example by its action on a further molecule or by further action on it. The cytotoxic moiety may be cytotoxic only when intracellular and is preferably not cytotoxic when extracellular.
Cytotoxic chemotherapeutic agents are well known in the art. Cytotoxic chemotherapeutic agents, such as anticancer agents, include: alkylating agents including nitrogen mustards such as mechlorethamine (HN2), cyclophosphamide, ifosfamide, melphalan (L-sarcolysin) and chlorambucil; ethylenimines and methylmelamines such as hexamethylmelamine, thiotepa; alkyl sulphonates such as busulfan; nitrosoureas such as carmustine (BCNU), lomustine (CCNU), semustine (methyl-CCNU) and streptozocin (streptozotocin); and triazenes such as decarbazine (DTIC; dimethyltriazenoimidazole-carboxamide); Antimetabolites including folic acid analogues such as methotrexate (amethopterin); pyrimidine analogues such as fluorouracil (5-fluorouracil; 5-FU), floxuridine (fluorodeoxyuridine; FUdR) and cytarabine (cytosine arabinoside); and purine analogues and related inhibitors such as mercaptopurine (6-mercaptopurine; 6-MP), thioguanine (6-thioguanine; TG) and pentostatin (2′-deoxycoformycin). Natural Products including vinca alkaloids such as vinblastine (VLB) and vincristine; epipodophyllotoxins such as etoposide and teniposide; antibiotics such as dactinomycin (actinomycin D), daunorubicin (daunomycin; rubidomycin), doxorubicin, bleomycin, plicamycin (mithramycin) and mitomycin (mitomycin C); enzymes such as L-asparaginase; and biological response modifiers such as interferon alphenomes. Miscellaneous agents including platinum coordination complexes such as cisplatin (cis-DDP) and carboplatin; anthracenedione such as mitoxantrone and anthracycline; substituted urea such as hydroxyurea; methyl hydrazine derivative such as procarbazine (N-methylhydrazine, MIH); and adrenocortical suppressant such as mitotane (o,p′-DDD) and aminoglutethimide; taxol and analogues/derivatives; and hormone agonists/antagonists such as flutamide and tamoxifen.
A cytotoxic chemotherapeutic agent may be a cytotoxic peptide or polypeptide moiety which leads to cell death. Cytotoxic peptide and polypeptide moieties are well known in the art and include, for example, ricin, abrin, Pseudomonas exotoxin, tissue factor and the like. Methods for linking them to targeting moieties such as antibodies are also known in the art. Other ribosome inactivating proteins are described as cytotoxic agents in WO 96/06641. Pseudomonas exotoxin may also be used as the cytotoxic polypeptide. Certain cytokines, such as TNFα and IL-2, may also be useful as cytotoxic agents.
Certain radioactive atoms may also be cytotoxic if delivered in sufficient doses. Radiotherapeutic agents may comprise a radioactive atom which, in use, delivers a sufficient quantity of radioactivity to the target site so as to be cytotoxic. Suitable radioactive atoms include phosphorus-32, iodine-125, iodine-131, indium-111, rhenium-186, rhenium-188 or yttrium-90, or any other isotope which emits enough energy to destroy neighbouring cells, organelles or nucleic acid. Preferably, the isotopes and density of radioactive atoms in the agents of the invention are such that a dose of more than 4000 cGy (preferably at least 6000, 8000 or 10000 cGy) is delivered to the target site and, preferably, to the cells at the target site and their organelles, particularly the nucleus.
The radioactive atom may be attached to an antibody, antigen-binding fragment, variant, fusion or derivative thereof in known ways. For example, EDTA or another chelating agent may be attached to the binding moiety and used to attach 111In or 90Y. Tyrosine residues may be directly labelled with 1251 or 1311.
A cytotoxic chemotherapeutic agent may be a suitable indirectly-cytotoxic polypeptide. In a particularly preferred embodiment, the indirectly cytotoxic polypeptide is a polypeptide which has enzymatic activity and can convert a non-toxic and/or relatively non-toxic prodrug into a cytotoxic drug. With antibodies, this type of system is often referred to as ADEPT (Antibody-Directed Enzyme Prodrug Therapy). The system requires that the antibody locates the enzymatic portion to the desired site in the body of the patient and after allowing time for the enzyme to localise at the site, administering a prodrug which is a substrate for the enzyme, the end product of the catalysis being a cytotoxic compound. The object of the approach is to maximise the concentration of drug at the desired site and to minimise the concentration of drug in normal tissues. In a preferred embodiment, the cytotoxic moiety is capable of converting a non-cytotoxic prodrug into a cytotoxic drug.
In any of the methods of treatment described herein, the one or more treatments that the individual is subjected to may be repeated on one or more occasions. The one or more treatments may be repeated at regular intervals. The repetitive nature of the treatment administration may depend on the particular treatment being administered. Some treatments may require repetitive administration at greater frequency than others. The skilled person would be aware of the frequency of administration required for therapies known in the art. The one or more treatments may be repeated weekly, two weekly, three weekly, four weekly, monthly, three monthly, six monthly, yearly, two yearly, three yearly, four yearly, or five yearly.
In any of the methods described herein, when the individual is assessed as having a cancer index value of is less than about −0.530, and preferably wherein the assay comprises determining methylation β-values for each CpG in the panel of one or more CpGs, no treatment is administered to the individual.
In any of the methods described herein, when the individual is assessed as having a cancer index value of is:
The invention also provides methods of monitoring the presence, or risk of the presence or development of CIN3 and/or cancer in an individual.
“Monitoring” in the context of the present invention may refer to longitudinal assessment of an individual's CIN3 and/or cancer status, risk of harbouring CIN3 and/or cancer or risk of CIN3 and/or cancer development. This longitudinal assessment may be carried out according to any of the assays of the invention described herein. This longitudinal assessment may involve performance of any of the assays of the invention described herein to predict the presence or development of CIN3 and/or cancer in an individual at more than one time point over the course of an undetermined time window. The time window may be any period of time whilst the individual is still living. The time window may persist for the lifetime of the individual. The time window may persist until the individual's CIN3 and/or cancer status, risk of harbouring CIN3 and/or cancer or risk of CIN3 and/or cancer development falls below a certain level. The level may be a particular cancer index value.
The invention thus encompasses a method of monitoring for the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, particularly cervical and/or endometrial cancer, most preferably cervical cancer, in an individual, the method comprising:
The invention also encompasses a method of monitoring for the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, particularly cervical and/or endometrial cancer, in an individual, the method comprising:
In any of the methods of monitoring described herein, the steps of assessing the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual based on a cancer index value may involve the application of threshold values. Threshold values can provide an indication of an individual's CIN3 and/or cancer status, risk of having CIN3 and/or cancer or an individual's risk of CIN3 and/or cancer development. For example, cancer index values may indicate the presence or absence of CIN3 and/or cancer, or a high or low risk of having or developing CIN3 and/or cancer. In any of the methods of monitoring encompassed by the invention, the step of predicting the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual involves deriving a cancer index value.
The invention further encompasses a method of measuring methylation in a patient at multiple time points comprising (a) assessing the presence, absence or development of CIN3 and/or cancer in an individual by performing any one of the assays of the invention described herein at a first time point; (b) assessing the presence, absence or development of CIN3 and/or cancer in the individual by performing any one of the assays of the invention described herein at one or more further time points, and (c) detecting differential methylation status between (a) and (b).
In any of the methods of monitoring described herein, depending on the risk of the presence or development of CIN3 and/or cancer in the individual, one or more treatments are administered to the individual according to any one of the methods of treatment encompassed by the invention and described herein, or wherein the cancer index value of the individual is less than about −0.530, and preferably wherein the assay comprises determining methylation β-values for each CpG in the panel of one or more CpGs, no treatment is administered to the individual. Different treatments may be administered depending on the stratification of an individual on the basis of their CIN3 and/or cancer status, risk of harbouring CIN3 and/or cancer or on the basis of their risk of CIN3 and/or cancer development. The method may further comprise administration of one or more treatments according to the methods of treatment described herein.
The cancer index value may change between any two or more time points. For this reason, longitudinal monitoring of an individual's cancer index value could be of particular benefit to the assessment of, for example, cancer progression, prevention of recurrence of cancer, cancer treatment efficacy, or cancer efficacy.
In any of the methods of monitoring described herein, the one or more further time points may be any suitable time point. Preferably the one or more further time points may of suitable distance apart for sufficiently frequent screening in order to predict any particularly early onset cases of presence or development of cancer in an individual. Preferably the one or more further time points may be of suitable distance apart for assessing the efficacy of one or more treatments. Preferably the one or more further time points may be of suitable distance apart for predicting whether an individual remains free of cancer after a successful course of treatment. The one or more further time points may be about monthly, about two monthly, about three monthly, about four monthly, about five monthly, about six monthly, about seven monthly, about eight monthly, about nine monthly, about ten monthly, about eleven monthly, about yearly, about two yearly, or more than two yearly.
In any of the methods of monitoring described herein, changes may be made to the one or more treatments wherein a positive or negative responses to the one or more treatments are observed. Treatments may be changed in accordance with the methods of treatments described herein. Treatments may particularly be changed if the individual's cancer status or risk stratification, based on their cancer index value, changes.
In any of the methods of monitoring encompassed by the invention, the step of predicting the presence or development of CIN3 and/or cancer in an individual may involve the use of any one of the arrays described herein.
The assays described herein are preferably performed on DNA from cells derived from/obtained from samples from tissue in which the native tissue structure is preserved e.g. a biopsy, or a sample comprising exfoliated cells from a tissue surface. The samples may comprise epithelial cells. The sample may particularly be derived from the cervix, the vagina, the buccal area, blood and/or urine. The sample is preferably a cervical liquid-based cytology sample, and more preferably a cervical smear sample.
Preferably, any one of the assays described herein for assessing the presence, absence or development of CIN3 and/or cancer, particularly cervical or endometrial cancer, most preferably cervical cancer, in an individual comprises providing a sample which has been taken from the individual. Preferably the individual is a woman.
In any of the assays described herein, the assay may or may not encompass the step of obtaining the sample from the individual. In assays which do not encompass the step of obtaining the sample from the individual, a sample which has previously been obtained from the individual is provided.
The sample may be provided directly from the individual for analysis or may be derived from stored material, e.g. frozen, preserved, fixed or cryopreserved material.
In any of the assays described herein, the sample may be self-collected or collected by any suitable medical professional.
Any of the assays described herein, the sample may comprise cells. The sample may comprise genetic material such as DNA and/or RNA.
Any of the assays described herein may involve providing a biological sample from the patient as the source of patient DNA for methylation analysis.
Any of the assays described herein may involve obtaining patient DNA from a biological sample which has previously been obtained from the patient.
Any of the assays described herein may involve obtaining a biological sample from the patient as the source of patient DNA for methylation analysis. The sample may be self-collected or collected by any suitable medical professional. Procedures for obtaining a biological sample include biopsy.
Methods for sample isolation and for the subsequent extraction and isolation of DNA from such cell or tissue samples in preparation for assessing DNA methylation, are well known to those skilled in the art. In the context of the assays or methods described herein, the entirety of a sample may be used, or alternatively cells may be concentrated or cell types may be fractionated in order to only apply subsets of one or more cell types to the present assays or methods. Any suitable methods of concentration or fractionation may be used.
The methods described herein may be applied to any cancer. Preferably, the methods described herein may be applied to cervical cancer and/or endometrial cancer, particularly cervical cancer. The methods described herein are most preferably applied to cervical cancer.
The cancer may be a primary cancer lesion. The cancer may be a secondary cancer lesion. The cancer may be a metastatic lesion.
In assays described herein, wherein the assay is for assessing the presence, absence or development of grade 3 cervical epithelial neoplasia (CIN3) and/or cervical cancer, the cervical cancer may preferably be a squamous cell cancer, an adenocarcinoma or an adenosquamous carcinoma. Any of the assays described herein may additionally, or alternatively, be for assessing the presence, absence or development of endometrial cancer.
In assays described herein, wherein the assay is for assessing the presence, absence or development of endometrial cancer, the endometrial cancer may preferably be endometroid cancer, uterine carcinosarcoma, squamous cell carcinoma, small cell carcinoma, transitional carcinoma, serous carcinoma, clear-cell carcinoma, mucinous adenocarcinoma, undifferentiated carcinoma, dedifferentiated carcinoma or serous adenocarcinoma.
The invention also encompasses arrays capable of discriminating between methylated and non-methylated forms of CpGs as defined herein; the arrays may comprise oligonucleotide probes specific for methylated forms of CpGs as defined herein and oligonucleotide probes specific for non-methylated forms of CpGs as defined herein.
In any of the arrays described herein, the array may comprise oligonucleotide probes specific for a methylated form of each CpG in a CpG panel and oligonucleotide probes specific for a non-methylated form of each CpG in the panel; wherein the panel consists of at least 500 CpGs selected from the CpGs identified in SEQ ID NOs 1 to 5418.
The panel may consist of at least 500 CpGs selected from the CpGs identified in SEQ ID NOs 1 to 5418, preferably wherein the CpGs comprise the CpGs identified in SEQ ID NOs 1 to 500.
The panel may consist of at least 1000 CpGs selected from the CpGs identified in SEQ ID NOs 1 to 5418, preferably wherein the CpGs comprise the CpGs identified in SEQ ID NOs 1 to 1000.
The panel may consist of at least 1500 CpGs selected from the CpGs identified in SEQ ID NOs 1 to 5418, preferably wherein the CpGs comprise the CpGs identified in SEQ ID NOs 1 to 1500.
The panel may consist of at least 2000 CpGs selected from the CpGs identified in SEQ ID NOs 1 to 5418, preferably wherein the CpGs comprise the CpGs identified in SEQ ID NOs 1 to 2000.
The panel may consist of at least 2500 CpGs selected from the CpGs identified in SEQ ID NOs 1 to 5418, preferably wherein the CpGs comprise the CpGs identified in SEQ ID NOs 1 to 2500.
The panel may consist of at least 3000 CpGs selected from the CpGs identified in SEQ ID NOs 1 to 5418, preferably wherein the CpGs comprise the CpGs identified in SEQ ID NOs 1 to 3000.
The panel may consist of at least 3500 CpGs selected from the CpGs identified in SEQ ID NOs 1 to 5418, preferably wherein the CpGs comprise the CpGs identified in SEQ ID NOs 1 to 3500.
The panel may consist of at least 4000 CpGs selected from the CpGs identified in SEQ ID NOs 1 to 5418, preferably wherein the CpGs comprise the CpGs identified in SEQ ID NOs 1 to 4000.
The panel may consist of at least 4500 CpGs selected from the CpGs identified in SEQ ID NOs 1 to 5418, preferably wherein the CpGs comprise the CpGs identified in SEQ ID NOs 1 to 4500.
The panel may consist of at least 5000 CpGs selected from the CpGs identified in SEQ ID NOs 1 to 5418, preferably wherein the CpGs are the CpGs identified in SEQ ID NOs 1 to 5000.
The panel may consist of all CpGs identified in SEQ ID NOs 1 to 5000 and identified at nucleotide positions 61 to 62, and identified in SEQ ID NOs 5001 to 5418 and denoted by CG.
In some embodiments the array is not an Infinium MethylationEPIC BeadChip array or an Illumina Infinium HumanMethylation450 BeadChip array.
Separately or additionally, in some embodiments the number of CpG-specific oligonucleotide probes of the array is 482,000 or less, 480,000 or less, 450,000 or less, 440,000 or less, 430,000 or less, 420,000 or less, 410,000 or less, or 400,000 or less, 375,000 or less, 350,000 or less, 325,000 or less, 300,000 or less, 275,000 or less, 250,000 or less, 225,000 or less, 200,000 or less, 175,000 or less, 150,000 or less, 125,000 or less, 100,000 or less, 75,000 or less, 50,000 or less, 45,000 or less, 40,000 or less, 35,000 or less, 30,000 or less, 25,000 or less, 20,000 or less, 15,000 or less, 10,000 or less, 5,000 or less, 4,000 or less, 3,000 or less or 2,000 or less.
The CpG panel may comprise any set of CpGs defined in the assays of the invention described herein.
The arrays of the invention may comprise one or more oligonucleotides comprising any set of CpGs defined in the assays of the invention, wherein the one or more oligonucleotides are hybridized to corresponding oligonucleotide probes of the array.
The invention also encompasses a process for making a hybridized array described herein, comprising contacting an array according to the present invention with a group of oligonucleotides comprising any set of CpGs defined in the assays of the invention.
Any of the arrays as defined herein may be comprised in a kit. The kit may comprise any array as defined herein together with instructions for use.
The invention further encompasses the use of any of the arrays as defined herein in any of the assays for determining the methylation status of CpGs for the purposes of predicting the presence or development of cancer in an individual.
The following Examples serve to illustrate but not to limit the invention.
In the Examples described herein, WID-CIN-Index is a cancer index value wherein the index value has been determined by assaying in a population of DNA molecules derived from a given sample from an individual the methylation status of a panel of CpGs selected from the CpGs identified in SEQ ID NOs 1 to 5000 and/or within Differentially Methylated Regions defined by SEQ ID NOs 5001 to 5418.
In some instances within the Examples, all CpGs defined by SEQ ID NOs: 1 to 5000 have been included in the panel which has been assayed to obtain a cancer index value. In addition, specific sub-selections of CpGs from among the 500 CpGs defined by SEQ ID NOs: 1 to 500 have been included in the panel which has been assayed to obtain a cancer index value. In these instances, the cancer index value's ability to discriminate between CIN3 and/or cancer positive and CIN3 and/or cancer negative women is described, wherein discriminatory ability of the index is characterised by AUC and received operating characteristics.
Cervical screening is currently transitioning from cytology to HPV-testing. This will lead to increased sensitivity but decreased specificity in mass screening. An objective, automatable test that can accurately triage HPV+ve women independently of sample heterogeneity and age, while also capable of detecting other epithelial uterine cancers is urgently required. The inventors, along with other scientists, have shown the feasibility of utilising DNA methylation markers to identify women with pre-invasive or invasive cancers. The clinical use of DNA methylation markers to identify women at high risk for CIN3+ has been hindered by several factors:
Using a cohort-based nested case-control setting, the inventors aimed to develop and validate a DNA methylation signature (called Women's cancer risk IDentification CIN3 index, WID-CIN3-index) in cervical liquid-based cytology samples. The cancer index value should be capable of both diagnosing and predicting the future risk of CIN3+.
All cervical liquid-based cytology samples processed in the capital region of Stockholm in Sweden are biobanked through a state-of-the-art platform at the Karolinska University Laboratory, Karolinska University Hospital, as previously described (Perskvist et al, 2013). Since the year 2013, virtually 100% of the ˜150 000 LBCs per year are compacted and stored in a 600 microliter, 96 well plate format at −27°. This allows for preservation of intact cells, and analyses of DNA, RNA and protein content, among others. The biobank is linked to the Swedish health register infrastructure through the individually unique personal identification number (PIN) (Ludvigsson et al, 2016).
The inventors defined a cohort of all women participating in cervical screening, or clinically indicated testing, during the years 2013-2015 and linked this to the National Cancer Register at the Swedish National Board of Health and Welfare, to identify all cases of CIN3 or invasive cervical cancer (CIN3+) occurring in the sample collection during these years (n=samples from n women). The inventors utilized this cohort to identify a) all women with prevalent CIN3+, b) all women with a normal sample which was succeeded by a later diagnosis of CIN3+ within 1-4 years, c) all women with low-grade lesions of the cervix, so-called CIN1-2 and d) an age- and calendar-year of sample frequency-matched control group of healthy women who had no record of cervical lesions in the National Cancer Register ever. During the years 2013-2015, some groups of the population had been randomized to HPV screening and some to primary cytology. The inventors carefully balanced each sample set to reflect this fact. All samples which did not have HPV results on record were put through high-performance HPV-testing on the cobas48000 assay (ref), which had been the publicly tendered HPV testing platform during this entire period. The inventors further linked all samples to comprehensive, harmonized records of their cytology diagnosis held in the National Cervical Screening Register (NKCx.se). This enabled stratification of all samples by high-risk HPV positivity, and cytology status, respectively.
To maximize DNA content, we, blinded to case-control status, visually screened all eligible vials of biobanked samples to ensure that a visible cell pellet was present. Approximately ⅓ of samples had such a pellet and this was independent of case-control or CIN3/ICC status. The inventors then aliquoted 100 microliter from each sample for UCL to perform methylation analyses.
650 ul of PBS was added to each 100 ul cervical liquid-based cytology sample received from the Karolinska University Laboratory biobank and centrifuged for 15 mins at 4,600 rpm. The supernatant was carefully removed and the pellet was washed with a further 750 ul PBS. The samples were then vortexed and centrifuged again for 15 mins at 4,600 rpm. After careful removal of the second PBS wash the samples the inventors resuspended in lysis buffer from the Nucleo-Mag Blood 200 ul kit (Macherey Nagel, cat #744501.4) which was used in conjunction with the Hamilton Star liquid handling platform for high throughout DNA extraction. DNA concentration and quality absorbance ratios were measured using Nanodrop-8000, Thermoscientific Inc. Extracted DNA was stored at −80° C. until further analysis.
Cervical was normalised to 10-25 ng/ul and 200-500 ng total DNA was bisulfite modified using the EZ-96 DNA Methylation-Lightning kit (Zymo Research Corp, cat #D5047) on the Hamilton Star Liquid handling platform. 8 ul of modified DNA was subjected to methylation analysis on the Illumina InfiniumMethylation EPIC BeadChip (Illumina, CA, USA) at UCL Genomics according to the manufacturer's standard protocol.
All methylation microarray data were processed through the same standardised pipeline. Raw data was loaded using the R package minfi. Any samples with median methylated and unmethylated intensities <9.5 were removed. Any probes with a detection p-value >0.01 were regarded as failed. Any samples with >10% failed probes, and any probes with >10% failure rate were removed from the dataset. Beta values from failed probes (approximately 0.001% of the dataset) were imputed using the impute.knn function as part of the impute R package.
Non-CpG probes (2,932), SNP-related probes as identified by Zhou et. al. (82,108), and chrY probes were removed from the dataset. An additional 6,102 previously identified probes that followed a trimodal methylation pattern characteristic of an underlying SNP were removed. Background intensity correction and dye bias correction was performed using the minfi single sample preprocessNoob function. Probe bias correction was performed using the beta mixture quantile normalisation (BMIQ) algorithm.
The fraction of immune cell contamination, and the relative proportions of different immune cell subtypes in each sample, were estimated using the EpiDISH algorithm using the epithelial, fibroblast and immune cell reference dataset. The top 1,000 most variable probes (ranked by standard deviation) were used in a principal component analysis. Statistical tests were performed in order to identify any anomalous associations between plate, sentrix position, date of array processing, date of DNA creation, study centre, immune contamination fraction, age, type (case versus control) and the top ten principal components. No anomalous associations were found.
Two ranked lists of CpGs were generated. The first was ranked according to the epithelial delta-beta estimates (the estimated difference in methylation between cases and controls in cervical epithelial cells). The second was ranked according to p-values (derived from a linear model comparing cases to controls after adjustment for immune cell proportion and age). For each CpG we identified any contiguous CpGs within +/−500 bp. The inventors computed and plotted the mean methylation in cases and controls across all CpGs within this 1000 bp region. Upon visual inspection of the top 50 CpGs (in both ranked lists) we identified a number of candidate regions according to the following criteria:
Contamination by immune cells presented a challenge with respect to the identification of differentially methylated positions (DMPs) as differential methylation that occurred solely in epithelial cells was diminished in samples with high IC and vice versa. In order to overcome this, the inventors linearly regressed the beta values on IC for each CpG site, the linear models being fitted to cases and controls separately. The intercept points at IC=0 were used as estimates of mean beta values in cases and controls in a pure epithelial cell population. The difference between these intercept points provided a delta-beta estimate in epithelial cells. The difference between intercept points at IC=1 provided immune cell delta-beta estimates.
The R package glmnet was used to train classifiers with a mixing parameter value of alpha=0 (ridge penalty) and alpha=1 (lasso penalty) with binomial response type. Data from the discovery set dataset were used to fit the classifiers. A ranked list of CpGs was generated by taking the CpG with the largest epithelial delta-beta, followed by the CpG with the largest immune delta-beta, followed by the next largest epithelial delta-beta and so forth (any duplicates were removed). The top n CpGs from the list of ranked CpGs were used as inputs to the classifier. Ten-fold cross-validation was used inside the training set by the cv.glmnet function in order to determine the optimal value of the regularisation parameter lambda. The AUC was used as a metric of classifier performance. Out-of-bag AUC estimates (based on the cross validation folds which were not used for training the classifier) were as a function of n, the number of CpGs used as inputs during training. The maximum value of n was 10,000.
The optimal classifier was selected based on the highest out-of-bag AUC obtained on the discovery set. Once the classifier was finalised it was then applied to the validation datasets.
Denoting the top n CpGs as β1, . . . , βn and the regression coefficients from the trained classifier as w1, . . . , wn then WID-CIN-index=Σi=1n(wiβi−μ)/σ where μ and a are defined as the mean and standard deviation of the quantity Σi=1nwiβi in the discovery set (that is, the index is scaled to have zero mean and unit standard deviation in the discovery set). The island (open sea) subcomponent was obtained by restricting the above sum to CpGs located in CpG islands (open seas) as defined in the Illumina manifest version B4.
For the Discovery Set (
Previously the inventors found that methylation differences may vary due to immune cell type composition in cases compared to controls. Hence, the inventors assessed the level of cell type heterogeneity in each cervical cytology sample using EpiDISH, an algorithm that infers the relative proportion of epithelial cells, fibroblasts, and seven subtypes of immune cells in each sample. The cell-type distributions were broadly similar between CIN3+ cases and controls with an increase in immune cells in CIN2 and CIN3+ cases (
Assessing 850,000 CpG sites after false discovery rate adjustment the inventors found 158,434 CpGs to be significantly differentially methylated between CIN3+ cases and controls (
In order to derive a diagnostic methylation signature to detect CIN3 or invasive cervical cancer, termed the WID-CIN-index, the inventors used ridge and lasso classifiers in the Discovery Set to classify individuals as CIN3+ cases or controls. The area under the receiver operator characteristic curve (AUC) was used as a measure of predictive performance. CpGs were ranked according to the delta-beta between CIN3+ cases and controls.
Predictive performance was estimated based on 10-fold cross-validation within the Discovery Set and evaluated as a function of the number of CpGs used to train the classifier. The optimal Discovery Set AUC of 0.98 was achieved using 5,000 CpGs with ridge regression (
The inventors found that the index was enriched for Open Sea regions and depleted for CpG islands (
In order to assess the diagnostic capacity of the WID-test the inventors analysed an independent dataset of cervical cytology samples consisting of 87 women who were diagnosed with CIN3+ and 111 HPV+ controls (see
The fact that almost all CIN3+ cases would have been correctly classified (irrespective of the age) at a specificity of 50% (
In order to optimise the efficacy of the screening, not only is it important to identify women with current CIN3+ but also those women with the highest risk of developing CIN3+ in the future. In order to address this, the inventors analysed 428 HPV+ve/Cytology-ve women of whom 210 were diagnosed with CIN3+1 to 4 years after they provided their sample and 218 remained disease-free within the same period (
Although the number of island CpGs was under-represented (i.e. only 702 of the 5,000 CpGs were island CpGs), these CpGs carried the highest weight in the WID-CIN-index. Hence, the inventors decomposed the WID-CIN-index into a subcomponent based only on the 702 CpG islands and a subcomponent based on the 3,411 open sea CpGs. The island subcomponent provided an extremely strong signal that corresponded to an AUC of 0.87 in the diagnostic validation set (
Both the cervical and endometrial epithelium form part of the Müllerian Duct system. Hence, in order to assess whether a woman with a WID-CIN-index positive result, who on colposcopic assessment has no abnormality on her cervix, has an underlying endometrial cancer, the inventors assessed the WID-CIN-index in cervical cytology samples from 217 women with endometrial cancer and 869 healthy controls. The endometrial cancers had a cell-type composition that was broadly similar to the Discovery Set (
Four sub-groups defined by ranges of cancer index values are specified in Table 9 as corresponding to preferred clinical actions, comprising intensified screening, administration of therapeutics and surgery. The subgroups are based on control samples from the internal validation set. That is, these values of the index split the control samples into four equally sized groups. Odds ratio values are calculated by comparing the number of cases and controls in a given quartile to the first quartile (which is taken as a reference). Odds ratio values are determined for CIN3 risk and endometrial cancer risk. For example, a woman in the fourth quartile is roughly 104 times more likely to have CIN3 than a woman in the first quartile, and approximately 40 times more likely to have endometrial cancer than a woman in the first quartile.
In general, cervical cancer screening is one of the foremost success stories in medicine and oncology in particular. Here the inventors have provided extensive evidence that an objective DNA methylation signature, the WID-CIN-test, outperforms cytology as a tool to triage HPV+ve women. The inventors have demonstrated that the WID-CIN test (i) reduces the number of false positive HPV+ women by 50%, (ii) does not identify women with CIN1 and only half of women with CIN2 (i.e. likely those which progress to CIN3+ if not treated before), (iii) is able—despite the cytology being negative—to identify HPV+ve women which present with a CIN3+ up to four years later, and (iv) is able to also identify the majority of women with endometrial cancer.
Whereas a plethora of DNA methylation markers have been identified and assessed in cervical liquid-based cytology samples and deemed to be promising only a small number of studies assessed the clinical validity of these markers in a screening setting. Using DNA methylation levels of a combination of two genes (i.e. MAL and miR-124-2), Verhoef et al demonstrated in a prospective clinical trial that triaging HPV+ve women with DNA methylation somehow lowers sensitivity (67.5%) compared to cytology-triaging (74.8%) and required almost twice as many colposcopy referrals. As this study was performed on women aged 33 years or older, the performance of these methylation markers would have been worse in younger women. Although the inventors also observed this age-dependent performance in the WID-test, in young women the sensitivity was 80% at a 75% specificity.
To date, the considerable heterogeneity of cervical liquid-based cytology samples has been entirely underappreciated (specifically at the level of DNA which includes DNA from cell-debris not visible at the microscopic level when assessing cytology), their high variability of epithelial and immune cell proportion ranging from almost no immune cells to samples which almost exclusively consist of immune cells. The inventors have thoroughly assessed and concluded that the WID-test performance is independent of sample heterogeneity and hence, likely to perform equally well in self-collected samples.
Our observation that the WID-test is able to identify HPV+ve women who show no abnormal cells in their cervical liquid-based cytology sample but develop CIN3+ up to four years later might suggest that the WID-test is not only reflective of an epigenetic cancer program, but may in fact be reflective of an individual predisposition to progress to a cervical (pre-)cancer upon infection with HPV. In order to test this hypothesis, samples from women prior to HPV-infection will need to be analysed in order to assess whether the WID-test would have predicted the disease development even before the presence of the carcinogen.
Here the inventors have demonstrated the unprecedented performance of a DNA methylation classifier—the WID-CIN-test—in identifying HPV+ve women with or at risk of CIN3+. The fact that the test not only identifies women with CIN3+(and equally as well when making comparisons to HPV+ve or HPV-ve women) but also women with endometrial cancer strongly suggests that the WID-CIN-test be rapidly introduced and implemented in the clinical arena.
Number | Date | Country | Kind |
---|---|---|---|
2009225.0 | Jun 2020 | GB | national |
2107421.6 | May 2021 | GB | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/GB2021/051537 | 6/17/2021 | WO |