The instant application contains a Sequence Listing which has been submitted electronically in ASCII format and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Jan. 14, 2019, is named 30435_0341WOU1_SL.txt and is 201,768 bytes in size.
The invention relates to methods and materials for examining biological aging in individuals.
One of the major goals of geroscience research is to define ‘biomarkers of aging’1,2, which are individual-level measures of aging that can account for differences in the timing of disease onset, functional decline, and death over the life course. While chronological age is arguably the strongest risk factor for aging-related death and disease, it is important to distinguish chronological time from biological aging. Individuals of the same chronological age may exhibit greatly different susceptibilities to age-related diseases and death, which is likely reflective of differences in their underlying biological aging processes. Such biomarkers of aging will be crucial to enable instantaneous evaluation of interventions aimed at slowing the aging process, by providing a measurable outcome other than incidence of death and/or disease, which require extremely long follow-up observation.
One potential biomarker that has gained significant interest in recent years is DNA methylation (DNAm), given that chronological time has been shown to elicit predictable hypo- and hyper-methylation changes at many regions across the genome 3-7. As a result, the first generation of DNAm based biomarkers of aging were developed to predict chronological age8-10. The blood-based algorithm by Hannum9 and the multi-tissue algorithm by Horvath10 produced age estimates (DNAm age) that correlate with chronological age well above r=0.90 for full age range samples. Nevertheless, while the current epigenetic age estimators exhibit statistically significant associations with many age-related diseases and conditions11-17, the effect sizes are typically small to moderate. Further, using chronological age as the reference, by definition, may exclude CpGs whose methylation patterns don't display strong time-dependent changes, but instead signal the departure of biological age from chronological age.
Previous work by us and others have shown that “phenotypic aging measures”, derived from clinical biomarkers18-22, strongly predict differences in the risk of all-cause mortality, cause-specific mortality, physical functioning, cognitive performance measures, and facial aging among same-aged individuals. What's more, in representative population data, some of these measures have been shown to be better indicators of remaining life expectancy than chronological age18, suggesting that they are approximating individual-level differences in biological aging rates.
Accordingly, there is a need for improved methods of observing phenotypic aging, which is predictive of an earlier age of death (all-cause mortality) that is independent of chronological age and traditional risk factors of mortality.
This invention provides methods and materials useful to examine one or more clinical variables and DNA methylation biomarkers. As discussed in detail below, typically these biomarkers are based on variables that lend themselves to predicting life expectancy and risk for age-related diseases. For example, a first biomarker, referred to as “phenotypic age estimator”, is based on clinical variables such as measurements of factors such as Albumin, Creatinine, Glucose, C-reactive Protein, Lymphocyte Percentage, Mean Cell Volume, Red Blood Cell Distribution Width, Alkaline Phosphatase, White Blood Cell Count, and age at the time of assessment. A second biomarker, referred to as “DNA methylation PhenoAge”, is based on DNA methylation measurements at 513 locations across the human DNA molecule. As discussed below, by examining such biomarkers in an individual, it is possible to obtain information that is highly predictive of multiple morbidity and mortality outcomes in that individual.
The idea of using DNA methylation (DNAm) to estimate biological age has recently gained interest following the discovery that many CpGs throughout the genome display hyper- or hypo-methylation patterns as a function of chronological age. While most of the first-generation epigenetic biomarkers of aging capitalized on these age associations to identify CpGs from which to build composite scores, we hypothesized that a more powerful epigenetic biomarker of aging could be generated from DNA methylation data by replacing chronological age with a surrogate measure of “phenotypic aging” that, in and of itself, differentiates morbidity and mortality risk among same-age individuals. Using multiple large epidemiological studies, we demonstrate that our new epigenetic biomarker that is examines the above-noted combination of factors, DNAm PhenoAge, is highly predictive of multiple morbidity and mortality outcomes—including, but not limited to: life expectancy, heart disease, cancer, and age related dementia. Further, it produces reliable age estimates and risk predictions when measured in various tissues. This shows that our single DNAm based biomarker (DNAm PhenoAge) is capable of capturing risk for an array of diverse diseases and conditions across multiple tissues and cells. As such, DNAm PhenoAge will be useful for assessing personalized risk, improving our understanding of the biological aging process and, evaluating promising interventions aimed at slowing aging and preventing disease.
The invention disclosed herein has a number of embodiments. Embodiments of the invention include method of obtaining information on a phenotypic age of an individual, the method comprising observing methylation of genomic DNA obtained from the individual, wherein methylation is observed in at least 10 CpG methylation markers in polynucleotides having SEQ ID NO: 1-SEQ ID NO: 513 so that information on the phenotypic age of the individual is obtained. Typically in these methods, observing methylation of genomic DNA comprises hybridizing genomic DNA from the individual to a methylation array comprising the polynucleotides having sequences of SEQ ID NO: 1-SEQ ID NO: 513 coupled to a matrix; and/or comprises performing a bisulfite conversion process on the genomic DNA so that cytosine residues in the genomic DNA are transformed to uracil, while 5-methylcytosine residues in the genomic DNA are not transformed to uracil. In such embodiments, the method can comprise observing a clinical variable in the individual comprising at least one of: concentrations of albumin in the individual, concentrations of creatine in the individual, concentrations of glucose in the individual, concentrations of c-reactive protein in the individual, concentrations of alkaline phosphatase in the individual lymphocyte percentage in the individual, mean cell volume in the individual, red blood cell distribution width in the individual, white blood cell count in the individual, and age of the individual at the time of assessment. In certain embodiments of the invention, at least 3, 4, 5, 6, 7 or 8 clinical variables are observed.
Embodiments of the invention can include additional steps such as comparing the chronological age of the individual at the time of assessment and the phenotypic age so as to obtain information on life expectancy of the individual. Embodiments of the invention include using information on the phenotypic age obtained by the method to predict an age at which the individual may suffer from one or more age related diseases or conditions. Embodiments of the invention include those that compare the CG locus methylation profile observed in the individual to the CG locus methylation profile of genomic DNA having SEQ ID NO: 1-SEQ ID NO: 513 present in white blood cells or epithelial cells derived from a group of individuals of known ages; and then correlating the CG locus methylation observed in the individual with the CG locus methylation and known ages in the group of individuals. In typical embodiments of the invention, methylation is observed by a process comprising hybridizing genomic DNA obtained from the individual with at least 100, 200, 300, 400 or 500 polynucleotides comprising SEQ ID NO: 1-SEQ ID NO: 513 disposed in an array. In embodiments of the invention, the phenotypic age of the individual can be estimated using a weighted average of methylation markers within the set of 513 methylation markers. Optionally, methylation marker data is further analyzed, for example by a regression analysis. Optionally in these methods, methylation is observed in genomic DNA obtained from leukocytes or epithelial cells obtained from the individual.
A specific embodiment of the invention is a method of observing a phenotypic age of an individual, the method comprising observing methylation of genomic DNA obtained from the individual, wherein methylation is observed in 513 CpG methylation markers in polynucleotides having SEQ ID NO: 1-SEQ ID NO: 513; and the method comprises hybridizing genomic DNA from the individual to a methylation array comprising the polynucleotides of SEQ ID NO: 1-SEQ ID NO: 513 coupled to a matrix, so that the phenotypic age of the individual is observed.
In certain embodiments of the invention, methods include observing a clinical variable in the individual comprising at least one of: concentrations of albumin in the individual, concentrations of creatine in the individual, concentrations of glucose in the individual, concentrations of c-reactive protein in the individual, concentrations of alkaline phosphatase in the individual lymphocyte percentage in the individual, mean cell volume in the individual, red blood cell distribution width in the individual, white blood cell count in the individual, and age of the individual at the time of assessment. In some embodiments of the invention, the method further comprising observing at least one factor selected from individual diet history, individual smoking history and individual exercise history. Optionally, the observed phenotypic age is then used to assess a risk of a cancer mortality in the individual (e.g. to asses a risk of breast cancer, lung cancer or the like, or to assess a risk of dementia or diabetes mortality in the individual).
A related embodiment of the invention is a tangible computer-readable medium comprising computer-readable code that, when executed by a computer, causes the computer to perform operations including: receiving information corresponding to methylation levels of a set of methylation markers in a biological sample, wherein the set of methylation markers comprises 513 methylation markers that are identified in Table 5; determining an epigenetic age by applying a statistical prediction algorithm to methylation data obtained from the set of methylation markers; and then determining an epigenetic age using a weighted average of the methylation levels of the 513 methylation markers. Optionally in this embodiment, the tangible computer-readable medium comprising computer-readable code, when executed by a computer, further causes the computer to perform operations including: receiving information corresponding to methylation levels of a set of clinical variables in a biological sample, information that is then used for determining an epigenetic age.
Both phenotypic age, and in particular DNAm PhenoAge, are useful biomarkers for human anti-aging studies given that these are highly robust, blood based biomarkers that capture organismal age and the functional state of many organ systems and tissues, thus allowing efficacy of interventions to be evaluated based on real-time measures of aging, rather than relying on long-term outcomes, such as morbidity and mortality. Finally, this measure may be another component of the personalized medicine paradigm, as it allows for evaluation of risk based on an individual's personalized DNAm profile.
Other objects, features and advantages of the present invention will become apparent to those skilled in the art from the following detailed description. It is to be understood, however, that the detailed description and specific examples, while indicating some embodiments of the present invention, are given by way of illustration and not limitation. Many changes and modifications within the scope of the present invention may be made without departing from the spirit thereof, and the invention includes all such modifications.
In the description of embodiments, reference may be made to the accompanying figures which form a part hereof, and in which is shown by way of illustration a specific embodiment in which the invention may be practiced. It is to be understood that other embodiments may be utilized and structural changes may be made without departing from the scope of the present invention. Many of the techniques and procedures described or referenced herein are well understood and commonly employed by those skilled in the art. Unless otherwise defined, all terms of art, notations and other scientific terms or terminology used herein are intended to have the meanings commonly understood by those of skill in the art to which this invention pertains. In some cases, terms with commonly understood meanings are defined herein for clarity and/or for ready reference, and the inclusion of such definitions herein should not necessarily be construed to represent a substantial difference over what is generally understood in the art.
All publications mentioned herein are incorporated herein by reference to disclose and describe aspects, methods and/or materials in connection with the cited publications. For example, Levine et al., Aging, 2018 Apr. 18; 10(4):573-591; U.S. Patent Publication 20150259742, U.S. patent application Ser. No. 15/025,185, titled “METHOD TO ESTIMATE THE AGE OF TISSUES AND CELL TYPES BASED ON EPIGENETIC MARKERS”, filed by Stefan Horvath; U.S. patent application Ser. No. 14/119,145, titled “METHOD TO ESTIMATE AGE OF INDIVIDUAL BASED ON EPIGENETIC MARKERS IN BIOLOGICAL SAMPLE”, filed by Eric Villain et al.; and Hannum et al. “Genome-Wide Methylation Profiles Reveal Quantitative Views Of Human Aging Rates.” Molecular Cell. 2013; 49(2):359-367 and patent US2015/0259742, are incorporated by reference in their entirety herein.
DNA methylation refers to chemical modifications of the DNA molecule. Technological platforms such as the Illumina Infinium microarray or DNA sequencing based methods have been found to lead to highly robust and reproducible measurements of the DNA methylation levels of a person. There are more than 28 million CpG loci in the human genome. Consequently, certain loci are given unique identifiers such as those found in the Illumina CpG loci database (see, e.g. Technical Note: Epigenetics, CpG Loci Identification ILLUMINA Inc. 2010). These CG locus designation identifiers are used herein. In this context, one embodiment of the invention is a method of obtaining information useful to observe biomarkers associated with a phenotypic age of an individual by observing the methylation status of one or more of the 513 methylation marker specific GC loci that are identified in Table 5.
The term “epigenetic” as used herein means relating to, being, or involving a chemical modification of the DNA molecule. Epigenetic factors include the addition or removal of a methyl group which results in changes of the DNA methylation levels. Novel molecular biomarkers of aging that observe methylation patterns in genomic DNA, such as those termed “DNA methylation PhenoAge”, or “phenotypic age” (allow one to prognosticate mortality, are interesting to gerontologists (aging researchers), epidemiologists, medical professionals, and medical underwriters for life insurances. Exclusively clinical biomarkers such as lipid levels, body mass index, blood pressures have a long and successful history in the life insurance industry. By contrast, molecular biomarkers of aging have rarely been used.
The profitability of a life insurance product directly depends on the accurate assessment of mortality risk because the costs of life insurance (to the insurance company) are directly proportional to the number of deaths in a given category. Thus, any improvement in assessing mortality risk and in improving the basic classification will directly translate into cost savings. For the reasons noted above, DNA methylation (DNAm) based biomarkers of aging are useful for predicting mortality. Consequently, they are useful the life insurance industry due to their ability to increase the accuracy of medical underwriting. DNAm measurements can provide a host of complementary information that can inform the medical underwriting process. In this context, the DNAm based biomarkers and associated method disclosed herein can be used both to molecularly estimate complete blood counts and to estimate biological age, as well as to directly predict/prognosticate mortality. Using embodiments of the invention disclosed herein, upon completing a medical exam, an insurer can, for example, look at a combination of the clinical biomarker and DNA methylation test results as well as other factors such as family health history and lifestyle choices to classify the applicant into useful classification categories such as: 1) preferred plus/super preferred/preferred select/preferred elite, 2) preferred, 3) standard plus, 4) standard, 5) preferred smoker, 6) standard smoker, 7) table rate A, 8) table rate B, etc. Each of these categories has a distinct mortality risk and usually directly relates to the pricing of the insurance product. The basic classification is largely determined by well established risk factors of mortality such as sex, smoking status, family history of death, prior history of disease (e.g. diabetes status, cancer), and a host of clinical biomarkers (blood pressure, body mass index, cholesterol, glucose levels, hemoglobin A1C).
The term “nucleic acids” as used herein may include any polymer or oligomer of pyrimidine and purine bases, preferably cytosine, thymine, and uracil, and adenine and guanine, respectively. The present invention contemplates any deoxyribonucleotide, ribonucleotide or peptide nucleic acid component, and any chemical variants thereof, such as methylated, hydroxymethylated or glucosylated forms of these bases, and the like. The polymers or oligomers may be heterogeneous or homogeneous in composition, and may be isolated from naturally-occurring sources or may be artificially or synthetically produced. In addition, the nucleic acids may be DNA or RNA, or a mixture thereof, and may exist permanently or transitionally in single-stranded or double-stranded form, including homoduplex, heteroduplex, and hybrid states.
The term “methylation marker” as used herein refers to a CpG position that is potentially methylated. Methylation typically occurs in a CpG containing nucleic acid. The CpG containing nucleic acid may be present in, e.g., in a CpG island, a CpG doublet, a promoter, an intron, or an exon of gene. For instance, in the genetic regions provided herein the potential methylation sites encompass the promoter/enhancer regions of the indicated genes. Thus, the regions can begin upstream of a gene promoter and extend downstream into the transcribed region.
The phrase “selectively measuring” as used herein refers to methods wherein only a finite number of methylation marker or genes (comprising methylation markers) are measured rather than assaying essentially all potential methylation marker (or genes) in a genome. For example, in some aspects, “selectively measuring” methylation markers or genes comprising such markers can refer to measuring more than (or not more than) 500, 200, 100, 75, 50, 25, 10 or 5 different methylation markers or genes comprising methylation markers.
The invention described herein provides novel and powerful predictors of life expectancy, mortality, and morbidity based on DNA methylation levels. In this context, it is critical to distinguish clinical from molecular biomarkers of aging. Clinical biomarkers such as lipid levels, blood pressure, blood cell counts have a long and successful history in clinical practice. By contrast, molecular biomarkers of aging are rarely used. However, this is likely to change due to recent breakthroughs in DNA methylation based biomarkers of aging. Since their inception, DNA methylation (DNAm) based biomarkers of aging promise to greatly enhance biomedical research, clinical applications, patient care, and even medical underwriting when it comes to life insurance policies and other financial products. They will also be more useful for clinical trials and intervention assessment that target aging, since they are more proximal to the biological changes that characterize the aging process compared to upstream clinical read outs of health and disease status.
The disclosure presented herein surrounding the prediction of mortality and morbidity show that these combinations of clinical and DNAm based biomarkers are highly robust and informative for a range of applications. DNAm PhenoAge can not only be used to directly predict/prognosticate mortality but also relate to a host of age related conditions such as heart disease risk, cancer risk, dementia status, cardiovascular disease and various measures of frailty.
The invention disclosed herein has a number of embodiments. One embodiment of the invention is a method of observing biomarkers that are associated with a phenotypic age of an individual. In such embodiments, the method comprises observing a biomarker comprising the state of a clinical variable in the individual comprising at least one of: concentrations of albumin in the individual, concentrations of creatine in the individual, concentrations of glucose in the individual, concentrations of c-reactive protein in the individual, concentrations of alkaline phosphatase in the individual lymphocyte percentage in the individual, mean cell volume in the individual, red blood cell distribution width in the individual, white blood cell count in the individual, and age of the individual at the time of assessment; and, in addition, further observing another biomarker comprising the individual's methylation status at at least 10 513 CpG methylation markers that are identified in Table 5 such that biomarkers associated with the phenotypic age of the individual are observed. In some embodiments, methylation is observed by a process comprising hybridizing genomic DNA obtained from the individual with 513 complementary sequences disposed in an array on a substrate. Optionally, methylation is observed by a process comprising treatment of genomic DNA from the population of cells from the individual with bisulfite to transform unmethylated cytosines of CpG dinucleotides in the genomic DNA to uracil.
In typical embodiments of the invention, at least 3, 4, 5, 6, 7 or 8 clinical variables are observed. In some embodiments of the invention, the second DNA methylation biomarker is observed in a population of leukocytes or epithelial cells obtained from the individual. Optionally the method comprises assessing on or more of the biomarkers in a regression analysis. In certain embodiments, the phenotypic age of the individual is estimated using a weighted average of methylation markers within the set of 513 methylation markers. Embodiments of the invention can further comprise examining at least one factor selected from the diet of the individual, whether the individual smokes and the levels that the individual exercises. Embodiments of the invention can compare the age of the individual at the time of assessment and the phenotypic age so as to obtain information on life expectancy of the individual. In certain embodiments of the invention, the method includes using the phenotypic age to predict the age at which the individual may suffer from one or more age related diseases or conditions. Further embodiments and aspects of the invention are discussed below.
Previous work has shown that “phenotypic aging measures”, derived from clinical biomarkers (see, e.g. Levine M E., The Journals of Gerontology Series A: Biological Sciences and Medical Sciences. 2013; 68(6):667-674; Li S et al., Twin Res Hum Genet. 2015; 18(6):720-726; Sebastiani et al., Aging Cell. 2017; and Ferrucci L et al., Public Health Reviews. 2010; 32(2):475-488), strongly predict differences in the risk of all-cause mortality, cause-specific mortality, physical functioning, cognitive performance measures, and facial aging among same-aged individuals. What's more, in representative population data, some of these measures have been shown to be better indicators of remaining life expectancy than chronological age (Levine M E., The Journals of Gerontology Series A: Biological Sciences and Medical Sciences. 2013; 68(6):667-674), suggesting that they are approximating individual-level differences in biological aging rates. We developed a new phenotypic age predictor based on 10 variables total (9 clinical biomarkers and chronological age at the time of the assessment). These variables were selected out of a possible 42 biomarkers, using an elastic net proportion hazards model, and are aggregated into a composite score by forming a weighted average
WeightedAverage=(−Albumin*0.0336+log(Creatinine)*0.0095+Glucose*0.1953+C−reactiveProtein*0.0954−LymphocytePerc*0.0120+MeanCellVolume*0.0268+RedBloodCellDistributionWidth*0.3306+AlkalinePhosphatase*0.0019+WhiteBloodCellCount*0.0554+age*0.0804−19.9067).
Next the weighted average is transformed using a monotonically increasing function to arrive at a phenotypic age estimate (in units of years). Validation data for phenotypic age came from the fourth National Health and Nutrition Examination Survey (NHANES IV), and included up to 17 years of mortality follow-up for n=6,209 national representative US adults. Mortality results show that a one year increase in phenotypic age is associated with a 9% increase in the hazard of all-cause mortality (hazard ratio, HR=1.09, p-value=3.8E-49), a 9% increase in the risk of aging-related mortality(HR=1.09, p=4.5E-34), a 10% increase in the risk of CVD mortality (HR=1.10, p=5.1E-17), a 7% increase in the risk of cancer mortality (HR=1.07, p=7.9E-10), a 20% increase in the risk of diabetes mortality (HR=1.20, p=1.9E-11), and a 9% increase in the risk of lung disease mortality (HR=1.09, p=6.3E-4). Finally, in the proportional hazard model, phenotypic age completely accounted for the effect of chronological age, such that chronological age no longer exhibited a significant positive association with mortality.
Finally, we tested the association between phenotypic age and 1) the number of coexisting morbidities a participant had been diagnosed with, and 2) levels of physical functioning problems. Results showed that after adjusting for chronological age, persons with more coexisting morbidities also display higher phenotypic ages on average (p=3.9E-21). Similarly, those with worse physical functioning tended to have higher phenotypic ages (p=2.1E-10).
Data from the Invecchiare in Chianti (InCHIANTI) study was used to relate blood DNAm levels to phenotypic age. Elastic net regression produced a model in which phenotypic age is predicted by DNAm levels at 513 CpGs. The linear combination of the weighted 513 CpGs yields a DNAm based estimator of phenotypic age, that we refer to as ‘DNAm PhenoAge’.
To demonstrate the utility of DNAm PhenoAge, we used four independent large-scale samples-two samples from Women's Health Initiative (WHI) (n=2,016; and n=2,191), the Framingham Heart Study (FHS) (n=2,553), and the Normative Aging Study (n=657). In these studies, DNAm PhenoAge correlated with chronological age at r=0.67 in WHI (Sample 1), r=0.69 in WHI (Sample2), r=0.78 in FHS, and r=0.62 in the Normative Aging Study. The four validation samples were then used to assess the effects of DNAm PhenoAge on mortality. DNAm PhenoAge was significantly associated with subsequent mortality risk in all studies (independent of chronological age), such that, a one year increase in DNAm PhenoAge is associated with a 4% increase in the risk of all-cause mortality (Meta(FE)=1.042, Meta p=1.1E-36). We also observe strong associations between DNAm PhenoAge and a variety of other aging outcomes. For instance, independent of chronological age, higher DNAm PhenoAge is associated with an increase in a person's number of coexisting morbidities (Meta P-value=4.56E-15), a decrease in likelihood of being disease-free (Meta P-value=1.06E-7), an increase in physical functioning problems (Meta P-value=2.05E-13), an increase in the risk of coronary heart disease (CHD) risk (Meta P-value=2.43E-10, and an earlier age at menopause (Meta P-value=8.22E-4)—suggesting that women were epigenetically older if they had entered menopause earlier.
Additional replication data was used to test for associations with other aging outcomes. For instance, we find that among the 527 women who were cancer free at age 50, accelerated DNAm PhenoAge predicts incident breast cancer (p=0.033, OR: 1.037). We also find a marginally significant reduction of approximately 2.4 years for the DNAm PhenoAge of semi-super centenarian offspring, relative to controls (p=0.065). Using blood methylation data, we evaluated whether DNAm PhenoAge relates to clinically diagnosed dementia in living individuals. Results suggest that those with presumed Alzheimer's disease (AD, n=154) and/or frontotemporal dementia (FTD, n=116) have significantly higher DNAm PhenoAge compared to non-demented (n=334) individuals (P=2.2E-2), and the strength of the association is further increased (P=9.4E-3) when limiting the sample to those ages 75 and older. We also find that DNAm PhenoAge, relates to Down syndrome in two separate blood methylation datasets (p=0.0046 and p=4.0E-11), and similarly relates to HIV infection in two blood datasets (p=6E-6 and p=8.6E-6). We observe a suggestive relationship between DNAm PhenoAge in blood and Parkinson's disease status (p=0.028) for individuals from European ancestry.
We examined the association between DNAm PhenoAge and smoking and found that DNAm PhenoAge also significantly differs by smoking status (p=0.0033). Next, we re-evaluated the morbidity and mortality associations (fully-adjusted) in our four samples, stratifying by smoking status (smokers vs. non-smokers). We find that DNAm PhenoAge is associated with mortality both among smokers (adjusted for pack-years) (Meta(FE)=1.041, Meta p=2.6E-14), and among persons who have never smoked (Meta(FE)=1.027, Meta p=7.9E-7). Moreover, among never smokers, DNAm PhenoAge relates to the number of coexisting morbidities (Meta P-value=7.83E-6), physical functioning status (Meta P-value=2.63E-3), disease free status (Meta P-value=4.38E-4), and CHD (Meta P-value=1.80E-4), while among current smokers, it relates to the number of coexisting morbidities (Meta P-value=4.61E-5), physical functioning status (Meta P-value=1.01E-4), and disease free status (Meta P-value=0.0048), but only exhibits a suggestive association with CHD (Meta P-value=0.084).
We studied whether DNAm PhenoAge of blood predicts lung cancer risk in the first WHI sample. After adjusting for chronological age, race/ethnicity, pack-years, and smoking status, results showed that a one year increase in DNAm PhenoAge is associated with a 5% increase in lung cancer risk (HR=1.05, p=0.031), and when restricting the model to current smokers only, we find that the effect of DNAm PhenoAge on lung cancer mortality is even stronger (HR=1.10, p=0.014).
We also find evidence of social gradients in DNAm PhenoAge, such that those with higher education (p=6E-9) and higher income (p=9E-5) appear younger. DNAm PhenoAge relates to exercise and dietary habits, such that increased exercise (p=7E-5) and markers of fruit/vegetable consumption (such as carotenoids, p=5E-22) are associated with lower DNAm PhenoAge.
We also evaluated DNAm PhenoAge in other non-blood tissues. Although DNAm PhenoAge was developed from DNAm levels assessed in whole blood, our empirical results show that it strongly correlates with chronological age in a host of different tissues. For instance, when examining all tissue concurrently, the correlation between DNAm PhenoAge and chronological age was 0.71. Age correlations in brain tissue ranged from 0.54 to 0.92. Consistent age correlations were also found in breast (r=0.47), buccal cells (r=0.88), dermal fibroblasts (r=0.87), epidermis (r=0.84), colon (r=0.88), heart (r=0.66), kidney (r=0.64), liver (r=0.80), lung (r=055), and saliva (r=0.81).
DNA methylation (DNAm) data have given rise to highly accurate age estimation methods known as “epigenetic clocks”. These recently developed DNA methylation-based biomarkers allow one to estimate the epigenetic age of an individual (see, e.g. Levine M E., The Journals of Gerontology Series A: Biological Sciences and Medical Sciences. 2013; 68(6):667-674; Li S et al., Twin Res Hum Genet. 2015; 18(6):720-726; Sebastiani et al., Aging Cell. 2017; and Ferrucci L et al., Public Health Reviews. 2010; 32(2):475-488). For example, the “epigenetic clock”, developed by Horvath, which is based on methylation levels of 353 CpGs, can be used to estimate the age of most human cell types, tissues, and organs (Sebastiani et al., Aging Cell. 2017). The first generation of DNAm based biomarkers of aging were developed using chronological age as a surrogate measure for biological age. While the current epigenetic age estimators exhibit statistically significant associations with many age-related diseases and conditions, the effect sizes are typically small to moderate. While chronological age is arguably the strongest risk factor for aging-related death and disease, it is important to distinguish chronological time from biological aging. Individuals of the same chronological age may exhibit greatly different susceptibilities to age-related diseases and death, which is likely reflective of differences in their underlying biological aging processes (Ferrucci L et al., Public Health Reviews. 2010; 32(2):475-488). Using chronological age as the reference in the developing of epigenetic biomarkers of aging, by definition, may exclude CpGs whose methylation patterns don't display strong time-dependent changes, but instead signal the departure of biological age from chronological age. Thus, we hypothesized that a more powerful epigenetic biomarker of aging could be generated from DNAm by replacing chronological age with a surrogate measure of “phenotypic age” that, in and of itself, differentiates morbidity and mortality risk among same-age individuals.
Using a novel two-step method, we were successful in developing a DNAm based biomarker of aging that is highly predictive of nearly every morbidity and mortality outcome we tested. Our study demonstrates that DNAm PhenoAge greatly outperforms the first generation of DNAm based biomarkers of aging from Hannum (Hannum et al., Mol Cell. 2013; 49) and Horvath (Horvath S., Genome Biol. 2013; 14(R115), in terms of both its predictive accuracy for time to death and its associations with various other aging measures, including disease incidence/prevalence and physical functioning. Most surprisingly, DNAm PhenoAge is associated with age-related conditions in samples other than whole blood, for instance obesity in liver.
Our applications demonstrate that the combination of advanced machine learning methods, relevant functional genomic data (DNA methylation), and large sample sizes resulted in an epigenetic biomarker that outperforms existing molecular biomarkers of aging in terms of its strong relationship with a host of age related conditions. The new DNAm PhenoAge measure performs better than any of molecular biomarker of human aging, when it comes to predicting healthspan and lifespan.
Our results also demonstrate the utility of a novel method for building DNAm based biomarkers of aging. Our development of the new epigenetic biomarker of aging proceeded along two main steps. In step 1, a novel measure of phenotypic age was developed using clinical data. A Cox penalized regression model—where the hazard of aging-related mortality was regressed on clinical markers and chronological age—was used to select variables for inclusion in our phenotypic age score. In step 2, phenotypic age is regressed on DNA methylation data from the same individuals. The regression produced a model in which phenotypic age is predicted by DNAm levels. The linear combination of the weighted CpGs yields a DNAm based estimator of phenotypic age that we refer to as ‘DNAm PhenoAge’ in contrast to the previously published measures of ‘DNAm Age’.
To use the epigenetic biomarker one needs to extract DNA from cells or fluids, e.g. human blood cells, saliva, liver, brain tissue. Next, one needs to measure DNA methylation levels in the underlying signature of 513 CpGs (epigenetic markers) that are being used in the mathematical algorithm. The algorithm leads to a “phenotypic age” (the apparent age of an individual resulting from the interaction of its genotype with the environment) for each sample or human subject. The higher the value, the higher the risk of death and disease.
As noted above, embodiments of the present invention relate to methods for estimating the biological age of an individual human tissue or cell type sample based on measuring DNA Cytosine-phosphate-Guanine (CpG) methylation markers that are attached to DNA. In a general embodiment of the invention, a method is disclosed comprising a first step of choosing a source of DNA such as specific biological cells (e.g. T cells in blood) or tissue sample (e.g. blood) or fluid (e.g. saliva). In a second step, genomic DNA is extracted from the collected source of DNA of the individual for whom a biological age estimate is desired. In a third step, the methylation levels of the methylation markers near the specific clock CpGs are measured. In a fourth step, a statistical prediction algorithm is applied to the methylation levels to predict the age. One basic approach is to form a weighted average of the CpGs, which is then transformed to DNA methylation (DNAm) age using a calibration function. As used herein, “weighted average” is a linear combination calculated by giving values in a data set more influence according to some attribute of the data. It is a number in which each quantity included in the linear combination is assigned a weight (or coefficient), and these weightings determine the relative importance of each quantity in the linear combination.
DNA methylation of the methylation markers (or markers close to them) can be measured using various approaches, which range from commercial array platforms (e.g. from Illumina™) to sequencing approaches of individual genes. This includes standard lab techniques or array platforms. A variety of methods for detecting methylation status or patterns have been described in, for example U.S. Pat. Nos. 6,214,556, 5,786,146, 6,017,704, 6,265,171, 6,200,756, 6,251,594, 5,912,147, 6,331,393, 6,605,432, and 6,300,071 and US Patent Application Publication Nos. 20030148327, 20030148326, 20030143606, 20030082609 and 20050009059, each of which are incorporated herein by reference. Other array-based methods of methylation analysis are disclosed in U.S. patent application Ser. No. 11/058,566. For a review of some methylation detection methods, see, Oakeley, E. J., Pharmacology & Therapeutics 84:389-400 (1999). Available methods include, but are not limited to: reverse-phase HPLC, thin-layer chromatography, SssI methyltransferases with incorporation of labeled methyl groups, the chloracetaldehyde reaction, differentially sensitive restriction enzymes, hydrazine or permanganate treatment (m5C is cleaved by permanganate treatment but not by hydrazine treatment), sodium bisulfite, combined bisulphate-restriction analysis, and methylation sensitive single nucleotide primer extension.
The methylation levels of a subset of the DNA methylation markers disclosed herein are assayed (e.g. using an Illumina™ DNA methylation array, or using a PCR protocol involving relevant primers). To quantify the methylation level, one can follow the standard protocol described by Illumina™ to calculate the beta value of methylation, which equals the fraction of methylated cytosines in that location. The invention can also be applied to any other approach for quantifying DNA methylation at locations near the genes as disclosed herein. DNA methylation can be quantified using many currently available assays which include, for example:
a) Molecular break light assay for DNA adenine methyltransferase activity is an assay that is based on the specificity of the restriction enzyme DpnI for fully methylated (adenine methylation) GATC sites in an oligonucleotide labeled with a fluorophore and quencher. The adenine methyltransferase methylates the oligonucleotide making it a substrate for DpnI. Cutting of the oligonucleotide by DpnI gives rise to a fluorescence increase.
b) Methylation-Specific Polymerase Chain Reaction (PCR) is based on a chemical reaction of sodium bisulfite with DNA that converts unmethylated cytosines of CpG dinucleotides to uracil or UpG, followed by traditional PCR. However, methylated cytosines will not be converted in this process, and thus primers are designed to overlap the CpG site of interest, which allows one to determine methylation status as methylated or unmethylated. The beta value can be calculated as the proportion of methylation.
c) Whole genome bisulfite sequencing, also known as BS-Seq, is a genome-wide analysis of DNA methylation. It is based on the sodium bisulfite conversion of genomic DNA, which is then sequencing on a Next-Generation Sequencing (NGS) platform. The sequences obtained are then re-aligned to the reference genome to determine methylation states of CpG dinucleotides based on mismatches resulting from the conversion of unmethylated cytosines into uracil.
d) The Hpall tiny fragment Enrichment by Ligation-mediated PCR (HELP) assay is based on restriction enzymes' differential ability to recognize and cleave methylated and unmethylated CpG DNA sites.
e) Methyl Sensitive Southern Blotting is similar to the HELP assay but uses Southern blotting techniques to probe gene-specific differences in methylation using restriction digests. This technique is used to evaluate local methylation near the binding site for the probe.
f) ChIP-on-chip assay is based on the ability of commercially prepared antibodies to bind to DNA methylation-associated proteins like MeCP2.
g) Restriction landmark genomic scanning is a complicated and now rarely-used assay is based upon restriction enzymes' differential recognition of methylated and unmethylated CpG sites. This assay is similar in concept to the HELP assay.
h) Methylated DNA immunoprecipitation (MeDIP) is analogous to chromatin immunoprecipitation. Immunoprecipitation is used to isolate methylated DNA fragments for input into DNA detection methods such as DNA microarrays (MeDIP-chip) or DNA sequencing (MeDIP-seq).
i) Pyrosequencing of bisulfite treated DNA is a sequencing of an amplicon made by a normal forward primer but a biotinylated reverse primer to PCR the gene of choice. The Pyrosequencer then analyses the sample by denaturing the DNA and adding one nucleotide at a time to the mix according to a sequence given by the user. If there is a mismatch, it is recorded and the percentage of DNA for which the mismatch is present is noted. This gives the user a percentage methylation per CpG island.
In certain embodiments of the invention, the genomic DNA is hybridized to a complimentary sequence (e.g. a synthetic polynucleotide sequence) that is coupled to a matrix (e.g. one disposed within a microarray such as on a DNA chip). Optionally, the genomic DNA is transformed from its natural state via amplification by a polymerase chain reaction process. For example, prior to or concurrent with hybridization to an array, the sample may be amplified by a variety of mechanisms, some of which may employ PCR. See, for example, PCR Technology: Principles and Applications for DNA Amplification (Ed. H. A. Erlich, Freeman Press, NY, N.Y., 1992); PCR Protocols: A Guide to Methods and Applications (Eds. Innis, et al., Academic Press, San Diego, Calif., 1990); Mattila et al., Nucleic Acids Res. 19, 4967 (1991); Eckert et al., PCR Methods and Applications 1, 17 (1991); PCR (Eds. McPherson et al., IRL Press, Oxford); and U.S. Pat. Nos. 4,683,202, 4,683,195, 4,800,159, 4,965,188, and 5,333,675. The sample may be amplified on the array. See, for example, U.S. Pat. No. 6,300,070, which is incorporated herein by reference.
In addition to using art accepted modeling techniques (e.g. regression analyses), embodiments of the invention can include a variety of art accepted technical processes. For example, in certain embodiments of the invention, a bisulfite conversion process is performed so that cytosine residues in the genomic DNA are transformed to uracil, while 5-methylcytosine residues in the genomic DNA are not transformed to uracil. Kits for DNA bisulfite modification are commercially available from, for example, MethylEasy™ (Human Genetic Signatures™) and CpGenome™ Modification Kit (Chemicon™). See also, WO04096825A1, which describes bisulfite modification methods and Olek et al. Nuc. Acids Res. 24:5064-6 (1994), which discloses methods of performing bisulfite treatment and subsequent amplification. Bisulfite treatment allows the methylation status of cytosines to be detected by a variety of methods. For example, any method that may be used to detect a SNP may be used, for examples, see Syvanen, Nature Rev. Gen. 2:930-942 (2001). Methods such as single base extension (SBE) may be used or hybridization of sequence specific probes similar to allele specific hybridization methods. In another aspect the Molecular Inversion Probe (MIP) assay may be used.
The 513 CpG sites discussed herein are found in Table 5 that is included with this application. The Illumina method takes advantage of sequences flanking a CpG locus to generate a unique CpG locus cluster ID with a similar strategy as NCBI's refSNP IDs (rs#) in dbSNP (see, e.g. Technical Note: Epigenetics, CpG Loci Identification ILLUMINA Inc. 2010). Further information on the present invention can be found in Levine et al., Aging, 2018 Apr. 18; 10(4):573-591 which is incorporated herein by reference.
Estimating Phenotypic Age from Clinical Biomarkers
Our development of the new epigenetic biomarker of aging proceeded along three main steps (
Validation data for phenotypic age came from the fourth National Health and Nutrition Examination Survey (NHANES IV), and included up to 17 years of mortality follow-up for n=6,209 national representative US adults. Mortality results show (Table 1) that a one year increase in phenotypic age is associated with a 9% increase in the risk of all-cause mortality (HR=1.09, p=3.8E-49), a 9% increase in the risk of aging-related mortality (HR=1.09, p=4.5E-34), a 10% increase in the risk of CVD mortality (HR=1.10, p=5.1E-17), a 7% increase in the risk of cancer mortality (HR=1.07, p=7.9E-10), a 20% increase in the risk of diabetes mortality (HR=1.20, p=1.9E-11), and a 9% increase in the risk of lung disease mortality (HR=1.09, p=6.3E-4). Finally, in the proportional hazard model, phenotypic age completely accounted for the effect of chronological age, such that chronological age no longer exhibited a significant positive association with mortality.
We further tested whether the phenotypic age associations held-up when examining mortality among three age strata-young and middle-aged adults (20-64 years at baseline), older adults (65-79 years at baseline), and the oldest-old (80+ years at baseline). Results showed consistent findings for all-cause, aging-related, CVD, cancer, diabetes, and lung disease within all age strata (Table 1). Finally, to ensure that phenotypic age didn't simply represent an end-of-life marker, we removed participants who died within five years of baseline, and then re-examined mortality associations. Again, we find significant associations for all mortality outcomes, except Alzheimer's disease (Table 1).
Finally, as shown in
In step 2 (
While our new clock was trained on cross-sectional data in InCHIANTI, we capitalized on the repeated time-points to test whether changes in DNAm PhenoAge are related to changes in phenotypic age. As expected, between 1998 and 2007, mean change in DNAm PhenoAge was 8.51 years, whereas mean change in phenotypic age was 8.88 years. Moreover, participants' phenotypic age (adjusting for chronological age) at the two time-points was correlated at r=0.50, whereas participants' DNAm PhenoAge (adjusting for chronological age) at the two time-points was correlated at r=0.68 (
In step 3 (
We also observe strong association between DNAm PhenoAge and a variety of other aging outcomes (Table 2). For instance, independent of chronological age, higher DNAm PhenoAge is associated with an increase in a person's number of coexisting morbidities (Meta P-value=4.56E-15), a decrease in likelihood of being disease-free (Meta P-value=1.06E-7), an increase in physical functioning problems (Meta P-value=2.05E-13), an increase in the risk of CHD risk (Meta P-value=2.43E-10, an earlier age at menopause (Meta P-value=8.22E-4)—suggesting that women were epigenetically older if they had entered menopause earlier.
Additional replication data was used to test for associations with other aging outcomes, which have previously been shown to relate to the first generation of epigenetic biomarkers14,15,23-26 For instance, we find that among the 527 women who were cancer free at age 50, accelerated DNAm PhenoAge predicts incident breast cancer (p=0.033, OR: 1.037). We also find a marginally significant reduction of approximately 2.4 years for the DNAm PhenoAge of semi-super centenarian offspring, relative to controls (P=−2.40, p=0.065). Using blood methylation data, we evaluated whether DNAm PhenoAge relates to clinically diagnosed dementia in living individuals. Results suggest that those with presumed Alzheimer's disease (AD, n=154) and/or frontotemporal dementia (FTD, n=116) have significantly higher DNAm PhenoAge compared to non-demented (n=334) individuals (P=2.2E-2), and the strength of the association is further increased (P=9.4E-3) when limiting the sample to those ages 75 and older. We also find that DNAm PhenoAge, relates to Down syndrome in two separate blood methylation datasets (p=0.0046 and p=4.0E-11), and similarly relates to HIV infection in two blood datasets (p=6E-6 and p=8.6E-6). We observe a suggestive relationship between DNAm PhenoAge in blood and Parkinson's disease status (p=0.028) for individuals from European ancestry.
Given the recent study in which Zhang and colleagues27 developed an epigenetic mortality predictor that turned out to be an estimate of smoking habits, we examined the association between DNAm PhenoAge and smoking. As shown in
In evaluating the relationship between DNAm PhenoAge and social, behavioral, and demographic characteristics we observe significant differences between racial/ethnic groups (p=5.1E-5), with non-Hispanic blacks having the highest DNAm PhenoAge on average, and non-Hispanic whites having the lowest (
Although DNAm PhenoAge was developed from DNAm levels assessed in whole blood, our empirical results show that it strongly correlates with chronological age in a host of different tissues (
Using the Horvath DNAm age measure, we previously found that body mass index is correlated with epigenetic age acceleration in two independent human liver samples (r=0.42 and r=0.42 in liver data sets 1 and 2, respectively)29. Using the same data, we replicated this finding using the new measure of PhenoAge acceleration (r=0.32, p=0.011 and r=0.48 p=7.7E-6 in liver data set 1 and 2, respectively. Interestingly we also find a significant correlation between BMI and DNAm PhenoAge acceleration in the first adipose data set (r=0.43, p=1.2E-23 using n=648 adipose samples from the Twins UK study) but not in a second smaller adipose data set (n=32 samples).
To test the hypothesis that DNAm phenotypic age acceleration captures aspects of the age-related decline of the immune system, we correlated DNAm PhenoAge acceleration with estimated blood cell count (
In our functional enrichment analysis of the chromosomal locations of the 513 CpGs, we found that 149 CpGs whose age correlation exceeded 0.2 tended to be located in CpG islands (p=0.0045,
Our heritability analysis of the DNAm PhenoAge acceleration used the SOLAR polygenic model to estimate the proportion of phenotypic variance explained by family relationship in the Framingham Heart Study pedigrees. The model assumes additive genetic heritability in a polygenic model, adjusting for chronological age and sex. The heritability estimated by the SOLAR polygenic model was (h2=0.33) among persons of European ancestry. Similarly, a heritability estimate from SNP data was calculated from WHI data using GCTA-GREML analysis. In this model, we find that heritability is estimated at h2=0.51 for participants of European ancestry.
Using a novel two-step method, we were successful in developing a DNAm based biomarker of aging that is highly predictive of nearly every morbidity and mortality outcome we tested. Our study demonstrates that DNAm PhenoAge greatly outperforms the first generation of DNAm based biomarkers of aging from Hannum9 and Horvath10, in terms of both its predictive accuracy for time to death and its associations with various other aging measures, including disease incidence/prevalence and physical functioning. Most surprisingly, DNAm PhenoAge is associated with age-related conditions in samples other than whole blood, for instance obesity in liver.
Our applications demonstrate that the combination of advanced machine learning methods, relevant functional genomic data (DNA methylation), and large sample sizes resulted in an epigenetic biomarker that outperforms existing molecular biomarkers. However, the unbiased, data-driven approach used in its construction entails that it is challenging to understand the molecular causes and consequences of DNAm PhenoAge. To partially address this challenge, we employed three approaches: i) study on the relationship between phenotypic aging and changes in blood cell counts, ii) functional enrichment studies of the underlying CpGs, iii) heritability analysis. Although DNAm PhenoAge captures some aspects of the age-related decline in the immune system, these changes in cell composition do not explain the strong association between DNAm PhenoAge and mortality/morbidity outcomes. Our functional enrichment study demonstrates that age related DNA methylation changes in polycomb group protein targets must play a role, which echoes results from previous epigenome wide studies of aging effects4,31,32 Our heritability analysis suggests that there is a genetic basis for differences in DNAm PhenoAge, after adjusting for chronological age. Our results also suggest DNAm PhenoAge may respond to modifiable lifestyle factors. In moving forward, it will be important to establish causative pathways to test whether DNAm PhenoAge mediates the links between these precipitating factors and aging-related outcomes (i.e. social, behavioral, environmental conditions→DNAm PhenoAge→morbidity/mortality).
Overall, we expect that DNAm PhenoAge will become a useful molecular biomarker for human anti-aging studies because it is a highly robust, blood based biomarker that captures organismal age and the functional state of many organ systems and tissues.
Using the NHANES training data, we applied a Cox penalized regression model—where the hazard of aging-related mortality (mortality from diseases of the heart, malignant neoplasms, chronic lower respiratory disease, cerebrovascular disease, Alzheimer's disease, Diabetes mellitus, nephritis, nephrotic syndrome, and nephrosis) was regressed on forty-two clinical markers and chronological age to select variables for inclusion in our phenotypic age score. Ten-fold cross-validation was employed to select the parameter value, lambda, for the penalized regression. In order to develop a sparse phenotypic age estimator (the fewest biomarker variables needed to produce robust results) we selected a lambda of 0.0192, which represented a one standard deviation increase over the lambda with minimum mean-squared error during cross-validation (
These nine biomarkers and chronological age were then included in a parametric proportional hazards model based on the Gompertz distribution. Based on this model, we estimated the 10-year (120 months) mortality risk of the j-the individual. Next, the mortality score was converted into units of years The resulting phenotypic age estimate was regressed DNA methylation data using an elastic net regression analysis. The penalization parameter was chosen to minimize the cross validated mean square error rate (
As noted above, these nine biomarkers and chronological age were then included in a parametric proportional hazards model based on the Gompertz distribution. Based on this model, we estimated the 10-year (120 months) mortality risk of the j-the individual based on the cumulative distribution function
MortalityScorej=CDF(120,Xj)=1−e−e
where xb=represents the linear combination of biomarkers from the fitted model (Table 4):
WeightedAverage=(−Albumin*0.0336+log(Creatinine)*0.0095+Glucose*0.1953+C−reactiveProtein*0.0954−LymphocytePerc*0.0120−+MeanCellVolume*0.0268+RedBloodCellDistributionWidth*0.3306+AlkalinePhosphatase*0.0019+WhiteBloodCellCount*0.0554+age*0.0804−19.9067).
Next, the mortality score was converted into units of years using the following equation
PhenotypicAgej=141.50225+ln(−0.00553*ln(1−MortalityScorej)))/0.090165
Statistical Details on the Gompertz Proportional Hazards Model for Phenotypic Age Estimation
The Gompertz regression is parameterized only as a proportional hazards model. This model has been extensively used extensively for modeling mortality data. The Gompertz distribution implemented is the two-parameter function as described in Lee and Wang (2003)1, with the following hazard and survivor functions:
h(t)=λexp(γt)
S(t)=exp{−λγ−1(eγt−1)}
The covariates of the j-th individual are including in the model using the following parametrization: λj=exp(xjβ) which implies that the baseline hazard is given by h0(t)=exp(μt) where γ is an ancillary parameter to be estimated from the data.
The cumulative distribution function of the Gompertz model is given by
CDF(t,x)=1−exp(−exp(xb)(exp(γt)−1)/γ)
where t denotes time (here in units of months) and xb=Σu=1p xubu+b0.
We used the STATA software (StataCorp. 2001. Statistical Software: Release 7.0) to carry out the Gompertz regression analysis.
In step 1, we fit a parametric proportional hazards model analysis with Gompertz distribution using the STATA commands
stset person_months [pweight=wt], failure(mortstat==1)
streg var1 var2 var3 . . . vark,dist(gomp)
The Gompertz regression analysis resulted in coefficient values and parameter values (Table 1) and γ=0.0076927.
In step 2, we used the cumulative distribution function of the Gompertz model to estimate the 120-month mortality risk of each individual. Thus, CDF(t=120,xj) denotes the probability that the j-th individual will die within the next 120 months. In step 3, carried out another parametric proportional hazards model analysis with Gompertz distribution, but only including chronological age as a IV. We will refer to this analysis as the univariate Gompertz regression model since it only involved one covariate (age). The resulting estimate of the cumulative distribution function CDF·univariate(t,age)
allowed us to estimate the probability that the j-th individual with die within 120 months as follows CDF·univariate(120,agej) where agej is the age of the j-th individual.
In step 4, we solved the equation CDF(120,xj)=CDF·univariate(120,agej) for the variable agej. The resulting solution for the j-th individual, referred to as PhenotypicAge, is given by
Participants ages 20 and over in NHANES III (1988-94) were used as the training sample to develop a new and improved measure of phenotypic aging (n=9,926), while participants ages 20 and over in NHANES IV (1999-2014) were used to validate the association between phenotypic aging and age-related morbidity and mortality (n=6,209). Overall, NHANES III had available mortality follow-up for up to 23 (n=deaths) and NHANES IV had available mortality follow-up for up to 17 years (n=deaths). InCHIANTI included longitudinal (two time-points-1998 and 2007) phenotypic and DNAm data on n=456 male and female participants, ages 21-91 in 1998, and 30-100 in 2007. Participants from WHI included 2,107 post-menopausal women, who were ages 50-80 at baseline and were followed-up for just over 20 years.
Steps for Measuring the DNA Methylation PhenoAge of a Tissue Sample and Estimating DNA Methylation-Based Predictors of Mortality
Step 1: Obtain Cells from Blood, Saliva, or Other Sources of DNA from an Individual.
There are several options.
Blood tubes collected by venipunture: Blood tubes collected by venipuncture will result in a large amount of high quality DNA from a relevant tissue. The invention applies to DNA from whole blood, or peripheral blood mononuclear cells or even sorted blood cell types.
Saliva spit kit:
Dried blood spots can be easily collected by a finger prick method. The resulting blood droplet can be put on a blood card, e.g. http://www.lipidx.com/dbs-kits/.
This step will be carried out by the lab that collects the samples.
Step 2a: Extract the genomic DNA from the cells
Step 2b: Measure cytosine DNA methylation levels.
Several approaches can be used for measuring DNA methylation including sequencing, bisulfite sequencing, arrays, pyrosequencing, liquid chromatography coupled with tandem mass spectrometry.
Our invention applies to any platform used for measuring DNA methylation data. In particular, it can be used in conjunction with the latest Illumina methylation array platform the EPIC array or the older platforms (Infinium 450K array or 27K array). Our coefficient values used pertain to the “beta values” whose values lie between 0 and 1 but it could be easily adapted to other metrics of assessing DNA methylation, e.g. “M values”.
The DNAm PhenoAge estimate can be estimated as a weighted linear combination of 513 CpGs in Table 5. This table also includes the probe designation/identifier used in the Illumina Infinium 450K array.
All publications mentioned herein are incorporated herein by reference to disclose and describe the methods and/or materials in connection with which the publications are cited (e.g. U.S. Patent Publication 20150259742). Publications cited herein are cited for their disclosure prior to the filing date of the present application. Nothing here is to be construed as an admission that the inventors are not entitled to antedate the publications by virtue of an earlier priority date or prior date of invention. Further, the actual publication dates may be different from those shown and require independent verification.
This concludes the description of the preferred embodiment of the present invention. The foregoing description of one or more embodiments of the invention has been presented for the purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed. Many modifications and variations are possible in light of the above teaching.
This application claims priority under Section 119(e) from U.S. Provisional Application Ser. No. 62/618,422, filed Jan. 17, 2018, entitled “PHENOTYPIC AGE AND DNA METHYLATION BASED BIOMARKERS FOR LIFE EXPECTANCY AND MORBIDITY” the contents of each which are incorporated herein by reference.
This invention was made with Government support under Grant Numbers AG051425 and AG052604, awarded by the National Institutes of Health. The Government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2019/014053 | 1/17/2019 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62618422 | Jan 2018 | US |