EPIGENETIC METHOD TO ESTIMATE THE INTRINSIC AGE OF SKIN

Information

  • Patent Application
  • 20210207201
  • Publication Number
    20210207201
  • Date Filed
    June 15, 2019
    6 years ago
  • Date Published
    July 08, 2021
    4 years ago
Abstract
The invention provides a method for obtaining information useful to determine the intrinsic age of skin of an individual, the method comprising the steps of: (a) obtaining genomic DNA from skin cells derived from the individual; and (b) observing cytosine methylation of >30 CpG loci in the genomic DNA selected from the group consisting of: cg19381811 cg19670290 cg15393490 cg01465824 cg04999352 cg09046979 cg10426318 cg09077126 cg24374161 cg14896948 cg14412967 cg16937583 cg17508941 cg24757926 cg03936449 cg17953764 cg00442430 cg06621744 cg08076830 cg06882058 cg25351606 cg02662828 cg20897936 cg07878486 cg21992250 cg06335867 cg15171839 cg09017434 cg04044664 cg20442599 cg15488596 cg10384245 cg23368787 cg07960624 cg08622677 cg13848598 cg02273797 cg22593953 cg23213887 cg20234007 cg26492368 cg06470727 cg13612317 cg09432376 cg12530994 cg05457221 cg04766371 cg03614721 cg22624391 cg27369542 cg18322569 cg27284120 cg21303763 cg05238606 cg16300030 cg00085493 cg11239720 cg23942526 cg10568624 cg07217499 cg03405983 cg25590826 cg10292855 cg15440941 cg15084543 cg05036656 cg00167670 cg18396984 cg00642460 cg07922606 cg23676577 cg17241310 cg00991848 cg03738025 cg09287864 cg12060499 cg14912644 cg11084334 cg22589169 cg17885226 cg15568145 cg07779387 cg02571816 cg17861230 cg26993102 cg02898293 cg00346208 cg17062829 cg15895690, so that information useful to determine the intrinsic age of the skin of the individual is obtained.
Description
FIELD OF INVENTION

This invention relates to methods of detecting and analysing patterns of cytosine methylation in genomic DNA. More specifically, it relates to detecting and analysing patterns of cytosine methylation in specific sites in genomic DNA in order to determine the intrinsic age and health of skin.


BACKGROUND TO INVENTION

It is well known that ageing is a multifactorial process predominantly driven by the age of the individual. Skin ageing in an especially multifactorial phenomenon driven by both intrinsic and extrinsic factors. In terms of intrinsic factors, the chronological age of an individual is the most well-known but other intrinsic factors such as an individual's metabolism, diet, stress and underlying health also contribute to the age if the skin. In addition to these intrinsic factors, the skin is exposed to external challenges such as UV radiation, pollution, drying conditions and extremes of temperature. These extrinsic factors therefore also contribute to the age on an individual's skin.


It is therefore clear that there are two distinct forms of skin age: Extrinsic age, which is dominated by the accumulation of ageing caused by extrinsic factors (i.e. originating from outside the exterior surface of the stratum corneum and that then penetrate into the skin through the stratum corneum), especially sun exposure (photo-ageing); and Intrinsic age, which is the degree of ageing in skin due to factors that originate endogenously; in other words ageing not due to extrinsic factors. For the sake of understanding, it is helpful to consider 2 different types of skin of an individual. One from a site normally protected by clothing (such as the buttock area or upper inner arm area). Another from a sun exposed site (such as the face or back of the hand). The protected site will have far less exposure to extrinsic aging factors and therefore any aging will be due to intrinsic factors. The exposed site will been fully exposed to extrinsic aging factors and therefore the age of this area aging will be due to a combination of both the inherent intrinsic age caused by the intrinsic factors but also the aging due to the extrinsic factors.


The present invention is directed towards the development of an epigenetic method to estimate the intrinsic age of an individual's skin.


DNA methylation is an epigenetic determinant of gene expression. Patterns of CpG methylation are heritable, tissue specific, and correlate with gene expression. The consequence of methylation, particularly if located in a gene promoter, is usually gene silencing. DNA methylation also correlates with other cellular processes including embryonic development, chromatin structure, genomic imprinting, somatic X-chromosome inactivation in females, inhibition of transcription and transposition of foreign DNA and timing of DNA replication. When a gene is highly methylated it is less likely to be expressed. Thus, the identification of sites in the genome containing 5-meC is important in understanding cell-type specific programs of gene expression and how gene expression profiles are altered during both normal development, ageing and diseases such as cancer. Mapping of DNA methylation patterns is important for understanding diverse biological processes such as the regulation of imprinted genes, X chromosome inactivation, and tumor suppressor gene silencing in human cancers.


Horvath S. et al “DNA methylation age of human tissues and cell types” (Genome Biology 14 (2103) R115) reports the use of a transformed version of chronological age that was regressed on CpGs using a penalized regression model (elastic net). The elastic net regression model selected 353 CpGs which were referred to as epigenetic clock CpGs since their weighted average (formed by the regression coefficients) was said to amount to an epigenetic clock. This study is referred to as the “Horvath Study” in this patent.


However, we have now found that for sun-exposed skin sites the predicted ages based on these 353 loci were approximately 9 years younger than their actual (“chronological”) age, indicating they do not detect sun-induced damage in skin. Additionally, sun-protected skin samples were found to have an age 4 years younger than the chronological age which is a underestimation of the age of the sun-protected skin which would be expected to be approximately the same as the chronological age of the subject that the sample was taken from. These 353 loci therefore fail to recognize the difference between photo-damaged and photo-protected skin types, underestimate the age of sun-protected skin, and predict photo-damaged skin as younger than photo-protected. It can therefore be appreciated that this model is not capable of assessing the different forms of aging—extrinsic and intrinsic ageing


The present invention therefore aims to address the poor performance of this prior art ageing model and to provide an improved method for evaluating the intrinsic age of skin.


SUMMARY OF INVENTION

We have surprisingly found that a different, specific set of methylation sites provide enhanced accuracy for the prediction of intrinsic skin age. In particular, the sites are capable of predicting the age of protected skin and are also capable of giving an intrinsic age for exposed skin that is surprisingly not influenced by extrinsic factors.


Accordingly, in a first aspect the invention provides a method for obtaining information useful to determine the intrinsic age of skin of an individual, the method comprising the steps of:


(a) obtaining genomic DNA from skin cells derived from the individual; and


(b) observing cytosine methylation of >30 CpG loci in the genomic DNA selected from the group consisting of:














cg19381811 cg19670290 cg15393490 cg01465824 cg04999352 cg09046979 cg10426318


cg09077126 cg24374161 cg14896948 cg14412967 cg16937583 cg17508941 cg24757926


cg03936449 cg17953764 cg00442430 cg06621744 cg08076830 cg06882058 cg25351606


cg02662828 cg20897936 cg07878486 cg21992250 cg06335867 cg15171839 cg09017434


cg04044664 cg20442599 cg15488596 cg10384245 cg23368787 cg07960624 cg08622677


cg13848598 cg02273797 cg22593953 cg23213887 cg20234007 cg26492368 cg06470727


cg13612317 cg09432376 cg12530994 cg05457221 cg04766371 cg03614721 cg22624391


cg27369542 cg18322569 cg27284120 cg21303763 cg05238606 cg16300030 cg00085493


cg11239720 cg23942526 cg10568624 cg07217499 cg03405983 cg25590826 cg10292855


cg15440941 cg15084543 cg05036656 cg00167670 cg18396984 cg00642460 cg07922606


cg23676577 cg17241310 cg00991848 cg03738025 cg09287864 cg12060499 cg14912644


cg11084334 cg22589169 cg17885226 cg15568145 cg07779387 cg02571816 cg17861230


cg26993102 cg02898293 cg00346208 cg17062829 cg15895690,










so that information useful to determine the intrinsic age of the skin of the individual is obtained.


The genomic DNA is obtained from skin cells derived from the individual. The skin sample preferably comprises the epidermis, either alone or in combination with the dermis.


Preferably >40 sites from this group are used, more preferably >45, >50, >55, >60, >65, >70, >75, >80, >85, most preferably all 89 sites of this group are used.


Preferably the loci that are observed are:














cg02273797 cg22593953 cg23213887 cg20234007 cg26492368 cg06470727 cg13612317


cg09432376 cg12530994 cg05457221 cg04766371 cg03614721 cg22624391 cg27369542


cg18322569 cg27284120 cg21303763 cg05238606 cg16300030 cg00085493 cg11239720


cg23942526 cg10568624 cg07217499 cg03405983 cg25590826 cg10292855 cg15440941


cg15084543 cg05036656 cg00167670 cg18396984 cg00642460 cg07922606 cg23676577


cg17241310 cg00991848 cg03738025 cg09287864 cg12060499 cg14912644 cg11084334


cg22589169 cg17885226 cg15568145 cg07779387 cg02571816 cg17861230 cg26993102


cg02898293 cg00346208 cg17062829 cg15895690.









More preferably the loci that are observed are:














cg19381811 cg19670290 cg15393490 cg01465824 cg04999352 cg09046979 cg10426318


cg09077126 cg24374161 cg14896948 cg14412967 cg16937583 cg17508941 cg24757926


cg03936449 cg17953764 cg00442430 cg06621744 cg08076830 cg06882058 cg25351606


cg02662828 cg20897936 cg07878486 cg21992250 cg06335867 cg15171839 cg09017434


cg04044664 cg20442599 cg15488596 cg10384245 cg23368787 cg07960624 cg08622677


cg13848598.









In an alternative embodiment, the cytosine methylation in the genomic DNA is assessed wherein the genomic DNA is within 20 kBp of the CpG locus designation listed above, preferably within 15 kBp, more preferably within 10 kBp, yet more preferably within 5 kBp, even more preferably within 1 kBp, most preferably within 0.5 kBp.


In a second aspect, the invention provides a kit for obtaining information useful to determine the intrinsic age of the skin of an individual, the kit comprising:

    • primers or probes specific for >30 genomic DNA sequences in a biological sample, wherein the genomic DNA sequences comprise CpG loci in the genomic DNA selected from the group consisting only of the following CpG locus designations:














cg19381811 cg19670290 cg15393490 cg01465824 cg04999352 cg09046979 cg10426318


cg09077126 cg24374161 cg14896948 cg14412967 cg16937583 cg17508941 cg24757926


cg03936449 cg17953764 cg00442430 cg06621744 cg08076830 cg06882058 cg25351606


cg02662828 cg20897936 cg07878486 cg21992250 cg06335867 cg15171839 cg09017434


cg04044664 cg20442599 cg15488596 cg10384245 cg23368787 cg07960624 cg08622677


cg13848598 cg02273797 cg22593953 cg23213887 cg20234007 cg26492368 cg06470727


cg13612317 cg09432376 cg12530994 cg05457221 cg04766371 cg03614721 cg22624391


cg27369542 cg18322569 cg27284120 cg21303763 cg05238606 cg16300030 cg00085493


cg11239720 cg23942526 cg10568624 cg07217499 cg03405983 cg25590826 cg10292855


cg15440941 cg15084543 cg05036656 cg00167670 cg18396984 cg00642460 cg07922606


cg23676577 cg17241310 cg00991848 cg03738025 cg09287864 cg12060499 cg14912644


cg11084334 cg22589169 cg17885226 cg15568145 cg07779387 cg02571816 cg17861230


cg26993102 cg02898293 cg00346208 cg17062829 cg15895690;










and
    • a reagent used in:
    • a genomic DNA polymerization process;
    • a genomic DNA hybridization process;
    • a genomic DNA direct sequencing process;
    • a genomic DNA bisulphite conversion process; or
    • a genomic DNA pyrosequencing process.


Preferably the primers or probes are specific for >40 of the genomic DNA sequences in a biological sample, more preferably >45, >50, >55, >60, >65, >70, >75, >80, >85, most preferably the primers or probes are specific for all 89 sites of this group.


Preferably primers or probes are specific for genomic DNA sequences in a skin sample, most preferably a skin sample comprising the epidermis, either alone or in combination with the dermis.


Preferably the primers or probes are specific for the following CpG locus designations:














cg02273797 cg22593953 cg23213887 cg20234007 cg26492368 cg06470727 cg13612317


cg09432376 cg12530994 cg05457221 cg04766371 cg03614721 cg22624391 cg27369542


cg18322569 cg27284120 cg21303763 cg05238606 cg16300030 cg00085493 cg11239720


cg23942526 cg10568624 cg07217499 cg03405983 cg25590826 cg10292855 cg15440941


cg15084543 cg05036656 cg00167670 cg18396984 cg00642460 cg07922606 cg23676577


cg17241310 cg00991848 cg03738025 cg09287864 cg12060499 cg14912644 cg11084334


cg22589169 cg17885226 cg15568145 cg07779387 cg02571816 cg17861230 cg26993102


cg02898293 cg00346208 cg17062829 cg15895690.









More preferably the primers or probes are specific for the following CpG locus designations:














cg19381811 cg19670290 cg15393490 cg01465824 cg04999352 cg09046979 cg10426318


cg09077126 cg24374161 cg14896948 cg14412967 cg16937583 cg17508941 cg24757926


cg03936449 cg17953764 cg00442430 cg06621744 cg08076830 cg06882058 cg25351606


cg02662828 cg20897936 cg07878486 cg21992250 cg06335867 cg15171839 cg09017434


cg04044664 cg20442599 cg15488596 cg10384245 cg23368787 cg07960624 cg08622677


cg13848598.









In an alternative embodiment, the cytosine methylation in the genomic DNA is assessed wherein the genomic DNA is within 20 kBp of the CpG locus designation listed above, preferably within 15 kBp, more preferably within 10 kBp, yet more preferably within 5 kBp, even more preferably within 1 kBp, most preferably within 0.5 kBp.


Preferably the kit comprises a methylation microarray.


Preferably the kit comprises a DNA sequencing method.







DETAILED DESCRIPTION OF INVENTION AND EXAMPLES

As discussed, the aging process in skin is a highly multifactorial phenomenon that also varies across the body. For example, protected skin is exposed to far fewer insults than exposed skin and it is therefore apparent that different areas of skin from the same individual will have different levels of damage and therefore different “ages”.


In the present invention we consider two forms of skin age: Intrinsic age; and Extrinsic age.


In terms of intrinsic age, the chronological age of an individual is predominant but other endogenous factors such as an individual's metabolism, diet, stress and underlying health also contribute to the age of the skin. Therefore, in the context of the present invention, intrinsic age means the age of the skin caused by endogenous factors.


In terms of extrinsic age, the inherent age will still be a fundamental component but in addition, exogenous factors such as UV radiation, pollution, drying conditions and extremes of temperature will also contribute. Therefore, in the context of the present invention, extrinsic age means the age of the skin caused predominantly by exogenous factors.


For the sake of clarity: Extrinsic age is dominated by the accumulation of ageing caused by extrinsic factors (i.e. originating from outside the exterior surface of the stratum corneum and that then penetrate into the skin through the stratum corneum), especially sun exposure (photo-ageing); whereas Intrinsic age is the degree of ageing in skin due to factors that originate endogenously; in other words ageing not due to extrinsic factors.


The present invention is directed towards the development of an epigenetic method to estimate the intrinsic age of an individual's skin.


Datasets

This application utilised three epigenetic datasets.

    • Identification: A first dataset was used to identify methylation sites associated with protected and exposed sites in skin.
    • Training: A second dataset was used to train mathematical models in which the methylation sites identified from the Identification dataset were assessed, those best able to predict the age of the skin were determined, and a predictive model was built.
    • Testing: Finally, a third test dataset was used to assess the accuracy of these methylation sites in determining the age of the skin samples and whether the use of these methylation sites was more accurate than those identified in the Horvath Study.


The first dataset (Identification) was a single centre, cross-sectional biopsy study involving 24 Chinese and 24 Caucasian female participants in which 24 young and 24 old females had enrolled. Samples of skin were collected from two different areas of each subject: samples from exposed area of the skin; and samples from protected area of the skin. Sites designated as exposed were located on the lower outer arm. Protected sites were located on the upper inner arm, typically half way between the elbow and axilla area.


The second training dataset (Training) was a publicly available dataset (Bormann F. et al: Reduced DNA methylation patterning and transcriptional connectivity define human skin aging. Aging Cell (2016) 1-9. Array express id: EMTAB-4385). The dataset comprised a total of 108 epidermis samples, 48 samples had been isolated from punch biopsies that had been obtained from the outer forearm of 24 young (18-27 years) and 24 old (61-78 years). 60 samples had been obtained as suction blister roofs from the outer forearm of 60 volunteers aged 20-79 years. All volunteers were female, Caucasian, and disease-free.


The final test dataset (Testing) was a publicly available dataset (Vandiver A. R. et al.: Age and sun exposure-related widespread genomic blocks of hypomethylation in nonmalignant skin. Genome Biology (2015) 16:80) Gene Expression Omnibus accession number: GSE51954). The dataset contained epidermal samples (N=38) from 20 Caucasian subjects. Paired punch biopsy samples, 4 mm in diameter, had been collected under local anaesthesia from the outer forearm or lateral epicanthus (exposed area) and upper inner arm (protected area).


Choice of Training and Test Datasets

The choice of datasets was guided by the following criteria. First, the training and test data needed to be from epidermal skin, either skin biopsy or epidermis only. The chosen Training data (Bormann et al.) was from skin biopsy and suction blister of the outer forearm and epidermis samples were available for the Testing (Vandiver et al.) dataset. Second, the Training data needed to be on continuous ages and the Testing data needed to have both exposed and protected samples across both young and old age groups. Third, the mean age in the Training dataset (47 years, standard deviation=21) needed to be, and was, comparable to that of the Testing dataset (51 years, standard deviation=25).


Methylation Data Quality Checks

All three datasets used bisulphite converted DNA hybridized to Infinium 450k human methylation beadchip.


The methylation data from all DNA samples in the Identification dataset passed quality checks based on three array quality metrics (MAplot, Boxplot, Heatmap). Beta-values were calculated as B=R/R+G and M-values were calculated as M=log 2(R/G), where R represents methylated signals and G unmethylated signals. An offset of 60 was added to the denominator. M-values were used to create the expression matrix. Raw data were normalized using quantile normalization. Beta-values were used for subsequent modelling and filtering the statistical results.


Quality control and pre-processing of the Training dataset was done from raw .idat files in ‘minfi’ R package. Raw data was normalized using Subset-quantile Within Array Normalization (SWAN).


For the Testing dataset, the raw .idat files that are necessary for performing SWAN were unavailable. Therefore, the Illumina pre-processed beta values that were provided were used for subsequent analysis. The quality control and pre-processing applied on the data was also done using ‘minfi’ R package.


Technical Influences on the Data

Exploratory analysis using principle component analysis (PCA) on the Identification dataset was carried out. It was found that the between-array replicates did not cluster together, likely due to batch effect linked to array number. Clustering analysis of the Testing dataset revealed a similar array batch effect. No technical batch effect was seen on the Training dataset.


Batch-Effect Corrected Data

The array batch effects observed in the Identification and Testing datasets was adjusted using the ComBat method (Johnson W. E. et al.: Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8(1) (2007) 118-127) following quality control, normalization and averaging of within-array replicates. The resulting datasets after batch correction showed no clustering on array. The remaining biological effects were still present and tended to be the main effects in the data.


CpG Loci Identification

As used herein, CpG loci refer to the unique identifiers found in the Illumina CpG loci database (as described in Technical Note: Epigenetics, CpG Loci Identification ILLUMINA Inc. 2010, https://www.illumina.com/documents/products/technotes/technote_cpg_loci_identification.pdf). These CpG site identifiers therefore provide consistent and deterministic CpG loci database to ensure uniformity in the reporting of methylation data.


Performance of Horvath's Epigenetic Clock in Predicting Age of Sun-Protected Skin

The age predictor from the Horvath Study (which uses the 353 CpG sites discussed above) was run against the exposed (Se) and protected (Sp) samples of the Testing dataset. The performance of the Horvath model was assessed using Linear Regression from which an R2 (“pho” or “p”) was obtained. Median Error (Predicted vs. Actual Age) was also calculated. The results are provided in Table 1.









TABLE 1







Predicted ages of exposed and protected skin samples


age using predictor from the Horvath Study.










Predicted age











Exposed
Protected


Actual age
(Se)
(Sp)





20
21.32
22.66


21
31.26
26.20


22
25.04
35.02


25
28.63
30.63


27
40.24
38.89


28
25.55
30.63


29
31.61
38.17


30
36.08
36.95


34/30*
34.77
33.73


65
47.49
53.96


65
55.71
48.37


67
54.71
51.31


69
50.26
63.49


70
54.79
58.56


72
56.99
65.79


74
59.80
62.51


83
47.39
66.47


84
47.76
66.82


90
55.96
68.15


Average age:
42.39
47.28


51.32/51.11





*Se and Sp samples unpaired. Age of the exposed subject is 34, the age of the protected subject is 30.






It can be seen that for 15 out of the 19 subjects the Horvath model calculated exposed samples as being younger than protected samples which is not correct because samples subjected to exposure such as UV radiation are expected to be older than those protected from UV damage.


Average age acceleration on the predicted age reveals the sun-exposed skin sample to have an age 9 years younger than the chronological age which goes against the known physiology that sun that exposure, especially sun-exposure, causes premature ageing of skin. In addition, for a model that is related to intrinsic ageing only, this would be expected to give approximately similar ages for both the protected and exposed samples.


Additionally, the protected skin samples were found to have an age 4 years younger than the chronological age which is a underestimation of the age of the protected skin which would be expected to be approximately the same as the chronological age of the person from which the sample was taken.


It can therefore be concluded that the 353 CpG sites from the Horvath Study are not able to recognize the difference between exposed and protected skin types, nor intrinsic ageing effects in exposed skin, incorrectly predict sun-damaged skin as younger than sun-protected, and underestimate the age of the protected samples.


It was also found that the 353 CpG sites identified by the Horvath Study performed poorly in terms of the accuracy score for protected samples.


The accuracy score for protected samples was:





ρ=0.93 (error=16.6 years).


It can therefore be appreciated that an improved epigenetic method for determining the intrinsic age of skin is required.


Identification of Methylation Sites Associated with Protected Sites (from the Identification Dataset)


A total of 5 comparisons, using different linear models were performed on the normalized batch corrected data for the purpose of generating extrinsic and intrinsic age lists (Table 2). A statistical cut-off set at multiple testing corrected lists (adjust P-value—adjP, benjamini Hochberg)<0.05 together with a delta-beta >=0.05 was applied.


A high number of differentially methylated CpG sites were detected for the comparison of young versus old in exposed sites (Comparison 1: n=10,649). Relatively fewer differentially methylated CpG sites were identified for the comparison of age group versus site interaction (Comparison 5: n=233).









TABLE 2







Statistical results. Number of differentially methylated sites for


each of the 5 comparisons with adjusted p-value cut-off of 0.05.











Number of differentially



Comparison
methylated CpG sites detected













1
Young vs. Old exposed sites
10,649


2
Young vs. Old protected sites
3,545


3
Protected vs. exposed (Young)
3,714


4
Protected vs. exposed (Old)
7,053


5
Age group: Site interaction
233









Intrinsic Site List

To identify CpG sites that capture intrinsic ageing only, Comparison 2 (Young vs. Old protected sites) results were filtered to remove probes changing by site in young or old (Comparisons 3 & 4), to remove any aging changes in protected skin that might be additionally influenced by extrinsic factors.


The resulting list was 1,575 CpG sites. PCA analysis on these 1,575 sites allowed identification of sites contributing to maximum variance in classifying protected sites into young and old groups across both ethnicities. PCA loadings were used to select these variable probes, a cut-off of 0.030 loading applied to the first component resulted in 322 probes capturing the maximum variability between the age groups.


Intrinsic Age Predictor from Protected Sites


The 322 CpG sites identified to capture intrinsic age changes from the Identification dataset were used to build an intrinsic age model in which the same elastic net as that used in the Horvath Study was utilised on the Training dataset with 10 sets of size n/10 (train on 9 datasets and test on 1). These were repeated 10 times and a mean “accuracy” for each iteration was obtained to give a model for calculating age, and a coefficient for each probe.


Lists of predictors were arrived at by running several iterations of the model. The first iteration identified the best set of predictors. For each subsequent iteration, the identified predictors from the previous iteration were excluded from the training set to identify the next-best set of predictors. The iterations were repeated until the predictive accuracy, measured in terms of rho and error margin was found to be less accurate than that of the Horvath model as described above.


For the intrinsic sites, 3 iterations were performed. The first identified 36 sites, the second identified 53 sites, the third identified 25 as shown in Table 3.


Resultant models where the sites from each of these 3 iterations were removed from the final intrinsic age list of 322 CpG sites were used to estimate the age of the protected samples from the Testing dataset. The results are shown in Table 4. In addition, the average ages for both sun-protected and sun exposed samples were calculated for the resultant models. The results are shown in Table 5. The accuracy of the model using 353 sites from Horvath study for predicting sun-protected age is also shown in Tables 4 and 5 (in italics) for reference.









TABLE 3







Predictor sets for Intrinsic age scores.











Iteration 1
Iteration 2
Iteration 3



(36 sites)
(53 sites)
(25 sites)







cg19381811
cg02273797
cg15110296



cg19670290
cg22593953
cg13072214



cg15393490
cg23213887
cg21322248



cg01465824
cg20234007
cg22112832



cg04999352
cg26492368
cg07833951



cg09046979
cg06470727
cg13454226



cg10426318
cg13612317
cg03183540



cg09077126
cg09432376
cg07690127



cg24374161
eg12530994
cg06048750



cg14896948
cg05457221
cg02318784



cg14412967
cg04766371
cg06448705



cg16937583
cg03614721
cg18104919



cg17508941
cg22624391
cg01141812



cg24757926
cg27369542
cg02940165



cg03936449
cg18322569
cg23500537



cg17953764
cg27284120
cg23119628



cg00442430
cg21303763
cg23479922



cg06621744
cg05238606
cg11846112



cg08076830
cg16300030
cg12052661



cg06882058
cg00085493
cg05406635



cg25351606
cg11239720
cg05991454



cg02662828
cg23942526
cg10806820



cg20897936
cg10568624
cg25984671



cg07878486
cg07217499
cg08377398



cg21992250
cg03405983
cg06699519



cg06335867
cg25590826



cg15171839
cg10292855



cg09017434
cg15440941



cg04044664
cg15084543



cg20442599
cg05036656



cg15488596
cg00167670



cg10384245
cg18396984



cg23368787
cg00642460



cg07960624
cg07922606



cg08622677
cg23676577



cg13848598
cg17241310




cg00991848




cg03738025




cg09287864




cg12060499




cg14912644




cg11084334




cg22589169




cg17885226




cg15568145




cg07779387




cg02571816




cg17861230




cg26993102




cg02898293




cg00346208




cg17062829




cg15895690

















TABLE 4







Accuracy of models









R2 values


Model
(sun exposed sites)





Model using 322 sites (final intrinsic age list)
0.96


Model using 286 sites (36 sites from iteration 1
0.94


removed)


Model using 233 sites (53 sites from iteration 2
0.89


removed)


Model using 353 sites from Horvath study
0.93









According to the accuracy measures shown in Table 4 the models of intrinsic age that included the sites identified in iterations 1 and 2 performed with higher or equivalent accuracy (R2=0.96, error 5.7 years and R2=0.94, error=12.6 years) and better error than that the models using the 353 Horvath sites (R2=0.93, error=16.6 years). The remaining 208 sites (which included the 25 sites from iteration 3) performed with lower accuracy than the 353 Horvath sites. Therefore, the 89 sites of iterations 1 and 2 were better at predicting intrinsic age than the Horvath model.









TABLE 5







Average age for models










Average age











Model
Sun-exposed
Sun-protected
Difference













Model using 322 sites
50.14
50.34
−0.20


(final intrinsic age list)


Model using 286 sites
51.02
48.84
2.19


(36 sites from iteration


1 removed)


Model using 233 sites
49.65
47.82
1.83


(53 sites from iteration


2 removed)


Model using 353 sites from
42.39
47.28
−4.89


Horvath study









It is expected that the intrinsic age of samples from sun-exposed sites will be similar to that of samples from sun-protected sites. As can be seen from Table 5, the models from this study have a smaller difference between average age for sun-exposed and sun-protected sites than the Horvath model. This demonstrates that the models described herein are better than the Horvath model in predicting intrinsic age.


It can therefore be seen that the use of CpG sites selected from those of iterations 1 and 2 as shown in Table 3 delivers better accuracy when determining the intrinsic age of skin. Therefore, the present invention provides >30 of these 89 sites for use in predicting the intrinsic age of skin. The invention also provides the 53 sites of iteration 2 as a preferred group. The invention further provides the 36 sites of iteration 1 as the most preferred group.


It is an alternative of the invention that the foregoing CpG sites may also be replaced and the closest gene used instead.


Table 6 provides annotations of the 105 sites identified in Iterations 1 & 2 (as described in Price et al. Epigenetics & Chromatin 2013, 6:4, “Additional annotation enhances potential for biologically-relevant analysis of the Illumina Infinium HumanMethylation450 BeadChip array” using Human Genome version HG19), including the closest gene names.









TABLE 6







Annotations of CpG sites identified in Iterations 1 & 2












Position of
Closest


CpG Site ID
Chromosome No.
Methylation on Chr
Gene Name













cg19381811
chr3
49851713
UBA7


cg19670290
chr15
91477210
UNC45A


cg15393490
chr1
207996459
mir-29b-2


cg01465824
chr6
1930446
AK024936


cg04999352
chr11
63304614
RARRES3


cg09046979
chr16
28333134
SBK1


cg10426318
chr6
30647588
DHX16


cg09077126
chr10
72015695
PPA1


cg24374161
chr11
46582058
AMBRA1


cg14896948
chr7
51096885
COBL


cg14412967
chr7
51096971
COBL


cg16937583
chr3
9997177
PRRT3


cg17508941
chr7
19183280
BC043576


cg24757926
chr3
66643528
LRIG1


cg03936449
chr16
56696967
MT1G


cg17953764
chr4
48492845
ZAR1


cg00442430
chr19
49841458
AK097351


cg06621744
chr14
37052470
NKX2-8


cg08076830
chr18
44774923
SKOR2


cg06882058
chr1
243646787
Mir_584


cg25351606
chr6
100917427
SIM1


cg02662828
chr4
48492848
ZAR1


cg20897936
chr6
28554741
SCAND3


cg07878486
chr19
58951885
ZNF132


cg21992250
chr11
60718709
SLC15A3


cg06335867
chr7
8482325
NXPH1


cg15171839
chr5
92924603
NR2F1


cg09017434
chr5
16179660
40979


cg04044664
chr5
132150117
SOWAHA


cg20442599
chr6
108479500
NR2E1


cg15488596
chr1
185253495
IVNS1ABP


cg10384245
chr18
909086
ADCYAP1


cg23368787
chr19
36049342
ATP4A


cg07960624
chr8
119208486
EXT1


cg08622677
chr12
3601306
AK125333


cg13848598
chr10
115804578
ADRB1


cg02273797
chr1
33191153
KIAA1522


cg22593953
chr3
186781783
ST6GAL1


cg23213887
chr5
127872767
FBN2


cg20234007
chr18
74535840
ZNF236


cg26492368
chr10
22634733
SPAG6


cg06470727
chr2
26723062
OTOF


cg13612317
chr10
32345864
Y_RNA


cg09432376
chr22
36044226
APOL6


cg12530994
chr10
5136782
AKR1C3


cg05457221
chr10
134272437
C10orf91


cg04766371
chr10
43857641
FXYD4


cg03614721
chr5
8700943
BC032891


cg22624391
chr11
1937457
TNNT3


cg27369542
chr17
14207890
MGC12916


cg18322569
chr1
91182777
BARHL2


cg27284120
chr19
52302551
FPR3


cg21303763
chr12
3309708
AK056228


cg05238606
chr16
68907950
TMCO7


cg16300030
chr6
32908980
HLA-DMB


cg00085493
chr1
208040203
AK123177


cg11239720
chr4
152967415
BC040914


cg23942526
chr7
27882598
TAX1BP1


cg10568624
chr9
100619991
FOXE1


cg07217499
chr12
2416339
CACNA1C


cg03405983
chr8
143858548
LYNX1


cg25590826
chr15
74557537
CCDC33


cg10292855
chr16
8807018
ABAT


cg15440941
chr10
527681
DIP2C


cg15084543
chr1
79472408
ELTD1


cg05036656
chr4
41875470
BC025350


cg00167670
chr4
13922944
LOC152742


cg18396984
chr14
37049893
NKX2-8


cg00642460
chr5
176827697
PFN3


cg07922606
chr6
26225389
HIST1H3E


cg23676577
chr14
37049565
NKX2-8


cg17241310
chr1
91182856
BARHL2


cg00991848
chr16
2014270
RPS2


cg03738025
chr6
105388694
LIN28B


cg09287864
chr7
17274056
AHR


cg12060499
chr14
102172296
LINC00239


cg14912644
chr2
157176601
NR4A2


cg11084334
chr3
9594264
LHFPL4


cg22589169
chr13
111227891
RAB20


cg17885226
chr6
105388731
LIN28B


cg15568145
chr1
14113203
AK124197


cg07779387
chr2
95873465
Mir_720


cg02571816
chr19
38747378
PPP1R14A


cg17861230
chr19
18343901
PDE4C


cg26993102
chr6
30228245
HLA-L


cg02898293
chr20
25061762
VSX1


cg00346208
chr1
20669905
VWA5B1


cg17062829
chr4
147558089
POU4F2


cg15895690
chr14
60982635
SIX6








Claims
  • 1. A method for obtaining information useful to determine the intrinsic age of skin of an individual, the method comprising the steps of: (a) obtaining genomic DNA from skin cells derived from the individual; and(b) observing cytosine methylation of >30 CpG loci in the genomic DNA selected from the group consisting of:
  • 2. A method according to claim 1 wherein >40 sites from the group are used, more preferably >45, >50, >55, >60, >65, >70, >75, >80, >85, most preferably all 89 sites.
  • 3. A method according to claim 1 wherein the loci that are observed are following CpG loci:
  • 4. A method according to claim 1 wherein the loci that are observed are the following CpG loci:
  • 5. A kit for obtaining information useful to determine the intrinsic age of skin of an individual, the kit comprising: primers or probes specific for >30 genomic DNA sequences in a biological sample, wherein the genomic DNA sequences comprise CpG loci in the genomic DNA selected from the group consisting only of the following CpG Locus designations:
  • 6. A kit according to claim 5 wherein the primers or probes are specific for >40 of the genomic DNA sequences in a biological sample, more preferably >45, >50, >55, >60 >65, >70, >75, >80, >85, most preferably all 69.
  • 7. A kit according to claim 5 wherein primers or probes are specific for genomic DNA sequences in a skin sample.
  • 8. A kit according to claim 5 wherein the primers or probes are specific for the following CpG locus designations:
  • 9. A kit according to claim 5 wherein the primers or probes are specific for the following CpG locus designations:
Priority Claims (1)
Number Date Country Kind
18177976.0 Jun 2018 EP regional
PCT Information
Filing Document Filing Date Country Kind
PCT/EP2019/065709 6/15/2019 WO 00