The invention relates generally to the use of gene expression marker gene sets that are correlated to Alzheimer's disease progression and methods of using thereof.
During normal aging the brain undergoes many changes resulting in a gradual but detectable cognitive decline that is associated with limited neuronal loss and glial proliferation in the cortex and gross weight decrease of 2-3% per decade (Drachman, D. A., 2006, Neurology, 67: 1340-1352; Yankner, B. A., et al., 2008, Annu. Rev. Pathol., 3:41-66). On the molecular level the mechanisms driving aging of the brain are not yet understood, but likely include mitochondrial DNA damage (Lu, T., et al., 2004, Nature 429:883-891) and chronic oxidative stress (Lin, M. T., et al., 2006, Nature 443:787-795). This slow decline in cognitive ability does not interfere with normal function through at least 100 years of life. In contrast Alzheimer's disease (AD) is a debilitating neurodegenerative disorder associated with a rapid cognitive decline with an average survival of 5-10 years after the diagnosis (Blennow, K., et al., 2006, Lancet, 368:387-403); Cummings, J. L., 2004, N. Engl. J. Med., 351:56-67; Jakob-Roetne, R. and Jacobsen, H., 2009, Angew. Chem. Int. Ed. Engl., 48:3030-3059). Age is the main AD risk factor with almost half of the population over age 85 affected. However, AD clearly differs from the normal aging in that it causes dramatic loss of synapses, neurons and brain activity in specific anatomical regions, and results in massive atrophy and gliosis (Drachman, D. A., 2006; Herrup, K., 2010, J. Neurosci., 30:16755-16762).
The factors that cause some individuals to depart from the relatively benign process of normal brain aging and instead undergo the pathological cascade that leads to AD are unknown. A number of genetic risk factors for AD have been proposed (Waring, S. C. and Rosenberg, R. N., 2008, Arch. Neurol., 65:329-334; Bertram, L. and Tanzi, R. E., 2008, Nat. Rev. Neurosci., 9:768-778; Harold, D., et al., 2009, Nat. Genet., 41:1088-1093; Lambert, J. C., et al., 2009, Nat. Genet., 41:1094-1099), however, only the apolipoprotein E (APOE) ε4-allele, which lowers the age of onset and accelerates the cognitive decline, has a large effect (Kleiman, T., et al., 2006, Dement. Geriatr. Cogn. Disord., 22:73-82; Stone, D. J., et al., 2010, Pharmacogenomics J., 10:161-164). Pathologically, AD is characterized by the presence of two insoluble protein aggregates, senile plaques formed from the peptide β-amyloid (Aβ) and neurofibrillary tangles composed of hyperphosphorylated tau protein (Goedert, M. and Spillantini, M. G., 2006, Science, 314:777-781). In rare familial AD, the cause of disease is autosomal dominant mutations in Aβ precursor protein (APP) or the Aβ-producing enzymes presenilins (PSEN1 or PSEN2), which are all thought to lead to increased levels of aggregated Aβ (Waring, S. C. and Rosenberg, R. N., 2008; Bertram, L. and Tanzi, R. E., 2008; Hardy, J. and Selkoe, D. J., 2002, Science, 297:353-356). Likewise, mutations in tau (MAPT) that predispose it to aggregation can cause specific diseases that involve profound neurodegeneration and dementia (Ballatore, C., et al., 2007, Nat. Rev. Neurosci., 8:663-672; Wolfe, M. S., 2009, J. Biol. Chem., 284: 6021-6025). Thus, like in other neurodegenerative diseases such as Huntington's disease (HD) and Parkinson's disease, the formation of toxic insoluble aggregates seems to be a key pathogenic step. It is not known why these Aβ and tau aggregates accumulate in AD patients, nor how they contribute to neuronal dysfunction, particularly as to Aβ deposits, which can often be found in the brains of elderly non-demented subjects (Schmitt, F. A., et al., 2000, Neurology, 55:370-376).
An important goal of AD research is to identify interventions that maintain brain function, potentially by inhibiting the formation or improving the clearance of neurotoxic aggregates, or by promoting resistance to or recovery from damage. A number of biological processes have been associated with AD including cholesterol metabolism, inflammation, and response to misfolded proteins, such as increased expression of heat shock proteins. The link with lipid metabolism is supported, for example, by the essential role of APOE in lipid transport in the brain (Kleiman, T., et al., 2006; Stone, D. J., et al., 2010). These processes have not been unequivocally ordered into a pathogenic cascade and the molecular mediators and correlates of each are largely unknown.
Microarray gene expression profiling provides an opportunity to observe processes that are common for normal aging, AD, and other neurodegenerative diseases, as well as to detect the differences between these conditions and disentangle their relationships. Towards that end, Applicants profiled post-mortem samples from non-demented and AD subjects and used gene co-expression network analysis to distinguish several major processes involved in brain aging and disease and to define the corresponding signature scores quantitatively. The invention herein is directed to biomarkers correlated to the underlying pathology, signature scores that can be used to monitor disease progression and to develop animal models for the study of disease pathology and the evaluation of therapeutics for the treatment of AD.
In one aspect, the invention comprises four transcriptional biomarkers, BioAge (biological age), Alz (Alzheimer), Inflame (inflammation), and NdStress (neurodegenerative stress) that define gene expression variation in Alzheimer's disease (AD). BioAge captures the first principal component of variation and includes genes statistically associated with neuronal loss, glial activation, and lipid metabolism. BioAge typically increases with chronological age, but in AD it is prematurely expressed, as if, the subjects were 140 years old. A component of BioAge, Lipa, contains the AD risk factor APOE and reflects an apparent early disturbance in lipid metabolism. The rate of biological aging in AD patients, which was not explained by the BioAge, was instead associated with NdStress, which included genes related to protein folding and metabolism. Inflame, comprised of inflammatory cytokines and microglial genes, was broadly activated and appeared early in the disease process. In contrast, the disease specific Alz biomarker was selectively present only in the affected areas of the AD brain, appeared later in pathogenesis, and was enriched in genes associated with the signaling and cell adhesion changes during the epithelial to mesenchymal (EMT) transition.
In another aspect of the invention, the biomarkers can be used to calculate a biomarker score, or signature score, that can be used to diagnose Alzheimer's disease (AD) and monitor disease progression.
In still another aspect of the invention, the signature scores can be used to select animal models for the disease that can be used for the development and evaluation of therapeutics to treat Alzheimer's disease.
Microarray gene expression profiling provides an opportunity to observe the processes that are common for normal aging, Alzheimer's disease (AD), and other neurodegenerative diseases, as well as, to detect the differences between these conditions and disentangle their relationships. Applicants profiled several hundred post-mortem samples assembled in the Harvard Brain Tissue Resource Center (HBTRC, McLean Hospital, Belmont, Mass.) and used gene co-expression network analysis, Zhang, B. and Horvath, S., 2005, Stat. Appl. Genet. Mol. Biol., 4; Article 17; Tamayo, P. et al., 2007, Proc. Natl. Acad. Sci. USA, 104:5959-64; Carvalho, C. et al., 2008, J. Amer. Stat. Assn. 103:1438-1456; Oldham, M. C. et al., 2008, Nat. Neurosci., 11:1271-82; Miller, J. A., et al., 2008, J. Neurosci., 28:1410-20, to distinguish several major processes involved in brain aging and disease to qualitatively and quantitatively define a set of biomarkers and their corresponding signature scores. The correlation analysis of the signature scores between three profiled brain regions revealed systemic effects of the same disease processes on different brain regions. Applicants herein also provide a model of Alzheimer's disease progression that specifies the complex sequence of molecular pathological events associated with the disease. The inventive biomarkers and methods, i.e. signature scores, described herein can also be used to select animal models for the development and evaluation of therapeutics for the treatment of Alzheimer's disease (AD).
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood to one of ordinary skill in the art to which this invention belongs. The following definitions are provided in order to provide clarity with respect to terms as they are used in the specification and claims to describe various embodiments of the present invention.
As used herein, the term “Alzheimer's disease” or “AD” refers to any disease characterized by the accumulation of amyloid deposits in which the pathology results in some form of dementia or cognitive impairment. Amyloid deposits comprise a peptide, referred to as amyloid beta peptide, that aggregates to form an insoluble mass. Disease characterized by amyloid deposits include, but are not limited to Alzheimer's disease (AD), mild cognitive impairment, or other forms of memory loss or dementia.
As used herein, the term “normal” or “non-demented” refers to a subject who has not been previously diagnosed or who has not previously exhibited any clinical pathology related to Alzheimer's disease or any other form of cognitive impairment.
As used herein, the term “biomarker” refers to a list of genes known to be associated or correlated for which the gene expression in a particular tissue can be measured. The gene expression values for the correlated genes making up the biomarker can be used to calculate the signature score (Score) for the biomarker.
As used herein, the term “gene signature” or “signature score” or “Score” refers to a set of one or more differentially expressed genes that are statistically significant and characteristic of the biological differences between two or more cell samples, e.g., normal, non-demented and AD cells, cell samples from different cell types or tissue, or cells exposed to an agent or not. A signature may be expressed as a number of individual unique probes complementary to signature genes whose expression is detected when a cRNA product is used in microarray analysis or in a PCT reaction. A signature may be exemplified by a particular set of genes making up a biomarker. One means to calculate a signature or Score is provided in Example 4, in which the Score is equivalent to the average gene expression of the up-regulated genes minus the average gene expression for the down-regulated genes.
As used herein, the term “measuring expression levels,” or “obtaining expression level,” “detecting an expression level” and the like refers to methods that quantify a gene expression level of, for example, a transcript of a gene or a protein encoded by a gene, as well as methods that determine whether a gene or interest is expressed at all. Thus, an assay which provides a “yes” or “no” result without necessarily providing quantification of an amount of expression is an assay that “measures expression” as that term is used herein. Alternatively, a measured or obtained expression level may be expressed as any quantitative value, for example, a fold-change in expression, up or down, relative to a control gene or relative to the same gene in another sample, or a log ratio of expression, or any visual representation thereof, such as, for example a “heatmap” where a color intensity is representative of the amount of gene expression detected. Exemplary methods for detecting the level of expression of a gene include, but are not limited to, Northern blotting, dot or slot blots, reporter gene matrix (see, e.g., U.S. Pat. No. 5,569,588) nuclease protection, RT-PCR, microarray profiling, differential display, 2D gel electrophoresis, SELDI-TOF, ICAT, enzyme4 assay, antibody assay, and the like.
As used herein, the term “average gene expression” refers to arithmetic average of logarithm-transformed values of gene expression levels as measured on any applicable platform, as listed above.
As used herein, the term “classifier” refers to a property of a biomarker to distinguish groups of subjects and shown significant p-value in parametric (ANOVA) or non-parametric (Kruskal-Wallis) testing. For example, the classifier can be applied to samples collected from (1) the subject with AD and control subjects, (2) different neurodegenerative disease animal models As used herein, the term “sample” refers to a tissue specimen collected from human subjects or animal models As used herein, the term “subject” refers to an organism, such as a mammal, or to a cell sample, tissue sample or organ sample derived therefrom, including, for example, cultured cell lines, a biopsy, a blood sample, or a fluid sample containing a cell or a plurality of cells. In some instances, the subject or sample derived therefrom comprises a plurality of cell types. The organism may be an animal, including, but not limited to, an animal such as a mouse, rat, or dog, and is usually a mammal, such as a human.
To identify gene expression changes corresponding to AD, we analyzed RNA specimens from more than 600 individuals with pathologically confirmed diagnoses of AD, Huntington's disease (HD), or age-matched controls (average post-mortem interval of 18 hours) using microarrays with over 40,000 unique probes. The brain regions profiled included dorsolateral prefrontal cortex (PFC), visual cortex (VC), and cerebellum (CR). These regions were chosen in part because, in AD, the PFC is impacted by the pathology while the latter two regions remain largely intact throughout most of the disease (Braak, H. and Braak, E., 1991, Acta. Neuropathol., 82: 239-259). The data were then analyzed by principal component analysis to assess the major patterns of gene expression variability. Genes that were highly correlated with the principal components were used to build signatures and biologically annotate the major sources of variance.
Analysis of differential gene expression in prefrontal cortex between non-demented individuals and AD patients revealed massive changes, with more than 18,000 transcripts significantly regulated (ANOVA p<10−6) by more than 28% (
Tables 1-7 that follow show representative correlated genes that make up each biomarker and the average expression of which was used to calculate the biomarker score, i.e. the signature score. Tables 2 and 3 show the representative genes that were most up- (+BioAge) and down-regulated (-BioAge) with the biomarker, BioAge, and that were selected based on the strongest absolute correlations with PC 1.
It is useful to ascribe a signature score based on the average expression levels for all included genes as a composite measure of the signature. Applicants refer to the PC1 signature score herein as BioAge (biological age). Without wishing to be bound by any theory, Applicants believe that the BioAge signature score (herein the “Score”) of each brain tissue sample is a more precise and objective measure of its aging level than chronological age. Most of the AD subjects attained much larger values for BioAge than normal subjects (AUROC=0.92). Comparison of the Score for BioAge for AD and non-demented individuals at different chronological age groups revealed a very significant difference at younger ages, which decreased in chronologically older age groups. While the Score for BioAge of non-demented individuals gradually increased with age, AD patients showed consistently higher Scores for BioAge regardless of chronological age (
As an independent test of the power of BioAge, that is, the average gene expression or Score for this biomarker, to predict normal chronological age, Applicants applied this biomarker to a cohort of prefrontal cortex samples from non-demented individuals (Gene Expression Omnibus dataset, GSE1572) that were used to qualitatively describe aging in an earlier study (Lu, T., et al., 2004, Nature, 429: 883-891). The BioAge Score in these samples strongly and significantly correlated with the chronological age of the subjects (ρ=0.75, p=8E-7,
The massive gene expression changes associated with aging that Applicants detected involved a constellation of biological processes. Gene set annotation analysis revealed that the genes down-regulated with increasing BioAge showed significant enrichment for neuronal and synaptic processes, possibly reflecting neuronal depletion or loss of plasticity (data not shown). The up-regulated processes include lipid metabolism, FAK signaling and axon guidance, as well as the glial marker, GFAP (Table 2). In agreement with an earlier analysis of aging signatures observed in normal brains (Yanker, B. A., et al., 2004, Nature, 429:883-891; Lu, T., et al., 2004), the up-regulated genes contain several oncogenes (for example, TP53, PI3K, PTEN), shown to be strongly correlated with BioAge in
Applicants also found that the up-regulated portion of the BioAge biomarker could be further dissected using a metagene discovery approach where genes significantly associated with a disease trait and a very strong Pearson correlation with each other are treated as a single unit (Tamayo, P. et al., 2007, Proc. Natl. Acad. Sci. U.S.A., 104:5959-5964; Carvalho, C., et al., 2008, J. Am. Statistical Assoc., 103:1438-1456; Oldham, M. C. et al., 2008, Nat. Neurosci., 11: 1271-1282; Miller, J. A. et al., 2008, J. Neurosci., 28: 1410-1420. Applicants selected samples with relatively low BioAge (BioAge <0) and found a large metagene with exceptionally high mutual correlation between the genes. The range of expression values for the genes comprising the metagene in these samples corresponded to an average three fold up-regulation early in the aging process. This metagene was much more coherent in normal samples than in AD samples. Applicants named this metagene “Lipa” (Table1) because it included APOE, PPARA, γ-protocadherins, and other genes involved in lipid metabolism, amino acid metabolism and cell adhesion. Other notable Lipa genes included HES1, TGFB2, NTRK2, and WIF 1.
The higher BioAge score of AD patients explained more than 50% of the differential expression between normal (non-demented) and AD cohorts. In the range of BioAge scores in which AD and normal individuals overlap, there was a significant residual differential expression, composed of several distinct sub-patterns that explain a large fraction of the normal-to-AD variance. Applicants focused on 88 AD and 43 normal brain samples with matched moderate levels of BioAge between −0.1 and 0.3. Applicants identified 4,500 genes that are differentially expressed between the two cohorts (ANOVA p<0.005, absolute fold change >10%, FDR <0.1).
The first and the largest group of about 2,000 genes, herein defined as “NdStress,” was associated with various metabolic disruptions. This signature contained some genes that were up-regulated (+NdStress, Table 5) and others that were down-regulated (−NdStress, Table 6) in AD subjects. The expression of these genes was maintained in a relatively stable narrow range in normal brains with low BioAge with relatively low coherence (
The second metagene, herein defined as “Alz,” consisted of about 200 genes up-regulated in AD (
#Bonferroni corrected Hypergeometric p-value < 0.5
Finally, a small, but exceptionally tightly correlated, metagene herein defined as “Inflame” (Table 4) contained about 250 genes upregulated with AD including many inflammation markers, such as IL1B, 1L10, IL16, IL18, and HLA genes, as well as markers of macrophages, such as VSIG4, SLC11A1, and apoptosis, such as CASP1/4, TNFRSF1B (p75 death receptor) (
A unique feature of this dataset is the availability of samples from different brain regions belonging to the same individual. All biomarkers determined from prefrontal cortex (PFC) samples were tested for coherence in visual cortex (VC) and cerebellum (CR) samples. Applicants confirmed that BioAge and the disease-specific signatures were still expressed coherently and differentially between normal and AD subjects. Applicants then performed direct correlation analysis between the signature scores in different regions (
Furthermore, the disease biomarkers were fully validated in a hold-out set of samples (Phase 2), which in addition contained some Huntington disease (HD) subjects. As shown in
Comparison with Brain Transcriptome
Consistent patterns of gene expression were recently observed by coexpression analyses in several large cohorts of brain samples from non-demented individuals (Oldham, et al., 2008, Nat. Neurosci., 11: 1271-1282). Applicants discovered several, reproducible metagenes, defined herein as “brain transcriptome modules,” some of which have been associated with genes expressed in specific brain cell types. In particular, the most reproducible modules, M4/5, M9, M15, and M16 (data not shown), were associated with microglia, oligodendrocytes, astrocytes, and neurons, respectively, in the cited work (Oldham, et al., 2008, Nat. Neurosci., 11: 1271-1282). Applicants validated the coherence of these modules in the Harvard Brain Tissue Resource Center (HBTRC) (McLean Hospital, Belmont, Mass.) dataset by metagene analysis and found that more than 90% of the genes comprising these modules strongly correlated with each other (ρ>0.7) within normal subjects. This analysis supports the finding that the latent structure of gene expression in cortex was preserved in dataset used herein.
In addition, we compared the gene expression profiling captured by the brain transcriptome modules with the biomarker, BioAge, and the disease-specific patterns discovered herein. Applicants found a strong correlation between M4/5 associated with microglia and the Inflame biomarker (ρ=0.92). In addition, “astrocytic” M15 correlates with BioAge (ρ=0.83) and “neuronal” M16 negatively correlates with BioAge (ρ=−0.93). Applicants also found that none of the major brain transcriptome modules strongly correlated with either the neurodegenerative NdStress or the AD specific Alz biomarkers. This confirms that these expression patterns are novel patterns that can only be detected in brains of those individuals affected by the disease.
This genome-wide gene expression profiling study of a large cohort of AD and normal aging brains revealed large groups of genes that vary as a function of age and disease status. When the hundreds of gene expression values contained in each of these sets are converted into a single quantitative trait, new molecular biomarkers of biological aging and disease progression emerge. The transcriptional profiles of AD brains were profoundly different from those in non-demented individuals, with thousands of genes differing in their levels of expression between the two cohorts. To reduce the complexity of the observed changes, Applicants focused on key gene expression patterns that explained the most variability across the cohorts. Applicants have found that the most significant pattern in terms of variance explained, both within and between the AD and non-demented cohorts, was BioAge, a biomarker of the level of biological aging in the brain. BioAge captured the extent of gradual molecular changes in the normal aging brain by averaging the gene expression changes associated with a multitude of synchronous physiological events. BioAge can be accurately and reliably assigned to each sample in the dataset and used to describe the molecular state of the brain in the same way as other clinical and physiological measurements are used by one of ordinary skill in the art.
Genes up-regulated with BioAge are associated with activation of cell cycle regulation pathways, lipid metabolism and axon guidance pathways (Table 2). Misexpression of cell cycle genes in post-mitotic neurons has been observed in aging and in AD subjects and has been suggested to be an important mechanism of neurodegeneration (Woods, et al., 2007, Biochim. Biophs. Acta, 1772: 503-508; Bonda, et al., 2010, Neuropathol. Appl. Neurobiol., 36: 157-163). The enrichment for oncogenes within this set is consistent with biological responses to genotoxic stress activated during aging in an increasingly larger population of brain cells. Genes down-regulated with BioAge were associated with a decrease in neuronal activity. Most of these genes maintained a strong correlation (connectivity) with BioAge throughout the entire range of the biomarker. This implies that the core of biological aging is one gradual change rather than several distinct transitions.
Contrary to most aging patterns, a significant loss of connectivity with aging was observed for the Lipa metagene (Table 1) that included APOE, HES1, and TGFB2 (
Applicants have also found three other distinct disease-specific patterns. The biomarker, NdStress, which included both up- (+NdStress, Table 5) and down-regulated (−NdStress, Table 6) genes, dominated differential expression between AD and non-demented brains matched for BioAge score. The up-regulated genes contained multiple heatshock and proteasome proteins. Activation of these pathways may reflect the response to disease-related stress. Another set of genes in this module are cell cycle genes indicative of cell cycle arrest or apoptosis. The down-regulated (−NdStress, Table 6) arm of NdStress was enriched in one-carbon/folate metabolism genes and could underlay the perturbations in folic acid and one-carbon metabolism that are one of the earliest biomarkers associated with neurodegenerative disorders including AD (Kronenberg, et al., 2009, Curr. Mol. Med., 9: 315-23; Van Dam, F. and Van Gool, W. A., 2009, Arch. Gerontol. Geriatr., 48: 425-30; McCampbell, A. et al., 2011, J. Neurochem., 116, 82-92).
The second largest disease-specific pattern, Alz (Table 7), contained genes associated with cell adhesion, migration, morphogenesis. This biomarker prominently featured genes characteristic of epithelial-to-mesenchymal transition (EMT), such as VIM, TWIST1, and FN1 (Kalluri, R. and Weinberg, R. A., 2009, J. Clin. Invest., 119: 1420-8) (
Further, BioAge and Inflame are consistent with published analysis of healthy brain transcriptome and associated with neuronal, astrocytic, and microglial modules (Oldham, et al., 2008, Nat. Neurosci., 11:1271-1282). Importantly, Applicants found that NdStress and Inflame have virtually identical scores in different regions from the same individual. This suggests they measure systemic changes in brain tissue that happen across multiple cell types and layers and are independent of the diverse morphology and makeup of different brain regions. Alz scores, on the other hand, are not the same across all brain regions and had the highest levels in prefrontal cortex, indicating a local rather than systemic nature of EMT.
Applicants' analysis of gene expression changes in the brains of AD patients confirms that AD is both similar and distinct from the process of normal aging. Although each brain was captured only in a particular (postmortem) state and was not studied longitudinally, Applicants can assemble these data as a function of time to propose a few generalized aging trajectories (
For AD patients, the studies herein are missing early stages of the aging trajectory and can only observe late stages with terminal high BioAge. Unlike the normal cohort that can be represented by a single trajectory, the AD cohort covers a family of trajectories with different rates of biological aging. Patients with a fast rate of biological aging would succumb to disease at younger ages and generally would have higher levels of BioAge relative to their chronological age in the early phases of disease. However, since the studies herein did not include longitudinal specimens from subjects before they developed the disease, a second biomarker was required to explain disease progression rates after BioAge is maximal. The expression profile of NdStress fits the properties expected of this progression rate biomarker as it was highest level in chronologically young AD patients and it significantly correlates with (+) BioAge and (−) chronological age. Alz, on the other hand, is the highest in chronologically older patients and does not correlate with BioAge. Thus, patients with high NdStress likely have more accelerated aging trajectories than patients with high Alz. The older chronological age of Alz onset may suggest that the acceleration of BioAge due to Alz does not occur until the level of BioAge of the brain reaches a certain threshold. The quantitative assessment of the brain biological age in terms of BioAge and the rate of its disease-related acceleration in terms of NdStress are two critical hypotheses proposed in this work.
Another way to look at the aging trajectory is to model it as a set of molecular transitions that lead to changes in BioAge. Examination of biomarker scores for BioAge-low brains in
This proposed model is most consistent with an age-based hypothesis of Alzheimer's disease that postulates three fundamental steps: 1) an initial injury aggravated by aging, 2) chronic neuroinflammation, and 3) a transition of most brain cells to a new state (Herrup, K. 2010, J. Neurosci., 30: 16755-16762). These key stages of the disease were independently observed and associated with transcriptional changes in Applicants' analysis of brain transcriptome. Applicants herein also identified a striking resemblance of the biological processes behind the disease progression biomarkers and epithelial-to-mesenchymal transition (EMT) (Kalluri, R. and Weinberg, R. A., 2009, J. Clin. Invest., 119:1420:1428). The AD processes are most similar to EMT type 2, which is dependent on inflammation-inducing injuries for initiation and continued occurrence. Associated with tissue regeneration and organ fibrosis in kidney, lung, and liver, EMT type 2 generates mesenchymal cells that produce excessive amounts of extracellular matrix (ECM). Similarly, a transition of AD brain into a tissue enriched with mesenchymal cells produces a large amount of ECM containing β-amyloid. This model of the disease implies that multiple independent genetic factors, as well as infections and/or injuries may accelerate consecutive transitions leading to disease. This also suggests that different therapeutic strategies may be appropriate for early and late disease stages. Therapies targeting lipid metabolism and inflammation may be more effective in the early stages. In the late stages, when the brain becomes enriched in mesenchymal-like signaling and adhesion processes, novel approaches that support the survival of the new state of the brain tissue should be considered.
Projection of Human Aging into Animal Models
As shown in
The following abbreviations are used herein: AD: Alzheimer's disease; ANOVA: ?; AUROC: area under receiver operation characteristics; PFC1: prefrontal cortex from phase 1; PFC2: prefrontal cortex from phase 2; VC1: visual cortex from phase 1; VC2: visual cortex from phase 2; CR1: cerebellum from phase 1; CR2: cerebellum from phase 2; HD: Huntington disease.
The dataset comprises gene expression data from brain tissue samples that were posthumously collected from more than 600 individuals with diagnosed with Alzheimer's disease (AD), Huntington disease (HD), or with normal, non-demented brains. All brains were obtained from individuals for whom both the donor and the next of kin had completed the Harvard Brain Tissue Resource Center Informed Consent Form (HBTRC, McLean Hospital, Belmont, Mass.). All tissue samples were handled and the research conducted according to the HBTRC Guidelines, including those relating to Human Tissue Handling Risks and Safety Precautions, and in compliance with the Human Tissue Single User Agreement and the HBTRC Acknowledgment Agreement. Table 10 summarizes the composition of the HBTRC gene expression dataset by experimental phase, brain region, gender, and diagnosis at the time of death.
The brain regions profiled included dorsolateral prefrontal cortex (PFC, Brodmann area 9), visual cortex (VC, Brodmann area 17), and cerebellum (CR). These regions were chosen because, in AD, the PFC is impacted by the pathology, while the VC and CR regions remain largely intact throughout most of the disease (Braak, 1991). The samples were flash frozen in liquid nitrogen vapor with an average post-mortem interval of about 18 hours. Sample clinical information included age at the time of death (Mean Age and Age Range), gender, Braak stage of AD (Braak, 1991), and pH in different brain tissue samples summarized in Table 10. Braak stage and atrophy were assessed by pathologists at McLean Hospital (Belmont, Mass.). Only neuropathologically confirmed AD subjects with Braak scores >3 were included in this profiling experiment.
The total of 1 μg mRNA from each sample was extracted, amplified to fluorescently labeled tRNA, and profiled by the Rosetta Gene Expression Laboratory in two phases using Rosetta/Merck 44k 1.1 microarray (GPL4372) (Agilent Technikogies, Santa Clara, Calif.) (Hughes, 2001, Nat. Biotechnol., 19:342-347). The average RNA integrity number of 6.81 was sufficiently high for the microarray experiment monitoring 40,638 transcripts representing more than 31,000 unique genes. The expression levels were processed and normalized to the average of all samples in the batch from the same region using Rosetta Resolver (Rosetta Biosoftware, Seattle, Wash.).
Applicants refer to each batch of samples hybridized to the microarrays profiled at the same time by use of the abbreviation for the brain region and the phase of the experiment (e.g., PFC2 refers to prefrontal cortex samples profiled in phase 2). Table 10 summarizes the number of samples in each category. All microarray data generated in this study are available through the National Brain Databank at the Harvard Brain Tissue Resource Center (McLean Hospital, Belmont, Mass.).
Applicants used the log 10-ratio of the individual microarray intensities to the average intensities of all samples from the same brain region profiled in the same phase as a primary measure of gene expression. Quality control of gene expression data was performed by principal component analysis using MATLAB R2007a (Mathworks Inc. Natick, Mass.). Outlier samples (less than 2%) were removed from the data set based on extreme standardized values of the first, second, or third principal components, with absolute z-scores more than 3.
The first principal component (PC1) was used to assess the major pattern of gene expression variability in the dataset. Genes that were highly correlated with PC1 were used to build a surrogate biomarker. Throughout this work Applicants used Pearson correlation coefficients, ρ, and assessed their significance, p, assuming normal distribution for Fisher z-transformed values, atanh ρ (Rosner, 2010, Fundamentals of Biostatistics). Significant differential expression for each gene was evaluated using t-test p-values (Rosner, 2010, Fundamentals of Biostatistics, Duxbury Press, Boston Mass.). Multiple testing correction of p-values was done according to Benjamini-Hochberg procedure to obtain false-discovery rates (FDR) (Benjamini and Hochberg, 1995, 57:289-300). These analyses were performed using Statistical Toolbox of MATLAB R2007a (Mathworks Inc. Natick, Mass.).
Gene expression changes associated with aging and disease were characterized by metagenes combining sets of genes with significant association with a disease trait and a very strong Pearson correlation with each other. Applicants utilized a procedure of exploring covariance structure of the gene expression data which was similar to metagene identification (Tamayo, 2007, Proc. Natl. Acad. Sci. U.S.A., 104: 5959-5964), factor analysis of gene expression (Carvalho, 2008, J. Amer. Stat. Assoc., 103: 1438-1456), and supervised gene module discovery (Oldham, 2008, Nat. Neurosci., 11: 1271-1282; Miller, 2008, J. Neurosci., 28: 1410-1420). Instead of genome-wide search for metagenes followed by analysis of associations between metagenes and disease traits, Applicants used a supervised approach. After selecting genes significantly associated with the disease, Applicants agglomeratively clustered them using Pearson correlation as a distance measure. Especially tight and large clusters in the dendrogram were then assigned to biomarkers, i.e. the dendrogram was cut so that several hundred genes in a branch qualified for a biomarker and the average of their correlations to the mean was not weaker than 0.75. Applicants recognized that some signatures could have two anti-correlated arms representing opposite trends in the gene expression (e.g. genes that are up- and down-regulated with the end point).
Through out the experiments herein, Applicants utilize the term “biomarker” to refer to a metagene together with its associated score that quantifies it in each brain tissue sample. The biomarker score for each sample was calculated as the mean expression levels of the comprising genes or as the arithmetic difference between the means in the positive and negative arms of the signature when both arms were specified. See, for example, Tables 1-7 that show representative genes making up the biomarkers of the invention herein. Thus, the “Score” was calculated as follows:
where I/I0 was the normalized intensity of the signature probes. To produce a robust score, all samples have to be normalized to the same reference. The reference intensity I0 for each gene corresponded to the average intensity in the cohort. The overall coherence of biomarkers was evaluated as an average correlation between individual genes and the average score. Applicants found that averaging coherent genes (coherence >0.75) that correlate with each other produced a measure that was more accurate than for individual genes. For all biomarkers identified in this work, the Score represented a continuous measure of progression for a particular aspect of disease in each sample. To evaluate the performance of the signature score, i.e. Score, as a classifier between diseased and normal samples, Applicants used the area under the curve for the receiver operating characteristic (AUROC) (Hanley, J. A. and McNeil, B. J., 1982, Radiology, 143: 29-36). AUROC is equal to the probability that two randomly selected tissue samples from two groups will be correctly assigned to the correct group based on the relative values of the classifier.
To validate the biomarkers identified in this work Applicants tested their coherence (mutual correlation between genes) and predictive power (correlation with clinical end points) in the context of an independent gene expression dataset, GSE 1572 (Lu, 2004, Nature, 429:883-891). This data set contained gene expression data from PFC samples of 30 non-demented subject, aged 26-106. These samples were profiled on Human Genome U95 Version 2 Array (GPL8300) (Affymetrix Inc., Santa Clara Calif.). To select the microarray probes and calculate the biomarker score, Applicants matched the biomarker gene symbols to those represented on the HG-U95Av2 array.
An additional set of public gene expression data used to validate the coherence and predictive power of the biomarkers was obtained from hippocampus samples from elderly control and AD subjects, GSE1297 (Blalock, 2004, Proc. Nat. Acad. Sci., USA, 101:2173-2178; Gomez Ravetti, 2010, PlosONE, 5:e10153). These 31 samples were profiled using Affymetrix Human Genome U133A Array (HG-U133A). To select the probes and calculate the biomarker score, Applicants matched the biomarker gene symbols to those represented on the array and averaged the gene expression values according to the equation in the previous subsection.
The human BioAge (
For the detection of a human brain gene signature in a peripheral tissue sample, such as blood, Applicants obtained a total of 29 human samples (six normal controls, seven early stage Alzheimer's disease (AD), nine late stage AD, and seven multiple sclerosis (MS)) from PrecisionMed (Solana Beach, Calif.). All subjects were age and gender matched. Alzheimer's disease samples were chosen to have a comparable number of ApoE ε4 carriers and non-carriers. Samples were amplified using a standard amplification kit (NuGEN Technologies, Inc., San Carlos, Calif.) and profiled using a standard microarray (Affymetrix, Santa Clara, Calif.) according to the manufacturer's protocols.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US12/62218 | 10/26/2012 | WO | 00 | 4/28/2014 |
Number | Date | Country | |
---|---|---|---|
61553400 | Oct 2011 | US |