Type 2 diabetes (DM2) affects an estimated 110 million people worldwide and is a major contributor to atherosclerotic vascular disease, blindness, amputation, and kidney failure. Defects in insulin secretion are observed early in patients with MODY, a monogenic form of type 2 diabetes; insulin resistance at tissues such as skeletal muscle is a cardinal feature of patients with fully developed DM2. Many molecular pathways have been implicated in the disease process: beta-cell development, insulin receptor signaling, carbohydrate production and utilization, mitochondrial metabolism, fatty acid oxidation, cytokine signaling, adipogenesis, adrenergic signaling, and others. It remains unclear, however, which of these or other pathways are disturbed in, and might be responsible for, DM2 in its common form.
Therefore, a need remains to identify the molecular pathways implicated in the disease process and to develop new tools and assays to identify therapeutics for the treatment of diabetes.
One aspect of the invention provides a method of modulating a biological response in a cell, the method comprising contacting the cell with at least one agent that modulates the expression or activity of Errα or Gabp, wherein the biological response is (a) expression of at least one OXPHOS gene; (b) mitochondrial biogenesis; (c) expression of Nuclear Respiratory Factor 1 (NRF-1); (d) β-oxidation of fatty acids; (e) total mitochondrial respiration; (f) uncoupled respiration; (g) mitochondrial DNA replication; (h) expression of mitochondrial enzymes; or (i) skeletal muscle fiber-type switching.
Another aspect of the invention provides a method of determining if an agent is a potential agent for the treatment of a disorder that is characterized by glucose intolerance, insulin resistance or reduced mitochondrial function, the method comprising determining if the agent increases: (i) the expression or activity of Errα or Gabp in a cell; or (ii) the formation of a complex between a PGC-1 polypeptide and (1) an Errα polypeptide; or (2) a Gabp polypeptide; wherein an agent that increases (i) or (ii) is a potential target for the treatment of the disorder.
The invention also provides a method of identifying an agent that modulates a biological response, the method comprising (a) contacting, in the presence of the agent, a PGC-1 polypeptide and an (i) Errα polypeptide, or (ii) a Gabp polypeptide, under conditions which allow the formation of a complex between the PGC-1 polypeptide and (i) the Errα polypeptide, or (ii) the Gabp polypeptide; and (b) detecting the presence of the complex; wherein an agent that modulates the biological response is identified if the agent increases or decreases the formation of the complex, and wherein the biological response is (a) expression of at least one OXPHOS gene; (b) mitochondrial biogenesis; (c) expression of Nuclear Respiratory Factor 1 (NRF-1); (d) β-oxidation of fatty acids; (e) total mitochondrial respiration; (f) uncoupled respiration; (g) mitochondrial DNA replication; (h) expression of mitochondrial enzymes; or (i) skeletal muscle fiber-type switching.
Additionally, the invention provides a method of treating or preventing a disorder characterized by reduced mitochondrial function, glucose intolerance, or insulin intolerance in a subject, the method comprising administering to the subject a therapeutically effective amount of an agent which (i) increases the expression or activity of Errα or Gabp or both; or (ii) increases the formation of a complex between a PGC-1 polypeptide and (a) an Errα polypeptide; (b) a Gabp polypeptide; or both; or (iii) binds to an (a) Errα binding site, or to a (b) Gabpa binding site, and which increases transcription of at least one gene in the subject, said gene having an Errα binding site, a Gabpa binding site, or both.
Yet another aspect of the invention provides a method of treating or preventing a disorder characterized by reduced mitochondrial function, glucose intolerance, or insulin intolerance in a subject, the method comprising administering to the subject a therapeutically effective amount of an agent which increases the expression or activity of a gene, wherein the gene has an Errα binding site or a Gapba binding site.
The invention also provides a method of reducing the metabolic rate of a subject in need thereof, the method comprising administering to the subject a therapeutically effective amount of an agent which decreases the expression or activity of at least one of the following: (i) Errα; (ii) Gabpa; (iii) a gene having an Errα binding site, a Gabpa binding site, or both; or (iv) a transcriptional activator which binds to an Errα binding site or to a Gabpa binding site; thereby reducing the metabolic rate of the patient.
The invention further provides a method of identifying a susceptibility locus for a disorder that is characterized by reduced mitochondrial function, glucose intolerance, or insulin intolerance in a subject, the method comprising (i) identifying at least one polymorphisms in a gene, or linked to a gene, wherein the gene (a) has an Errα binding site, a Gabpa binding site, or both; or (b) is Errα, Gabpa, or Gabpb; (ii) determining if at least one polymorphism is associated with the incidence of the disorder, wherein if a polymorphism is associated with the incidence of the disorder then the gene having the polymorphism, or the gene to which the polymorphism is linked, is a susceptibility locus.
A related aspect of the invention provides a method of determining if a subject is at risk of developing a disorder which is characterized by reduced mitochondrial function, the method comprising determining if a gene from the subject contains a mutation which reduces the function of the gene, wherein the gene has an Errα binding site, a Gapba binding site, or both, wherein if a gene from the subject contains a mutation then the subject is at risk of developing the disorder.
Yet another aspect of the invention provides a method of identifying a transcriptional regulator having differential activity between an experimental cell and a control cell, the method comprising (i) determining the level of gene expression of at least two genes in the experimental cell and in the control cell; (ii) ranking genes according to a difference metric of their expression level in the experimental cell compared to the control cell; (iii) identifying a subset of genes, wherein each gene in the subset contains the same DNA sequence motif; (iv) testing using a nonparametric statistic if the subset of genes are enriched at either the top or the bottom of the ranking; (v) optionally reiterating steps (ii)-(iii) for additional motifs; (vi) for a subset of genes that is enriched, identifying a transcriptional regulator which binds to a DNA sequence motif that is contained in the subset of genes; thereby identifying a transcriptional regulator having differential activity between two cells.
An additional aspect of the invention provides a method of treating impaired glucose tolerance in an individual in need thereof, the method comprising administering to the individual a therapeutically effective amount of an agent which increases the expression level of at least two OXPHOS-CR genes, thereby treating impaired glucose tolerance in the individual. A related aspect provides a method of treating obesity in an individual, comprising administering to the individual a therapeutically effective amount of an agent which increases the expression level of at least two OSPHOS-CR genes, thereby treating obesity in the individual.
One aspect of the invention provides a method of detecting statistically-significant differences in the expression level of at least one biomarker belonging to a biomarker set, between the members of a first and of a second experimental group, comprising: (a) obtaining a biomarker sample from members of the first and the second experimental groups; (b) determining, for each biomarker sample, the expression levels of at least one biomarker belonging to the biomarker set and of at least one biomarker not belonging to the set; (c) generating a ranks order of each biomarker according to a difference metric of its expression level in the first experimental group compared to the second experimental group; (d) calculating an experimental enrichment score for the biomarker set by applying a non parametric statistic; and (e) comparing the experimental enrichment score with a distribution of randomized enrichment scores to calculate the fraction of randomized enrichment scores greater than the experimental enrichment score, wherein a low fraction indicates a statistically-significant difference in the expression level of the biomarker set, between the members of a first and of a second experimental group. In one embodiment, the distribution of randomized enrichment scores is generated by (i) randomly permutating the assignment of each biomarker sample to the first or to the second experimental group; (ii) generating a rank order of each biomarker according to the absolute value of a difference metric of its expression level in the first experimental group compared to the second experimental group; (iii) calculating an experimental enrichment score for the biomarker set by applying a non parametric statistic to the rank order; and (iv) repeating steps (i), (ii) and (iii) a number of times sufficient to generate the distribution of randomized enrichment scores.
In addition, the invention provides a method of identifying an agent that regulates expression of OXPHOS-CR genes, the method comprising (a) contacting (i) an agent to be assessed for its ability to regulate expression of OXPHOS-CR genes with (ii) a test cell; and (b) determining whether the expression of at least two OXPHOS-CR gene products show a coordinate change in the test cell compared to an appropriate control, wherein a coordinate change in the expression of the OXPHOS-CR gene products indicates that the agent regulates the expression levels of OXPHOS-CR genes. In one embodiment, the OXPHOS-CR genes are selected from the group consisting of NDUFB3, SDHA, NDUFA8, COX7A1, UQCRC1, NDUFC1, NDUFS2, ATP5O, NDUFS3, SDHB, NDUFS5, NDUFB6, COX5B, CYC1, NDUFA7, UQCRB, COX7B, ATP5L, COX7C, NDUFA5, GRIM19, ATP5J, COX6A2 NDUFB5, CYCS, NDUFA2 and HSPC051.
I. Overview
The invention broadly relates to novel therapeutics for regulating metabolism, mitochondrial function, and for treating disorders, including obesity and type 2 diabetes, and to related methods. The invention stems, in part, from the discovery by applicants of a new group of coordinately-regulated genes, termed OXPHOS, which are involved in oxidative phosphorylation. OXPHOS-CR genes have the following key characteristics: (a) they are members of oxidative phosphorylation; (b) they are transcriptionally co-regulated and highly expressed at the major sites of insulin mediated glucose uptake (brown fat, heart, skeletal muscle); (c) they are targets of the transcriptional co-activator PPARGC1 (PGC-1α); (d) they show a subtle but extremely consistent expression decrease in diabetic and pre-diabetic muscle; and (e) their expression predicts total body aerobic capacity in humans.
Applicant have discovered that OXPHOS genes are downregulated in subjects afflicted with type 2 diabetes or with glucose intolerance and that Peroxisome Proliferator-Activated Receptor γ-Coactivator-1α (PGC-1α) transcriptionally regulates the OXPHOS genes. Applicants have also discovered that PGC-1α acts through Errα and Gabp to regulate OXPHOS gene expression. Such discoveries provide the basis for novel assays and methods of treatment relating to the genes and disorders.
The invention provides, in part, methods of modulating mitochondrial function, expression of the OXPHOS genes, mitochondrial biogenesis, expression of Nuclear Respiratory Factor 1 (NRF-1), β-oxidation of fatty acids, total mitochondrial respiration, uncoupled respiration, mitochondrial DNA replication, or expression of mitochondrial enzymes, by modulating the expression or activity of Errα, Gabpa, Gabpb or of genes containing Errα binding sites, Gabpa binding sites, or both. Modulation of these biological activities may be carried out in a cell, such as contacting a cell with an agent, or in a subject in need thereof. The invention further provides agents for treating these disorders and for modulating Errα, Gabp and PGC-1 function.
A related aspect of the invention provides a method of identifying agents useful for treating disorders related to altered glucose homeostasis, insulin resistance or reduced mitochondrial function. Furthermore, the invention provides methods of diagnosing such disorders or of identifying subjects at risk of developing the disorders.
The invention also provides cell-based methods of identifying agents which modulate the expression of OXPHOS genes. Since applicants have discovered that PGC-1α, Errα and Gabp regulate the expression of level of OXPHOS genes, such methods are useful in identifying agents which regulate the expression or activity of PGC-1α, Errα and Gabp. Furthermore, expression of OXPHOS genes may be used to predict total body aerobic capacity in humans and other mammals.
Another aspect of the invention provides a method of detecting statistically-significant differences in the expression level of at least one biomarker belonging to a biomarker set, between the members of a first and of a second experimental group. Such a method may be applied, for example, to identify biomarker sets which are differentially expressed in an experimental group afflicted with a disorder, even when the changes in expression between the two groups are very subtle. Biomarker sets identified using the methods described herein may be used in the development of diagnostic tools and treatments for the disorder for which they are associated. A related aspect of the invention provides methods of identifying transcriptional regulators which display differential activity between two sets of conditions. Such methods may be applied to the bio markers identified using the related methods provided herein, and may be useful in identifying disease genes and targets for novel therapeutics to treat or prevent disease.
II. Definitions
For convenience, certain terms employed in the specification, examples, and appended claims, are collected here. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The term “expression vector” and equivalent terms are used herein to mean a vector which is capable of inducing the expression of DNA that has been cloned into it after transformation into a host cell. The cloned DNA is usually placed under the control of (i.e., operably linked to) certain regulatory sequences such a promoters or enhancers. Promoters sequences maybe constitutive, inducible or repressible.
The term “operably linked” is used herein to mean molecular elements that are positioned in such a manner that enables them to carry out their normal functions. For example, a gene is operably linked to a promoter when its transcription is under the control, of the promoter and, if the gene encodes a protein, such transcription produces the protein normally encoded by the gene. For example, a binding site for a transcriptional regulator is said to be operably linked to a promoter when transcription from the promoter is regulated by protein(s) binding to the binding site. Likewise, two protein domains are said to be operably linked in a protein when both domains are able to perform their normal functions.
The articles “a” and “an” are used herein to refer to one or to more than one (i.e., to at least one) of the grammatical object of the article. By way of example, “an element” means one element or more than one element.
The term “including” is used herein to mean, and is used interchangeably with, the phrase “including but not limited to”.
The term “or” is used herein to mean, and is used interchangeably with, the term “and/or,” unless context clearly indicates otherwise.
The term “such as” is used herein to mean, and is used interchangeably, with the phrase “such as but not limited to”.
A “patient” or “subject” to be treated by the method of the invention can mean either a human or non-human animal, preferably a mammal.
The term “encoding” comprises an RNA product resulting from transcription of a DNA molecule, a protein resulting from the translation of an RNA molecule, or a protein resulting from the transcription of a DNA molecule and the subsequent translation of the RNA product.
The term “promoter” is used herein to mean a DNA sequence that initiates the transcription of a gene. Promoters are typically found 5′ to the gene and located proximal to the start codon. If a promoter is of the inducible type, then the rate of transcription increases in response to an inducer. Promoters may be operably linked to DNA binding elements that serve as binding sites for transcriptional regulators. The term “mammalian promoter” is used herein to mean promoters that are active in mammalian cells. Similarly, “prokaryotic promoter” refers to promoters active in prokaryotic cells.
The term “expression” is used herein to mean the process by which a polypeptide is produced from DNA. The process involves the transcription of the gene into mRNA and the translation of this mRNA into a polypeptide. Depending on the context in which used, “expression” may refer to the production of RNA, protein or both.
The term “recombinant” is used herein to mean any nucleic acid comprising sequences which are not adjacent in nature. A recombinant nucleic acid may be generated in vitro, for example by using the methods of molecular biology, or in vivo, for example by insertion of a nucleic acid at a novel chromosomal location by homologous or non-homologous recombination.
The term “transcriptional regulator” refers to a biochemical element that acts to prevent or inhibit the transcription of a promoter-driven DNA sequence under certain environmental conditions (e.g., a repressor or nuclear inhibitory protein), or to permit or stimulate the transcription of the promoter-driven DNA sequence under certain environmental conditions (e.g., an inducer or an enhancer).
The term “microarray” refers to an array of distinct polynucleotides or oligonucleotides synthesized on a substrate, such as paper, nylon or other type of membrane, filter, chip, glass slide, or any other suitable solid support.
The terms “disorders” and “diseases” are used inclusively and refer to any deviation from the normal structure or function of any part, organ or system of the body (or any combination thereof). A specific disease is manifested by characteristic symptoms and signs, including biological, chemical and physical changes, and is often associated with a variety of other factors including, but not limited to, demographic, environmental, employment, genetic and medically historical factors. Certain characteristic signs, symptoms, and related factors can be quantitated through a variety of methods to yield important diagnostic information.
The terms “level of expression of a gene in a cell” or “gene expression level” refer to the level of mRNA, as well as pre-mRNA nascent transcript(s), transcript processing intermediates, mature mRNA(s) and degradation products, encoded by the gene in the cell.
The term “modulation” refers to upregulation (i.e., activation or stimulation), downregulation (i.e., inhibition or suppression) of a response, or the two in combination or apart. A “modulator” is a compound or molecule that modulates, and may be, e.g., an agonist, antagonist, activator, stimulator, suppressor, or inhibitor.
The term “prophylactic” or “therapeutic” treatment refers to administration to the subject of one or more of the subject compositions. If it is administered prior to clinical manifestation of the unwanted condition (e.g., disease or other unwanted state of the host animal) then the treatment is prophylactic, i.e., it protects the host against developing the unwanted condition, whereas if administered after manifestation of the unwanted condition, the treatment is therapeutic (i.e., it is intended to diminish, ameliorate or maintain the existing unwanted condition or side effects therefrom).
The term “therapeutic effect” refers to a local or systemic effect in animals, particularly mammals, and more particularly humans caused by a pharmacologically active substance. The term thus means any substance intended for use in the diagnosis, cure, mitigation, treatment or prevention of disease or in the enhancement of desirable physical or mental development and conditions in an animal or human. The phrase “therapeutically-effective amount” means that amount of such a substance that produces some desired local or systemic effect at a reasonable benefit/risk ratio applicable to any treatment. In certain embodiments, a therapeutically-effective amount of a compound will depend on its therapeutic index, solubility, and the like. For example, certain compounds discovered by the methods of the present invention may be administered in a sufficient amount to produce a reasonable benefit/risk ratio applicable to such treatment.
The term “improving mitochondrial function” may refer to (a) substantially (e.g., in a statistically significant manner, and preferably in a manner that promotes a statistically significant improvement of a clinical parameter such as prognosis, clinical score or outcome) restoring to a normal level at least one indicator of glucose responsiveness in cells having reduced glucose responsiveness and reduced mitochondrial mass and/or impaired mitochondrial function; or (b) substantially (e.g., in a statistically significant manner, and preferably in a manner that promotes a statistically significant improvement of a clinical parameter such as prognosis, clinical score or outcome) restoring to a normal level, or increasing to a level above and beyond normal levels, at least one indicator of mitochondrial function in cells having impaired mitochondrial function or in cells having normal mitochondrial function, respectively. Improved or altered mitochondrial function may result from changes in extra-mitochondrial structures or events, as well as from mitochondrial structures or events, in direct interactions between mitochondrial and extra-mitochondrial genes and/or their gene products, or in structural or functional changes that occur as the result of interactions between intermediates that may be formed as the result of such interactions, including metabolites, catabolites, substrates, precursors, cofactors and the like.
The term “effective amount” refers to the amount of a therapeutic reagent that when administered to a subject by an appropriate dose and regime produces the desired result.
The term “subject in need of treatment for a disorder” is a subject diagnosed with that disorder or suspected of having that disorder.
The term “metabolic disorder” refers to a disorder, disease or condition which is caused or characterized by an abnormal metabolism (i.e., the chemical changes in living cells by which energy is provided for vital processes and activities) in a subject. Metabolic disorders include diseases, disorders, or conditions associated with aberrant thermogenesis or aberrant adipose cell (e.g., brown or white adipose cell) content or function. Metabolic disorders can detrimentally affect cellular functions such as cellular proliferation, growth, differentiation, or migration, cellular regulation of homeostasis, inter- or intra-cellular communication; tissue function, such as liver function, muscle function, or adipocyte function; systemic responses in an organism, such as hormonal responses (e.g., insulin response). Examples of metabolic disorders include obesity, diabetes, hyperphagia, hypophagia, endocrine abnormalities, triglyceride storage disease, Bardet-Biedl syndrome, Lawrence-Moon syndrome, Prader-Labhart-Willi syndrome, Kearns-Sayre syndrome, anorexia, medium chain acyl-CoA dehydrogenase deficiency, and cachexia. Obesity is defined as a body mass index (BMI) of 30 kg/2m or more (National Institute of Health, Clinical Guidelines on the Identification, Evaluation, and Treatment of Overweight and Obesity in Adults (1998)). However, the present invention is also intended to include a disease, disorder, or condition that is characterized by a body mass index (BMI) of 25 kg/2m or more, 26 kg/2m or more, 27 kg/2m or more, 28 kg/2m or more, 29 kg/2m or more, 29.5 kg/2m or more, or 29.9 kg/2m or more, all of which are typically referred to as overweight (National Institute of Health, Clinical Guidelines on the Identification, Evaluation, and Treatment of Overweight and Obesity in Adults (1998)).
A “susceptibility locus” for a particular disease is a sequence or gene locus implicated in the initiation or progression of the disease. The susceptibility locus can be, for example, a gene or a microsatellite repeat, as identified by a microsatellite marker, or can be identified by a defined single nucleotide polymorphism. Generally, susceptibility genes implicated in specific diseases and their loci can be found in scientific publications, but may also be determined experimentally.
The term “Gabp polypeptide” comprises Gabpa and Gabpb polypeptides. In preferred embodiments of the methods described herein, the Gabpa and Gabpb polypeptides are mammalian polypeptides, preferably human. The amino acid sequences of human Gabpa and Gabpb are deposited as Genbank Accession Nos. NP—002031 and NP—852092, respectively. Gabpa is also known as E4TF1-53 in the art, while Gabpb is also known as E4TF1-60. Additional assays to those described herein for assaying the transcriptional activity of Gabpa and Gabpb, and additional isoforms of these subunits, may be found in the art (Sawa et al., Nucleic Acids Res. 24(24):4954-61 (1996); Watanabe, et al. Mol. Cell. Biol. 13 (3), 1385-1391 (1993), Sawada, J. et al J. Biol. Chem. 274 (50), 35475-35482 (1999); Suzuki, F. et al J. Biol. Chem. 273 (45), 29302-29308 (1998); Sawa, C., et al. Nucleic Acids Res. 24 (24), 4954-4961 (1996); Gugneja, S. et al Mol. Cell. Biol. 15 (1), 102-111 (1995); de la Brousse, F. C. et al. Genes Dev. 8 (15), 1853-1865 (1994); Virbasius, J. V. et al. Genes Dev. 7 (3), 380-392 (1993)), the teachings of which are incorporated by referenced herein.
The term “PGC-1 polypeptide” comprises PGC-1a and PGC-1b polypeptides. In preferred embodiments of the methods described herein, the PGC-1a and PGC-1b polypeptides are mammalian polypeptides, preferably human. The amino acid sequences of human PGC-1a and PGC-1b are deposited as Genbank Accession Nos. NP—573570 and AF453324, respectively. Additional assays to those described herein for assaying the transcriptional activity of Gabpa and Gabpb, and additional isoforms of these subunits, may be found in the art (Huss, J. M., et al. Biol. Chem. 277 (43), 40265-40274 (2002); Kressler, D., et al. J. Biol. Chem. 277 (16), 13918-13925 (2002); Lin, J., et al. J. Biol. Chem. 277 (3), 1645-1648 (2002); Lin et al. J. Biol. Chem., Vol. 277, Issue 3, 1645-1648, Jan. 18, (2002)), the teachings of which are incorporated by referenced herein.
The term “Errα polypeptide” includes Errα polypeptides from any species. In some preferred embodiments of the methods described herein, an Errα polypeptide is a mammalian polypeptide, preferably a human polypeptide. The sequence of human Errα corresponds to Genbank Accession No. NP—004442. Additional isoforms of Errα and methods for assaying Errα activity are known in the art e.g. Schreiber, S. N., et al. J. Biol. Chem. 278 (11), 9013-9018 (2003); Igarashi, M., et al. J. Gen. Virol. 84 (Pt 2), 319-327 (2003); Kraus, R. J., et al. J. Biol. Chem. 277 (27), 24826-24834 (2002); Vanacker, J. M., Oncogene 17 (19), 2429-2435 (1998); Sladek, R., et al. Genomics 45 (2), 320-326 (1997); Sladek, R., et al. Mol. Cell. Biol. 17 (9), 5400-5409 (1997); Shi, H., et al. Genomics 44 (1), 52-60 (1997); Yang, N., et al. J. Biol. Chem. 271 (10), 5795-5804 (1996); Giguere, V et al. Nature 331 (6151), 91-94 (1988); Eiler, S., et al Protein Expr. Purif. 22 (2), 165-173 (2001), the teachings of which are incorporated by referenced herein.
The term “nuclear hormone receptors” comprises comprise a large, well-defined family of ligand-activated transcription factors which modify the expression of target genes by binding to specific cis-acting sequences (Laudet et al., 1992, EMBO J, Vol, 1003-1013; Lopes da Silva et al., 1995, TINS 18, 542-548; Mangelsdorf et al., 1995, Cell 83, 835-839; Mangelsdorf et al., 1995, Cell 83, 841-850). Family members include both orphan receptors and receptors for a wide variety of clinically significant ligands including steroids, vitamin D, thyroid hormones, retinoic acid, etc. Additional receptors may be found in the literature (See for example The Nuclear Receptor FactsBook; Vincent Laudet (Editor); Elsevier Science & Technology, 2001).
The term “antibody” as used herein is intended to include whole antibodies, e.g., of any isotype (IgG, IgA, IgM, IgE, etc), and includes fragments thereof which are also specifically reactive with a vertebrate, e.g., mammalian, protein. Antibodies can be fragmented using conventional techniques and the fragments screened for utility and/or interaction with a specific epitope of interest. Thus, the term includes segments of proteolytically-cleaved or recombinantly-prepared portions of an antibody molecule that are capable of selectively reacting with a certain protein. Non-limiting examples of such proteolytic and/or recombinant fragments include Fab, F(ab′)2, Fab′, Fv, and single chain antibodies (scFv) containing a V[L] and/or V[H] domain joined by a peptide linker. The scFv's may be covalently or non-covalently linked to form antibodies having two or more binding sites. The term antibody also includes polyclonal, monoclonal, or other purified preparations of antibodies and recombinant antibodies.
The term “recombinant” as used in reference to a nucleic acid indicates any nucleic acid that is positioned adjacent to one or more nucleic acid sequences that it is not found adjacent to in nature. A recombinant nucleic acid may be generated in vitro, for example by using the methods of molecular biology, or in vivo, for example by insertion of a nucleic acid at a novel chromosomal location by homologous or non-homologous recombination. The term “recombinant” as used in reference to a polypeptide indicates any polypeptide that is produced by expression and translation of a recombinant nucleic acid.
The following terms are used to describe the sequence relationships between two or more polynucleotides: “reference sequence,” “comparison window,” “sequence identity,” “percentage of sequence identity,” and “substantial identity.” A reference sequence is a defined sequence used as a basis for a sequence comparison; a reference sequence can be a subset of a larger sequence, for example, as a segment of a fall length cDNA or gene sequence given in a sequence listing, or may comprise a complete cDNA or gene sequence. Generally, a reference sequence is at least 20 nucleotides in length, frequently at least 25 nucleotides in length, and often at least 50 nucleotides in length. Since two polynucleotides can each (1) comprise a sequence (for example a portion of the complete polynucleotide sequence) that is similar between the two polynucleotides, and (2) may further comprise a sequence that is divergent between the two polynucleotides, sequence comparisons between two (or more) polynucleotides are typically performed by comparing sequences of the two polynucleotides over a “comparison window” to identify and compare local regions of sequence similarity. A comparison window, as used herein, refers to a conceptual segment of at least 20 contiguous nucleotide positions wherein a polynucleotide sequence may be compared to a reference sequence of at least 20 contiguous nucleotides and wherein the portion of the polynucleotide sequence in the comparison window can comprise additions and deletions (for example, gaps) of 20 percent or less as compared to the reference sequence (which would not comprise additions or deletions) for optimal alignment of the two sequences. Optimal alignment of sequences for aligning a comparison window can be conducted by the local identity algorithm (Smith and Waterman, Adv. Appl. Math., 2:482 (1981)), by the identity alignment algorithm (Needleman and Wunsch, J. Mol. Bio., 48:443 (1970)), by the search for similarity method (Pearson and Lipman, Proc. Natl. Acid. Sci. U.S.A. 85:2444 (1988)), by the computerized implementations of these algorithms such as GAP, BESTFIT, FASTA and TFASTA (Wisconsin Genetics Software Page Release 7.0, Genetics Computer Group, Madison, Wis.), or by inspection. Preferably, the best alignment (for example, the result having the highest percentage of identity over the comparison window) generated by the various methods is selected.
The term “diagnostic” refers to assays that provide results which can be used by one skilled in the art, typically in combination with results from other assays, to determine if an individual is suffering from a disease or disorder of interest such as diabetes, including type I and type II, whereas the term “prognostic” refers to the use of such assays to evaluate the response of an individual having such a disease or disorder to therapeutic or prophylactic treatment. The term “pharmacogenetic” refers to the use of assays to predict which individual patients in a group will best respond to a particular therapeutic or prophylactic composition or treatment.
Other technical terms used herein have their ordinary meaning in the art that they are used, as exemplified by a variety of technical dictionaries, such as the McGraw-Hill Dictionary of Chemical Terms and the Stedman's Medical Dictionary.
III. Methods of Modulating Biological Responses in a Cell
In one aspect, the invention provides methods of modulating biological responses in a cell. One specific aspect of the invention provides a method of modulating a biological response in a cell, the method comprising contacting the cell with at least one agent that modulates the expression or activity of Errα or Gabp, wherein the biological response is (a) expression of at least one OXPHOS gene; (b) mitochondrial biogenesis; (c) expression of Nuclear Respiratory Factor 1 (NRF-1); (d) β-oxidation of fatty acids; (e) total mitochondrial respiration; (f) uncoupled respiration; (g) mitochondrial DNA replication; (h) expression of mitochondrial enzymes; or (i) skeletal muscle fiber-type switching.
In one embodiment of the methods described herein, the biological response that is modulated is the expression of at least one OXPHOS gene. OXPHOS genes have been described in Mootha et al., Nat. Genet. 2003; 34(3):267-73, hereby incorporated by reference in its entirety. In one embodiment, the OXPHOS gene is NDUFB3, SDHA, NDUFA8, COX7A1, UQCRC1, NDUFC1, NDUFS2, ATP5O, NDUFS3, SDHB, NDUFS5, NDUFB6, COX5B, CYC1, NDUFA7, UQCRB, COX7B, ATP5L, COX7C, NDUFA5, GRIM19, ATP5J, COX6A2 NDUFB5, CYCS, NDUFA2 or HSPC051.
In another embodiment of the methods described herein, the biological response that is modulated is mitochondrial biogenesis. U.S. Patent Publication No. 2002/0049176 describes assays for determining mitochondrial mass, volume or number, and is hereby incorporated by reference in its entirety.
In another embodiment of the methods described herein, the biological response that is modulated is expression of Nuclear Respiratory Factor 1 (NRF-1). NRF-1 is a transcription factor occurring as a homodimer of a 54 KDa polypeptide encoded by the nuclear gene nrf-1 (Evans and Scarpulla, Genes & Development 4:1023-1034 (1990), Scarpulla, J. Bioenergetics and Biomembranes 29:109-119 (1997), Moyes et al., J. Exper. Biol. 201:299-307 (1998)). NRF-1 binds to the upstream promoters of nuclear genes that encode respiratory components associated with mitochondrial transcription and replication. NRF-1 can be any NRF-1, such as rat, mouse or human. NRF-1 nucleotide and polypeptide sequences are described in U.S. Patent Publication No. 20020049176, hereby incorporated by reference in its entirety.
In another embodiment of the methods described herein, the biological response that is modulated is β-oxidation of fatty acids. In another embodiment of the methods described herein, the biological response that is modulated is total mitochondrial respiration. In another embodiment of the methods described herein, the biological response that is modulated uncoupled respiration. Uncoupled respiration occurs when electron transport is uncoupled from ATP synthesis
In another embodiment of the methods described herein, the biological response that is modulated is mitochondrial DNA replication. Quantification of mitochondrial DNA (mtDNA) content may be accomplished by one with routine skill in the art using any of a variety of established techniques that are useful for this purpose, including but not limited to, oligonucleotide probe hybridization or polymerase chain reaction (PCR) using oligonucleotide primers specific for mitochondrial DNA sequences (see, e.g., Miller et al., 1996 J. Neurochem. 67:1897; Fahy et al., 1997 Nucl. Ac. Res. 25:3102; U.S. patent application Ser. No. 09/098,079; Lee et al., 1998 Diabetes Res. Clin. Practice 42:161; Lee et al., 1997 Diabetes 46(suppl. 1): 175A). A particularly useful method is the primer extension assay disclosed by Fahy et al. (Nucl. Acids Res. 25:3102, 1997) and by Ghosh et al. (Am. J. Hum. Genet. 58:325, 1996). Suitable hybridization conditions may be found in the cited references or may be varied according to the particular nucleic acid target and oligonucleotide probe selected, using methodologies well known to those having ordinary skill in the art (see, e.g., Ausubel et al., Current Protocols in Molecular Biology, Greene Publishing, 1987; Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, 1989).
In another embodiment of the methods described herein, the biological response that is modulated is expression of mitochondrial enzymes. In one embodiment, mitochondrial enzymes are Electron Transport Chain (ETC) enzymes. An ETC enzyme refers to any mitochondrial molecular component that is a mitochondrial enzyme component of the mitochondrial electron transport chain (ETC) complex associated with the inner mitochondrial membrane and mitochondrial matrix. An ETC enzyme may include any of the multiple ETC subunit polypeptides encoded by mitochondrial and nuclear genes. The ETC is typically described as comprising complex I (NADH:ubiquinone reductase), complex II (succinate dehydrogenase), complex III (ubiquinone: cytochrome c oxidoreductase), complex IV (cytochrome c oxidase) and complex V (mitochondrial ATP synthetase), where each complex includes multiple polypeptides and cofactors (for review see, e.g., Walker et al., 1995 Meths. Enzymol. 260:14; Ernster et al., 1981 J. Cell Biol. 91:227s-255s, and references cited therein). A mitochondrial enzyme of the present invention may also comprise a Krebs cycle enzyme, which includes mitochondrial molecular components that mediate the series of biochemical/bioenergetic reactions also known as the citric acid cycle or the tricarboxylic acid cycle (see, e.g., Lehninger, Biochemistry, 1975 Worth Publishers, NY; Voet and Voet, Biochemistry, 1990 John Wiley & Sons, NY; Mathews and van Holde, Biochemistry, 1990 Benjamin Cummings, Menlo Park, Calif.). Krebs cycle enzymes include subunits and cofactors of citrate synthase, aconitase, isocitrate dehydrogenase, the α-ketoglutarate dehydrogenase complex, succinyl CoA synthetase, succinate dehydrogenase, fumarase and malate dehydrogenase. Krebs cycle enzymes further include enzymes and cofactors that are functionally linked to the reactions of the Krebs cycle, such as, for example, nicotinamide adenine dinucleotide, coenzyme A, thiamine pyrophosphate, lipoamide, guanosine diphosphate, flavin adenine dinucloetide and nucleoside diphosphokinase.
In another embodiment of the methods described herein, the biological response that is modulated is skeletal muscle fiber-type switching, that is, a shift towards type I oxidative skeletal muscle fibers. International PCT Application WO 03/068944 describes skeletal muscle fiber-type switching. In some embodiments, the agent increases at least one of the biological responses. In alternate embodiments, the agent decreases at least one of the biological responses.
The methods described herein for modulating a biological activity in a cell may be applied to any type of cell. In specific embodiments, the cell is a skeletal muscle cell, a smooth muscle cell, a cardiac muscle cell, a hepatocyte, an adipocyte, a neuronal cell, or a pancreatic cell. The cell may be a primary cell, a cell derived from a cell line, or a cell which has differentiated in vitro, such as a differentiated cell obtained through manipulation of a stem cell. In some embodiments, the cell in an organism, while in other embodiments the cell is manipulated ex vivo, such as in cell or tissue culture. The methods described herein also apply to groups of cells, such as to whole tissues or organs. In some embodiments, the organism is a mammal, such as a mouse, rat, an ungulate, a horse, a dog or a human.
In some embodiments, the human is afflicted, at risk of developing, or suspected with being afflicted, with a disorder. In some embodiments, the disorder comprises a metabolic disorder, a disorder characterized by altered mitochondrial activity, a disorder characterized by sugar intolerance, or a combination thereof. In specific embodiments of the methods described herein, the disorder is diabetes, obesity, cardiac myopathy, aging, coronary atherosclerotic heart disease, diabetes mellitus, Alzheimer's Disease, Parkinson's Disease, Huntington's disease, dystonia, Leber's hereditary optic neuropathy (LHON), schizophrenia, myodegenerative disorders such as “mitochondrial encephalopathy, lactic acidosis, and stroke” (MELAS). and “myoclonic epilepsy ragged red fiber syndrome” (MERRY), NARP (Neuropathy; Ataxia; Retinitis Pigmentosa), MNGIE (Myopathy and external ophthalmoplegia, neuropathy; gastrointestinal encephalopathy, Kearns-Sayre disease, Pearson's Syndrome, PEO (Progressive External Ophthalmoplegia), congenital muscular dystrophy with mitochondrial structural abnormalities, Wolfram syndrome, Diabetes Insipidus, Diabetes Mellitus, Optic Atrophy Deafness, Leigh's Syndrome, fatal infantile myopathy with severe mitochondrial DNA (mtDNA) depletion, benign “later-onset” myopathy with moderate reduction in mtDNA, dystonia, medium chain acyl-CoA dehydrogenase deficiency, arthritis, and mitochondrial diabetes and deafness (MIDD), mitochondrial DNA depletion syndrome.
In one embodiment of the methods for modulating biological responses in a cell described herein, the agent modulates the formation of a complex between a PGC-1 polypeptide and (i) an Errα polypeptide; or (ii) a Gabp polypeptide. The agent may be an agent which increases formation of the complex in the cell, or it may be an agent that reduces formation of the complex in the cell. In embodiments where the agent increases a biological activity of the cell, the agent increases complex formation, whereas in embodiments where a biological activity is to be decreased, complex formation is decreased. One skilled in the art would recognize that complex formation, as used herein, refers to the normal association between the polypeptides which results in the transcriptional activation of target genes by the complex. Therefore, an agent which resulted in an aberrant aggregation of PGC-1α and Errα polypeptides, wherein the resulting complex has reduced transcriptional activating activity, would not result in increased biological activity but instead in less. Likewise, an agent which increased complexed formation, but the resulting complex was degraded in the cell, would result in less biological activity in the cell. Accordingly, in some specific embodiments for reducing biological activity, the agent results in increase complex formation, wherein the complex has reduced transcriptional activity or stability.
In one embodiment of the methods for modulating biological responses in a cell described herein, the agent modulates the expression level or the transcriptional activity of an Errα polypeptide, a Gabp polypeptide, or of both. The agent may comprise a polypeptide, a nucleic acid, or a chemical compound. In one embodiment of the methods for modulating biological responses in a cell described herein, the agent is itself an Errα polypeptide or fragments thereof, or a Gapb polypeptide or a fragment thereof, or a nucleic acid encoding such polypeptides or fragments thereof.
In some embodiments of the methods for increasing biological responses in a cell described herein, the agent increases complex formation between a PGC-1 polypeptide and an Errα polypeptide. In preferred embodiments, the agent is specific for the complex formation between a PGC-1 polypeptide and an Errα polypeptide. In a preferred embodiment, the agent increases Errα activity by preferentially promoting complex formation between a PGC-1 polypeptide and an Errα polypeptide over complex formation between a PGC-1 polypeptide and at least one other polypeptide to which PGC-1 normally binds in an organism. Polypeptides to which PGC-1 normally binds in an organism include the following: nearly all nuclear receptor (e.g., PPAR-gamma, PPAR-alpha, thyroid hormone receptor, HNF4α, etc.) as well as other transcription factors, such as NRF1, NFAT, etc (see Puigserver and Spiegelman, Endocr Rev. 2003; 24(1):78-90).
In another preferred embodiment, the agent increases Errα activity by preferentially promoting complex formation between a PGC-1 polypeptide and an Errs polypeptide over a PGC-1 polypeptide and another nuclear receptor. In some embodiments, the affinity of an agent which increases complex formation between PGC-1 polypeptide and Errα does so at least 2, 5, 10, 20, 40, 50, 100, 200, 500, 1000, 5000, 10,000, 50,000 or 100,000-fold times more potently than complex formation between the same PGC-1 polypeptide and (i) at least another polypeptide to which PGC-1 normally binds in an organism; or (ii) a nuclear receptor; or (iii) both. The fold-level of potency may be determined by measuring the association constant, the disassociation constant, or more preferably the Kd of the agent for the various complexes.
In parallel embodiments of the methods for inhibiting a biological response in a cell described herein, the agent preferentially inhibits complex formation between a PGC-1 polypeptide and an Errα polypeptide over a PGC-1 polypeptide and another nuclear receptor. In some embodiments, the affinity of an agent which decreases complex formation between PGC-1 polypeptide and an Errα does so at least 2, 5, 10, 20, 40, 50, 100, 200, 500, 1000, 5000, 10,000, 50,000 or 100,000-fold times more potently than complex formation between the same PGC-1 polypeptide and (i) at least another polypeptide to which PGC-1 normally binds in an organism; or (ii) a nuclear receptor; or (iii) both. In other embodiments, the IC50 for disrupting the interaction between a PGC-1 polypeptide and an Errα polypeptide is 2, 5, 10, 20, 40, 50, 100, 200, 500, 1000, 5000, 10,000, 50,000 or 100,000-fold lower than that for disrupting the interaction between a PGC-1 polypeptide and (i) at least one another polypeptide to which PGC-1 normally binds in an organism; or (ii) a nuclear hormone receptor.
In other embodiments of the methods described herein for modulating biological responses in a cell, a Gabp polypeptide may replace the Errα polypeptide. For example, instead of using an agent that modulates the interaction between a PGC-1 polypeptide and an Errα polypeptide, an agent is used that modulates the interaction between a polypeptide PGC-1 polypeptide and an Gabp polypeptide. Thus all variations of the methods described herein for modulating biological responses in a cell using an Errα polypeptide may be applied to an Gabp polypeptide, such as a Gabpa polypeptide.
Another embodiment of the methods described herein for modulating biological responses in a cell, the cell is contacted with two agents, wherein one agent modulates the expression or activity of Errα and the other agent modulates the expression or activity of a Gabp polypeptide, such as a Gabpa polypeptide. In another embodiment, the cell is contacted with one agent which modulates the expression or activity of both Errs and of a Gabp polypeptide.
IV. Methods of Preventing/Treating Disease
Some aspects of the invention provide methods of treating or preventing a disorder. Some aspects provide methods of preventing disorders which are associated with glucose intolerance, excess glucose production, insulin resistance, aberrant metabolism or abnormal mitochondrial function.
The invention further provides agents for the manufacture of medicaments to treat any of the disorders described herein. Any methods disclosed herein for treating or preventing a disorder by administering an agent to a subject may be applied to the use of the agent in the manufacture of a medicament to treat that disorder. For example, in one specific embodiment, an Errα agonist may be used in the manufacture of a medicament for the treatment of a disorder characterized by low mitochondrial function or by sugar intolerance, such as diabetes.
One aspect of the invention provides method of treating or preventing a disorder characterized by reduced mitochondrial function, glucose intolerance, or insulin intolerance in a subject, the method comprising administering to the subject a therapeutically effective amount of an agent which (i) increases the expression or activity of Errα or Gabp or both; or (ii) increases the formation of a complex between a PGC-1 polypeptide and (a) an Errα polypeptide; (b) a Gabp polypeptide; or both; or (iii) binds to an (a) Errα binding site, or to a (b) Gabpa binding site, and which increases transcription of at least one gene in the subject, said gene having an Errα binding site, a Gabpa binding site, or both.
In one embodiment, the agent which binds to an (a) Errα binding site, or to a (b) Gabp binding site, comprises at least one DNA binding domain. In a further embodiment, the DNA binding domain comprises at least one zinc-finger. In some embodiments, such agents comprise a DNA binding domain and a transactivation domain. Methods are known in the art for designing transcriptional activator or repressors which bind to specific DNA sequences, including those disclosed in U.S. Pat. Nos. 6,607,882, 6,453,242 and 6,511,808.
In one embodiment, the disorder is type 2 diabetes mellitus. In one embodiment of any of the methods described herein, a disorder characterized by reduced mitochondrial function, glucose intolerance, or insulin intolerance is diabetes, obesity, cardiac myopathy, aging, coronary atherosclerotic heart disease, diabetes mellitus, Alzheimer's Disease, Parkinson's Disease, Huntington's disease, dystonia, Leber's hereditary optic neuropathy (LHON), schizophrenia, myodegenerative disorders such as “mitochondrial encephalopathy, lactic acidosis, and stroke” (MELAS). and “myoclonic epilepsy ragged red fiber syndrome” (MERRF), NARP (Neuropathy; Ataxia; Retinitis Pigmentosa), MNGIE (Myopathy and external ophthalmoplegia, neuropathy; gastro-intestinal encephalopathy, Kearns-Sayre disease, Pearson's Syndrome, PEO (Progressive External Ophthalmoplegia), congenital muscular dystrophy with mitochondrial structural abnormalities, Wolfram syndrome, Diabetes Insipidus, Diabetes Mellitus, Optic Atrophy Deafness, Leigh's Syndrome, fatal infantile myopathy with severe mitochondrial DNA (mtDNA) depletion, benign “later-onset” myopathy with moderate reduction in mtDNA, dystonia, medium chain acyl-CoA dehydrogenase deficiency, arthritis, and mitochondrial diabetes and deafness (MIDD), mitochondrial DNA depletion syndrome.
The invention further provides a method of treating or preventing a disorder characterized by reduced mitochondrial function, glucose intolerance, or insulin intolerance in a subject, the method comprising administering to the subject a therapeutically effective amount of an agent which increases the expression or activity of a gene, wherein the gene has an Errα binding site or a Gapba binding site.
In one preferred embodiment of this method, the gene has both an Errα binding site and a Gapba binding site. In one embodiment, the Errα binding site comprises the sequence 5′-TGACCTTG-3′ or the sequence ′5-CAAGGTCA-3′. In one embodiment, the Gapba binding site comprises the sequence ′5-CTTCCG-3′ or ′5-CGGAAG-3′. It is well known by one of routine skill in the art that transcriptional factors may have optimal binding sites to which they may bind in vivo or in vitro with substantially the same binding affinity as their optimal binding sites. Accordingly, in some embodiments, an Errα binding site comprises any sequence that, when operably bound to a promoter, allows transcriptional control of the promoter by Errα. In another embodiment, an Errα binding site comprises any sequence that may be bound by an Errα polypeptide with high affinity, such as with a Kd that is less than at least about 10−5 M, about 10−6 M, about 10−7 M, about 10−8 M, about 10−9 M, about 10−10 M, about 10−11 M, or about 10−12 M. Likewise, in some embodiments, an Gabpa binding site comprises any sequence that, when operably bound to a promoter, allows transcriptional control of the promoter by Gabpa. In another embodiment, an Errα binding site comprises any sequence that may be bound by an Gabpa polypeptide with high affinity, such as with a Kd that is less than at least about 10−5 M, about 10−6 M, about 10−7 M, about 10−8 M, about 10−9 M, about 10−10 M, about 10−11 M, or about 10−12 M. In some embodiments, an Errα binding site comprises a sequence which is about 50%, 62.5%, 75%, or 87.5% identical to either 5′-TGACCTTG-3′ or to ′5-CAAGGTCA-3′. In some embodiments, a Gabpa binding site comprises a sequence which is about 50%, 66.6%, or 83.3%, identical to either ′5-CTTCCG-3′ or ′5-CGGAAG-3′.
In another embodiment of any of the methods described herein, a gene which has an Errα binding site is any one of the genes listed on Table 10, a gene which has a Gabpa binding site is any one of the genes on Table 11, and a gene having both an Errα and a Gabpa binding site is any one of the genes listed on Table 12.
In yet another embodiment of this method, the binding sites are located within the promoter region of the gene. In one embodiment, the promoter region comprises from at least 0.5, 1, 1.5, 2, 2.5, 3, 4, 5 or 10 kb upstream of the transcriptional start site of the gene to at least either (i) 0.5, 1, 1.5, 2, 2.5, 3, 4, 5 or 10 kb downstream of the transcriptional start site of the gene; or (ii) 0.5, 1, 1.5, 2, 2.5, 3, 4, 5 or 10 kb downstream of the stop codon of the gene. In yet another embodiment of this methods, the promoter region comprises a masked promoter region. A masked promoter region comprises the regions of promoters that are conserved between two organisms. For example, a masked promoter region may comprise the promoter sequences which are conserved between human and another mammal, such as a mouse. By sequences that are conserved, it is meant sequences which share at least 70% sequence identity between the two species across a window size of at least 8, 9, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, or 50 nucleotides, or more preferably a window of 10 nucleotides.
In another embodiment, the binding sites are located within the promoter region, the coding region, the exons, the introns, or the untranslated region of the gene, or a combination thereof.
In yet another specific embodiment of the method, the gene having an Errα binding site or a Gapba binding site is not Errα, while in another embodiment, the gene is not Gabpa. The agent which increases the activity or expression of a specific gene may be selected by one skilled in the art according to the type of protein that is encoded. For example, if the gene encodes an enzyme, then enzyme activators are expected to increase the activity of the enzyme. Likewise, if the gene is a receptor, a receptor agonist may be administered. Such agonist may comprise small organic molecules, such as those having less than 1 kDa in mass, or may comprise an antibody that binds to the gene product and increases its activity. For any gene, an agent which increases the activity of the gene may comprise a polypeptide of the gene itself, or a nucleic acid containing the gene or an active fragment thereof.
In one embodiments of the methods described herein, reduced mitochondrial function comprises reduced total mitochondrial respiration, reduced uncoupled respiration, reduced expression of mitochondrial enzymes, reduced mitochondrial biogenesis or a combination thereof. In some embodiments of the methods for preventing or treating a disorder in a subject, at least one of the agents increases the expression or activity of Errα, of a Gabp polypeptide, or of both. In another embodiment, the agent promotes the expression or activity of a binding partner of PGC-1α or of PGC-1β. In yet another embodiment, the agent promotes the binding of PGC-1α to a transcriptional regulator. In some embodiments, the transcriptional regulator is Errα or Gabpa. In one preferred embodiment, the agent induces mitochondrial activity in skeletal muscle.
Another aspect of the invention provides a method of treating impaired glucose tolerance in an individual, comprising administering to the individual a therapeutically effective amount of an agent which increases the expression level of at least two OXPHOS-CR genes, thereby treating impaired glucose tolerance in the individual. Another aspect of the invention provides a method of treating obesity in an individual, comprising administering to the individual a therapeutically effective amount of an agent which increases the expression level of at least two OSPHOS-CR genes, thereby treating obesity in the individual. In preferred embodiments, the expression level of the OXPHOS-CR genes is increased in the skeletal muscle cells of the subject by at least 10%, 20%, 30%, 40%, 50% or 75%.
Another aspect of the invention provides methods of treating or preventing disorders characterized by an elevated metabolic rate in a subject and methods of lowering a metabolic rate in a subject. The invention provides a method of reducing the metabolic rate of a subject in need thereof, the method comprising administering to the subject a therapeutically effective amount of an agent which decreases the expression or activity of at least one of the following: (i) Errα; (ii) Gabpa; (iii) a gene having an Errα binding site, a Gabpa binding site, or both; or (iv) a transcriptional activator which binds to an Errα binding site or to a Gabpa binding site; thereby reducing the metabolic rate of the patient.
In some embodiments of the methods provided for reducing the metabolic rate of a subject in need thereof, the subject is afflicted with an infection, such as a viral infection. In one specific embodiment, the viral infection is a human immunodeficiency virus infection.
In another embodiment of methods for reducing metabolic rates, the subject is afflicted with cancer or with cachexia. Cachexia is a metabolic condition characterized by weight loss and muscle wasting. It is associated with a wide range of conditions including inflammation, heart failure and malignancies, and is well known and described in the clinical literature e.g., J. Natl. Cancer Inst. 89(23): 1763-1773 (1997) 1. The mechanistic derangements underlying cachexia are not known, but it is clear that a negative energy balance obtains in the face of severe weight loss. In specific embodiments, the subject is afflicted with cancer cachexia, pulmonary cachexia, Russell's Diencephalic Cachexia, cardiac cachexia or chronic renal insufficiency.
In some embodiments of the methods provided for reducing the metabolic rate of a subject in need thereof, the agent decreases the formation of a complex between a PGC-1 polypeptide and (i) an Errα polypeptide; or (ii) a Gabp polypeptide. In preferred embodiments, the PGC-1 polypeptide is a PGC-1α polypeptide. In another embodiment, the agent decreases the expression level or the transcriptional activity of an Errα polypeptide, a Gabp polypeptide, or of both, while in additional embodiments the agent inhibits the expression or activity of a gene which has an Errα binding site, a Gabpa binding site, or both. In some embodiments, the agents comprise double stranded RNA reagents, dominant negative polypeptides or nucleic acids encoding them, or antibodies directed to Errα, Gabpa, Gabpb, or to genes (or their gene products) which have an Errα binding site, a Gabpa binding site, or both, such as binding sites in their promoter regions.
U.S. Pat. No. 5,602,009 describes a method of generating inhibitory nuclear hormone receptors. Such methods may be applied to Errα or to Gabp to generate polypeptides or nucleic acids which encode them, which may be used as agents in the methods described herein for reducing the metabolic rate of a subject.
V. Methods of Diagnosing/Identifying Disease Genes
One aspect of the invention provides methods of identifying a susceptibility loci for a disorder characterized by reduced mitochondrial function or reduced metabolism. The identification of these loci allows for the diagnosis of the disorders and for the design or screening of agents for the treatment of these disorders.
The invention provides a method of identifying a susceptibility locus for a disorder that is characterized by reduced mitochondrial function, glucose intolerance, or insulin intolerance in a subject, the method comprising (i) identifying at least one polymorphisms in a gene, or linked to a gene, wherein the gene (a) has an Errα binding site, a Gabpa binding site, or both; or (b) is Errα, Gabpa, or Gabpb; (ii) determining if at least one polymorphism is associated with the incidence of the disorder, wherein if a polymorphism is associated with the incidence of the disorder then the gene having the polymorphism, or the gene to which the polymorphism is linked, is a susceptibility locus.
In one embodiment of the methods described herein for identifying a susceptibility locus for a disorder, the gene is any one of the gene listed on Tables 10-12.
As used herein, the term “polymorphism” refers to the co-existence, within a population, of more than one form of a gene or portion thereof (e.g. allelic variant), at a frequency too high to be explained by recurrent mutation alone. A portion of a gene of which there are at least two different forms, i.e. two different nucleotide sequences, is referred to as a polymorphic region of a gene”. A specific genetic sequence at a polymorphic region of a gene is an allele.
A polymorphic region can be a single nucleotide or more than one nucleotide, the identity of which differs in different alleles. A polymorphic region can be a restriction fragment length polymorphism (RFLP). A RFLP refers to a variation in DNA sequence that alters the length of a restriction fragment as described in Botstein et al., Am. J. Hum. Genet. 32. 3 14-33 1 (1980). The RFLP may create or delete a restriction site, thus changing the length of the restriction fragment. RFLPs have been widely used in human and animal genetic analyses (see WO 90/13668; WO90/11369; Donis-Keller, Cell 5 1, 3) 19-33)7 (1987); Lander et al. Genetics 121, 85-99 (1989)). When a heritable trait can be linked to a particular RFLP, the presence of the RFLP in an individual can be used to predict the likelihood that the individual will also exhibit the trait.
Other polymorphisms take the form of short tandem repeats (STRs) that include tandem di-, tri- and tetranucleotide repeated motifs. These tandem repeats are also referred to as variable number tandem repeat (VNTR) polymorphisms. VNTRs have been used in identity and paternity analysis (U.S. Pat. No. 5,075,217; Armour et al., FEBS Lett. 307, 1 3-1 15 (1992); Horn et al. WO 91/14003; Jeffreys, EP 370,719), and in a large number of genetic mapping studies.
Other polymorphisms take the form of single nucleotide variations between individuals of the same species. Such single nucleotide variations may arise due to substitution of one nucleotide for another at the polymorphic site or from a deletion of a nucleotide or an insertion of a nucleotide relative to a referenced allele. These single nucleotide variations are referred to herein as single nucleotide polymorphism (SNPs). Such SNPs are far more frequent than RFLPS, STRs and VNTRs. Some SNPs may occur in protein-coding sequences, in which case, one of the polymorphic forms may give rise to the expression of a defective protein and, potentially, a genetic disease. Other SNPs may occur in noncoding regions. Some of these polymorphisms may also result in defective protein expression (e.g. as a result of defective splicing). Other SNPs may have no phenotypic effects.
Techniques for determining the presence of particular alleles would be those known to persons skilled in the art and include, but are not limited to, nucleic acid techniques based on size or sequence, such as restriction fragment length polymorphism (RFLP), nucleic acid sequencing, or nucleic acid hybridization. The nucleic acid tested may be RNA or DNA. These techniques may also comprise the step of amplifying the nucleic acid before analysis. Amplification techniques are known to those of skill in the art and include, but are not limited to, cloning, polymerase chain reaction (CR), polymerase chain reaction of specific alleles (PASA), polymerase chain ligation, nested polymerase chain reaction, and the like. Amplification products may be assayed in a variety of ways, including size analysis, restriction digestion followed by size analysis, detecting specific tagged oligonucleotide primers in the reaction products, allele-specific oligonucleotide (ASO) hybridization, allele specific exonuclease detection, sequencing, hybridization and the like. Polymorphic variations leading to altered protein sequences or structures may also be detected by analysis of the protein itself. Additional methods for the detection of polymorphisms are described in U.S. Pat. No. 6,453,244 and in International PCT publications No. WO 04/011668, WO 03/048384, WO 01/20031 and WO 03/038125, the teachings of which are hereby incorporated by reference.
General methods are available to one skilled in the art for determining if a particular allele is associated with the incidence of the disorder, such as those described in Analysis of Human Genetic Linkage, by Jurg Ott: Johns Hopkins University Press, 1999; and Statistical Genomics: Linkage, Mapping, and QTL Analysis by Ben Hui Liu: CRC Press, 1997.
The invention also provides a related method for determining if a subject is at risk of developing a disorder which is characterized by reduced mitochondrial function, the method comprising determining if a gene from the subject contains a mutation which reduces the function of the gene, wherein the gene has an Errα binding site, a Gapba binding site, or both, wherein if a gene from the subject contains a mutation then the subject is at risk of developing the disorder.
In one embodiment of this method, the mutation reduces the function of the gene. In another embodiment, the disorder is diabetes, obesity, premature aging, cardiomyopathy, a neurodegenerative disease, or retinal degeneration. In further embodiments, the gene is any one of the genes on Tables 10-12.
The proposed role of the candidate genes proteins can be validated by traditional overexpression or knockout approaches to ascertain the effects of such manipulations on mitochondrial biogenesis in the engineered cell lines. This approach ultimately identifies additional molecules whose expression or activity can be modulated to enhance mitochondrial function. For example, cultured skeletal muscle cells may be used with electrical stimulation or thyroid hormone as the stimulus for mitochondrial biogenesis. Alternatively, a fat cell culture such as 3T3-L1 cells may be used, with norepinephrine providing the stimulus for mitochondrial biogenesis. Alternatively, cultured cells such as HeLa or HEK293 that express PGC-1 and/or NRF-1 under a tetracycline inducible system may be used, wherein induced expression of PGC-1 and/or NRF-1 stimulates mitochondrial biogenesis. After sufficient time with the appropriate stimulus to allow induction (1-2 days), the cells are incubated with P32 orthophosphate for 4 hrs. Cells are then harvested and subjected to SDS-PAGE to resolve the labeled proteins. Using these systems, the function of a candidate disease gene may be altered, such as through overexpression, expression of dominant negative forms of the proteins, inhibitory RNAi reagents, antibodies, and the like, and the effects on mitochondrial biogenesis or function determined.
VI. Methods of Identifying Therapeutic Agents
One aspect of the invention provides methods of identifying agents which modulate biological responses in a cell, which modulate expression of the OXPHOS-CR genes or which prevent or treat a disorder.
One aspect of the invention provides a method of determining if an agent is a potential agent for the treatment of a disorder that is characterized by glucose intolerance, insulin resistance or reduced mitochondrial function, the method comprising determining if the agent increases: (i) the expression or activity of Errα or Gabp in a cell; or (ii) the formation of a complex between a PGC-1 polypeptide and (i) an Errα polypeptide; or (ii) a Gabp polypeptide; wherein an agent that increases (i) or (ii) is a potential target for the treatment of the disorder.
In some embodiments of the methods described herein for determining if an agent is a potential agent for the treatment of a disorder, the disorder is diabetes, obesity, cardiac myopathy, aging, coronary atherosclerotic heart disease, diabetes mellitus, Alzheimer's Disease, Parkinson's Disease, Huntington's disease, dystonia, Leber's hereditary optic neuropathy (LHON), schizophrenia, myodegenerative disorders such as “mitochondrial encephalopathy, lactic acidosis, and stroke” (MELAS). and “myoclonic epilepsy ragged red fiber syndrome” (MERRF), NARP (Neuropathy; Ataxia; Retinitis Pigmentosa), MNGIE (Myopathy and external ophthalmoplegia, neuropathy; gastrointestinal encephalopathy, Kearns-Sayre disease, Pearson's Syndrome, PEO (Progressive External Ophthalmoplegia), congenital muscular dystrophy with mitochondrial structural abnormalities, Wolfram syndrome, Diabetes Insipidus, Diabetes Mellitus, Optic Atrophy Deafness; Leigh's Syndrome, fatal infantile myopathy with severe mitochondrial DNA (mtDNA) depletion, benign “later-onset” myopathy with moderate reduction in mtDNA, medium chain acyl-CoA dehydrogenase deficiency, dystonia, arthritis, and mitochondrial diabetes and deafness (MIDD) or mitochondrial DNA depletion.
Any general method known to one skilled in the art may be applied to determine if an agent increases the expression or activity of Errα or Gabp. In one specific embodiment for determining if an agent increases the expression of Errα or Gabp, a cell is contacted with an agent, and an indicator of gene expression, such as mRNA level or protein level, is determined. Levels of mRNA may be determined, for example, using such techniques as Northern Blots, reverse-transcriptase polymerase chain reaction (RT-PCR), RNA protection assays or a DNA microarray comprising probes capable of detecting Errα or Gabp mRNA or cDNA molecules. Likewise, protein levels may be quantitated using techniques well-known in the art, such as western blotting, immuno-sandwich assays, ELISA assays, or any other immunological technique. Techniques for quantitating nucleic acids and proteins may be found, for example, in Molecular Cloning: A Laboratory Manual, 3rd Ed., ed. by Sambrook and Russell (Cold Spring Harbor Laboratory Press: 2001); and in Current Protocols in Cell Biology, ed. by Bonifacino, Dasso, Lippincott-Schwartz, Harford, and Yamada, John Wiley and Sons, Inc., New York, 1999, hereby incorporated by reference in their entirety.
In one example, an RC cell culture system can be used to identify compounds which activate production of ERRα or, once ERRα production has been activated in the cells, can be used to identify compounds which lead to suppression or switching off of ERRα, production. Alternatively, such a cell culture system can be used to identify compounds or binding partners of ERRα which increase its expression. Compounds thus identified are useful as therapeutics in conditions where ERRα production is deficient or excessive. Similar experiments may be carried out with Gabpa or Gabpb or both.
Likewise, any general method known to one skilled in the art may be applied to determining if an agent increases the activity of Errα or Gabp. Activities of Errα or Gabp include their ability to bind to DNA, their ability to bind to other transcriptional regulators or their ability to promote transcription of target genes. In one embodiment, candidate agents are tested for their ability to modulate ERRα activity by (a) providing a system for measuring a biological activity of ERRα; and (b) measuring the biological activity of ERRα in the presence or absence of the candidate compound, wherein a change in ERRα activity in the presence of the compound relative to ERRα activity in the absence of the compound indicates an ability to modulate ERRα activity. In specific embodiments, the biological activity is the ability of Errα to bind the promoter of a target gene, such as the promoter or medium chain acyl-CoA dehydrogenase (MCAD), which may be determined using chromatin immunoprecipitation and analysis of the DNA bound to the Errα polypeptide. In another embodiment, the biological activity is the ability of Errα to complex with PGC-1a or PGC-1b, which may be measured by immunoprecipitation of either Errα or a PGC-1 polypeptide and determining the presence of the other protein by western blotting. In another embodiment, the biological activity is promoting transcription of a target gene. An indicator of gene expression for a target gene whose transcription is regulated by Errα or by Gabp can be compared between cells which have or have not been contacted with the agent. In specific embodiments, PGC-1α or PGC-1β is also present when testing of an agent modulates the transcriptional activating activity of Errα or Gabp polypeptides. Target genes which may be used include those which contain either an Errα or a Gabp binding site, such as OXPHOS genes or those provided by the invention. Because Gabpa and Gabpb form a complex, in some preferred embodiments both proteins, or nucleic acids encoding them, are present in the assay systems described herein.
One particular embodiment for identifying agents which modulate activity of Errα employs two genetic constructs. One is typically a plasmid that continuously expresses the transcriptional regulator of interest when transfected into an appropriate cell line. The second is a plasmid which expresses a reporter, e.g., luciferase under control of the transcriptional regulator. For example, if a compound which acts as a ligand for Errα is to be evaluated, one of the plasmids would be a construct that results in expression of the Errα in the cell line. The second would possess a promoter linked to the luciferase gene in which an Errα response element is inserted. If the compound to be tested is an agonist for the Errα receptor, the ligand will complex with the receptor and the resulting complex binds the response element and initiates transcription of the luciferase gene. In time the cells are lysed and a substrate for luciferase added. The resulting chemiluminescence is measured photometrically. Dose response curves are obtained and can be compared to the activity of known ligands. Other reporters than luciferase can be used including CAT and other enzymes. In one specific embodiments of this approach, the cells further express PGC-1α or PGC-1β, either endogenously or by introduction of a third plasmid encoding said polypeptides. The presence of PGC-1 polypeptides in the cell further allows for the identification of agents which increase or decrease the binding interaction between a PGC-1 polypeptide and Errα. This approach may also be modified to express both Gabpa and Gabpb to identify agents which modulate their transcriptional activity. Alternatively, a cell may be used which endogenously expresses any combination of polypeptides, such that only a plasmid encoding a reporter gene is introduced into the cell.
Viral constructs can be used to introduce the gene for Errα Gabp or PGC-1 and the reporter into a cell. An usual viral vector is an adenovirus. For further details concerning this preferred assay, see U.S. Pat. No. 4,981,784 issued Jan. 1, 1991 hereby incorporated by reference, and Evans et al., WO88/03168 published on 5 May 1988, also incorporated by reference.
Errα antagonists can be identified using this same basic “agonist” assay. A fixed amount of an antagonist is added to the cells with varying amounts of test compound to generate a dose response curve. If the compound is an antagonist, expression of luciferase is suppressed.
Additional methods for the isolation of agonists and antagonist of transcriptional regulators are described in U.S. Pat. Nos. 6,187,533, 5,620,887, 5,804,374, and 5,298,429, and U.S. Patent Publication Nos. 2004/003394, 2003/0077664, 2003/0215829 and 2003/0039980. Any of the methods described herein may be easily adapted to identify agonists or antagonists of any one Errα or Gabp polypeptides.
U.S. Pat. No. 6,555,326 (PCT Pub No. WO 99/27365) describes a fluorescent polarization assay for identifying agents which regulate the activity of nuclear hormone receptors, by using a nuclear hormone receptor, a peptide sensor and a candidate agent. Table 1 of this patent also lists exemplary nuclear hormone receptors. Such a method may easily be modified by one skilled in the art to identify agents which regulate the activity of Errα or Gabp.
The invention also provides a method for screening a candidate compound for its ability to modulate Errα activity in a suitable system, in the presence or absence of the candidate compound. A change in Errα activity the presence of the compound relative to ERRα activity in the absence of the compound indicates that the compound modulates ERRα activity. ERRα activity is increased relative to the control in the presence of the compound, the compound is an ERRS agonist. Conversely, if ERRS activity is decreased in the presence of the compound, the compound is an ERRα antagonist.
Another way of determining if an agent increases the activity of Errα or Gabp may also be based on binding of the agent to an ERRα or to a Gabp polypeptide or fragment thereof. Such competitive binding assays are well known to those skilled in the art.
For example, the invention provides screening methods for compounds able to bind to ERRα which are therefore candidates for modifying the activity of ERRα. Various suitable screening methods are known to those in the art, including immobilization of ERRα on a substrate and exposure of the bound ERRα to candidate compounds, followed by elution of compounds which have bound to the ERRα. Additional methods and assays for identifying agents which modulate Errα activity, for generating Errα knock out animals and cells, and for generating Errα reagents, such as anti-Errα antibodies are described in International PCT publication No. WO 00/122988, hereby incorporated by reference in its entirety.
Another aspect of the invention provides a method of identifying an agent that modulates a biological response, the method comprising (a) contacting, in the presence of the agent, a PGC-1 polypeptide and an (i) Errα polypeptide, or (ii) a Gabp polypeptide, under conditions which allow the formation of a complex between the PGC-1 polypeptide and (i) the Errα polypeptide, or (ii) the Gabp polypeptide; and (b) detecting the presence of the complex; wherein an agent that modulates the biological response is identified if the agent increases or decreases the formation of the complex, and wherein the biological response is (a) expression of the OXPHOS genes; (b) mitochondrial biogenesis; (c) expression of Nuclear Respiratory Factor 1 (NRF-1); (d) β-oxidation of fatty acids; (e) total mitochondrial respiration; (f) uncoupled respiration; (g) mitochondrial DNA replication; or (h) expression of mitochondrial enzymes.
In some embodiments of the methods for identifying an agent that modulates a biological response, the method comprises an agent that increases the formation of the complex and that increases the biological response. In alternate embodiments, the agent decreases the formation of the complex and decreases the biological response. In some embodiments, the conditions which allow the formation of a complex between the PGC-1 polypeptide and an Errα polypeptide or a Gabpa polypeptide comprise in vitro conditions, while in other embodiments they comprise in vivo conditions such as expression in a cell or in an organism.
The following embodiments of methods for identifying a compound that modulates a biological response, although directed at Errα and PGC-1α, are equally applicable to Gabp polypeptides, such as Gabpa polypeptides, or to PGC-1β polypeptides.
One embodiment for the of the methods for identifying a compound that modulates a biological response comprises: 1) combining: a Errα polypeptide or fragment thereof, a PGC-1α polypeptide or fragment thereof, and an agent, under conditions wherein the Err alpha and PGC-1α polypeptides physically interact in the absence of the agent, 2) determining if the agent interferes with the interaction, and 3) for an agent that interferes with the interaction, further assessing its ability to promote the any of the biological responses of the cell, such as (a) expression of the OXPHOS genes, mitochondrial biogenesis, expression of Nuclear Respiratory Factor 1 (NRF-1), β-oxidation of fatty acids, total mitochondrial respiration, uncoupled respiration, mitochondrial DNA replication or expression of mitochondrial enzymes.
A variety of assay formats will suffice and, in light of the present disclosure; those not expressly described herein will nevertheless be comprehended by one of ordinary skill in the art. Assay formats which approximate such conditions as formation of protein complexes, enzymatic activity, may be generated in many different forms, and include assays based on cell-free systems, e.g. purified proteins or cell lysates, as well as cell-based assays which utilize intact cells. Simple binding assays can also be used to detect agents which bind to Errα or PGC-1α. Such binding assays may also identify agents that act by disrupting the interaction between a Errα polypeptide and PGC-1α. Agents to be tested can be produced, for example, by bacteria, yeast or other organisms (e.g. natural products), produced chemically (e.g. small molecules, including peptidomimetics), or produced recombinantly. Because Errα and PGC-1a polypeptides contain multiple domains, specific embodiments of the assays and methods described to identify agents which modulate complex formation between Errα and PGC-1a employ fragments of Errα rather than full-length polypeptides, such as those lacking the DNA binding domains. Fragments of PGC-1α may also be used in some embodiments, in particular fragments which retain the ability to complex with Errα.
In many drug screening programs which test libraries of compounds and natural extracts, high throughput assays are desirable in order to maximize the number of compounds surveyed in a given period of time. Assays of the present invention which are performed in cell-free systems, which may be developed with purified or semi-purified proteins or with lysates, are often preferred as “primary” screens in that they can be generated to permit rapid development and relatively easy detection of an alteration in a molecular target which is mediated by a test compound. Moreover, the effects of cellular toxicity and/or bioavailability of the test agent can be generally ignored in the in vitro system, the assay instead being focused primarily on the effect of the drug on the molecular target as may be manifest in an alteration of binding affinity with other proteins or changes in enzymatic properties of the molecular target.
In preferred in vitro embodiments of the present assay, a reconstituted Errα/PGC-1α complex comprises a reconstituted mixture of at least semi-purified proteins. By semi-purified, it is meant that the proteins utilized in the reconstituted mixture have been previously separated from other cellular or viral proteins. For instance, in contrast to cell lysates, the proteins involved in Errα/PGC-1α complex formation are present in the mixture to at least 50% purity relative to all other proteins in the mixture, and more preferably are present at 90-95% purity. In certain embodiments of the subject method, the reconstituted protein mixture is derived by mixing highly purified proteins such that the reconstituted mixture substantially lacks other proteins (such as of cellular or viral origin) which might interfere with or otherwise alter the ability to measure Errα/PGC-1α complex assembly and/or disassembly.
Assaying Errα/PGC-1α complexes, in the presence and absence of a candidate agent, can be accomplished in any vessel suitable for containing the reactants. Examples include microtiter plates, test tubes, and micro-centrifuge tubes. In a screening assay, the effect of a test agent may be assessed by, for example, determining the effect of the test agent on kinetics, steady-state and/or endpoint of the reaction.
In one embodiment of the present invention, drug screening assays can be generated which detect inhibitory agents on the basis of their ability to interfere with assembly or stability of the Errα/PGC-1a complex. In an exemplary binding assay, the compound of interest is contacted with a mixture comprising a Errα/PGC-1a complex. Detection and quantification of Errα/PGC-1α complexes provides a means for determining the compound's efficacy at inhibiting (or potentiating) interaction between the two polypeptides. The efficacy of the compound can be assessed by generating dose response curves from data obtained using various concentrations of the test compound. Moreover, a control assay can also be performed to provide a baseline for comparison. In the control assay, the formation of complexes is quantitated in the absence of the test compound.
Complex formation may be detected by a variety of techniques. For instance, modulation in the formation of complexes can be quantitated using, for example, detectably labeled proteins (e.g. radiolabeled, fluorescently labeled, or enzymatically labeled), by immunoassay, or by chromatographic detection. Surface plasmon resonance systems, such as those available from Biacore © International AB (Uppsala, Sweden), may also be used to detect protein-protein interaction.
The proteins and peptides described herein may be immobilized. Often, it will be desirable to immobilize the peptides and polypeptides to facilitate separation of complexes from uncomplexed forms of one of the proteins, as well as to accommodate automation of the assay. The peptides and polypeptides can be immobilized on any solid matrix, such as a plate, a bead or a filter. The peptide or polypeptide can be immobilized on a matrix which contains reactive groups that bind to the polypeptide. Alternatively or in combination, reactive groups such as cysteines in the protein can react and bind to the matrix. In another embodiment, the polypeptide may be expressed as a fusion protein with another polypeptide which has a high binding affinity to the matrix, such as a fusion protein to streptavidin which binds biotin with high affinity.
In an illustrative embodiment, a fusion protein can be provided which adds a domain that permits the protein to be bound to an insoluble matrix. For example, a GST-ERRα fusion protein can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtitre plates, which are then combined with a PGC-1a polypeptide, e.g. an 35S-labeled polypeptide, and the test compound and incubated under conditions conducive to complex formation. Following incubation, the beads are washed to remove any unbound interacting protein, and the matrix bead-bound radiolabel determined directly (e.g. beads placed in scintillant), or in the supernatant after the complexes are dissociated, e.g. when microtitre plate is used. Alternatively, after washing away unbound protein, the complexes can be dissociated from the matrix, separated by SDS-PAGE gel, and the level of interacting polypeptide found in the matrix-bound fraction quantitated from the gel using standard electrophoretic techniques.
In yet another embodiment, the Errα and PGC-1α polypeptides can be used to generate an interaction trap assay (see also, U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J Biol Chem 268:12046-12054; Bartel et al. (1993) Biotechniques 14: 920-924; and Iwabuchi et al. (1993) Oncogene 8:1693-1696), for subsequently detecting agents which disrupt binding of the proteins to one and other.
In still further embodiments of the present assay, the Errα/PGC-1α complex is generated in whole cells, taking advantage of cell culture techniques to support the subject assay. For example, as described below, the Errα/PGC-1α complex can be constituted in a eukaryotic cell culture system, such as a mammalian cell and a yeast cell. Other cells known to one skilled in the art may be used. Advantages to generating the subject assay in a whole cell include the ability to detect inhibitors which are functional in an environment more closely approximating that which therapeutic use of the inhibitor would require, including the ability of the agent to gain entry into the cell. Furthermore, certain of the in vivo embodiments of the assay, such as examples given below, are amenable to high through-put analysis of candidate agents.
The components of the Errα/PGC-1a complex can be endogenous to the cell selected to support the assay. Alternatively, some or all of the components can be derived from exogenous sources. For instance, fusion proteins can be introduced into the cell by recombinant techniques (such as through the use of an expression vector), as well as by microinjecting the fusion protein itself or mRNA encoding the fusion protein.
In still further embodiments of the present assay, the Errα/PGC-1a complex is generated in whole cells and the level of interaction is determined by measuring the level of gene expression of an (i) endogenous gene or of a transgene, whose expression is dependent on the formation of a complex. Genes which are responsive to Errα/PGC-1a complex are provided by the invention and some may be found in the literature.
In specific embodiments, the cells used in the methods described herein for identifying agents are cells in culture or from a subject, such as a tissue, fluid or organ or a portion of any of the foregoing. For example, cells can preferably be from tissues that are involved in glucose metabolism, such as pancreatic cells, islates of Langerhans, pancreatic beta cells, muscle cells, liver cells or other appropriate cells. Preferably, cells are provided in culture and can be a primary cell line or a continuous cell line and can be provided as a clonal population of cells or a mixed population of cells.
VII. Methods of Identifying Agents which Modulate OXPHOS-CR Expression
Applicants have identified a core set of genes (OXPHOS-CR) that help unify previous observations from clinical investigation, exercise physiology, pharmacology, and genetics. Drugs that modulate OXPHOS-CR activity may be promising candidates for the prevention and/or treatment of type 2 diabetes. Applicants discovery of OXPHOS-CR properties and previous observations support the hypothesis that drugs that increase OXPHOS-CR activity in muscle and fat will improve insulin resistance, while agents that reduce it will worsen insulin resistance. These drugs may have benefit in other processes characterized by aberrant oxidative capacity in these tissues, including obesity and aging.
The methods described in this section for identifying agents which regulate the expression level of one or more OXPHOS-CR genes may also identify agents which modulate PGC-1α, Gabp or Errα expression or activity, or agents which mimic or functionally substitute for these genes, since applicants have demonstrated that these three transcriptional regulators regulate the expression of OXPHOS-CR genes. Likewise, these methods also identify therapeutic agents which modulate metabolism or mitochondrial function in a subject in need thereof, such as a subject afflicted with diabetes.
Accordingly, the invention further provides cell based methods for identifying agents which regulate the expression of OXPHOS-CR genes, On aspect provides a method of identifying an agent that regulates expression of OXPHOS-CR genes, the method comprising (a) contacting (i) an agent to be assessed for its ability to regulate expression of OXPHOS-CR genes with (ii) a test cell; and (b) determining whether the expression level of at least two OXPHOS-CR gene products show a coordinate change in the test cell compared to an appropriate control, wherein a coordinate change in the expression of the OXPHOS-CR gene products relative to the appropriate control indicates that the agent regulates the expression of OXPHOS-CR genes.
A related aspect of the invention provides method of identifying an agent that regulates expression of a gene, wherein the gene is an OXPHOS-CR gene, the method comprising (a) contacting (i) an agent to be assessed for its ability to regulate expression of the gene with (ii) a test cell; and (b) determining whether the expression level of two or more OXPHOS-CR gene products show a coordinate change in the test cell compared to an appropriate control, wherein the gene does not encode the two or more OXPHOS-CR gene products, and wherein a coordinate change in the expression of the OXPHOS-CR gene products relative to the appropriate control indicates that the agent regulates the expression level of the gene.
In some embodiments, the OXPHOS-CR gene products comprise an mRNA or a polypeptide. The gene products of the two genes need not be of the same type. For instance, in one specific embodiment, the mRNA levels of a first OXPHOS-CR gene, the polypeptide levels of a second OPHOS-CR gene, and the enzymatic activity of a third OXPHOS-CR genes are determined. In a preferred embodiment, all the gene products comprises mRNAs.
In additional embodiments, determining whether the expression of at least two OXPHOS-CR gene products show a coordinate change in the test cell comprises detecting, either qualitatively, semiquantitatively, or more preferably quantitatively, the levels of the OXPHOS-CR gene products. In one embodiment, the coordinate change comprises an increase or a decrease in expression in all the genes tested. In another embodiment, a coordinate change comprises an increase or a decrease in at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 95%, 97%, 98% or 99% of the genes tested.
In a variation of this method, more than one cell is contacted with the agent. In yet another variation, multiple cells or cell populations are contacted with the agent, such that each cell or cell population provides a measure of expression for each of the OXPHOS-CR gene products. For example, if the expression level of four OXPHOS-CR genes is to be determined, then four cell populations, such as one on each well of a 96-well plate, is contacted with the agent, and from each well the expression level of one of the OXPHOS genes is determined. Alternatively, two cell populations could be used and the expression level of two gene products could be determined from each of the two cell populations. In another embodiment, the cell or cell population is contacted with more than one agent.
The expression level of the OXPHOS-CR gene products may be determined using techniques known in the art. Gene products which comprise an mRNA may be detected, for example, using reverse transcriptase mediated polymerase chain reaction (RT-PCR), Northern blot analysis, in situ hybridization, microarray analysis, etc. (Schena et al., Science 270:467-470 (1995); Lockhart et al., Nature Biotech. 14: 1675-1680 (1996), and U.S. Pat. Nos. 5,770,151, 5,807,522, 5,837,832, 5,952,180, 6,040,138 and 6,045,996). Polypeptide products may be detected using, for example, standard immunoassay methods known in the art. Such immunoassays include but are not limited to, competitive and non-competitive assay systems using techniques such as radioimmunoassays, ELISA (enzyme-linked immunosorbent assay), “sandwich” immunoassays, immunoradiometric assays, gel diffusion precipitin, reactions, immunodiffusion assays, in situ immunoassays (using colloidal gold, enzymatic, or radioisotope labels, for example), Western blots, 2-dimensional gel analysis, precipitation reactions, immunofluorescence assays, protein A assays, and immunoelectrophoresis assays.
When the gene product comprises an enzyme, the level of gene product may be determined using a measure of enzymatic activity. Products of enzyme catalytic activity may be detected by suitable methods that will depend on the quantity and physicochemical properties of the particular product. Thus, detection may be, for example by way of illustration and not limitation, by radiometric, calorimetric, spectrophotometric, fluorimetric, immunometric or mass spectrometric procedures, or by other suitable means that will be readily apparent to a person having ordinary skill in the art. In certain embodiments of the invention, detection of a product of enzyme catalytic activity may be accomplished directly, and in certain other embodiments detection of a product may be accomplished by introduction of a detectable reporter moiety or label into a substrate or reactant such as a marker enzyme, dye, radionuclide, luminescent group, fluorescent group or biotin, or the like. The amount of such a label that is present as unreacted substrate and/or as reaction product, following a reaction to assay enzyme catalytic activity, is then determined using a method appropriate for the specific detectable reporter moiety or label. For radioactive groups, radionuclide decay monitoring, scintillation counting, scintillation proximity assays (SPA) or autoradiographic methods are generally appropriate. For immunometric measurements, suitably labeled antibodies may be prepared including, for example, those labeled with radionuclides, with fluorophores, with affinity tags, with biotin or biotin mimetic sequences or those prepared as antibody-enzyme conjugates (see, e.g., Weir, D. M., Handbook of Experimental Immunology, 1986, Blackwell Scientific, Boston; Scouten, W. H., Methods in Enzymology 135:30-65, 1987; Harlow and Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory, 1988; Haugland, 1996 Handbook of Fluorescent Probes and Research Chemicals—Sixth Ed., Molecular Probes, Eugene, Oreg.; Scopes, R. K., Protein Purification: Principles and Practice, 1987, Springer-Verlag, NY; Hermanson, G. T. et al., Immobilized Affinity Ligand Techniques, 1992, Academic Press, Inc., NY; Luo et al., 1998 J. Biotechnol. 65:225 and references cited therein). Spectroscopic methods may be used to detect dyes (including, for example, colorimetric products of enzyme reactions), luminescent groups and fluorescent groups. Biotin may be detected using avidin or streptavidin, coupled to a different reporter group (commonly a radioactive or fluorescent group or an enzyme). Enzyme reporter groups may generally be detected by the addition of substrate (generally for a specific period of time), followed by spectroscopic, spectrophotometric or other analysis of the reaction products. Standards and standard additions may be used to determine the level of enzyme catalytic activity in a sample, using well known techniques.
In one embodiment, the promoter regions for two or more OXPHOS-CR genes (or larger portions of such genes) may be operatively linked to a reporter gene and used in a reporter gene-based assay to detect agents that enhance or diminish OXPHOS-CR gene expression. In such embodiments, the OXPHOS gene product is the mRNA or polypeptide encoded by the reporter gene. In a specific embodiment, the recombinant fluorescent polypeptide comprises a polypeptide selected from the group consisting of the green fluorescent protein (GFP), DsRed, zFP538, mRRFP1, BFP, CFP, YFP, mutants thereof, or functionally-active fragments thereof. GFP is described in U.S. Pat. No. 5,491,084, while zFP538 is described in Zagranichny et al. Biochemistry. 2004; 43(16):4764-72.
In another specific embodiment, the appropriate control comprises the expression level of the two or more OXPHOS-CR gene products in cells that (a) have not been contacted with the agent; (b) have been contacted with a different dosage of the agent; (c) have been contacted with a second agent; or (d) a combination thereof. Alternatively, an appropriate control may be a measure of the gene product in the cell prior to contacting with the agent. In another embodiment, the level of gene expression of the OXPHOS-CR gene product in the cell can be compared with a standard (e.g., presence or absence of an OXPHOS-CR gene product) or numerical value determined (e.g. from analysis of other samples) to correlate with a normal or expected level of expression.
In some embodiments, the identification of agents which regulate the expression of OXPHOS-CR genes is carried out in a high-throughput fashion. When screening agents in a high-throughput manner, such as when test compounds are screened for their effects on the cellular phenotype, arrays of cells may be prepared for parallel handling of cells and reagents. Standard 96 well microtiter plates which are 86 mm by 129 mm, with 6 mm diameter wells on a 9 mm pitch, may be used for compatibility with current automated loading and robotic handling systems, The microplate is typically 20 mm by 30 mm, with cell locations that are 100-200 microns in dimension on a pitch of about 500 microns. Methods for making microplates are described in U.S. Pat. No. 6,103,479, incorporated by reference herein in its entirety. Microplates may consist of coplanar layers of materials to which cells adhere, patterned with materials to which cells will not adhere, or etched 3-dimensional surfaces of similarly pattered materials. For the purpose of the following discussion, the terms ‘well’ and ‘microwell’ refer to a location in an array of any construction to which cells adhere and within which the cells are imaged. Microplates may also include fluid delivery channels in the spaces between the wells. The smaller format of a microplate increases the overall efficiency of the system by minimizing the quantities of the reagents, storage and handling during preparation and the overall movement required for the scanning operation. In addition, the whole area of the microplate can be imaged more efficiently.
In specific embodiments, the test cell that is contacted with the agent may be a primary cell, a cell within a tissue, or a cell line. In a preferred embodiment, the test cell is a liver cell, a skeletal muscle cell, such as a C2C12 myoblast or a fat cell, such as 3T3-L1 preadipocyte.
In one embodiment, the method for identifying an agent that regulates expression of OXPHOS-CR genes comprises determining whether the expression of at least 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26 or 27 OXPHOS-CR gene products. In a preferred embodiment, the expression level of five or less OXPHOS-CR gene products is determined. In a specific embodiment, the OXPHOS-CR gene products are selected from the group consisting of NDUFB3, SDHA, NDUFA8, COX7A1, UQCRC1, NDUFC1, NDUFS2, ATP50, NDUFS3, SDHB, NDUFS5, NDUFB6, COX5B, CYC1, NDUFA7, UQCRB, COX7B, ATP5L, COX7C, NDUFA5, GRIM19, ATP5J, COX6A2 NDUFB5, CYCS, NDUFA2 and HSPC051. In a specific embodiment, one of the OXPHOS-CR genes is ubiquinol cytochrome c reductase binding protein (UQCRB). In a preferred embodiment, the OXPHOS-CR gene products are human OXPHOS-CR products. The OXPHOS-CR genes whose expression level is determined may be encoded by (i) mitochondrial DNA (mtDNA); (ii) nuclear DNA; or (iii) a combination thereof.
In one embodiment of the methods described herein for identifying agents which regulate the expression of OXPHOS-CR genes, the method further comprises determining if the agent regulates the expression of at least one gene which is not an OXPHOS-CR gene. In some embodiments, the method further comprises determining if the agent regulates the expression of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or 50 genes which are not an OXPHOS-CR genes. Such genes may be mitochondrial genes or, in preferred embodiments, not mitochondrial genes, such as actin genes. The expression level of another gene which is not an OXPHOS-CR gene may serve as an internal control, such that agents which specifically modulate the expression of an OXPHOS-CR gene may be identified.
In other embodiments, a secondary screening step is performed on the agent. In a specific embodiment, the agent is tested in additional assays for its effects on mitochondrial cell number or a mitochondrial function, such as coupled oxygen consumption. Such assays may comprise contacting a cell with the agent, measuring mitochondrial cell number or function, and comparing it to an appropriate control. U.S. Patent Publication No. 20020049176 describes assays for determining mitochondrial mass, volume or number, and U.S. Patent Publication No. 2002/0127536 describes assays for determining coupled oxygen consumption. Accordingly, in one embodiment, the agent being tested in the assays described herein additionally (a) increases the number of mitochondria in the test cell; (b) increases coupled oxygen consumption in the cell; (c) increases mtDNA copy number in the test cell; or (d) a combination thereof.
Agents identified using the methods of the present invention may also be tested in model systems for their efficacy in inducing the desired biological response or in treating disorders. One example is high-fat diet induced obesity and insulin resistance. In another example, agents may also be tested for their efficacy in treating diabetes by using a non-obese diabetic (NOD) mouse. The successful use of this animal model in diabetic drug discovery is reported in the literature (Yang et al., J. Autoimmun. 10:257-260 (1997), Akashi et al., Int. Immunol. 9:1159-1164 (1997), Suri and Katz, Immunol. Rev. 169:55-65 (1999), Pak et al., Autoimmunity 20:19-24 (1995), Toyoda and Formby, Bioessays 20:750-757 (1998), Cohen, Res. Immunol. 148:286-291 (1997), Baxter and Cooke, Diabetes Metal. Rev. 11:315-335 (1995), McDuffie, Curr. Opin. Immunol. 10:704-709 (1998), Shieh et al. Autoimmunity 15:123-135 (1993), Anderson et al., Autoimmunity 15:113-122 (1993)).
It is well understood by one skilled in the art that many of the methods described herein may be carried out using variants of the polypeptides described. Variants include truncated polypeptides, mutant polypeptides, such as those carrying point mutations, and fusions between domains of the subject polypeptides and other polypeptides. In some embodiments, the subject polypeptides, or their domains, may be fused to reporter proteins, such as to GFP or to enzymes. In some embodiments of any of the methods described herein, the polypeptides used are 50, 60, 70, 80, 90, 95, 98 or 99% identical to the sequences referenced to in the various Genbank Accession numbers.
In the methods described herein for identifying an agent, the agent may comprise a recombinant polypeptide, a synthetic molecule, or a purified or partially purified naturally occurring molecule. In a specific embodiment, the agent comprises a virus or a phage. In another embodiment, the agent is a nuclear hormone, such as estrogen, thyroid hormone, cortisol, testosterone, and others. Additional agents include nucleic acids encoding nuclear hormone receptors.
In another embodiment, the agent comprises a set of environmental conditions. The condition may be a physical condition of the environment in which the cell resides, a chemical condition of the environment, and/or a biological condition of the site. Exposure may be for any suitable time. The exposure may be continuous, transient, periodic, sporadic, etc. Physical conditions include any physical state of the examination site. The physical state may be the temperature or pressure of the sample, or an amount or quality of light (electromagnetic radiation) at the site. Alternatively, or in addition, the physical state may relate to an electric field, magnetic field, and/or particle radiation at the site, among others. Chemical conditions include any chemical aspect of the fluid in which the sample populations are disposed. The chemical aspect may relate to presence or concentration of a test compound or material, pH, ionic strength, and/or fluid composition, among others.
Biological conditions include any biological aspect of the shared fluid volume in which cell populations are disposed. The biological aspects may include the presence, absence, concentration, activity, or type of cells, viruses, vesicles, organelles, biological extracts, and/or biological mixtures, among others. The assays described herein may screen a library of conditions to test the activity of each library member on a set of cell populations. A library generally comprises a collection of two or more different members. These members may be chemical modulators (or candidate modulators) in the form of molecules, ligands, compounds, transfection materials, receptors, antibodies, and/or cells (phages, viruses, whole cells, tissues, and/or cell extracts), among others, related by any suitable or desired common characteristic. This common characteristic may be “type.” Thus, the library may comprise a collection of two or more compounds, two or more different cells, two or more different antibodies, two or more different nucleic acids, two or more different ligands, two or more different receptors, or two or more different phages or whole cell populations distinguished by expressing different proteins, among others. This common characteristic also may be “function.” Thus, the library may comprise a collection of two or more binding partners (e.g., ligands and/or receptors), agonists, or antagonists, among others, independent of type.
Library members may be produced and/or otherwise generated or collected by any suitable mechanism, including chemical synthesis in vitro, enzymatic synthesis in vitro, and/or biosynthesis in a cell or organism. Chemically and/or enzymatically synthesized libraries may include libraries of compounds, such as synthetic oligonucleotides (DNA, RNA, peptide nucleic acids, and/or mixtures or modified derivatives thereof), small molecules (about 100 Da to 10 KDa), peptides, carbohydrates, lipids, and/or so on. Such chemically and/or enzymatically synthesized libraries may be formed by directed synthesis of individual library members, combinatorial synthesis of sets of library members, and/or random synthetic approaches. Library members produced by biosynthesis may include libraries of plasmids, complementary DNAs, genomic DNAs, RNAs, viruses, phages, cells, proteins, peptides, carbohydrates, lipids, extracellular matrices, cell lysates, cell mixtures, and/or materials secreted from cells, among others. Library members may be contact arrays of cell populations singly or as groups/pools of two or more members.
VIII. Methods of Identifying Transcriptional Regulators
Another aspect of the invention provides methods of identifying transcriptional regulators. In some aspects, the invention provides methods of identifying transcriptional regulators which display differential activity between two cells.
The invention provides a method of identifying a transcriptional regulator having differential activity between an experimental cell and a control cell, the method comprising (i) determining the level of gene expression of at least two genes in the experimental cell and in the control cell; (ii) ranking genes according to a difference metric of their expression level in the experimental cell compared to the control cell; (iii) identifying a subset of genes, wherein each gene in the subset contains the same DNA sequence motif; (iv) testing via a nonparametric statistic if the subset of genes are enriched at either the top or the bottom of the ranking; (v) optionally reiterating steps (ii)-(iii) for additional motifs; (vi) for a subset of genes that is enriched, identifying a transcriptional regulator which binds to a DNA sequence motif that is contained in the subset of genes; thereby identifying a transcriptional regulator having differential activity between two cells.
The methods provided by the invention for identifying transcriptional regulators with differential activity are not limited to any type of cell or to any type of difference between the two cell. The cells may be eukaryotic, prokaryotic, yeast, nematode, insect, mammalian or human cells. The cells may be primary cells, or cell lines. The cells may be in an organism. In one specific embodiment, the cells are isolated from a subject.
The control and the experimental cell may be the same type of cell or they may be different types of cells. In one embodiment, the experimental cell and the control cell are both cells derived from the same cell line or from the same tissue types. In some embodiments, the experimental cell and the control cell are from different organisms, such as from two different subjects. In some specific embodiments in which the cells are derived from the same organism, one cell is a normal cell and another cell is a diseased cell. For instance, one cell may be a cancer cell and one may be a non-cancer cell, or one cell may be a virus infected cell and one may be a non-infected cell. In some embodiments, both cells may be diseased cells, but differ in their disease states. For instance, the two cells may be hyperplastic cells but at different stages of cancer progression e.g. one cell may be a tumor cell and the other a metastatic cell derived from that tumor. Furthermore, the two cells may differ genetically or they may be clonal cells with essentially identical genotypes. One or both of the cells may be experimentally manipulated, such as by contacting one of the cells with an agent, or contacting both cells with an agent but at different concentrations.
In some embodiments of the method, the subject from which one or both of the cells are derived in is afflicted with a disorder. The method is not limited by any particular disorder. In some specific embodiments, the disorder is a metabolic disorder or a hyperplastic condition. Hyperplastic conditions include renal cell cancer, Kaposi's sarcoma, chronic leukemia, prostate cancer, breast cancer, sarcoma, pancreatic cancer, leukemia, ovarian carcinoma, rectal cancer, throat cancer, melanoma, colon cancer, bladder cancer, lymphoma, mastocytoma, lung cancer, mammary adenocarcinoma, pharyngeal squamous cell carcinoma, testicular cancer, gastrointestinal cancer, or stomach cancer, or a combination thereof. Additional disorders to which this method may be applied may be found, for example, in Braunwald, E. et al. eds. Harrison's Principles of Internal Medicine, 15th Edition (McGraw-Hill Book Company, New York, 2001).
In some embodiments, a transgene is introduced into the experimental cell. The transgene may encode any protein, such as transcriptional regulators or proteins that regulate the activity of transcriptional regulators, such as kinase and phosphatases. The transgene may also encode an inhibitory RNA, such as a hairpin RNA, so that the function of the gene to which the hairpin RNA is directed may be knocked down, allowing a comparison of gene expression in between the two cells. In some embodiments, the transgenes is a transgene associated with a disease state. For example, a gene whose overexpressing leads to cancer may be overexpressed to identify transcriptional regulators expressing differential activity between the two cells. These transcriptional regulators may then be used as therapeutic targets for the treatment of cancer. In some embodiments, the transgene is a mutant transgene, such as a mutant transgene associated with a disease state.
In some embodiments, the DNA sequence motif comprises at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or 25 nucleotides in length, preferably at least 5. The DNA sequence motif may be any combination of nucleotides, and it may represent a known binding site or a novel binding site. In some embodiments, the DNA sequence motif comprises undefined nucleotide positions which may contain more than one base. For instance, a DNA sequence motif may comprise the sequence GATNNATC, wherein the 3rd and 4th positions would include any of the four bases. Similarly, a DNA sequence motif comprising the sequence GAT(G/T)ATC would have a G or a T in the fourth position. In some embodiments, DNA sequence motif comprises at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 defined positions.
The method can be applied to any number of motifs. In one embodiment, all permutations of DNA sequence motifs of at least 6, 7, 8 and 9 bases in length are tested. The number selected may depend on the number of genes in the subset, the computational capabilities available, and the size of the window in each gene in which the DNA sequence motif is search.
The method is not limited to any particular method of measuring gene expression. In some embodiments, determining the level of expression of a gene in a cell comprises determining the levels of mRNA for the gene in the cell. Any method known in the art may be used to determine mRNA levels. In one embodiment, mRNA is isolated from the cell, and the levels of mRNA for each gene in the subset is determined by hybridizing the mRNA, or cDNA derived from the mRNA, to a DNA microarray.
In some embodiments of the methods described herein, identifying the transcriptional regulator which binds to a DNA sequence motif comprises searching a database comprising transcriptional regulators and DNA sequence motifs to which they bind. For example, the TRANSFAC transcription factor database, maintained at the GBF Braunschweig, Germany, defines sequence specific binding site patterns, or motifs, for transcription factors. In another embodiment, the transcriptional regulator is identified by comparing the sequences identified to those found in the literature. It is understood by one skilled in the art that more than one transcriptional regulator may bind to a given DNA sequence motif, and therefore multiple transcriptional regulators may be identified.
In some embodiments of the method described herein, identifying a transcriptional regulator which binds to a DNA sequence motif comprises experimentally identifying a transcriptional regulator which binds to the DNA sequence motif. In one embodiment, this is achieved by These may be achieved by (i) identifying, from a library of genes, a transcriptional regulator capable of driving the expression of a selectable marker, wherein the expression of the selectable marker is dependent on binding of the transcriptional regulator to the DNA sequence motif. In a specific embodiment, a reporter gene is introduced into a cell, such as a mammalian cell or a yeast cell, wherein the promoter of the reporter gene is operably linked to the DNA sequence motif. A plasmid library which comprises candidate transcriptional regulator genes is introduced into the cells such that the transcriptional regulators are expressed in the cell. If a transcriptional regulator is able to bind to the DNA sequence motif, it will increase or decrease expression of the reporter gene, allowing identification of the cell expressing said regulator and thus allowing its identification. In a specific embodiment, a yeast one-hybrid approach, or other approaches well known to one skilled in the art, is used to identify a transcriptional regulator which binds to the DNA sequence motif (Vidal M et al. Nucleic Acids Res. 1999; 27(4):919-29, Kadonaga et al., (1986) Proc. Natl. Acad. Sci. USA, 83, 5889-5893. Singh et el. (1988) Cell, 52, 415-423; Chong, J. A. et al. (1997) In Bartel, P. L. and Fields, S. (eds), The Yeast Two-Hybrid System. Oxford University Press, New York, N.Y., pp. 289-297). Transcriptional regulators may also be identified based on its binding affinity for the DNA sequence motif, such by standard affinity chromatography.
In some embodiments, the non-parametric statistic is a nonparametric, rank sum statistic. In specific embodiments, the non-parametric statistic is selected from the group consisting of a Kolmogorov-Smirnov, Mann-Whitney or Wald-Wolfowitz. Non-parametric statistics are well-known in the art (David J. Sheskin, Handbook of Parametric and Nonparametric Statistical Procedures, CRC Press, 2003; Myles Hollander, Douglas A. Wolfe, Nonparametric Statistical Methods, Wiley, John & Sons, Inc., 1998; Larry Wasserman, All of Statistics, Springer-Verlag New York, Incorporated, 2003). In some embodiments, the difference metric is a difference in arithmetic means, t-test scores, or signal to noise ratios. In some embodiments, a gene set is said to be enriched if the probability that the gene set would be enriched by chance, or when compared to an appropriate null hypothesis, is less than 0.05, 0.04, 0.03, 0.02, 0.01, 0.005, 0.0001, 0.00005 or 0.00001.
In some embodiments where the experimental cell expresses a recombinant transgene, such as a recombinant transcriptional regulator, the recombinant transcriptional regulator may itself be found to have differential activity. In other embodiments where the experimental cell expresses a recombinant transgene, the method may yield transcriptional regulators whose activity or expression is itself regulated by the recombinant transcriptional regulator, and if a recombinant transcriptional regulator is used whose activity is related to a disease state is used, identification of transcriptional regulators having differential activity between the two cells may yield therapeutic targets to treat the disorder.
IX. Biomarker Set Enrichment Analysis (BSEA)
One aspect of the invention provides methods of detecting statistically-significant differences in the expression level of at least one biomarker belonging to a biomarker set, between the members of a first and of a second experimental group. Applicants have named this new analytical technique Biomarker Set Enrichment Analysis (BSEA), or Gene Set Enrichment Analysis (GSEA) when the biomarker is a gene or a gene product.
GSEA may be valuable in efforts to relate genomic variation to disease and measures of total body physiology. Single-gene methods are powerful only where the individual gene effect is dramatic and the variance small, which may not be the case in many disease states. Methods like GSEA are complementary, and provide a framework with which to examine changes operating at a higher level of biological organization. This may be needed if common, complex disorders typically result from modest variation in the expression or activity of multiple members of a pathway e.g. gene (biomarker) sets. As gene sets are systematically assembled using functional and genomic approaches, methods such as GSEA will likely be valuable in detecting coordinated but subtle variation in gene function that contribute to common human diseases. Accordingly, in a preferred embodiment, the methods detect statistically-significant differences in the expression level in more than one biomarker.
One aspect of the invention provides a method of detecting statistically-significant differences in the expression level of at least one biomarker belonging to a biomarker set, between the members of a first and of a second experimental group, comprising: (a) obtaining a biomarker sample from members of the first and the second experimental groups; (b) determining, for each biomarker sample, the expression levels of at least one biomarker belonging to the biomarker set and of at least one biomarker not belonging to the set; (c) generating a rank order of each biomarker according to a difference metric of its expression level in the first experimental group compared to the second experimental group; (d) calculating an experimental enrichment score for the biomarker set by applying a non parametric statistic; and (e) comparing the experimental enrichment score with a distribution of randomized enrichment scores to calculate the fraction of randomized enrichment scores greater than the experimental enrichment score, wherein a low fraction indicates a statistically-significant difference in the expression level of the biomarker set between the members of the first and of the second experimental group.
In one embodiment of the foregoing methods, the distribution of randomized enrichment scores is generated by randomly permutating the assignment of each biomarker sample to the first or to the second experimental group; (ii) generating a rank order of each biomarker according to the absolute value of a difference metric of its expression level in the first experimental group compared to the second experimental group; (iii) calculating an experimental enrichment score for the biomarker set by applying a non parametric statistic to the rank order; and (iv) repeating steps (i), (ii) and (iii) a number of times sufficient to generate the distribution of randomized enrichment scores. In a specific embodiment, the number of times sufficient to generate a distribution is at least 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200 or 500 times. In another specific embodiment, the low fraction is less than 0.05, while in other embodiments it is less than 0.04, 0.03, 0.02, 0.01, 0.005 or 0.001.
In one embodiment of the foregoing methods, the distribution of randomized enrichment scores is a normal distribution. The difference metric may be any difference metric, such as a difference in arithmetic means, a difference in t-test scores, or a difference in signal-to-noise ratio. Similarly, the non-parametric statistic may be any non-parametric statistic, such Mann-Whitney, Wald-Wolfowitz or more preferably Kolmogorov-Smirnov.
The biomarker set typically comprises elements of a pathway, such as a metabolic pathway, a biochemical pathway, a signaling pathway, or any set of genes which share a common biological function or which are coordinately regulated. In a preferred embodiment, the biomarker is selected from the group consisting of a nucleic acid, a polypeptide, a metabolite and a genotype. For example, when the biomarker set comprises genes encoding enzymes of a metabolic pathway, such as glycolytic enzymes, the biomarkers may comprise the genotype of the glycolytic genes. In the embodiment where the biomarker is a genotype, the genotype of all or a subset of the glycolytic genes may be determined by DNA sequencing, and the expression level of the genotype would correspond to the amount of polymorphic DNA i.e. 0, 1 or 2 copies of a wild-type copy of the gene for a diploid cell or organism. Alternatively, the number of mutant copies, or of a specific mutation, can be used in determining the expression level of the genotype.
In other embodiments where the biomarker is the mRNA of each of, or of a subset of, the glycolytic enzymes, the expression level of the mRNA may be determined, or the expression level of a particular splice isoform, using methods well known in the art, such as by northern blots or microarray analysis. In other embodiments where the biomarker is the protein of each of, or of a subset of, the glycolytic enzymes, the level of expression may comprise total protein levels or levels of a particular modified form of the protein, such as the level of phosphorylated or glycosylated protein, both of which may be determined using immunological techniques. Finally, when the biomarker is a metabolite, such as the product whose formation is catalyzed by the glycolytic enzyme, the expression level of the metabolite is its concentration in the biomarker sample, such as its cellular concentration. Metabolite levels may be determined using chromatographic means or other means well known in the art. The reference to the glycolitic pathway in the examples above is meant to be illustrative and non-limiting, or the same principles may apply to any other pathway or biomarker set.
In one embodiment, experimental groups comprise organisms, such as mammals, or more preferably humans. In such embodiments, the sample from the biomarker sample comprises a sample of cells from the organism, or a sample of bodily fluid, such as serum, saliva, tears, sweat or semen. The difference between the first and second experimental groups may be a disease state. For example, the first experimental group may be afflicted with a disease or disorder, while the second group is not. In a specific embodiment, the disorder is characterized by defective glucose metabolism, such as type II diabetes. In another embodiment where the experimental groups comprise organisms, the first and second experimental groups may differ by any measurable characteristic. For example, the groups may differ by a physical characteristic, such as weight, age, sex, sexual preference, eyesight, percent body fat, percent lean muscle mass, height, right vs. left handedness or race. The groups may also differ by a psychological characteristic, such as intelligence, verbal skills, emotional intelligence and even personality types, such those determined by the Myers-Briggs Type Indicator. The groups may also differ by emotional state, such as relaxed vs. emotionally stressed subjects, or cheerful vs. gloomy subjects. The subjects may also differ by the presence or absence of one or more mutations, such as subjects having mutations in an oncogene. In another embodiment, the two experimental groups differ in that one group has been treated with at least one agent, such as a drug.
In another embodiment, experimental groups comprise cells. The cells may comprise primary cells, cell lines, or come in the form of tissue samples. As described above for organisms, the cells in the two experimental groups may differ by a physical characteristic or differ genetically. In a preferred embodiment, the two experimental groups differ in that the cells in one of the experimental groups have been treated with an agent, such as with a compound or drug. In such embodiments, the methods described herein may be used to detect subtle changes that the agent may have on the biomarker set, such as a biochemical or signaling pathway.
X. Nucleic Acid and Polypeptide Agents
In some of embodiments of methods described herein, an agent which reduces the expression of Errα, Gabpa, Gabpb, or any other gene, or an genet used in any of the methods of screening agents described herein, comprises a double stranded RNAi molecule, a ribozyme, or an antisense nucleic acid directed at said gene.
Certain embodiments of the invention make use of materials and methods for effecting knockdown of one form of a gene, by means of RNA interference (RNAi). RNAi is a process of sequence-specific post-transcriptional gene repression which can occur in eukaryotic cells. In general, this process involves degradation of an mRNA of a particular sequence induced by double-stranded RNA (dsRNA) that is homologous to that sequence. For example, the expression of a long dsRNA corresponding to the sequence of a particular single-stranded mRNA (ss mRNA) will labilize that message, thereby “interfering” with expression of the corresponding gene. Accordingly, any selected gene may be repressed by introducing a dsRNA which corresponds to all or a substantial part of the mRNA for that gene. It appears that when a long dsRNA is expressed, it is initially processed by a ribonuclease III into shorter dsRNA oligonucleotides of in some instances as few as 21 to 22 base pairs in length. Furthermore, RNAi may be effected by introduction or expression of relatively short homologous dsRNAs. Indeed the use of relatively short homologous dsRNAs may have certain advantages as discussed below.
Mammalian cells have at least two pathways that are affected by double-stranded RNA (dsRNA). In the RNAi (sequence-specific) pathway, the initiating dsRNA is first broken into short interfering (si) RNAs, as described above. The siRNAs have sense and antisense strands of about 21 nucleotides that form approximately 19 nucleotide si RNAs with overhangs of two nucleotides at each 3′ end. Short interfering RNAs are thought to provide the sequence information that allows a specific messenger RNA to be targeted for degradation. In contrast, the nonspecific pathway is triggered by dsRNA of any sequence, as long as it is at least about 30 base pairs in length. The nonspecific effects occur because dsRNA activates two enzymes: PKR, which in its active form phosphorylates the translation initiation factor eIF2 to shut down all protein synthesis, and 2′, 5′ oligoadenylate synthetase (2′,5′-AS), which synthesizes a molecule that activates RNAse L, a nonspecific enzyme that targets all mRNAs. The nonspecific pathway may represents a host response to stress or viral infection, and, in general, the effects of the nonspecific pathway are preferably minimized under preferred methods of the present invention. Significantly, longer dsRNAs appear to be required to induce the nonspecific pathway and, accordingly, dsRNAs shorter than about 30 bases pairs are preferred to effect gene repression by RNAi (see Hunter et al. (1975) J Biol Chem 250: 409-17; Manche et al. (1992) Mol Cell Biol 12: 5239-48; Minks et al. (1979) 3 Biol Chem 254: 10180-3; and Elbashir et al. (2001) Nature 411: 494-8).
RNAi has been shown to be effective in reducing or eliminating the expression of a gene in a number of different organisms including Caenorhabditis elegans (see e.g. Fire et al. (1998) Nature 391: 806-11), mouse eggs and embryos (Wianny et al. (2000) Nature Cell Biol 2: 70-5; Svoboda et al. (2000) Development 127: 4147-56), and cultured RAT-1 fibroblasts (Bahramina et al. (1999) Mol Cell Biol 19: 274-83), and appears to be an anciently evolved pathway available in eukaryotic plants and animals (Sharp (2001) Genes Dev. 15: 485-90). RNAi has proven to be an effective means of decreasing gene expression in a variety of cell types including HeLa cells, NIH/3T3 cells, COS cells, 293 cells and BHK-21 cells, and typically decreases expression of a gene to lower levels than that achieved using antisense techniques and, indeed, frequently eliminates expression entirely (see Bass (2001) Nature 411: 428-9). In mammalian cells, siRNAs are effective at concentrations that are several orders of magnitude below the concentrations typically used in antisense experiments (Elbashir et al. (2001) Nature 411: 494-8).
The double stranded oligonucleotides used to effect RNAi are preferably less than 30 base pairs in length and, more preferably, comprise about 25, 24, 23, 22, 21, 20, 19, 18 or 17 base pairs of ribonucleic acid. Optionally the dsRNA oligonucleotides of the invention may include 3′ overhang ends. Exemplary 2-nucleotide 3′ overhangs may be composed of ribonucleotide residues of any type and may even be composed of 2′-deoxythymidine resides, which lowers the cost of RNA synthesis and may enhance nuclease resistance of siRNAs in the cell culture medium and within transfected cells (see Elbashi et al. (2001) Nature 411: 494-8). Longer dsRNAs of 50, 75, 100 or even 500 base pairs or more may also be utilized in certain embodiments of the invention. Exemplary concentrations of dsRNAs for effecting RNAi are about 0.05 nM, 0.1 nM, 0.5 nM, 1.0 nM, 1.5 nM, 25 nM or 100 nM, although other concentrations may be utilized depending upon the nature of the cells treated, the gene target and other factors readily discernable to the skilled artisan. Exemplary dsRNAs may be synthesized chemically or produced in vitro or in vivo using appropriate expression vectors. Exemplary synthetic RNAs include 21 nucleotide RNAs chemically synthesized using methods known in the art (e.g. Expedite RNA phophoramidites and thymidine phosphoramidite (Proligo, Germany). Synthetic oligonucleotides are preferably deprotected and gel-purified using methods known in the art (see e.g. Elbashir et al. (2001) Genes Dev. 15: 188-200). Longer RNAs may be transcribed from promoters, such as T7 RNA polymerase promoters, known in the art. A single RNA target, placed in both possible orientations downstream of an in vitro promoter, will transcribe both strands of the target to create a dsRNA oligonucleotide of the desired target sequence. For example, if Errα is the target of the double stranded RNA, any of the above RNA species will be designed to include a portion of nucleic acid sequence of the Errα gene.
The specific sequence utilized in design of the oligonucleotides may be any contiguous sequence of nucleotides contained within the expressed gene message of the target. Programs and algorithms, known in the art, may be used to select appropriate target sequences. In addition, optimal sequences may be selected utilizing programs designed to predict the secondary structure of a specified single stranded nucleic acid sequence and allowing selection of those sequences likely to occur in exposed single stranded regions of a folded mRNA. Methods and compositions for designing appropriate oligonucleotides may be found, for example, in U.S. Pat. No. 6,251,588, the contents of which are incorporated herein by reference. Messenger RNA (mRNA) is generally thought of as a linear molecule which contains the information for directing protein synthesis within the sequence of ribonucleotides, however studies have revealed a number of secondary and tertiary structures that exist in most mRNAs. Secondary structure elements in RNA are formed largely by Watson-Crick type interactions between different regions of the same RNA molecule. Important secondary structural elements include intramolecular double stranded regions, hairpin loops, bulges in duplex RNA and internal loops. Tertiary structural elements are formed when secondary structural elements come in contact with each other or with single stranded regions to produce a more complex three dimensional structure. A number of researchers have measured the binding energies of a large number of RNA duplex structures and have derived a set of rules which can be used to predict the secondary structure of RNA (see e.g. Jaeger et al. (1989) Proc. Natl. Acad. Sci. USA 86:7706 (1989); and Turner et al. (1988) Annu. Rev. Biophys. Biophys. Chem. 17:167). The rules are useful in identification of RNA structural elements and, in particular, for identifying single stranded RNA regions which may represent preferred segments of the mRNA to target for silencing RNAi, ribozyme or antisense technologies. Accordingly, preferred segments of the mRNA target can be identified for design of the RNAi mediating dsRNA oligonucleotides as well as for design of appropriate ribozyme and hammerhead ribozyme compositions of the invention.
The dsRNA oligonucleotides may be introduced into the cell by transfection with an heterologous target gene using carrier compositions such as liposomes, which are known in the art—e.g. Lipofectamine 2000 (Life Technologies) as described by the manufacturer for adherent cell lines. Transfection of dsRNA oligonucleotides for targeting endogenous genes may be carried out using Oligofectamine (Life Technologies). Transfection efficiency may be checked using fluorescence microscopy for mammalian cell lines after co-transfection of hGFP-encoding pAD3 (Kehlenback et al. (1998) J Cell Biol 141: 863-74). The effectiveness of the RNAi may be assessed by any of a number of assays following introduction of the dsRNAs. Further compositions, methods and applications of RNAi technology are provided in U.S. Pat. Nos. 6,278,039, 5,723,750 and 5,244,805, which are incorporated herein by reference.
Ribozyme molecules designed to catalytically cleave Errα or Gabpa mRNA transcripts can also be used to prevent translation of Errα or Gabpa (see, e.g., PCT International Publication WO90/11364, published Oct. 4, 1990; Sarver et al. (1990) Science 247:1222-1225 and U.S. Pat. No. 5,093,246). Ribozymes are enzymatic RNA molecules capable of catalyzing the specific cleavage of RNA. (For a review, see Rossi (1994) Current Biology 4: 469-471). The mechanism of ribozyme action involves sequence specific hybridization of the ribozyme molecule to complementary target RNA, followed by an endonucleolytic cleavage event. The composition of ribozyme molecules preferably includes one or more sequences complementary to the gene whose activity is to be reduced.
While ribozymes that cleave mRNA at site specific recognition sequences can be used to destroy target mRNAs, the use of hammerhead ribozymes is preferred. Hammerhead ribozymes cleave mRNAs at locations dictated by flanking regions that form complementary base pairs with the target mRNA. Preferably, the target mRNA has the following sequence of two bases: 5′-UG-3′. The construction and production of hammerhead ribozymes is well known in the art and is described more fully in Haseloff and Gerlach (1988) Nature 334:585-591; and see PCT Appln. No. WO89/05852, the contents of which are incorporated herein by reference). Hammerhead ribozyme sequences can be embedded in a stable RNA such as a transfer RNA (tRNA) to increase cleavage efficiency in vivo (Perriman et al. (1995) Proc. Natl. Acad. Sci. USA, 92: 6175-79; de Feyter, and Gaudron, Methods in Molecular Biology, Vol. 74, Chapter 43, “Expressing Ribozymes in Plants”, Edited by Turner, P. C, Humana Press Inc., Totowa, N.J.). In particular, RNA polymerase III-mediated expression of tRNA fusion ribozymes are well known in the art (see Kawasaki et al. (1998) Nature 393: 284-9; Kuwabara et al. (1998) Nature Biotechnol. 16: 961-5; and Kuwabara et al. (1998) Mol. Cell. 2: 617-27; Koseki et al. (1999) J Virol 73: 1868-77; Kuwabara et al. (1999) Proc Natl Acad Sci USA 96: 1886-91; Tanabe et al. (2000) Nature 406: 473-4). There are typically a number of potential hammerhead ribozyme cleavage sites within a given target cDNA sequence. Preferably the ribozyme is engineered so that the cleavage recognition site is located near the 5′ end of the target mRNA- to increase efficiency and minimize the intracellular accumulation of non-functional mRNA transcripts. Furthermore, the use of any cleavage recognition site located in the target sequence encoding different portions of the C-terminal amino acid domains of, for example, long and short forms of target would allow the selective targeting of one or the other form of the target, and thus, have a selective effect on one form of the target gene product.
In addition, ribozymes possess highly specific endoribonuclease activity, which autocatalytically cleaves the target sense mRNA. The present invention extends to ribozymes which hybridize to a sense mRNA encoding a Errα or Gabpa or any other genes of interest described herein, thereby hybridizing to the sense mRNA and cleaving it, such that it is no longer capable of being translated to synthesize a functional polypeptide product.
The ribozymes of the present invention also include RNA endoribonucleases (hereinafter “Cech-type ribozymes”) such as the one which occurs naturally in Tetrahymena thermophila (known as the IVS, or L-19 IVS RNA) and which has been extensively described by Thomas Cech and collaborators (Zaug, et al. (1984) Science 224:574-578; Zaug, et al. (1986) Science 231:470-475; Zaug, et al. (1986) Nature 324:429-433; published International patent application No. WO88/04300 by University Patents Inc.; Been, et al. (1986) Cell 47:207-216). The Cech-type ribozymes have an eight base pair active site which hybridizes to a target RNA sequence whereafter cleavage of the target RNA takes place. The invention encompasses those Cech-type ribozymes which target eight base-pair active site sequences that are present in a target gene or nucleic acid sequence.
Ribozymes can be composed of modified oligonucleotides (e.g., for improved stability, targeting, etc.) and should be delivered to cells which express the target gene in vivo. A preferred method of delivery involves using a DNA construct “encoding” the ribozyme under the control of a strong constitutive pol III or pol II promoter, so that transfected cells will produce sufficient quantities of the ribozyme to destroy endogenous target messages and inhibit translation. Because ribozymes, unlike antisense molecules, are catalytic, a lower intracellular concentration is required for efficiency.
In a long target RNA chain, significant numbers of target sites are not accessible to the ribozyme because they are hidden within secondary or tertiary structures (Birikh et al. (1997) Eur J Biochem 245: 1-16). To overcome the problem of target RNA accessibility, computer generated predictions of secondary structure are typically used to identify targets that are most likely to be single-stranded or have an “open” configuration (see Jaeger et al. (1989) Methods Enzymol 183: 281-306). Other approaches utilize a systematic approach to predicting secondary structure which involves assessing a huge number of candidate hybridizing oligonucleotides molecules (see Milner et al. (1997) Nat Biotechnol 15: 537-41; and Patzel and Sczakiel (1998) Nat Biotechnol 16: 64-8). Additionally, U.S. Pat. No. 6,251,588, the contents of which are hereby incorporated herein, describes methods for evaluating oligonucleotide probe sequences so as to predict the potential for hybridization to a target nucleic acid sequence. The method of the invention provides for the use of such methods to select preferred segments of a target mRNA sequence that are predicted to be single-stranded and, further, for the opportunistic utilization of the same or substantially identical target mRNA sequence, preferably comprising about 10-20 consecutive nucleotides of the target mRNA, in the design of both the RNAi oligonucleotides and ribozymes of the invention.
In other embodiments of methods described herein, an agent which modulates the activity of Errα, Gabpa, Gabpb, or any other gene, comprises an antibody or fragment thereof. An antibody may increase or decrease the activity of any of the subject polypeptides, and it may increase or decrease the binding of two proteins into a complex, such as an Errα/PCG-1a complex.
Chickens, mammals, such as a mouse, a hamster, a goat, a guinea pig or a rabbit, can be immunized with an immunogenic form of the Errα, Gabpa, Gabpb, or any polypeptide provided by the invention, or with peptide variants thereof (e.g., an antigenic fragment which is capable of eliciting an antibody response). Techniques for conferring immunogenicity on a protein or peptide include conjugation to carriers or other techniques well known in the art. For instance, a peptidyl portion of one of the subject proteins can be administered in the presence of adjuvant. The progress of immunization can be monitored by detection of antibody titers in plasma or serum. Standard ELISA or other immunoassays can be used with the immunogen as antigen to assess the levels of antibodies.
Following immunization, antisera can be obtained and, if desired, polyclonal antibodies against the target protein can be further isolated from the serum. To produce monoclonal antibodies, antibody producing cells (lymphocytes) can be harvested from an immunized animal and fused by standard somatic cell fusion procedures with immortalizing cells such as myeloma cells to yield hybridoma cells. Such techniques are well known in the art, and include, for example, the hybridoma technique (originally developed by Kohler and Milstein, Nature, 256: 495-497, 1975), as well as the human B cell hybridoma technique (Kozbar et al., Immunology Today, 4: 72, 1983), and the EBV-hybridoma technique to produce human monoclonal antibodies (Cole et al., Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc. pp. 77-96, 1985). Hybridoma cells can be screened immunochemically for production of antibodies specifically reactive to the peptide immunogen and the monoclonal antibodies isolated. Accordingly, another aspect of the invention provides hybridoma cell lines which produce the antibodies described herein. The antibodies can then be tested for their effects on the activity and expression of the protein to which they are directed.
The term antibody as used herein is intended to include fragments which are also specifically reactive with a protein described herein or a complex comprising such protein. Antibodies can be fragmented using conventional techniques and the fragments screened in the same manner as described above for whole antibodies. For example, F(ab′)2 fragments can be generated by treating antibody with pepsin. The resulting F(ab′)2 fragment can be treated to reduce disulfide bridges to produce Fab′ fragments. The antibody of the present invention is further intended to include bispecific and chimeric molecules, as well as single chain (scFv) antibodies.
The subject antibodies include trimeric antibodies and humanized antibodies, which can be prepared as described, e.g., in U.S. Pat. No. 5,585,089. Also within the scope of the invention are single chain antibodies. All of these modified forms of antibodies as well as fragments of antibodies are intended to be included in the term “antibody”.
In yet another embodiment of the methods described herein, the agent is a polypeptide, such as an Errα polypeptide or a Gabp polypeptide, or a fragment thereof which retains a biological activity or which antagonizes a biological activity of the wild-type polypeptide. For example, an Errα stimulatory agent comprises an active Errα protein, a nucleic acid molecule encoding Errα that has been introduced into the cell. In another embodiment, the agent is a mutant polypeptide which inhibits Errα protein activity. Examples of such inhibitory agents include a nucleic acid molecule encoding a dominant negative Errα a protein, such a fragment of Errα which may compete with wildtype Errα protein for DNA binding or complex formation with PGC-1.
XI. Therapeutics
In one aspect, the invention provides methods of treating disorders in a subject comprising the administration of a agent or of a composition comprising an agent, such as a therapeutic agent. “Therapeutic agent” or “therapeutic” refers to an agent capable of having a desired biological effect on a host. Chemotherapeutic and genotoxic agents are examples of therapeutic agents that are generally known to be chemical in origin, as opposed to biological, or cause a therapeutic effect by a particular mechanism of action, respectively. Examples of therapeutic agents of biological origin include growth factors, hormones, and cytokines. A variety of therapeutic agents are known in the art and may be identified by their effects. Certain therapeutic agents are capable of regulating cell proliferation and differentiation. Examples include chemotherapeutic nucleotides, drugs, hormones, non-specific (non-antibody) proteins, oligonucleotides (e.g., antisense oligonucleotides that bind to a target nucleic acid sequence (e.g., mRNA sequence)), peptides, and peptidomimetics.
In one embodiment, the compositions are pharmaceutical compositions. Pharmaceutical compositions for use in accordance with the present invention may be formulated in conventional manner using one or more physiologically acceptable carriers or excipients. Thus, the compounds and their physiologically acceptable salts and solvates may be formulated for administration by, for example, by aerosol, intravenous, oral or topical route. The administration may comprise intralesional, intraperitoneal, subcutaneous, intramuscular or intravenous injection; infusion; liposome-mediated delivery; topical, intrathecal, gingival pocket, per rectum, intrabronchial, nasal, transmucosal, intestinal, oral, ocular or otic delivery.
An exemplary composition of the invention comprises an compound capable of modulating the expression or activity of a transcriptional regulator, such as a PGC-1, Gabp or Errα polypeptide, with a delivery system, such as a liposome system, and optionally including an acceptable excipient. In a preferred embodiment, the composition is formulated for injection.
Techniques and formulations generally may be found in Remmington's Pharmaceutical Sciences, Meade Publishing Co., Easton, Pa. For systemic administration, injection is preferred, including intramuscular, intravenous, intraperitoneal, and subcutaneous. For injection, the compounds of the invention can be formulated in liquid solutions, preferably in physiologically compatible buffers such as Hank's solution or Ringer's solution. In addition, the compounds may be formulated in solid form and redissolved or suspended immediately prior to use. Lyophilized forms are also included.
For oral administration, the pharmaceutical compositions may take the form of, for example, tablets or capsules prepared by conventional means with pharmaceutically acceptable excipients such as binding agents (e.g., pregelatinised maize starch, polyvinylpyrrolidone or hydroxypropyl methylcellulose); fillers (e.g., lactose, microcrystalline cellulose or calcium hydrogen phosphate); lubricants (e.g., magnesium stearate, talc or silica); disintegrants (e.g., potato starch or sodium starch glycolate); or wetting agents (e.g., sodium lauryl sulphate). The tablets may be coated by methods well known in the art. Liquid preparations for oral administration may take the form of, for example, solutions, syrups or suspensions, or they may be presented as a dry product for constitution with water or other suitable vehicle before use. Such liquid preparations may be prepared by conventional means with pharmaceutically acceptable additives such as suspending agents (e.g., sorbitol syrup, cellulose derivatives or hydrogenated edible fats); emulsifying agents (e.g., lecithin or acacia); non-aqueous vehicles (e.g., ationd oil, oily esters, ethyl alcohol or fractionated vegetable oils); and preservatives (e.g., methyl or propyl-p-hydroxybenzoates or sorbic acid). The preparations may also contain buffer salts, flavoring, coloring and sweetening agents as appropriate.
Preparations for oral administration may be suitably formulated to give controlled release of the active compound. For buccal administration the compositions may take the form of tablets or lozenges formulated in conventional manner. For administration by inhalation, the compounds for use according to the present invention are conveniently delivered in the form of an aerosol spray presentation from pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by providing a valve to deliver a metered amount. Capsules and cartridges of e.g., gelatin for use in an inhaler or insufflator may be formulated containing a powder mix of the compound and a suitable powder base such as lactose or starch.
The compounds may be formulated for parenteral administration by injection, e.g., by bolus injection or continuous infusion. Formulations for injection may be presented in unit dosage form, e.g., in ampoules or in multi-dose containers, with an added preservative. The compositions may take such forms as suspensions, solutions or emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, stabilizing and/or dispersing agents. Alternatively, the active ingredient may be in powder form for constitution with a suitable vehicle, e.g., sterile pyrogen-free water, before use.
The compounds may also be formulated in rectal compositions such as suppositories or retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other glycerides.
In addition to the formulations described previously, the compounds may also be formulated as a depot preparation. Such long acting formulations may be administered by implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as sparingly soluble derivatives, for example, as a sparingly soluble salt.
Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration bile salts and fusidic acid derivatives in addition, detergents may be used to facilitate permeation. Transmucosal administration may be through nasal sprays or using suppositories. For topical administration, the oligomers of the invention are formulated into ointments, salves, gels, or creams as generally known in the art. A wash solution can be used locally to treat an injury or inflammation to accelerate healing.
The compositions may, if desired, be presented in a pack or dispenser device which may contain one or more unit dosage forms containing the active ingredient. The pack may for example comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may be accompanied by instructions for administration.
For therapies involving the administration of nucleic acids, the oligomers of the invention can be formulated for a variety of modes of administration, including systemic and topical or localized administration. Techniques and formulations generally may be found in Remmington's Pharmaceutical Sciences, Meade Publishing Co., Easton, Pa. For systemic administration, injection is preferred, including intramuscular, intravenous, intraperitoneal, intranodal, and subcutaneous for injection, the oligomers of the invention can be formulated in liquid solutions, preferably in physiologically compatible buffers such as Hank's solution or Ringer's solution. In addition, the oligomers may be formulated in solid form and redissolved or suspended immediately prior to use. Lyophilized forms are also included.
Systemic administration can also be by transmucosal or transdermal means, or the compounds can be administered orally. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration bile salts and fusidic acid derivatives. In addition, detergents may be used to facilitate permeation. Transmucosal administration may be through nasal sprays or using suppositories. For oral administration, the oligomers are formulated into conventional oral administration forms such as capsules, tablets, and tonics. For topical administration, oligomers may be formulated into ointments, salves, gels, or creams as generally known in the art.
Toxicity and therapeutic efficacy of the agents and compositions of the present invention can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds which exhibit large therapeutic induces are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.
The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.
In one embodiment of the methods described herein, the effective amount of the agent is between about 1 mg and about 50 mg per kg body weight of the subject. In one embodiment, the effective amount of the agent is between about 2 mg and about 40 mg per kg body weight of the subject. In one embodiment, the effective amount of the agent is between about 3 mg and about 30 mg per kg body weight of the subject. In one embodiment, the effective amount of the agent is between about 4 mg and about 20 mg per kg body weight of the subject. In one embodiment, the effective amount of the agent is between about 5 mg and about 10 mg per kg body weight of the subject.
In one embodiment of the methods described herein, the agent is administered at least once per day. In one embodiment, the agent is administered daily. In one embodiment, the agent is administered every other day. In one embodiment, the agent is administered every 6 to 8 days. In one embodiment, the agent is administered weekly.
As for the amount of the compound and/or agent for administration to the subject, one skilled in the art would know how to determine the appropriate amount. As used herein, a dose or amount would be one in sufficient quantities to either inhibit the disorder, treat the disorder, treat the subject or prevent the subject from becoming afflicted with the disorder. This amount may be considered an effective amount. A person of ordinary skill in the art can perform simple titration experiments to determine what amount is required to treat the subject. The dose of the composition of the invention will vary depending on the subject and upon the particular route of administration used. In one embodiment, the dosage can range from about 0.1 to about 100,000 ug/kg body weight of the subject. Based upon the composition, the dose can be delivered continuously, such as by continuous pump, or at periodic intervals. For example, on one or more separate occasions. Desired time intervals of multiple doses of a particular composition can be determined without undue experimentation by one skilled in the art.
The effective amount may be based upon, among other things, the size of the compound, the biodegradability of the compound, the bioactivity of the compound and the bioavailability of the compound. If the compound does not degrade quickly, is bioavailable and highly active, a smaller amount will be required to be effective. The effective amount will be known to one of skill in the art; it will also be dependent upon the form of the compound, the size of the compound and the bioactivity of the compound. One of skill in the art could routinely perform empirical activity tests for a compound to determine the bioactivity in bioassays and thus determine the effective amount. In one embodiment of the above methods, the effective amount of the compound comprises from about 1.0 ng/kg to about 100 mg/kg body weight of the subject. In another embodiment of the above methods, the effective amount of the compound comprises from about 100 ng/kg to about 50 mg/kg body weight of the subject. In another embodiment of the above methods, the effective amount of the compound comprises from about 1 ug/kg to about 10 mg/kg body weight of the subject. In another embodiment of the above methods, the effective amount of the compound comprises from about 100 ug/kg to about 1 mg/kg body weight of the subject.
As for when the compound, compositions and/or agent is to be administered, one skilled in the art can determine when to administer such compound and/or agent. The administration may be constant for a certain period of time or periodic and at specific intervals. The compound may be delivered hourly, daily, weekly, monthly, yearly (e.g. in a time release form) or as a one time delivery. The delivery may be continuous delivery for a period of time, e.g. intravenous delivery. In one embodiment of the methods described herein, the agent is administered at least once per day. In one embodiment of the methods described herein, the agent is administered daily. In one embodiment of the methods described herein, the agent is administered every other day. In one embodiment of the methods described herein, the agent is administered every 6 to 8 days. In one embodiment of the methods described herein, the agent is administered weekly.
The invention now being generally described, it will be more readily understood by reference to the following examples, which are included merely for purposes of illustration of certain aspects and embodiments of the present invention, and are not intended to limit the invention, as one skilled in the art would recognize from the teachings hereinabove and the following examples, that other DNA microarrays, cell types, agents, constructs, or data analysis methods, all without limitation, can be employed, without departing from the scope of the invention as claimed.
The contents of any patents, patent applications, patent publications, or scientific articles referenced anywhere in this application are herein incorporated in their entirety.
The practice of the present invention will employ, where appropriate and unless otherwise indicated, conventional techniques of cell biology, cell culture, molecular biology, transgenic biology, microbiology, virology, recombinant DNA, and immunology, which are within the skill of the art. Such techniques are described in the literature. See, for example, Molecular Cloning: A Laboratory Manual, 3rd Ed., ed. by Sambrook and Russell (Cold Spring Harbor Laboratory Press: 2001); the treatise, Methods In Enzymology (Academic Press, Inc., N.Y.); Using Antibodies, Second Edition by Harlow and Lane, Cold Spring Harbor Press, New York, 1999; Current Protocols in Cell Biology, ed. by Bonifacino, Dasso, Lippincott-Schwartz, Harford, and Yamada, John Wiley and Sons, Inc., New York, 1999; and PCR Protocols, ed. by Bartlett et al., Humana Press, 2003.
The tables for all the Experimental genes are listed at the end of the third experimental series.
First Experimental Series
Described herein are results of RNA expression profiling of 43 individuals with varying levels of insulin resistance, carried out to systematically identify pathways and processes operative in diabetes. The 43 individuals were: 17 with normal glucose tolerance (NGT), 8 with impaired glucose tolerance (IGT), and 18 with type 2 diabetes (DM2). No single gene showed statistically significant expression differences between the diagnostic classes. Therefore, they developed a new analytical technique, called Gene Set Enrichment Analysis (GSEA), that seeks to determine whether members of gene sets (e.g., pathways) are consistently different, even though modestly or slightly, in one diagnostic class versus another. Application of GSEA to the microarray data, demonstrated that the oxidative phosphorylation pathway (OXPHOS) was significantly different. Of the approximately 106 members in this pathway, 94 are diminished in DM2 versus NGT. The effect is subtle—with each gene only showing a 15-20% decrease.
Also described herein are results of work carried out to define mechanisms underlying this coordinated decrease in expression of OXPHOS genes. Analysis of the expression of these OXPHOS genes in a public atlas of mouse gene expression, showed that ⅔ of all OXPHOS genes are tightly co-regulated across all 47 tissues examined, and that they are highly expressed at the major sites of insulin mediated glucose uptake (brown fat, heart, and skeletal muscle). This group of genes is referred to herein as “ONPHOS-CR,” for “OXPHOS Co-Regulated.” Applicants hypothesized that the transcriptional co-activator PPARGC1 (also known as PGC-1α was responsible for this transcriptional co-regulation. To prove this, Applicants infected mouse muscle cell lines with PPARGC1 and demonstrated that the OXPHOS-CR genes are specifically induced in a time-dependent manner over a three day period. As described in detail below, GSEA was re-applied to the diabetes data, this time testing whether OXPHOS-CR is specifically differentially expressed between the patient classes. Results showed that this accounts for the bulk of the signal detected in the comparison between NGT and DM2, and moreover, appears to be very different between NGT and IGT, as well, suggesting derangements in this group of genes is an early event. Previous studies have suggested that total body aerobic capacity (VO2max) is predictive of future insulin resistance and diabetes. Interestingly, Applicants found a striking relationship between the mean expression of the OXPHOS-CR genes and total body oxygen consumption.
The following experimental procedures were followed in the first experimental series:
Methods
Human Subjects and Clinical Measurements. Applicants selected 54 men of similar age but with varying degree of glucose tolerance who had been participating in The Malmö Prevention Study in southern Sweden for more than 12 years (Eriksson et al. Diabetologia 33, 526-31. (1990)). The investigation was approved by the Ethics Committée at Lund University, and informed consent was obtained from each of the volunteers. All subjects were Northern Europeans, and their glucose tolerance status was assessed using standardized 75-gram OGTT and by applying WHO85 criteria (Eriksson et al. Diabetologia 33, 526-31. (1990)). At the initial OGTT performed 10 years earlier, none of the men had DM2 (Eriksson et al. Diabetologia 33, 526-31. (1990)). An OGTT performed at the time the biopsy showed that 20 of the subjects had developed manifest type 2 diabetes (DM2), 8 fulfilled the criteria for IGT and 26 had normal glucose tolerance (NGT). As diabetes was diagnosed at the time of the repeat OGTT, none of the subjects were on medication for hyperglycemia or diabetes-related conditions.
Anthropometric and insulin sensitivity measures were performed as previously described (Groop, L. et al. Diabetes 45, 1585-93. (1996)). Height, weight, waist to hip ratio (WHR) and fat free mass were measured on the day of the euglycemic clamp. Maximal oxygen uptake (VO2max) was measured using an incremental work-conducted upright exercise test with a bicycle ergometer (Monark Varberg, Sweden) combined with continuous analysis of expiratory gases and minute ventilation. Exercise was started at a workload varying between 30-100 W depending on the previous history of endurance training or exercise habits and then increased by 20-50 W every 3 min, until a perceived exhaustion or a respiratory quotient of 1.0 was reached. Maximal aerobic capacity was defined as the VO2 during the last 30 s of exercise and is expressed per lean body mass. Insulin sensitivity was determined with a standard 2 hour-euglycemic hyperinsulinemic clamp combined with infusion of tritiated glucose to estimate endogenous glucose production and indirect calorimetry (Deltatrac, Datex Instrumentarium, Finland) to estimate substrate oxidation (Groop, L. et al. Diabetes 45, 1585-93. (1996)). The rate of glucose uptake (also referred to as the M-value) was calculated from the infusion rate of glucose and the residual rate of endogenous glucose production measured by the tritiated glucose tracer during the clamp.
Percutaneous muscle biopsies (20-50 mg) were taken from the vastus lateralis muscle under local anesthesia (1% lidocaine) after the 2-h euglycemic hyperinsulinemic clamp using a Bergström needle (Eriksson et al. Diabetes 43, 805-8. (1994)). Fiber-type composition and glycogen concentration were determined as previously described (Schalin et al. Eur J Clin Invest 25, 693-8. (1995)). Quantification and calculation of the fibers was performed using the COMFAS image analysis system (Scan Beam, Hadsun, Denmark).
Cell Culture and Adenoviral Infection. Mouse myoblasts (C2C12 cells) were cultured and differentiated into myotubes as previously described (Wu, Z. et al. Cell 98, 115-24. (1999)). After 3 days of differentiation, they were infected with an adenovirus containing either green fluorescent protein (GFP) or PGC-1α as previously described (Lin, J. et al. Nature 418, 797-801. (2002)).
mRNA Isolation, Target Preparation, and Hybridization. Targets were prepared from human biopsy or mouse cell lines as previously described (Golub, T. R. et al. Science 286, 531-7. (1999)) and hybridized to the Affymetrix HG-U133A or MG-U74Av2 chip, respectively. Only scans with 10% Present calls and a GAPDH 3′/GAPDH 5′ expression ratio<1.33 were selected. Applicants obtained gene expression data for 54 human samples, but only 43 met these selection criteria; the analysis in this paper is limited to these 43 individuals.
Data Scaling and Filtering. Human microarray data were subjected to global scaling to correct for intensity related biases. For each scan applicants binned all genes according to their expression intensity in a designated reference scan, and recorded the median intensity of that bin to serve as a calibration curve for that scan. Applicants then scaled the expression to the calibration curve of one NGT scan (patient mm12) which applicants visually inspected and deemed high quality using a linear interpolation between the calibration points. Applicants then filtered the 22,283 genes on the HG-U133A chip to eliminate genes that had extremely low expression. A previous study suggested that an Affymetrix average difference level of 100 corresponds to an extremely low level (“not expressed”) (Su, A. I. et al. Proc Natl Acad Sci USA 99, 4465-70. (2002)). Therefore, applicants only considered genes for which there was at least a single measure (average difference) greater than 100. Of the 22,283 genes on the HG-U133A chip, 10,983 genes met this filtering criterion.
Single Gene Microarray Analysis. Microarray analysis to identify individual genes that are significantly different between diagnostic classes was performed using two software packages. First, marker analysis was performed as previously described using GeneCluster. Significance of individual genes was testing by permutation of class labels (5000 iterations), as previously described (Golub, T. R. et al. Science 286, 531-7. (1999)). Applicants used both the t-test and signal to noise difference metrics in these analysis, both yielding comparable results. Second, applicants used the software package SAM, using a A=0.5, to search for gene expression values significantly different between classes (Tusher et al. Proc Natl Acad Sci USA 98, 5116-21. (2001)).
Compilation of Gene Sets. Applicants analyzed 149 gene sets consisting of manually curated pathways and clusters defined by public expression compendia. First, applicants used two different sets of metabolic pathway annotations. Applicants manually curated genes belonging to the following pathways: free fatty acid metabolism, gluconeogenesis, glycolysis, glycogen metabolism, insulin signaling, ketogenesis, pyruvate metabolism, reactive oxygen species (ROS) homeostasis, Kreb's cycle, oxidative phosphorylation (OXPHOS), and mitochondria, using standard textbooks, literature reviews, and LocusLink. Applicants also downloaded NetAFFX (Liu, G. et al et al. Nucleic Acids Res 31, 82-6. (2003)) annotations (October 2002) corresponding to GenMAPP metabolic pathways. To identify sets of co-regulated genes, applicants used self-organizing maps to group the GNF mouse expression atlas into 36 clusters (Su, A. I. et al. Proc Natl Acad Sci USA 99, 4465-70. (2002), Tamayo et al. Proc Natl Acad Sci USA 96, 2907-12. (1999). Genes in these 36 groups were converted to Affymetrix HG-U133A probe sets using the ortholog tables available at the NetAFFX website (October 2002).
Rationale for Grouped Gene Analysis. Consider a microarray dataset with the samples in two categories, A, B. For the sake of simplicity, let the size of A and B each be n. Consider a gene set S for which the expression levels differ between samples of A and B. Model the dataset so that the entry Dij for gene i and sample j is normally distributed with mean μij and standard deviation σ, where
Then the signal to noise for an individual gene in S is proportionate to
Suppose on the other hand applicants know S and add the expression levels for all genes in S. Then the signal to noise is proportionate to
where M is the number of genes in S. This increases the mean of our statistic (which is standard normal for the null hypothesis of no gene set association) by a factor of √{square root over (M)}. If the noise is in fact correlated for genes of S, this reduces the benefit, but applicants can still expect a large gain. In practice applicants will not be able to select a gene set containing fully concordant expression levels, but as long as an appreciable fraction of our gene set exhibits this property, applicants can expect a benefit from the grouped gene approach.
Gene Set Enrichment Analysis (GSEA). GSEA determines if the members of a given gene set are enriched amongst the most differentially expressed genes between two classes. First, the genes are rank ordered on the basis of a difference metric. The results presented in the current experimental series use the signal to noise (SNR) difference metric, which is simply the difference in means of the two classes divided by the sum of the standard deviations of the two diagnostic classes. In general other difference metrics can also be used.
For each gene set, applicants then make an enrichment measure, called the enrichment score (ES), which is a normalized Kolmogorov-Smirnov statistic. Consider the genes R1, . . . , RN that are rank ordered on the basis of the difference metric between the two classes, and a gene set S containing G members. Applicants define
if Ri is not a member of
if Ri is a member of S. Applicants then compute a running sum across all N genes. The enrichment score (ES) is defined as
or the maximum observed positive deviation of the running sum. ES is measured for every gene set considered. To determine whether any of the given gene sets shows association with the class phenotype distinction, applicants permute the class labels 1000 times, each time recording the maximum ES over all gene sets. Note that in this regard, applicants are testing a single hypothesis. The null hypothesis is that no gene set is associated with the class distinction.
In this experimental series, after identifying OXPHOS-CR as a subset of co-regulated OXPHOS genes, applicants tested it (a single gene set) for association with clinical status using GSEA. Because OXPHOS-CR is not independent of the OXPHOS set interrogated in the initial analysis, this cannot be viewed as an independent hypothesis. For this reason, these P-values are explicitly marked as nominal P-values.
Gene set enrichment analysis (GSEA) has been implemented as a software tool for use with microarray data and will be presented in fuller detail, including a discussion of different varieties of multiple hypothesis testing and applications to other biomedical problems, in a companion paper (Subramanian et. al., in preparation).
Evaluating OXPHOS Coregulation in Mouse Expression Datasets. Applicants used the NetAFFX to identify probe sets on the mouse expression chips corresponding to human OXPHOS probe sets. Applicants identified a total of 114 (106 of which passed our filtering criterion) probe-sets corresponding to the human oxidative phosphorylation genes. Using the October 2002 ortholog tables at NetAFFX, applicants were able to identify 61 mouse orthologs on the Affymetrix MG-U74Av2 chip. Of these 61 probe-sets, 52 were represented in the GNF mouse expression atlas (Su, A. I. et al. Proc Natl Acad Sci USA 99, 4465-70. (2002)). These expression data were normalized to a mean of 0 and a variance of 1. Data were hierarchically clustered and visualized using the Cluster and TreeView software packages (Eisen et al. Proc Natl Acad Sci USA 95, 14863-8. (1998)).
Applicants parsed these 52 genes into 32 co-regulated probe-sets and 20 probe-sets that are not co-regulated, based on the dendrogram in
Linear Regression Analysis. Applicants generated linear regression models using SAS (SAS Institute, USA). Clinical variables were used as dependent variables, and OXPHOS-CR gene expression levels or other clinical/biochemical measures used as the independent (explanatory or predictor) variables. To compute the mean centroid of OXPHOS-CR, the 34 genes OXPHOS-CR gene expression levels were normalized to a mean 0 and a variance 1 across all 43 patients. The OXPHOS-CR mean centroid vector is simply the mean of these 34 expression vectors. In some regression analyses, applicants introduced dummy variables to represent diabetes status. For the regressions applicants have performed, applicants have reported the adjusted squared correlation coefficient (R2adj), which corrects for the degrees of freedom.
DNA microarrays were used to profile expression of over 22,000 genes in skeletal muscle biopsies from 43 age-matched males (Table 1): 17 with Normal Glucose Tolerance (NGT), 8 with Impaired Glucose Tolerance (IGT), and 18 with Type 2 Diabetes Mellitus (DM2). Biopsies were obtained at the time of diagnosis (before treatment with hypoglycemic medication) and under the controlled conditions of a hyperinsulinemic euglycemic clamp (see Methods). When assessed with either of two different analytical techniques (Golub, T. R. et al. Science 286, 531-7. (1999), Tusher et al. Proc Natl Acad Sci USA 98, 5116-21. (2001)) that take into account the multiple comparisons implicit in microarray analysis, no single gene exhibited a significant difference in expression between the diagnostic categories. This result is consistent with smaller studies (Sreekumar et al. Diabetes 51, 1913-20. (2002), Yang et al. Diabetologia 45, 1584-93. (2002)) which failed to identify any individual gene whose expression difference was significant when corrected for the large number of hypotheses tested (Kropf et al. Biometrical J. 44, 789-800 (2002), Storey et al. J. R. Statist. Soc. B 64, 479-498 (2002)).
To test for sets of related genes that might be systematically altered in diabetic muscle, Applicants devised a simple approach called Gene Set Enrichment Analysis (GSEA), which is introduced here (see
For a given pairwise comparison (e.g., high in NGT vs DM2), all genes are ranked based on the difference in expression (using an appropriate metric such as signal to noise). The null hypothesis of GSEA is that the rank ordering of the genes in a given comparison is random with regard to the diagnostic categorization of the samples. The alternative hypothesis is the rank ordering of the pathway members is associated with the specific diagnostic criteria used to categorize the patient groups.
The extent of association is then measured by a non-parametric, running sum statistic termed the enrichment score (ES), and record the Maximum ES (MES) over all gene sets in the actual patient data (
Applicants applied GSEA to the microarray data described above, using 149 gene sets that applicants compiled (Table 2). Of these gene sets, 113 are based on involvement in metabolic pathways (based on public or local curation (Liu, G. et al et al. Nucleic Acids Res 31, 82-6. (2003)) and 36 consist of gene clusters that exhibit co-regulation in a mouse expression atlas of 46 tissues (Su, A. I. et al. Proc Natl Acad Sci USA 99, 4465-70. (2002)) (see Methods). The gene sets were selected without regard to the results of the microarray data from our patients. The top gene set in GSEA analysis yielded a Maximal Enrichment Score (MES=346) that was significant at P=0.029 over the 1,000 permutations of the 149 pathways. That is, in only 29 or 1,000 permutations did the top pathway (of the 149) exceed the score achieved by the top pathway achieved using the actual diagnostic labels.
The maximal ES score was obtained for an internally curated set consisting of genes involved in oxidative phosphorylation (applicants refer to this gene set as OXPHOS). Interestingly, the four gene sets with the next highest ES scores overlap with this OXPHOS gene set, and their enrichment is almost entirely explained by the overlap: a locally curated set of genes involved in mitochondrial function, a set of genes identified with the keyword ‘mitochondria,’ a cluster (referred to here as c20) of co-regulated genes derived from the comparison of publicly available mouse data, and a set of genes related to oxidative phosphorylation defined at the Affymetrix website (Liu, G. et al et al. Nucleic Acids Res 31, 82-6. (2003)).
Examination of the individual expression values for the 106 OXPHOS genes reveals the source of this signal (
One of the overlapping gene sets identified by GSEA is cluster c20, defined as a set of genes that are tightly co-regulated across many tissues (see Methods). The partial overlap of OXPHOS with the coregulated cluster led us to ask whether all OXPHOS genes are coordinately regulated, or just a subset. Applicants examined transcriptional co-regulation of mouse homologs of OXPHOS genes across a mouse tissue expression atlas (Su, A. I. et al. Proc Natl Acad Sci USA 99, 4465-70. (2002)). This revealed a previously unrecognized subset of the OXPHOS biochemical pathway, corresponding to about two-thirds of the OXPHOS genes, that exhibit strong correlation across mouse tissues (r=0.67) (
Applicants next asked whether the downregulation of OXPHOS observed in DM2 was a general property of all OXPHOS genes or was specific to OXPHOS-CR. Interestingly, the bulk of the statistical signal applicants observe in GSEA is accounted for by OXPHOS-CR (
The strong correlation in expression of the OXPHOS-CR genes and their coordinated downregulation in diabetic muscle led us to explore mechanisms that might mediate to this tight control. Applicants reasoned that peroxisome proliferator-activated receptor γ coactivator 1 (PGC-1α), a cold-inducible regulator of mitochondrial biogenesis, thermogenesis, and skeletal muscle fiber type switching (Puigserver, P. et al. Cell 92, 829-39. (1998), Wu, Z. et al. Cell 98, 115-24. (1999), Lin, J. et al. Nature 418, 797-801. (2002)), was a prime candidate for mediating these effects. Consistent with this hypothesis, applicants observed that mean levels of PGC-1α transcript were similarly decreased (−20%) in the diabetic muscle, and noted that the promoters of several of the OXPHOS-CR genes have been reported to contain binding sites for nuclear respiratory factor 1, a transcription factor co-activated by PGC-1α (Scarpulla, R. C. Biochim Biophys Acta 1576, 1-14. (2002)).
To test directly whether OXPHOS-CR genes might be transcriptional targets of PGC-1α, applicants expressed PGC-1α in a mouse skeletal muscle cell line using an adenoviral expression vector (Lin, J. et al. Nature 418, 797-801. (2002)) and used DNA microarrays to profile expression of the OXPHOS genes over a 3 day period (see Methods). Applicants found that a subset of OXPHOS genes were strongly upregulated in a time-dependent manner in response to PGC-1α, and that this subset corresponds almost precisely to OXPHOS-CR (
Metabolic control theory suggests that small increases in many sequential steps of a metabolic pathway can lead to a dramatic change in the total flux through the pathway, whereas large changes in a single enzyme might have no measurable effects (Brown et al. Biochem J 284, 1-13. (1992). To test the hypothesis that subtle differences in OXPHOS-CR gene expression in diabetic patients might be related to changes in total body metabolism, applicants examined the relationships between diabetes status, expression of OXPHOS-CR genes, and VO2max as measured in our patients (
It is important to note that these results do not seem secondary to other known predictors of oxidative capacity. Applicants found no relationship between BMI or WHR and OXPHOS-CR gene expression (Radj2<0.01 in both cases). In addition, there was no significant relationship between quantitative measures of fiber types and OXPHOS-CR expression. Thus, subtle decrease in expression of OXPHOS-CR genes in muscle appears to be associated with changes in total body aerobic capacity, even beyond their correlation to diabetes status, body habitus, or muscle fiber type.
Second Experimental Series
The following experimental procedures were followed in the second experimental series:
Organelle Purification and Sample Preparation. 6-8 week old male mice were subjected to an 8 hour fast and then euthanized. Brain, heart, kidney, and livers were harvested immediately and placed in ice cold saline. Mitochondria were isolated using differential centrifugation as previously described and purified with a Percoll gradient (Mootha et al. (2003). Proc Natl Acad Sci USA 100, 605-10). The proteins were then solubilized, size separated, and digested as previously described (Mootha et al. (2003). Proc Natl Acad Sci USA 100, 605-10)).
Tandem Mass Spectrometry. Liquid chromatography tandem mass spectrometry (LC-MS/MS) was performed on QSTAR pulsar quadrupole time of flight mass spectrometers (AB/MDS Sciex, Toronto) as described previously (Mootha et al. (2003). Proc Natl Acad Sci USA 100, 605-10). Tandem mass spectra were searched against the NCBInr database (February 2002) with tryptic constraints and initial mass tolerances<0.13 Da in the search software Mascot (Matrix Sciences, London). Only peptides achieving a Mascot score above 25 and containing a sequence tag of at least three consecutive amino acids were accepted.
Curation of Previously Annotated Mitochondrial Proteins. Two key sources were used to identify previously annotated proteins. First, Applicant downloaded the 308 human and 117 mouse protein sequences at MITOcondria Project (Scharfe et al. (2000). Nucleic Acids Res 28, 155-8). Applicant also downloaded the 199 human and 290 mouse protein sequences annotated at LocusLink (http://www.ncbi.nlm.nih.gov/LocusLink) as having a mitochondrial subcellular localization based on gene ontology terminology (GO:0005739) (Lewis et al. (2000). Curr Opin Struct Biol 10, 349-54) (January 2003). Also included in the master list the are 13 mtDNA encoded proteins, based on LocusLink annotation.
A Nonredundant List of Mitochondrial Proteins. FASTA sequences corresponding to the previously annotated mitochondrial proteins, newly identified mitochondrial proteins, and the mouse Reference Sequences (Maglott et al. (2000). Nucleic Acids Res 28, 126-8) were merged. These were then collapsed into distinct protein clusters using a downloaded version of blastclust (http://www.ncbi.nlm.nih.gov/BLAST/). Applicants required that members of a cluster demonstrate 70% sequence identity over 50% of the total length, not requiring a reciprocal relationship to exist. Clusters containing multiple Reference Sequences were then broken using a higher stringency blastclust, in which applicants required 90% identity over 50% of the length. Clusters containing hemoglobin, trypsin, and albumin were eliminated as obvious contaminants. When possible the Reference Sequence was selected as the exemplar from the cluster, otherwise another sequence was manually selected. Hence, each cluster is annotated by an exemplar sequence, the protein accessions (and tissues) in which the proteins were found in the proteomics experiments, and the protein accessions corresponding to annotation sources. Applicant obtained a total of 612 distinct protein clusters (Table 2). The GenPept descriptions of 37 of these exemplars suggested that they are mitochondrial, but simply missed by the automated annotation procedure using the MITOP and LocusLink databases. These exemplars were therefore manually annotated as previously known mitochondrial proteins, to provide a more conservative estimate of our sensitivity measure and newly discovered proteins.
Statistical Analysis. Cluster enrichment was determined using a cumulative hypergeometric distribution. To determine whether two empirical cumulative distributions arise from the same underlying distribution, Applicant used the Kolmogorov-Smirnov test statistic, D. Tail values were obtained using Matlab (Mathworks).
RNA/Protein Concordance Test. the RNA/protein concordance test was developed to determine whether there is significant concordance between protein detection in a proteomics experiment and mRNA abundance in a microarray experiment. Consider the pair of tissues, i,j, where i,jε{brain, heart, kidney, liver}. For a given gene, G, let M(G,k) represent the gene expression level of gene G in tissue k. Let P(G,k) be an indicator variable that is 0 if the protein product of gene G is not found in tissue k, and 1 if the protein product is found in tissue k. The mRNA and protein expression levels of gene G are concordant in tissues i and j if M(G,i)>M(G,j) when P(G,i)>P(G,j). For a given gene, G, compute the total number of observed concordances (cG) between all pairs of tissues as well as the expected variance in concordance (vG) for that gene. The test statistic is simply
which has mean 0 and variance 1 and is approximately normal in the null case where there is no concordance between RNA abundance and protein detection.
Compositional Diversity Across Tissues. Mitochondrial gene products show distinct patterns of expression based on protein and RNA expression (Table 5). These patterns of distribution can be used to develop a simple model that describes core mitochondrial proteins versus those that are specialized to any set of cell types.
Consider a set of i+1 tissues, Si+1, as well as a distinct subset Si, i.e., Si⊂Si+1, where i>0. Applicants are interested in the probability that a given gene product is found in Si+1 conditional that it is found in Si, or simply T(Si+1, Si)=P (gene product is found in Si+1|gene product is found in Si). Define Pi as the average T(Si+1, Si) over all selections of Si⊂Si+1. When applicant assessed compositional diversity using RNA expression levels, Applicant interpreted an RNA expression level greater than 200 as present (Su et al. (2002). Proc Natl Acad Sci USA 99, 4465-70), and an expression below this level as not present. These average conditional probabilities Pi can also be modeled. Imagine that a fraction f of all mitochondrial proteins are ubiquitous (i.e., expressed in all cell types with probability 1) and that a fraction 1−f are not ubiquitous, but rather, appear in a given tissue with probability p. Then Pi+1=(f+(1−f)pi+1)/(f+(1−f)pi).
DNA Microarray Analysis. To identify Affymetrix probe-sets corresponding to each protein cluster, Applicant mapped the exemplar sequence to the Unigene cluster, and then identified the corresponding Affymetrix MG-U74Av2 probe set. The NetAffx website (http://www.affymetrix.com) and its tables were used to perform these mappings (January 2003). The GNF mouse expression atlas (Su et al. (2002). Proc Natl Acad Sci USA 99, 4465-70) was downloaded from its website (http://www.gnf.org). In comparisons of protein detection and mRNA abundance, the used the mRNA expression level for a given tissue averaged over the replicates, since the GNF mouse expression atlas includes duplicates for each tissue. Because the proteomic survey was performed on whole brain, applicants simply compared to the average expression of all brain samples in the GNF mouse atlas. Hierarchical clustering was performed using DCHIP (Schadt et al. (2001). J Cell Biochem Suppl Suppl, 120-5).
Identification of Ancestral Mitochondrial Genes. The consensus FASTA sequences for the genes represented on the Affymetrix MG-U74Av2 oligonucleotide array were downloaded from the NetAFFX (Liu et al. (2003). Nucleic Acids Res 31, 82-6) website (http://www.affymetrix.com). A blastx comparison of these sequences was performed against the Rickettsia prowazekii protein sequences, downloaded from the NCBI, and then a tblastn comparison of the bacterial protein sequences was performed against the consensus FASTA sequences. An ancestral gene as defined as one achieving a BLASTX E<0.01 and having a reciprocal best match in the BLAST analysis.
Applicants carried out a systematic survey of mitochondrial proteins from brain, heart, kidney, and liver of C57BL6/S mice (see Methods). Each of these tissues provides a rich source of mitochondria. The isolation consisted of density centrifugation followed by Percoll purification. Preparations were tested for purity and for contamination using immunoblotting directed against organelle markers, enzymatic assays to ensure that the mitochondria were intact, and electron microscopy. The liver, heart, and kidney mitochondria were extremely pure. The brain mitochondria tended to show persistent contamination by synaptosomes, which themselves are a rich source of neuronal mitochondria (see Fernandez-Vizarra (2002). Methods 26, 292-7).
Mitochondrial proteins from each tissue were solubilized and size separated by gel filtration chromatography into approximately 20 fractions (see Methods). These proteins were then digested and analyzed by liquid chromatography mass spectrometry/mass spectrometry (LC-MS/MS). More than 100 LC-MS/MS experiments were performed (see Methods).
The acquired tandem mass spectra were then searched against the NCBI nonredundant database consisting of mammalian proteins using a probability-based method (Perkins et al. (1999). Electrophoresis 20, 3551-67. [pii]). Stringent criteria were used for accepting a database hit. Specifically, only peptides corresponding to complete tryptic cleavage specificity with scores greater than 25 were considered (see Methods). Furthermore, only fragmentation spectra which also exhibited a correct, corresponding peptide sequence tag (Mann et al. (1994). Anal Chem 66, 4390-9) consisting of at least three amino acids were considered.
Using these criteria, ˜2100 database hits were identified. This list contains a high degree of redundancy, because a protein may have been found in adjacent fractions of the gel and in different tissues. The ˜2100 hits collapse to a distinct set of 422 mouse proteins (see Table 4,
A list of previously annotated mouse and human mitochondrial proteins was created by pooling all the mouse and human proteins from MITOchondria Project (MITOP, http://mips.gsf.de/proj/medgen/mitop/), a public database of curated mitochondrial proteins, as well as all proteins annotated as mitochondrial in NCBI's LocusLink database (http://www.ncbi.nlm.nih.gov/LocusLink/) (see Methods). After elimination of redundancy, the list contains 452 distinct mouse proteins that are either directly annotated as mitochondrial or whose human homolog is annotated as mitochondrial (
The set of 422 proteins identified in Applicant's proteomic survey include 262 of the 452 proteins previously annotated to be mitochondrial (58%) and 160 proteins not previously annotated as associated with the mitochondria (
The 422 proteins identified in the proteomic survey span a wide range of isoelectric points and molecular weights (
The 160 proteins not previously annotated as mitochondrial potentially represent new mitochondrial proteins, either in the conventional sense of being present within the organelle or in a broader sense of being tethered to the mitochondrial outer membrane (e.g., tubulin (Heggeness et al. (1978). Proc Natl Acad Sci USA 75, 3863-6)).
To test this notion, Applicants sought independent evidence that these 160 proteins are actually mitochondrial. First, the list was compared to proteins identified in a recent survey of human heart mitochondria (Taylor et al. (2003). Nat Biotechnol 18, 18). Human homologs of 64 of the 160 proteins were identified in this recently published study. Of the remaining 96 proteins, 24 have strong mitochondrial targeting sequences based on bioinformatic analysis of protein targeting sequences (Table 4 and Methods) (Nakai et al. (1999). Trends Biochem Sci 24, 34-6), a proportion similar to the known mitochondrial proteins. For example polymerase delta interacting protein 38 (encoded by Pdip38-pending), which was detected only in liver mitochondria, and the gene product of Rnaseh1, which was found only in the kidney, have strong mitochondrial targeting scores. A recent study confirmed that Rnaseh1 can be localized to the mitochondrion, where it plays a critical role in mtDNA homeostasis (Cerritelli et al. (2003). Mol Cell 11, 807-15).
Applicant also investigated co-regulation of the 612 mito-P genes across different tissues. For 388 of the 612 mito-P genes, mRNA expression levels were available in a mouse gene expression compendium containing data across 47 tissues (Su et al. (2002). Proc Natl Acad Sci USA 99, 4465-70).
Applicant calculated pairwise correlation and performed hierarchical clustering of these 388 gene expression profiles (
Some of these gene modules have no obvious functional relationships, though two appear to be enriched in certain tissues (modules 1,2). Each of these gene modules is characterized by tightly correlated gene expression across the tissue compendium. Members of these genes likely share transcriptional regulatory mechanisms as well as cellular functions. Many of the newly identified mitochondrial genes (black bar in annotation bar of
The mitochondria gene modules provide an initial step towards the characterization of some of the newly identified mitochondrial genes, since functionally related genes tend to have correlated gene expression. Of the 104 newly identified mitochondrial proteins that are represented in this microarray dataset, 38 fall within these 7 modules, providing them with a preliminary functional context.
A striking gene module (module 6) consists of genes related to oxidative phosphorylation (OXPHOS) and β-oxidation and expressed at high levels in brown fat, skeletal muscle, and heart (
Applicant also sought to systematically identify all genes that exhibit correlated expression with the mito-P genes. This was done using the neighborhood index (N100), a previously described statistic that measures a given gene's expression similarity to a target gene set (Mootha et al. (2003). Proc Natl Acad Sci USA 100, 605-10). For a given gene, the mitochondria neighborhood index is defined as the number of mito-P genes among its nearest 100 expression neighbors. Applicant computed the N100 statistic for all genes in the mouse expression atlas (
The 10,043 genes in the mouse expression atlas include 388 of the 612 mito-P genes. If these 388 genes were a random subset, an N100 value greater than 10 would be expected to occur by chance 1 in 1000 times, and an N100 greater than 50 would be exceedingly rare (P=1.5×10−14).
A total of 806 genes have N100>10. This is defined herein as the expression neighborhood of the mito-P set, and Applicant interprets these genes as being co-regulated with mitochondrial genes (see the entire rank ordered list, Table 7). This group corresponds to only 8% of all the genes studied, but it contains 52% of the mito-P genes (6.5-fold enrichment, P=1.49×10−11). The list includes 59 that are newly mitochondrial, based on the proteomic survey described herein and 25 that were previously known to be mitochondrial but not detected by that proteomic survey.
Importantly, the expression neighborhood includes 605 genes not present in the mito-P set itself. These genes may encode proteins that are physically present in mitochondria but were missed in the proteomic survey or that are functionally related to mitochondria but not physically associated. They provide a catalog of genes that are likely functionally relevant to mitochondrial biology, and are complementary to the proteomic approach that identified proteins resident in this organelle.
Applicant found several genes involved in DNA replication within the mitochondria neighborhood (Table 1). Essra, Pparg, and Ppara encode nuclear receptors that are tightly co-regulated with the mitochondrial genes. This is intriguing since previous studies have suggested that these nuclear receptors are important partners of the coactivator PGC-1 key molecule in mitochondrial biogenesis (Puigserver et al. (2003). Endocr Rev 24, 78-90). While nuclear receptors are critical to mitochondrial biogenesis (Scarpulla, R. C. (2002). Biochim Biophys Acta 1576, 1-14), to our knowledge, none has previously been reported to be co-regulated with the mitochondrial genes themselves. Interestingly, a recent report demonstrated that PGC-1α co-activates Essra gene expression (Schreiber et al. (2003). J Biol Chem 278, 9013-8). Applicant's results raise the hypothesis that this may be a general phenomenon, in which PGC-1α is co-activating a number of its own transcriptional partners.
A number of other transcriptional regulators also have expression patterns very tightly regulated with the mitochondrial genes, including Mdfi, Nfix, Thx6, and Crsp2. These are excellent candidate transcription factors that may be targets of PGC-1α, or perhaps are involved in other mechanisms leading to the biogenesis of this organelle.
Surprisingly, the nutrient sensor Sir2 is also found within the mitochondrial expression neighborhood. Sir2 encodes an NAD(+)-dependent histone deacetylase which is homologous to the yeast silent information regulator 2 (ySir2). Sir2 is involved in gene silencing, chromosomal stability, and aging. Chromatin remodeling enzymes rely on coenzymes derived from metabolic pathways, including those generated by the mitochondrion. These observations suggest that Sir2 and mitochondrial gene expression are cooperatively regulated, perhaps linking the mitochondrion to the nutrient sensing activities of Sir2.
Third Experimental Series
The following experimental procedures were followed in the third experimental series:
Data Scaling, Visualization, and Annotation Enrichment. Microarray data were acquired and subjected to linear scaling using the median scan as a reference. Data were visualized using the dChip software package (10) and enrichment by ontology terms determined with the GoSurfer tool, using a P-value of 0.01 (11). Mitochondrial genes were defined based on a recent proteomic survey of organelle in mouse (12).
Promoter Databases. Applicants used the Reference Sequence annotations of mm3 build of the mouse genome (http://genome.ucsc.edu) and the annotation tables for the Affymetrix MG-U74Av2 chip (http://www.affymetrix.com) to compile a list of 5034 mouse genes for which there is a 1:1 mapping between Affymetrix probe-set and Reference Sequences. The ‘mouse promoter database’ consists of 2000 bp of genomic sequence centered on the annotated transcription start site of these genes.
Applicants also performed analyses on a ‘masked promoter database’, consisting of the regions within these 2000 bp that are aligned and conserved between mouse and human. Applicants used the mouse/human BLASTZ alignments (mouse mm3 vs. human hg15) (13) and only considered the 5008 promoters for which the alignment contained at least 100 bp. Applicants masked the aligned promoters to retain mouse sequence exhibiting at least 70% identity to human across windows of size 10. The median promoter length in the masked database is ˜1200 bp.
Motif discovery. For a given day, genes from the microarray are ordered on the basis of expression difference between GFP and PGC-1α (applicants use the signal to noise ratio as our difference metric). Each gene is annotated for the presence of a motif in the promoter by searching for exact k-mers (where k=6, 7, 8 or 9) or for selected motifs of interest. Applicants use the Mann-Whitney rank sum statistic U to determine whether the distribution of differential expression for those genes with a given motif differs from those genes lacking the motif. When working with promoters of unequal length (e.g., the masked promoter database), a more appropriate null hypothesis for the Mann-Whitney statistic is that the probability of detecting a motif in a promoter is proportional to its length. To assess the significance of a motif with rank sum U that appears in C promoters, applicants use Monte Carlo simulation (with 1000 samples) to estimate the null distribution of U for a sample of C ranks drawn randomly, without replacement, given relative weights proportional to the promoter lengths. For large C (C>10) and a reasonable distribution of promoter lengths, U is approximately normally distributed.
Promoter databases and motifADE source code are available at http://www-genome.wi.mit.edu/mpg/PGC_motifs/.
Systematic identification of transcription factors involved in biological processes in mammals remains a largely unsolved problem (17). A promising approach relates genome-wide expression profiles to promoter sequences to discover influential cis-motifs (18-21). Such methods have yielded impressive results in simple organisms such as yeast, but it has been challenging to extend these algorithms to mammalian genomes, where intergenic regions are large, annotation of gene structure is imperfect, and DNA sequence can be highly repetitive. Most of these methods seek motifs by comparison to a fixed background model of nucleotide composition (which fails to represent the fluctuations seen in large genomes) or by comparison between two sets of genes (which is likely to capture only very sharp differences). Further, many of these methods assume that the expression data are normally distributed, which may not always be true.
To overcome some of these obstacles, applicants devised a simple, nonparametric strategy for identifying motifs associated with differential expression (motifADE) (
To identify motifs related to PGC-1α action, applicants infected mouse C2C12 muscle cells with an adenovirus expressing PGC-1α and obtained gene expression profiles for 12,488 genes at 0, 1, 2, and 3 days following infection. Applicants found 649 genes that were induced at least 1.5-fold (nominal P<0.05) at day 3. As expected, these were enriched for genes involved in carbohydrate metabolism and the mitochondrion (see (1)). Interestingly, many genes involved with protein synthesis (GO terms: protein biosynthesis, mitochondrial ribosome and ribosome) are also induced.
Applicants then applied motifADE to study the 5034 mouse genes for which applicants have measures of gene expression as well as reliable annotations of the transcriptional start site (TSS) (see Methods). For each gene, the target region was defined to be a 21 kb region centered on the TSS. Applicants then tested all possible k-mers ranging in size from k=6 to k=9 nucleotides for association with differential expression on each of the three days of the timecourse. A total of 20 motifs achieved high statistical significance (p<0.001, following Bonferroni correction for multiple hypothesis testing) and these were almost exclusively related to two distinct motifs (see Table 8 and Table 9). The first motif, 5′-TGACCTTG-3′ was significant on days 1, 2, and 3 (adjusted P=2.1×10−6, 2.9×10−9, and 7.7×10−7, respectively). It corresponds to the published binding site for the orphan nuclear receptor Errα (22), which is known to be capable of being co-activated by PGC-1 and -β (23-25). The Errα gene is known to be involved in metabolic processes, based on studies showing that knockout mice have reduced body weight and peripheral fat tissue, as well as altered expression of genes involved in metabolic pathways (26). The second motif is 5′-CTTCCG-3′ (adjusted p=8.9×10−9), which is the top scoring motif on day 3. It corresponds to the published binding site for Gabpa (27), which complexes with Gabpb (15) to form the heterodimer, nuclear respiratory factor-2 (NRF-2), a factor known to regulate the expression of some OXPHOS genes (28).
Interestingly, the reverse complements of these motifs did not score as well, suggesting a preference for the orientation of these motifs, and some occurrences of the motifs occurred downstream of the TSS. While each of these motifs is individually associated with PGC-1A, our analyses suggest that a gene having both motifs typically ranks higher on the list of differentially expressed genes and genes with only one of the motifs (
Applicants next repeated motifADE analysis using a “masked” promoter database (Table 3). Applicants still considered the 2000 bp centered on the TSS, but only considered those nucleotides aligned and conserved between mouse and human (see Methods). Still, the top ranking motifs on days 1 and 3 were related to Errα (day 1, P=4.8×10−6; day 3 P-1.2×10−11) and to Gabpa (day 3 P=3.1×10−11), providing additional support these motifs are biologically relevant.
The Errα and Gabpa motifs are particularly enriched upstream of the OXPHOS-CR genes, which exhibit reduced expression in human diabetes (5, 6). Whereas the top scoring Errα motif (5′-TGACCTTG-3′ or its reverse complement) only occurs in 12% of the promoters in the database, in 29% of the PGC-responsive genes (i.e., those genes induced at least 1.5 fold on day 3), and in 27% of the mitochondrial genes, they are found in 52% of the OXPHOS-CR genes (significance of enrichment, P-1×10−4). About one-half of these sites are perfectly conserved in the syntenic region in human. The top scoring Gabpa binding sites (5′-CTTCCG-3′ or its reverse complement) are much more common (62% of all promoters of the database and in 79% of the PGC-responsive genes), but they, too, show significant enrichment in the OXPHOS-CR genes (89%, P=0.02).
The above results suggest that Errα and Gabpa may be the key transcriptional factors mediating PGC-1α action in muscle. In this connection, it is notable that based on the microarray data, both Errα and Gabpa are themselves induced 2-fold (P<0.01) on day 1 following expression PGC-1 consistent with previous studies (2, 23). Moreover, careful analysis of the Errα and Gabpa genes suggest that each contain potential binding sites for both transcription factors within the vicinity of their promoters. The Errα gene has the Errα motif as well as a conserved variant of the Gabpa binding site (27) upstream of the TSS, while the Gabpa gene has an Errα site upstream of the TSS and a conserved variant of the Gabpa binding site in its first intron These results raise the possibility that Errα and Gabpa may regulate their own and each other's expression.
Taken together, the systematic analysis of the transcriptional program driven by PGC-1α in skeletal muscle suggests a model (
Experiment 18: MotifADE Results Applied to Human Diabetic Versus Normal Expression
Applicants applied the MotifADE method to analyze the transcription factor binding sites that are differentially expressed in diabetic vs. normal human skeletal muscle (previously published data, Mootha et al Nature Genetics 2003). The program identified exactly three motifs achieving an adjusted P-value<0.05. These are AAATCG (adjusted P-value 0.003), CCGGAAG (adjusted P-value 0.039), and AGCGTTT (adjusted P-value 0.011). Applicants note that the second motif is a published binding site for Gabpa (reverse complement of CTTCCG). This results suggest that Gabpa function is altered in diabetic muscle, or that perhaps another transcription factor that binds to this element.
Experiment 6: Identification of Human Genes Having Binding Sites for Errα, Gabpa or Both
Applicants searched for the binding sites motifs (forward or reverse complement) 3 Kb upstream and 1 Kb downstream of the annotated transcription start site. In the accompanying files are the genes with either one motif (forward or reverse complement) or both motifs conserved between human and mouse. The following genes were identified: Table 10: 678 genes with Errα motif conserved between mouse and human. Table 11: 2799 genes with Gabpa motif conserved between mouse and human. Table 12: 354 genes with both motifs conserved between mouse and human.
Discussion of First Experimental Series
In this study, applicants have used a combined genomic and computational strategy to systematically dissect a mammalian transcriptional circuit central to cellular energetics. The results above have computational, biological and medical implications.
First, the motifADE algorithm provides a simple, nonparametric approach for discovering cis-elements by considering differential gene expression. It makes very few assumptions about the statistical properties of DNA composition or about the distribution of gene expression. The method is flexible, and as applicants have shown, can easily incorporate “masked” or “phylogenetically footprinted” promoters. With additional cross-species comparisons, it should be possible to interrogate conserved segments of larger upstream regions (34). Moreover, the method operates on any ordered set of genes and is particularly convenient for discovering motifs associated with human disease states, e.g., “healthy versus sick” or “treated versus control.” Clearly, the method has some limitations. For example, in the current study, applicants were confident in the identity of the transcription factors binding the motifs discovered—in general this may not be the case, and experimental strategies will be needed to systematically determine the occupancy of newly identified motifs. Moreover, a motif may be missed if it lies outside the target promoter region, or if a functional binding site is too degenerate for our motif search strategy.
Second, the analyses above indicate that the immediate effects of PGC-1α on OXPHOS genes in muscle are largely mediated through Errα and Gabpa. Recent studies have shown that PGC-1β can also co-activate Errα (25). Together, the data imply a model of gene regulation in which PGC-1α (and likely PGC-1β) initially induces the expression of Errα and Gabpa, via a double positive feedback mechanism (
Finally, the results suggest a potential approach to the treatment of type 2 diabetes. Recent studies in diabetic and pre-diabetic humans have demonstrated that there is a consistent decrease in the expression of genes of oxidative phosphorylation that are responsive to PGC-1α and PGC-1β and that treatments that induce PGC-1α (such as exercise) lead to increased expression of OXPHOS genes and improved insulin sensitivity (5, 6, 8, 9). On its face, this might argue for developing therapeutic approaches that raise the transcriptional activity of PGC-1. However, PGC-1 activates many different pathways in many tissues and such approaches may suffer from lack of specificity. For example, global transgenic overexpression of PGC-1β in mice results in resistance to obesity induced by a high-fat diet or by a genetic abnormality, though the contribution of PGC-1β expression in muscle has not been explored (25). On the other hand, a global knockout of Errα also causes a leaner phenotype and resistance to high-fat diet-induced obesity (26). The identification of the critical roles of Err and Gabpa in mediating the transcriptional program altered in human diabetic muscle may offer a more specific target. Because Errα is an orphan nuclear receptor, it may be an attractive, “druggable” target for diabetes and for other human metabolic disorders.
Tables:
Values are mean (S.D.).
M-value is the total body glucose uptake.
VO2max is the total body aerobic capacity.
Only P-values < 0.05 are shown for pairwise comparisons, using a two-sided t-test.
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
musculus]
Mus musculus 4 days neonate
Mus musculus, clone
Mus musculus adult male adrenal
Mus musculus 9 days embryo
Mus musculus 8 days embryo
Mus musculus 0 day neonate
Mus musculus 2 days neonate
musculus] [M. musculus]
Table 8 shows motifs associated with differential expression on days 1, 2, and 3.
motifADE was performed using the mouse promoter database on each of days 1, 2, and 3. All motifs achieving a Bonferroni-corrected P-value < 1 × 10−3 are shown. Annotations of the motif and the literature references, when available, are indicated.
motif ADE was performed using the mouse promoter database on each of days 1, 2, and 3. Motifs achieving a Bonferroni corrected P value <0.05 are shown. Motif ADE was performed using the mouse promoter database on each of days 1, 2, and 3. Motifs achieving a Bonferroni corrected P
Table 10 shows motifs discovered using the masked promoter database achieving P<0.05,
motif ADE was performed using the masked promoter database, consisting of regions of the promoters aligned and conserved between mouse and human. Motifs achieving a Bonferroni-corrected P-value < 0.05 are shown.
Homo sapiens hydroxyacyl-Coenzyme A dehydrogenase/3-ketoacyl-Coenzyme A,
sapiens keratin 14 (epidermolysis bullosa simplex, Dowling-Meara, Koebner)”, “(KRT14),
sapiens prostaglandin I2 (prostacyclin) receptor (IP) (PTGIR), mRNA”,
sapiens actinin, alpha 3 (ACTN3), mRNA”, gi|4557240|ref|NM_001104.1|[4557240]; 66:
Homo sapiens solute carrier family 25 (mitochondrial carrier; adenine nucleotide, “translocator),
sapiens baculoviral IAP repeat-containing 3 (BIRC3), transcript variant 1,”, mRNA,
sapiens CD1D antigen, d polypeptide (CD1D), mRNA”,
sapiens crystallin, mu (CRYM), mRNA”, gi|4503064|ref|NM_001888.1|[4503064]; 96:
sapiens potassium voltage-gated channel, subfamily H (eag-related), member”, “1 (KCNH1),
sapiens solute carrier family 18 (vesicular acetylcholine), member 3”, “(SLC18A3), mRNA”,
sapiens small nuclear ribonucleoprotein polypeptide N (SNRPN), transcript”, “variant 1,
sapiens HIR histone cell cycle regulation defective homolog A (S., “cerevisiae) (HIRA),
sapiens G protein-coupled receptor 68 (GPR68), mRNA”,
sapiens histone 1, H2bc (HIST1H2BC), mRNA”, gi|21166388|ref|NM_003526.2|[21166388];
sapiens beaded filament structural protein 2, phakinin (BFSP2), mRNA”,
sapiens carbohydrate (N-acetylglucosamine-6-O) sulfotransferase 2 (CHST2),”, mRNA,
sapiens calpain 1, (mu/I) large subunit (CAPN1), mRNA”,
sapiens protein phosphatase 1, regulatory (inhibitor) subunit 3C (PPP1R3C),”, mRNA,
sapiens ADP-ribosyltransferase (NAD+; poly (ADP-ribose) polymerase)-like 3, “(ADPRTL3),
sapiens laminin, alpha 5 (LAMA5), mRNA”, gi|21264601|ref|NM_005560.3|[21264601]; 293:
sapiens nescient helix loop helix 1 (NHLH1), mRNA”,
sapiens splicing factor, arginine/serine-rich 4 (SFRS4), mRNA”,
sapiens follistatin-like 3 (secreted glycoprotein) (FSTL3), mRNA”,
sapiens calicin (CCIN), mRNA”, gi|17738311|ref|NM_005893.1|[17738311]; 312: NM_005909,
sapiens neighbor of COX4 (NOC4), mRNA”, gi|34147520|ref|NM_006067.3|[34147520]; 320:
sapiens major histocompatibility complex, class II, DM alpha (HLA-DMA),”, mRNA,
sapiens natriuretic peptide precursor A (NPPA), mRNA”,
sapiens phosphodiesterase 6H, cGMP-specific, cone, gamma (PDE6H), mRNA”,
sapiens prophet of Pit1, paired-like homeodomain transcription factor”, “(PROP1), mRNA”,
sapiens bladder cancer associated protein (BLCAP), mRNA”,
sapiens retinoid X receptor, gamma (RXRG), mRNA”,
sapiens putative tumor suppressor 101F6 (101F6), mRNA”,
sapiens putative tumor suppressor (FUS2), mRNA”, gi|6912379|ref|NM_012191.1|[6912379];
sapiens ubiquinol-cytochrome c reductase complex (7.2 kD) (HSPC051), mRNA”,
sapiens NADH dehydrogenase (ubiquinone) 1 alpha subcomplex, 8, 19 kDa”, “(NDUFA8),
sapiens putative secreted protein ZSIG11 (ZSIG11), mRNA”,
sapiens heat shock protein 75 (TRAP1), mRNA”, gi|7706484|ref|NM_016292.1|[7706484]; 461:
sapiens deleted in esophageal cancer 1 (DEC1), mRNA”,
sapiens centaurin, alpha 2 (CENTA2), mRNA”, gi|8923762|ref|NM_018404.1|[8923762]; 513:
sapiens MCM10 minichromosome maintenance deficient 10 (S. cerevisiae), “(MCM10),
sapiens hypothetical protein FLJ10996 (FLJ10996), mRNA”,
sapiens interleukin 1 family, member 9 (IL1F9), mRNA”,
sapiens PR domain containing 10 (PRDM10), transcript variant 1, mRNA”,
sapiens hypothetical protein SP192 (SP192), mRNA”,
sapiens NEDD9 interacting protein with calponin homology and LIM domains, “(NICAL),
sapiens chromosome 14 open reading frame 127 (C14orf127), mRNA”,
sapiens ring finger protein 39 (RNF39), transcript variant 1, mRNA”,
sapiens hypothetical protein AY099107 (LOC152185), mRNA”,
sapiens hypothetical protein FLJ30656 (FLJ30656), mRNA”,
sapiens ATP-binding cassette, sub-family D (ALD), member 1 (ABCD1), mRNA”,
Homo sapiens fucosyltransferase 1 (galactoside 2-alpha-L-fucosyltransferase), “(FUT1),
sapiens hydroxyacyl-Coenzyme A dehydrogenase/3-ketoacyl-Coenzyme A, “thiolase/enoyl-
sapiens transglutaminase 1 (K polypeptide epidermal type I,”, “protein-glutamine-gamma-
sapiens CD79B antigen (immunoglobulin-associated beta) (CD79B), transcript”, “variant 1,
sapiens alcohol dehydrogenase 7 (class IV), mu or sigma polypeptide (ADH7),”, mRNA,
sapiens arachidonate 12-lipoxygenase (ALOX12), mRNA”,
sapiens glutamate receptor, metabotropic 2 (GRM2), mRNA”,
sapiens ribosomal protein L8 (RPL8), transcript variant 1, mRNA”,
sapiens topoisomerase (DNA) II beta 180 kDa (TOP2B), mRNA”,
sapiens amiloride-sensitive cation channel 1, neuronal (degenerin) (ACCN1),”, “transcript
sapiens aconitase 2, mitochondrial (ACO2), nuclear gene encoding”, “mitochondrial protein,
sapiens cyclin H (CCNH), mRNA”, gi|17738313|ref|NM_001239.2|[17738313]; 191:
sapiens ectonucleoside triphosphate diphosphohydrolase 2 (ENTPD2), mRNA”,
sapiens claudin 7 (CLDN7), mRNA”, gi|34222214|ref|NM_001307.3|[34222214]; 204:
sapiens HGF activator (HGFAC), mRNA”, gi|32455241|ref|NM_001528.2|[32455241]; 245:
sapiens aryl hydrocarbon receptor (AHR), mRNA”, gi|5016091|ref|NM_001621.2|[5016091];
sapiens ADP-ribosylation factor 5 (ARF5), mRNA”, gi|6995999|ref|NM_001662.2|[6995999];
sapiens 2,3-bisphosphoglycerate mutase (BPGM), transcript variant 1, mRNA”,
sapiens CD63 antigen (melanoma 1 antigen) (CD63), mRNA”,
sapiens cyclin-dependent kinase 7 (MO15 homolog, Xenopus laevis,”, “cdk-activating kinase)
sapiens CCAAT/enhancer binding protein (C/EBP), gamma (CEBPG), mRNA”,
sapiens cannabinoid receptor 2 (macrophage) (CNR2), mRNA”,
sapiens cellular retinoic acid binding protein 2 (CRABP2), mRNA”,
sapiens crystallin, zeta (quinone reductase) (CRYZ), mRNA”,
sapiens CTP synthase (CTPS), mRNA”, gi|4503132|ref|NM_001905.1|[4503132]; 309:
sapiens D component of complement (adipsin) (DF), mRNA”,
sapiens fms-related tyrosine kinase 4 (FLT4), transcript variant 2, mRNA”,
sapiens hemopoietic cell kinase (HCK), mRNA”, gi|30795228|ref|NM_002110.2|[30795228];
sapiens homeo box D10 (HOXD10), mRNA”, gi|23510365|ref|NM_002148.2|[23510365]; 346:
sapiens histidine rich calcium binding protein (HRC), mRNA”,
sapiens mannosidase, alpha, class 2A, member 1 (MAN2A1), mRNA”,
sapiens melanocortin 1 receptor (alpha melanocyte stimulating hormone, “receptor) (MC1R),
Homo sapiens solute carrier family 25 (mitochondrial carrier; phosphate, “carrier), member 3
sapiens proteasome (prosome, macropain) subunit, alpha type, 5 (PSMA5),”, mRNA,
sapiens protein tyrosine phosphatase, non-receptor type 6 (PTPN6),”, “transcript variant 1,
sapiens parvalbumin (PVALB), mRNA”, gi|4506334|ref|NM_002854.1|[4506334]; 478:
Homo sapiens Ras protein-specific guanine nucleotide-releasing factor 1, “(RASGRF1),
sapiens ring finger protein 4 (RNF4), mRNA”, gi|34305289|ref|NM_002938.2|[34305289]; 495:
Homo sapiens special AT-rich sequence binding protein 1 (binds to nuclear, “matrix/scaffold-
sapiens Sjogren syndrome antigen A1 (52 kDa, ribonucleoprotein autoantigen”, “SS-A/Ro)
sapiens src homology three (SH3) and cysteine rich domain (STAC), mRNA”,
sapiens thyrotrophic embryonic factor (TEF), mRNA”,
sapiens HIR histone cell cycle regulation defective homolog A (S., “cerevisiae) (HIRA),
sapiens ubiquitin-activating enzyme E1 (A1S9T and BN75 temperature, “sensitivity
sapiens axin 1 (AXIN1), transcript variant 1, mRNA”,
sapiens histone 1, H2bh (HIST1H2BH), mRNA”, gi|21166386|ref|NM_003524.2|[21166386];
sapiens histone 1, H3h (HIST1H3H), mRNA”, gi|15718725|ref|NM_003536.2|[15718725]; 596:
sapiens dual-specificity tyrosine-(Y)-phosphorylation regulated kinase 3, “(DYRK3), mRNA”,
sapiens solute carrier family 43, member 1 (SLC43A1), mRNA”,
sapiens histone acetyltransferase 1 (HAT1), mRNA”, gi|4504340|ref|NM_003642.1|[4504340];
sapiens diacylglycerol kinase, delta 130 kDa (DGKD), transcript variant 1,”, mRNA,
sapiens syntaxin 10 (STX10), mRNA”, gi|4507284|ref|NM_003765.1|[4507284]; 640:
sapiens a disintegrin and metalloproteinase domain 9 (meltrin gamma), “(ADAM9), mRNA”,
sapiens cyclin A1 (CCNA1), mRNA”, gi|16306528|ref|NM_003914.2|[16306528]; 658:
sapiens genethonin 1 (GENX-3414), mRNA”, gi|4503976|ref|NM_003943.1|[4503976]; 664:
sapiens BTAF1 RNA polymerase II, B-TFIID transcription factor-associated,”, “170 kDa (Mot1
sapiens thymosin, beta 4, Y-linked (TMSB4Y), mRNA”,
sapiens phosphatidylinositol glycan, class Q (PIGQ), transcript variant 2,”, mRNA,
sapiens pituitary tumor-transforming 1 (PTTG1), mRNA”,
sapiens 15 kDa selenoprotein (SEP15), transcript variant 1, mRNA”,
sapiens carbamoyl-phosphate synthetase 2, aspartate transcarbamylase, and”, “dihydroorotase
sapiens glypican 5 (GPC5), mRNA”, gi|34106705|ref|NM_004466.3|[34106705]; 752:
sapiens FK506 binding protein 2, 13 kDa (FKBP2), transcript variant 1, mRNA”,
sapiens NADH dehydrogenase (ubiquinone) Fe—S protein 5, 15 kDa”, “(NADH-coenzyme Q
sapiens Rab geranylgeranyltransferase, alpha subunit (RABGGTA), transcript”, “variant 2,
sapiens cytokeratin type II (K6HF), mRNA”, gi|4758617|ref|NM_004693.1|[4758617]; 801:
sapiens RNA, U3 small nucleolar interacting protein 2 (RNU3IP2), mRNA”,
sapiens solute carrier family 9 (sodium/hydrogen exchanger), isoform 3”, “regulatory factor 2
sapiens slit homolog 2 (Drosophila) (SLIT2), mRNA”, gi|4759145|ref|NM_004787.1|[4759145];
sapiens carbohydrate sulfotransferase 10 (CHST10), mRNA”,
sapiens vacuolar protein sorting 4B (yeast) (VPS4B), mRNA”,
sapiens multiple inositol polyphosphate histidine phosphatase, 1 (MINPP1),”, mRNA,
sapiens potassium voltage-gated channel, Shab-related subfamily, member 1”, “(KCNB1),
sapiens kinesin family member 5A (KIF5A), mRNA”, gi|4826807|ref|NM_004984.1|[4826807];
sapiens protein geranylgeranyltransferase type I, beta subunit (PGGT1B),”, mRNA,
sapiens phospholipase A2, group VII (platelet-activating factor”, “acetylhydrolase, plasma)
sapiens tumor necrosis factor (ligand) superfamily, member 18 (TNFSF18),”, mRNA,
sapiens squamous cell carcinoma antigen recognised by T cells (SART1), mRNA”,
sapiens v-ets erythroblastosis virus E26 oncogene homolog 2 (avian) (ETS2),”, mRNA,
Homo sapiens G protein-coupled receptor 37 (endothelin receptor type B-like), “(GPR37),
sapiens glucose regulated protein, 58 kDa (GRP58), mRNA”,
sapiens hemoglobin, epsilon 1 (HBE1), mRNA”, gi|28302129|ref|NM_005330.3|[28302129];
sapiens protein phosphatase 1, regulatory (inhibitor) subunit 3C (PPP1R3C),”, mRNA,
Homo sapiens hyperpolarization activated cyclic nucleotide-gated potassium, “channel 4
sapiens dipeptidylpeptidase 3 (DPP3), transcript variant 1, mRNA”,
Homo sapiens ubiquitin carboxyl-terminal esterase L3 (ubiquitin thiolesterase), “(UCHL3),
sapiens heparan sulfate (glucosamine) 3-O-sulfotransferase 3A1 (HS3ST3A1),”, mRNA,
sapiens TRK-fused gene (TFG), mRNA”, gi|34147663|ref|NM_006070.3|[34147663]; 1079:
sapiens sex comb on midleg-like 2 (Drosophila) (SCML2), mRNA”,
sapiens kinetochore associated 2 (KNTC2), mRNA”, gi|5174456|ref|NM_006101.1|[5174456];
sapiens translocase of outer mitochondrial membrane 40 homolog (yeast), “(TOMM40),
sapiens mannosidase,
sapiens phosphodiesterase 6H, cGMP-specific, cone, gamma (PDE6H), mRNA”,
sapiens polymerase (RNA) II (DNA directed) polypeptide H (POLR2H), mRNA”,
sapiens peripherin (PRPH), mRNA”, gi|21264344|ref|NM_006262.2|[21264344]; 1116:
sapiens C2f protein (C2F), mRNA”, gi|31652261|ref|NM_006331.3|[31652261]; 1128:
sapiens progesterone-induced blocking factor 1 (PIBF1), mRNA”,
sapiens synaptonemal complex protein SC65 (SC65), mRNA”,
sapiens chromosome 1 open reading frame 2 (C1orf2), transcript variant 1,”, mRNA,
sapiens ret finger protein-like 3 (RFPL3), mRNA”, gi|5730012|ref|NM_006604.1|[5730012];
sapiens polo-like kinase 2 (Drosophila) (PLK2), mRNA”,
sapiens serologically defined colon cancer antigen 8 (SDCCAG8), mRNA”,
sapiens trophoblast glycoprotein (TPBG), mRNA”, gi|34222307|ref|NM_006670.3|[34222307];
sapiens cisplatin resistance associated (CRA), mRNA”, gi|5870890|ref|NM_006697.1|[5870890];
Homo sapiens COP9 constitutive photomorphogenic homolog subunit 5 (Arabidopsis),
sapiens inner membrane protein, mitochondrial (mitofilin) (IMMT), mRNA”,
sapiens transcription termination factor, mitochondrial (MTERF), nuclear”, “gene encoding
sapiens RAB, member of RAS oncogene family-like 2A (RABL2A), transcript”, “variant 2,
sapiens solute carrier family 6 (neurotransmitter transporter), member 14”, “(SLC6A14),
sapiens kelch-like 2, Mayven (Drosophila) (KLHL2), mRNA”,
sapiens general transcription factor IIIC, polypeptide 3, 102 kDa (GTF3C3),”, mRNA,
sapiens F-box only protein 22 (FBXO22), transcript variant 2, mRNA”,
sapiens forkhead box D3 (FOXD3), mRNA”, gi|6912371|ref|NM_012183.1|[6912371]; 1343:
Homo sapiens sirtuin (silent mating type information regulation 2 homolog) 2 (S., “cerevisiae)
sapiens kelch-like ECH-associated protein 1 (KEAP1), mRNA”,
sapiens SEC22 vesicle trafficking protein-like 2 (S. cerevisiae) (SEC22L2),”, mRNA,
sapiens translocase of inner mitochondrial membrane 10 homolog (yeast), “(TIMM10), mRNA”,
sapiens testes-specific protease 50 (TSP50), mRNA”,
sapiens GDP-mannose pyrophosphorylase A (GMPPA), mRNA”,
sapiens unc-50 homolog (C. elegans) (UNC50), mRNA”,
sapiens response gene to complement 32 (RGC32), mRNA”,
sapiens SWI/SNF related, matrix associated, actin dependent regulator of”, “chromatin,
sapiens glyceronephosphate O-acyltransferase (GNPAT), mRNA”,
sapiens solute carrier family 25 (mitochondrial carrier; ornithine, “transporter) member 15
sapiens nitrogen fixation cluster-like (NIFU), mRNA”,
sapiens phosphatidylserine decarboxylase (PISD), mRNA”,
sapiens heat shock 27 kDa protein 8 (HSPB8), mRNA”,
sapiens TAF5-like RNA polymerase II, p300/CBP-associated factor”, “(PCAF)-associated
sapiens torsin family 1, member B (torsin B) (TOR1B), mRNA”,
sapiens nephrosis 2, idiopathic, steroid-resistant (podocin) (NPHS2), mRNA”,
sapiens KIAA0406 gene product (KIAA0406), mRNA”,
sapiens KIAA0195 gene product (KIAA0195), mRNA”,
sapiens BMS1-like, ribosome assembly protein (yeast) (BMS1L), mRNA”,
sapiens FERM and PDZ domain containing 1 (FRMPD1), mRNA”,
sapiens HSV-1 stimulation-related gene 1 (HSRG1), mRNA”,
sapiens dishevelled associated activator of morphogenesis 1 (DAAM1), mRNA”,
sapiens tripartite motif-containing 9 (TRIM9), transcript variant 1, mRNA”,
sapiens activity-dependent neuroprotector (ADNP), transcript variant 1,”, mRNA,
sapiens coiled-coil domain containing 9 (CCDC9), mRNA”,
sapiens glioma tumor suppressor candidate region gene 2 (GLTSCR2), mRNA”,
sapiens Wilms tumor associated protein (WIT-1), mRNA”,
sapiens membrane-bound transcription factor protease, site 2 (MBTPS2), mRNA”,
sapiens geminin, DNA replication inhibitor (GMNN), mRNA”,
sapiens ATPase, H+ transporting, lysosomal 50/57 kD V1 subunit H (ATP6V1H),”, mRNA,
sapiens solute carrier family 35, member C2 (SLC35C2), transcript variant”, “2, mRNA”,
sapiens thioredoxin-related transmembrane protein 2 (TMX2), mRNA”,
sapiens chromosome 14 open reading frame 111 (C14orf111), mRNA”,
C. elegans anterior pharynx defective 1A, “(APH-1A), mRNA”,
sapiens CGI-85 protein (CGI-85), transcript variant 2, mRNA”,
sapiens chromosome 20 open reading frame 45 (C20orf45), mRNA”,
sapiens mitochondrial ribosomal protein S16 (MRPS16), nuclear gene encoding”, “mitochondrial
sapiens mitochondrial ribosomal protein S18C (MRPS18C), nuclear gene”, “encoding
Homo sapiens mitochondria-associated protein involved in granulocyte-macrophage, “colony-
sapiens mitochondrial ribosomal protein S33 (MRPS33), nuclear gene encoding”, “mitochondrial
sapiens golgi autoantigen, golgin subfamily a, 7 (GOLGA7), mRNA”,
sapiens debranching enzyme homolog 1 (S. cerevisiae) (DBR1), mRNA”,
sapiens dehydrogenase/reductase (SDR family) member 8 (DHRS8), mRNA”,
sapiens MO25 protein (MO25), mRNA”, gi|19745179|ref|NM_016289.2|[19745179]; 1791:
sapiens RA-regulated nuclear matrix-associated protein (RAMP), mRNA”,
sapiens hydroxyacid oxidase 2 (long chain) (HAO2), mRNA”,
sapiens SCAN domain containing 1 (SCAND1), transcript variant 1, mRNA”,
sapiens TRAF and TNF receptor associated protein (TTRAP), mRNA”,
sapiens hypothetical protein LOC51321 (LOC51321), mRNA”,
sapiens frizzled homolog 3 (Drosophila) (FZD3), mRNA”,
sapiens hypothetical protein FLJ20156 (FLJ20156), mRNA”,
sapiens hypothetical protein FLJ20546 (FLJ20546), mRNA”,
sapiens peroxisome biogenesis factor 26 (PEX26), mRNA”,
sapiens zinc finger, DHHC domain containing 4 (ZDHHC4), mRNA”,
sapiens WD repeat domain 11 (WDR11), mRNA”, gi|22547233|ref|NM_018117.10|[22547233];
sapiens solute carrier family 4 (anion exchanger), member 1, adaptor”, “protein (SLC4A1AP),
sapiens hypothetical protein FLJ10661 (FLJ10661), mRNA”,
sapiens F-box and leucine-rich repeat protein 8 (FBXL8), mRNA”,
sapiens transcription factor SMIF (HSA275986), mRNA”,
sapiens chromosome 9 open reading frame 46 (C9orf46), mRNA”,
Homo sapiens LanC lantibiotic synthetase component C-like 2 (bacterial), “(LANCL2), mRNA”,
sapiens cytochrome c, somatic (CYCS), nuclear gene encoding mitochondrial”, “protein,
Homo sapiens translocase of outer mitochondrial membrane 7 homolog (yeast), “(TOMM7),
sapiens hypothetical protein LOC55954 (LOC55954), mRNA”,
sapiens ATP-binding cassette, sub-family A (ABC1), member 7 (ABCA7),”, “transcript variant
sapiens chromosome 8 open reading frame 4 (C8orf4), mRNA”,
sapiens hypothetical protein from EUROIMAGE 2021883 (LOC56926), mRNA”,
sapiens ACN9 homolog (S. cerevisiae) (ACN9), mRNA”,
sapiens DC6 protein (DC6), mRNA”, gi|34222364|ref|NM_020189.4|[34222364]; 2206:
sapiens hepatocellular carcinoma susceptibility protein (HCCA3), mRNA”,
sapiens x 009 protein (MDS009), mRNA”, gi|34222368|ref|NM_020234.3|[34222368]; 2220:
sapiens chromosome 6 open reading frame 210 (C6orf210), mRNA”,
sapiens nuclear pore complex protein (NUP107), mRNA”,
sapiens poly(rC) binding protein 4 (PCBP4), transcript variant 1, mRNA”,
sapiens chromosome 11 open reading frame 17 (C11orf17), transcript variant”, “2, mRNA”,
sapiens brain-enriched guanylate kinase-associated protein (KIAA1446), mRNA”,
sapiens histone 1, H3f (HIST1H3F), mRNA”, gi|21396497|ref|NM_021018.2|[21396497]; 2274:
sapiens disabled homolog 1 (Drosophila) (DAB1), mRNA”,
sapiens NFS1 nitrogen fixation 1 (S. cerevisiae) (NFS1), nuclear gene”, “encoding mitochondrial
sapiens homeo box D12 (HOXD12), mRNA”, gi|23510369|ref|NM_021193.2|[23510369]; 2296:
sapiens Rho GTPase activating protein 22 (ARHGAP22), mRNA”,
sapiens potassium intermediate/small conductance calcium-activated channel,”, “subfamily N,
sapiens aristaless-like homeobox 4 (ALX4), mRNA”,
sapiens Ras-related GTP binding C (RRAGC), mRNA”,
sapiens MMS19-like (MET18 homolog, S. cerevisiae) (MMS19L), mRNA”,
sapiens rhomboid family 1 (Drosophila) (RHBDP1), mRNA”,
sapiens fibrosin 1 (FBS1), mRNA”, gi|11967986|ref|NM_022452.1|[11967986]; 2386:
sapiens early B-cell factor 2 (EBF2), mRNA”, gi|12056972|ref|NM_022659.1|[12056972]; 2399:
sapiens chromosome 10 open reading frame 66 (C10orf66), mRNA”,
sapiens hypothetical protein MGC2731 (MGC2731), mRNA”,
sapiens ankyrin repeat and SOCS box-containing 8 (ASB8), mRNA”,
sapiens hypothetical protein MGC4614 (MGC4614), mRNA”,
sapiens fukutin related protein (FKRP), mRNA”, gi|36951139|ref|NM_024301.2|[36951139];
sapiens chromosome 6 open reading frame 211 (C6orf211), mRNA”,
sapiens solute carrier family 12 (potassium/chloride transporters), member”, “8 (SLC12A8),
sapiens hypothetical protein FLJ11753 (FLJ11753), mRNA”,
sapiens THAP domain containing 9 (THAP9), mRNA”,
sapiens human immune associated nucleotide 2 (hIAN2), mRNA”,
sapiens likely ortholog of mouse Shc SH2-domain binding protein 1 (SHCBP1),”, mRNA,
sapiens chromosome 9 open reading frame 82 (C9orf82), mRNA”,
sapiens hypothetical protein FLJ13385 (FLJ13385), mRNA”,
sapiens hypothetical protein FLJ21918 (FLJ21918), mRNA”,
sapiens hypothetical protein FLJ12355 (FLJ12355), mRNA”,
sapiens mitochondrial elongation factor G1 (EFG1), nuclear gene encoding”, “mitochondrial
sapiens hypothetical protein FLJ13096 (FLJ13096), mRNA”,
sapiens pseudouridylate synthase 1 (PUS1), mRNA”,
sapiens chromosome 12 open reading frame 22 (C12orf22), mRNA”,
sapiens testis expressed sequence 12 (TEX12), mRNA”,
sapiens elastin microfibril interfacer 2 (EMILIN2), mRNA”,
sapiens galactose-4-epimerase, UDP (GALE), mRNA”,
sapiens cytochrome P450, family 1, subfamily A, polypeptide 1 (CYP1A1),”, mRNA,
sapiens actinin, alpha 3 (ACTN3), mRNA”, gi|4557240|ref|NM_001104.1|[4557240]; 29:
sapiens Rho GTPase activating protein 4 (ARHGAP4), mRNA”,
sapiens creatine kinase, brain (CKB), mRNA”, gi|34335231|ref|NM_001823.3|[34335231]; 42:
sapiens cytochrome c oxidase subunit VIIa polypeptide 1 (muscle) (COX7A1),”, mRNA,
sapiens casein kinase 1, delta (CSNK1D), transcript variant 1, mRNA”,
sapiens glutathione peroxidase 2 (gastrointestinal) (GPX2), mRNA”,
sapiens REV3-like, catalytic subunit of DNA polymerase zeta (yeast)”, “(REV3L), mRNA”,
sapiens ring finger protein 4 (RNF4), mRNA”, gi|34305289|ref|NM_002938.2|[34305289]; 73:
sapiens ubiquitin-conjugating enzyme E2E 1 (UBC4/5 homolog, yeast)”, “(UBE2E1), transcript
sapiens voltage-dependent anion channel 1 (VDAC1), mRNA”,
sapiens barrier to autointegration factor 1 (BANF1), mRNA”,
sapiens carbohydrate (N-acetylglucosamine-6-O) sulfotransferase 2 (CHST2), ”, mRNA,
Homo sapiens BRCA1 associated protein-1 (ubiquitin carboxy-terminal hydrolase), “(BAP1),
sapiens protein phosphatase 1, regulatory (inhibitor) subunit 3C (PPP1R3C),”, mRNA,
sapiens neighbor of COX4 (NOC4), mRNA”, gi|34147520|ref|NM_006067.3|[34147520]; 161:
sapiens NEL-like 1 (chicken) (NELL1), mRNA”, gi|5453763|ref|NM_006157.1|[5453763]; 165:
sapiens prophet of Pit1, paired-like homeodomain transcription factor”, “(PROP1), mRNA”,
sapiens retinoid X receptor, gamma (RXRG), mRNA”,
sapiens ubiquinol-cytochrome c reductase complex (7.2 kD) (HSPC051), mRNA”,
sapiens FERM and PDZ domain containing 1 (FRMPD1), mRNA”,
sapiens chromosome 20 open reading frame 9 (C20orf9), mRNA”,
sapiens nemo like kinase (NLK), mRNA”, gi|42734431|ref|NM_016231.2|[42734431]; 244:
sapiens solute carrier family 22 (organic anion/cation transporter), member”, “11 (SLC22A11),
sapiens PR domain containing 10 (PRDM10), transcript variant 1, mRNA”,
sapiens histone 1, H3f (HIST1H3F), mRNA”, gi|21396497|ref|NM_021018.2|[21396497]; 285:
sapiens potassium channel, subfamily K, member 10 (KCNK10), transcript”, “variant 1,
sapiens blepharophimosis, epicanthus inversus and ptosis, candidate 1”, “(BPESC1), mRNA”,
sapiens NEDD9 interacting protein with calponin homology and LIM domains, “(NICAL),
sapiens nucleoporin Nup37 (Nup37), mRNA”, gi|34222120|ref|NM_024057.2|[34222120]; 314:
sapiens ring finger protein 39 (RNF39), transcript variant 1, mRNA”,
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US04/19017 | 6/14/2004 | WO | 6/15/2006 |
Number | Date | Country | |
---|---|---|---|
60478238 | Jun 2003 | US | |
60525548 | Nov 2003 | US | |
60559141 | Apr 2004 | US |