Obesity is a condition characterized by the accumulation of excess body fat, and is linked to a negative effect on health. In general, accumulation of body fat is a physiological consequence when caloric intake exceeds an individual's physiological energy requirements. However, the underlying mechanisms of obesity are postulated to result from an interplay of both environmental and genetic factors. In 1962 James Neel posited the existence of a “thrifty gene” that enhances survival in times of famine and promotes metabolic diseases in times of nutritional excess. Thus, under certain dietary conditions, genes controlling appetite and metabolism may predispose an individual to obesity. Indeed, the abundant availability of food in recent times, especially energy-rich and processed foods, has been accompanied by a sharp increase in the prevalence of obesity, revealing pre-existing genetic predispositions to the efficient generation and storage of body fat.
Compared to most populations worldwide, obesity is much more prevalent in Samoans. This high prevalence is likely due to a combination of at least three influences: (i) changes in dietary and physical activity in a small population exposed to economic development, (ii) historical selective pressures enriching for energy-efficient genetic variants that promote survival in times of starvation but obesity in the modern context of low physical activity and caloric excess, (iii) and genetic divergence from other populations due to founder effects, geographic isolation, and population bottlenecks. As a result, Samoans represent a unique population for identifying novel genetic factors contributing to obesity.
Obesity is essentially a disorder of energy homeostasis (i.e., an imbalance between intake and expenditure) and has strong genetic and environmental components. Indeed, as diets have modernized and physical activity has decreased, rates of overweight and obesity in the Samoan population have escalated to among the highest in the world. In 2003, 84% of women and 68% of men in Samoa were overweight or obese by Polynesian cutoffs (BMI>26 kg/m2); in 2010, the prevalence had increased to 91% and 80%, respectively. Although environmental contributors to this trend are clear, the estimated 45% heritability of BMI in this population remains largely unexplained. Samoan genetic susceptibility to obesity in the contemporary obesogenic environment may have resulted from putative advantages of efficient metabolism during 3,000 years of island discoveries, settlement, and population dynamics and/or from genetic drift due to founder effects, small population sizes, and population bottlenecks.
Although obesity is a serious health concern, obesity is a preventable cause of death. The ability to assess an individual's genetic predisposition to obesity would allow for early intervention and treatment. Additionally, understanding molecular mechanisms regulating metabolic efficiency has the potential to provide new therapies for obesity and metabolic disorders. To address these unmet needs, new methods of assessing and characterizing obesity are urgently required.
The present invention provides compositions and methods for identifying a subject as having a genetic predisposition to obesity or at risk of developing obesity (e.g., BMI>30 kg/m2). The present invention also provides compositions and methods for expressing a CREBRF polypeptide of the invention in an adipocyte or precursor thereof and cells expressing a nucleic acid molecule encoding a CREBRF polypeptide of the invention.
Thus, in one aspect, the invention provides a recombinant cell comprising a promoter operably linked to a nucleic acid sequence encoding a CREB3 Regulatory Factor (CREBRF) polypeptide.
In an embodiment of this aspect of the invention, the cell is selected from the group consisting of a preadipocyte, an adipocyte, an hepatocyte and precursors thereof. In one embodiment, cell is an adipocyte and is differentiated from a 3T3-L1 cell. In a related embodiment, the promoter is an adipocyte specific promoter. In another embodiment, the cell is an hepatocyte. In one embodiment, the cell hepatic cell is an HepG2 cell. In a related embodiment, the promoter is an hepatocyte specific promoter.
In one embodiment, the nucleic acid comprises a nucleic acid sequence set forth in
In another embodiment, the nucleic acid reduces or eliminates the expression of CREBRF polypeptide. In a further embodiment, the nucleic acid encodes an inhibitory RNA against the CREBRF mRNA. In yet another embodiment, the nucleic acid encodes a shRNA against the CREBRF mRNA.
In one embodiment, the remcombinant cell comprises a CRISPR/Cas9 vector having the nucleic acid sequence that targets the CREBRF gene. In another embodiment, the nucleic acid sequence guides the deletion of the exon 5 of the CREBRF gene. In yet another embodiment, the nucleic acid sequence guides the substitution of arginine at position 457 or its equivalent by glutamine.
Another aspect of the invention provides an expression vector comprising a promoter operably linked to a nucleic acid sequence encoding a CREBRF polypeptide. In one embodiment, the CREBRF polypeptide comprises a glutamine at amino acid position 457. In another embodiment, the expression vector comprises a nucleic acid sequence set forth in
In one embodiment, expression of the nucleic acid reduces or eliminates the expression of CREBRF polypeptide. In another embodiment, the nucleic acid encodes a shRNA against the CREBRF mRNA.
In another aspect, the invention provides an expression vector comprising a CRISP/Cas9 module operably linked to a nucleic acid targeting against a CREBRF gene, wherein the nucleic acid guides the deletion of the exon 5 of the CREBRF gene. In one embodiment, the nucleic acid guides the substitution of arginine at position 457 or its equivalent by glutamine.
Yet another aspect, the invention provides a recombinant cell comprising the expression vector of the aspects and associated embodiments heretofore described.
In another aspect, the invention provides a nucleic acid probe that specifically binds a nucleic acid encoding a CREB3 Regulatory Factor (CREBRF) polypeptide comprising a glutamine at amino acid position 457. In one embodiment, the nucleic acid probe further comprises a detectable label. In another embodiment, the nucleic acid probe is a TaqMan® probe. In yet another embodiment, the nucleic acid probe comprises the nucleic acid sequence:
Another aspect of the invention provides a knock-in mouse comprising a nucleic acid encoding a mutant murine CREB3 Regulatory Factor (CREBRF) polypeptide or a human CREBRF polypeptide. In one embodiment, the mutant CREBRF polypeptide comprise a substitution Arg457Gln or its equivalent. In a related embodiment, the substitution Arg457Gln or its equivalent is in the mouse endogenous CREBRF locus.
In one embodiment, the mouse is a wild type mouse. In another embodiment, the mutation confers thriftiness to the mouse.
The invention also provides a variety of methods that make use of any of the various embodiments of any aspect delineated herein.
Thus, in one aspect, the invention provides a method of enhancing adipogenesis in a cell, increasing lipid accumulation in a cell or of making a cell resistant to starvation, the method comprising causing the cell to express or overexpress a CREB3 Regulatory Factor (CREBRF) polypeptide. In one embodiment, the CREBRF polypeptide comprises a glutamine at amino acid position 457. In another embodiment, the cell is selected from the group consisting of a preadipocyte, an adipocyte, an hepatocyte and precursors thereof. In one embodiment, cell is an adipocyte and is differentiated from a 3T3-L1 cell. In another embodiment, the cell is an hepatocyte. In a related embodiment, the cell hepatic cell is an HepG2 cell. In yet another embodiment, the cell is in a human subject.
Another aspect of the invention provides method of genotyping a subject comprising contacting a cell of the subject with a nucleic acid probe described herein above. In one embodiment, the method further comprises obtaining the nucleic acid probe described herein above.
In yet another embodiment, the invention provides a method of identifying a subject as obese or at risk of obesity, the method comprising detecting one or more alleles encoding a CREB3 Regulatory Factor (CREBRF) polypeptide comprising a glutamine at amino acid position 457 in a biological sample from the subject, wherein the presence of one or more alleles encoding a CREBRF polypeptide comprising a glutamine at amino acid position 457 indicates that the subject is obese or is at risk of obesity.
In a related aspect, the invention provides a method of treating a subject identified as being obese or at risk of obesity, the method comprising administering the said identified subject a therapeutically effective amount of a compound that modulates adipogenesis in a cell of said subject, wherein the subject is identified as being obese or being at risk of obesity, the method comprising detecting one or more alleles encoding a CREB3 Regulatory Factor (CREBRF) polypeptide comprising a glutamine at amino acid position 457 in a biological sample from the subject, wherein the presence of one or more alleles encoding a CREBRF polypeptide comprising a glutamine at amino acid position 457 indicates that the subject is obese or is at risk of obesity.
In one embodiment of the foregoing methods, the allele comprises an A at position 1689 of a CREBRF polynucleotide. In another embodiment, the subject is human.
Another aspect of the invention provides a method of reducing adipogenesis or lipid accumulation in a cell, the method comprising reducing, eliminating or inactivating the adipogenic function of a CREBRF polypeptide in the cell.
In a related aspect, the invention provides a method of making a cell susceptible to starvation, the method comprising reducing, eliminating or inactivating the adipogenic function of a CREB3 Regulatory Factor (CREBRF) polypeptide.
In an embodiment of these methods, exon 5 of a CREBRF gene is deleted from the cell endogenous CREBRF locus. In one embodiment, the cell is selected from the group consisting of a preadipocyte, an adipocyte, an hepatocyte and precursors thereof. In another embodiment, the cell is an adipocyte and is differentiated from a 3T3-L1 cell. In yet another embodiment, the cell is an hepatocyte. In one embodiment, the cell hepatic cell is an HepG2 cell. In still another embodiment, the cell is a cell of a human subject.
Another aspect of the invention provides a method of identifying a compound that modulates the expression of a CREB3 Regulatory Factor (CREBRF) polypeptide, comprising:
a) contacting a nucleic acid that expresses a CREBRF polypeptide with a compound under conditions suitable for expression by the nucleic acid;
b) determining the level of expression of the CREBRF polypeptide;
c) determining the level of expression of the nucleic acid in the absence of the compound; and
d) comparing the level of expression of the nucleic acid after contact with the compound with the level of expression of the nucleic acid without contact of the compound;
thereby identifying a compound that modulates expression of the CREBRF polypeptide. In one embodiment, the compound is contacted with a recombinant cell or a knock-in mouse as heretofore described herein. In one embodiment, the nucleic acid comprises a nucleic acid sequence set forth in
In a related aspect, the invention provides a method of identifying a compound that modulates adipogenesis, the method comprising contacting a recombinant cell or a knock-in mouse, as heretofore described herein, with a compound, and assaying reporter expression in the contacted cell relative to a corresponding control cell, thereby identifying a compound that modulates adipogenesis.
Another related aspect of the invention provides a method of identifying a compound that modulate adipogenesis, the method comprising contacting a recombinant cell or a knock-in mouse, as heretofore described herein, with an shRNA against a gene of interest, and analyzing adipogenesis of the cell relative to a reference, thereby identifying an adipogenesis modulator.
In one embodiment, the adipogenesis of the cell is analyzed by detecting the amplitude, period length and phase of reporter expression. In another embodiment, the reference is an untreated control cell.
In an embodiment, the compound that modulates adipogenesis is an inhibitory nucleic acid molecule, a small organic molecule, or a polypeptide. In a related embodiment, the inhibitory nucleic acid molecule is an shRNA. In another embodiment, the methods further comprises obtaining the recombinant cell or the knock-in mouse described herein above.
In another aspect, the invention provides a kit comprising an expression vector described hierein above and instuctions for use. In a related aspect, the invention provides a kit comprising a nucleic acid probe described hierein above and instuctions for use. In yet another related aspect, the invention provides a kit comprising a knock-in mouse described hierein above and instuctions for use.
In various embodiments of the kits provided by the invention, the instructions for use are for use in accordance with any of the methods described hereinabove.
Other features and advantages of the invention will be apparent from the detailed description, and from the claims.
The invention features compositions and methods that are useful for identifying a subject as having a genetic predisposition to obesity or at risk of developing obesity (e.g., BMI>30 kg/m2). The invention also provides compositions and methods for expressing a CREBRF polypeptide of the invention in an adipocyte or precursor thereof and cells expressing a nucleic acid molecule encoding a CREBRF polypeptide of the invention. The invention still further provides methods for identifying compounds that modulate adipogenesis (i.e., screening assays), as well as kits for practicing the methods of the inventions.
Before further description of the invention, certain terms employed in the specification, examples and appended claims are, for convenience, collected here.
By “adipocyte” is meant a cell that stores fat (e.g., triglycerides and cholesteryl ester). Adipocytes are the main constituent of body fat or adipose tissue.
By “adipogenesis” is meant the process in which a preadipocyte differentiates into an adipocyte.
By “alteration” is meant a change (increase or decrease) in the expression levels or activity of a gene or polypeptide as detected by standard art known methods such as those described herein. As used herein, an alteration includes a 10% change in expression levels, preferably a 25% change, more preferably a 40% change, and most preferably a 50% or greater change in expression levels.”
By “cell survival” is meant cell viability.
By “reducing cell death” is meant reducing the propensity or probability that a cell will die. Cell death can be apoptotic, necrotic, or by any other means.
In this disclosure, “comprises,” “comprising,” “containing” and “having” and the like can have the meaning ascribed to them in U.S. Patent law and can mean “includes,” “including,” and the like; “consisting essentially of” or “consists essentially” likewise has the meaning ascribed in U.S. Patent law and the term is open-ended, allowing for the presence of more than that which is recited so long as basic or novel characteristics of that which is recited is not changed by the presence of more than that which is recited, but excludes prior art embodiments.
By “CREB3 regulatory factor (CREBRF) polypeptide” is meant a polypeptide or fragment thereof having at least about 85% or greater amino acid identity to the amino acid sequence provided at NCBI Accession No. NP_705835 and having DNA binding, protein binding, and transcriptional regulatory activity. CREBRF binds CREB3, promotes CREB3 degradation, and represses CREB3 transcriptional activity. An exemplary CREBRF amino acid sequence having an arginine at position 457 is provided below:
An exemplary CREBRF amino acid sequence having a glutamine at position 457 is provided below:
By “CREBRF nucleic acid molecule” is meant a polynucleotide encoding a CREBRF polypeptide. An exemplary CREBRF nucleic acid molecule sequence is provided at NCBI Accession No. NM_153607. An exemplary CREBRF nucleic acid sequence having a G at nucleotide position 1689 is provided below:
An exemplary CREBRF nucleic acid sequence having an A at nucleotide position 1689 is provided below:
By “rs373863828” is meant a single nucleotide polymorphism (SNP) 1689G→A in CREBRF, resulting in an arginine to glutamine change (R457Q) in the CREBRF polypeptide.
By “Cyclic AMP-responsive element-binding protein 3 (CREB3) polypeptide” is meant a polypeptide or fragment thereof having at least about 85% or greater amino acid identity to the amino acid sequence provided at NCBI Accession No. NP_006359 and having DNA binding, protein binding, and transcriptional regulatory activities. An exemplary CREB3 amino acid sequence is provided below:
By “CREB3 nucleic acid molecule” is meant a polynucleotide encoding a CREB3 polypeptide. An exemplary CREB3 nucleic acid molecule sequence is provided at NCBI Accession No. NM— 006368. An exemplary CREB3 nucleic acid sequence is provided below:
“Derived from” as used herein refers to the process of obtaining a cell from a subject, embryo, biological sample, or cell culture.
“Detect” refers to identifying the presence, absence or amount of the object to be detected.
By “detectable reporter” is meant a composition that when linked to a molecule of interest renders the latter detectable, via spectroscopic, photochemical, biochemical, immunochemical, or chemical means. For example, useful labels include radioactive isotopes, magnetic beads, metallic beads, colloidal particles, fluorescent dyes, electron-dense reagents, enzymes (for example, as commonly used in an ELISA), biotin, digoxigenin, or haptens.
By “fragment” is meant a portion of a polypeptide or nucleic acid molecule. This portion contains, preferably, at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% of the entire length of the reference nucleic acid molecule or polypeptide. A fragment may contain 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 nucleotides or amino acids. In certain embodiments, the fragment retains the activity of the polypeptide or nucleic acid molecule of which it is a fragment.
As used herein, “recombinant” includes reference to a polypeptide produced using cells that express a heterologous polynucleotide encoding the polypeptide. The cells produce the recombinant polypeptide because they have been genetically altered by the introduction of the appropriate isolated nucleic acid sequence. The term also includes reference to a cell, or nucleic acid, or vector, that has been modified by the introduction of a heterologous nucleic acid or the alteration of a native nucleic acid to a form not native to that cell, or that the cell is derived from a cell so modified. Thus, for example, recombinant cells express genes that are not found within the native (non-recombinant) form of the cell, express mutants of genes that are found within the native form, or express native genes that are otherwise abnormally expressed, under-expressed or not expressed at all.
By “isolated polynucleotide” is meant a nucleic acid (e.g., a DNA) that is free of the genes which, in the naturally-occurring genome of the organism from which the nucleic acid molecule of the invention is derived, flank the gene. The term therefore includes, for example, a recombinant DNA that is incorporated into a vector; into an autonomously replicating plasmid or virus; or into the genomic DNA of a prokaryote or eukaryote; or that exists as a separate molecule (for example, a cDNA or a genomic or cDNA fragment produced by PCR or restriction endonuclease digestion) independent of other sequences. In addition, the term includes an RNA molecule that is transcribed from a DNA molecule, as well as a recombinant DNA that is part of a hybrid gene encoding additional polypeptide sequence.
By an “isolated polypeptide” is meant a polypeptide of the invention that has been separated from components that naturally accompany it. Typically, the polypeptide is isolated when it is at least 60%, by weight, free from the proteins and naturally-occurring organic molecules with which it is naturally associated. Preferably, the preparation is at least 75%, more preferably at least 90%, and most preferably at least 99%, by weight, a polypeptide of the invention. It need not be purified to homogeneity. An isolated polypeptide of the invention may be obtained, for example, by extraction from a natural source, by expression of a recombinant nucleic acid encoding such a polypeptide; or by chemically synthesizing the protein. Purity can be measured by any appropriate method, for example, column chromatography, polyacrylamide gel electrophoresis, or by HPLC analysis.
By an “isolated cell” is meant a cell of the invention that has been separated from components that naturally accompany it, including, e.g., cells and cellular debris. In one embodiment, the disclosure provides an isolated cell comprising a nucleic acid sequence as disclosed herein.
By “marker” is meant any protein or polynucleotide having an alteration in expression level or activity that is associated with a disease or disorder. As used herein, “obtaining” as in “obtaining an agent” includes synthesizing, purchasing, or otherwise acquiring the agent.
The term “obesity” as used herein, refers to a condition characterized by the accumulation of excess body fat. Obesity can have a negative effect on health, leading to reduced life expectancy and/or increased health problems. Obesity may be evaluated by assessing a subject's body mass index (BMI), which is obtained by dividing a subject's weight by the square of the subject's height and/or by assessing fat distribution via the waist-hip ratio and total cardiovascular risk factor. A BMI between 18.50-24.99 kg/m2 classifies an individual as having normal weight, between 25.00-29.99 kg/m2 as being overweight, and exceeding 30 kg/m2 as being obese.
By “promoter” is meant a promoter, e.g., a viral promoter, that is capable of initiating expression in a cell. Such cells include cells selected from the group consisting of a preadipocyte, an adipocyte, an hepatocyte (e.g., an HepG2 cell) and precursors thereof. In various embodiments, cell specific promoters are capable of initiating expression of that cell. In certain embodiments, such cells are mammalian cells (e.g., human cells).
As used herein, the terms “prevent,” “preventing,” “prevention,” “prophylactic treatment” and the like refer to reducing the probability of developing a disorder or condition in a subject, who does not have, but is at risk of or susceptible to developing a disorder or condition.
By “reference” is meant a standard or control condition. As is apparent to one skilled in the art, an appropriate reference is where an element is changed in order to determine the effect of the element.
A “reference sequence” is a defined sequence used as a basis for sequence comparison. A reference sequence may be a subset of or the entirety of a specified sequence; for example, a segment of a full-length cDNA or gene sequence, or the complete cDNA or gene sequence. For polypeptides, the length of the reference polypeptide sequence will generally be at least about 16 amino acids, preferably at least about 20 amino acids, more preferably at least about 25 amino acids, and even more preferably about 35 amino acids, about 50 amino acids, or about 100 amino acids. For nucleic acids, the length of the reference nucleic acid sequence will generally be at least about 50 nucleotides, preferably at least about 60 nucleotides, more preferably at least about 75 nucleotides, and even more preferably about 100 nucleotides or about 300 nucleotides or any integer thereabout or there between.
By “subject” is meant a mammal, including, but not limited to, a human or non-human mammal, such as a bovine, porcine, equine, canine, ovine, murine or feline.
By “modulator” is meant any compound/agent that alters a biological function or activity of a cell. A modulator includes, without limitation, compounds/agents that reduce or eliminate a biological function or activity of a cell (e.g., an “inhibitor”). For example, a modulator may inhibit adipogenesis of a cell. A modulator includes, without limitation, compounds/agents that enhance or increase a biological function or activity of a cell. For example, a modulator may promote adipogenesis of a cell.
The term “modulate” is intended to encompass, in its various grammatical forms (e.g., “modulated”, “modulation”, “modulating”, etc.), up-regulation, induction, stimulation, potentiation, localization changes (e.g., movement of a protein from one cellular compartment to another) and/or relief of inhibition, as well as inhibition and/or down-regulation.
The term “compound” is intended include, but is not limited to, peptides, nucleic acids, carbohydrates, non-peptidic compounds, and natural product extracts.
The term “non-peptidic compound” is intended to encompass compounds that are comprised, at least in part, of molecular structures different from naturally-occurring L-amino acid residues linked by natural peptide bonds. However, “non-peptidic compounds” are intended to include compounds composed, in whole or in part, of peptidomimetic structures, such as D-amino acids, non-naturally-occurring L-amino acids, modified peptide backbones and the like, as well as compounds that are composed, in whole or in part, of molecular structures unrelated to naturally-occurring L-amino acid residues linked by natural peptide bonds, for example small organic molecules. “Non-peptidic compounds” also are intended to include natural products.
The terms “compound” and “agent” are used interchangeably in the context of the invention.
The terms “operably linked” is intended to mean that molecules are functionally coupled to each other in that the change of activity or state of one molecule is affected by the activity or state of the other molecule. For example, an adipocyte specific promoter operably linked to a nucleic acid sequence encoding a CREBRF polypeptide
By “substantially identical” is meant a polypeptide or nucleic acid molecule exhibiting at least 50% identity to a reference amino acid sequence (for example, any one of the amino acid sequences described herein) or nucleic acid sequence (for example, any one of the nucleic acid sequences described herein). Preferably, such a sequence is at least 60%, more preferably 80% or 85%, and more preferably 90%, 95% or even 99% identical at the amino acid level or nucleic acid to the sequence used for comparison.
Sequence identity is typically measured using sequence analysis software (for example, Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 University Avenue, Madison, Wis. 53705, BLAST, BESTFIT, GAP, or PILEUP/PRETTYBOX programs). Such software matches identical or similar sequences by assigning degrees of homology to various substitutions, deletions, and/or other modifications. Conservative substitutions typically include substitutions within the following groups: glycine, alanine; valine, isoleucine, leucine; aspartic acid, glutamic acid, asparagine, glutamine; serine, threonine; lysine, arginine; and phenylalanine, tyrosine. In an exemplary approach to determining the degree of identity, a BLAST program may be used, with a probability score between e−3 and e−100 indicating a closely related sequence.
The invention provides polynucleotides and polypeptides described herein, and fragments thereof; and polynucleotides and polypeptides that are substantially identical the polynucleotides and polypeptides described herein.
By “target gene” is meant a gene, the expression of which is directly or indirectly regulated by CREBRF. For example, CREB3 is a target gene directly regulated by CREBRF. The expression of an adipogenic marker gene, such as Pparg2, Cebpa, or Adipoq, can be directly or indirectly regulated by CREBRF and these adipogenic marker genes are also target genes. In one embodiment, CREBRF regulates a gene by binding to the gene's promoter.
By “transgenic” is meant any cell which includes a DNA sequence which is inserted by artifice into a cell and becomes part of the genome of the organism which develops from that cell. As used herein, the transgenic organisms are generally transgenic mammalian (e.g., rodents such as rats or mice) and the DNA (transgene) is inserted by artifice into the nuclear genome. In one embodiment, the transgenic mouse is a knock-in mouse comprising an p.Arg457Gln mutation in CREBRF gene.
As used herein the term “knock-in” is intended to encompass a genetic engineering method that involves the one-for-one substitution of DNA sequence information with a wild-type copy in a genetic locus or the insertion of sequence information not found within the locus. Typically, this is done in mice because the technology for this process is more refined and there is a high degree of shared sequence complexity between mice and humans. The difference between knock-in technology and traditional transgenic techniques is that a knock-in involves a gene inserted into a specific locus, and is thus a “targeted” insertion. The knock-in mice disclosed herein provide disease models for obesity and allow for the study of the function of the regulatory machinery (e.g. promoters) that governs the expression of the natural gene being replaced. This is accomplished by observing the new phenotype of the organism in question.
As used herein, the terms “treat,” treating,” “treatment,” and the like refer to reducing or ameliorating a disorder and/or symptoms associated therewith. It will be appreciated that, although not precluded, treating a disorder or condition does not require that the disorder, condition or symptoms associated therewith be completely eliminated.
By “reduce” or “reduces” is meant a negative alteration of at least 10%, 25%, 50%, 75%, or 100%.
Ranges provided herein are understood to be shorthand for all of the values within the range. For example, a range of 1 to 50 is understood to include any number, combination of numbers, or sub-range from the group consisting 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50.
Unless specifically stated or obvious from context, as used herein, the term “or” is understood to be inclusive. Unless specifically stated or obvious from context, as used herein, the terms “a”, “an”, and “the” are understood to be singular or plural.
Unless specifically stated or obvious from context, as used herein, the term “about” is understood as within a range of normal tolerance in the art, for example within 2 standard deviations of the mean. About can be understood as within 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.5%, 0.1%, 0.05%, or 0.01% of the stated value. Unless otherwise clear from context, all numerical values provided herein are modified by the term about.
The recitation of a listing of chemical groups in any definition of a variable herein includes definitions of that variable as any single group or combination of listed groups. The recitation of an embodiment for a variable or aspect herein includes that embodiment as any single embodiment or in combination with any other embodiments or portions thereof.
The invention is based, at least in part, on the discovery of a CREBRF variant resulting in an Arg457Gln mutation that was strongly associated with body mass index (BMI) (P=5.3×10−14), in a genome-wide association study (GWAS) of obesity-related traits conducted in 3,072 individuals from Samoa. This finding was replicated (P=1.2×10−9) in other samples from Samoa and American Samoa. Targeted sequencing analysis revealed that this signal is associated with the missense variant rs373863828 (p.Arg457Gln) in CREBRF (meta P=1.4×10−20). This variant is common in Samoans (allele frequency of 0.259), but rare in people of African or European descent. In Samoans, each copy of the minor allele increases BMI by 1.58 kg/m2 in females and 0.83 kg/m2 in males, an effect size that is much larger than currently known common BMI risk variants. In the 3T3-L1 preadipocyte cell model, over-expression of both wild-type (WT) and p.Arg457Gln CREBRF human variants promoted adipogenesis in the absence of standard hormonal stimulation and enhanced cell survival in response to nutrition stress. However, compared to WT CREBRF, the p.Arg457Gln CREBRF variant had greater lipid accumulation and lower energy utilization, indicating that p.Arg457Gln is a “thrifty” variant that strongly influences obesity in humans.
The present disclosure further provides isolated nucleic acids encoding the disclosed CREBRF polypeptides and fragments thereof. The nucleic acids may comprise DNA or RNA and may be wholly or partially synthetic or recombinant. Reference to a nucleotide sequence as set out herein encompasses a DNA molecule with the specified sequence, and encompasses a RNA molecule with the specified sequence in which U is substituted for T, unless context requires otherwise.
The present disclosure also provides constructs in the form of plasmids, vectors, phagemids, transcription or expression cassettes that comprise at least one nucleic acid encoding a CREBRF polypeptide or a fragment thereof, disclosed herein. The disclosure further provides a host cell that comprises one or more constructs as above.
Systems for cloning and expression of a polypeptide in a variety of different host cells are well known in the art. For cells suitable for producing polypeptides, see Gene Expression Systems, Academic Press, eds. Fernandez et al., 1999. Briefly, suitable host cells include, but are not limited to yeast, plant, algae, bacterial, mammalian, and insect cells. Mammalian cell lines available in the art for expression of a heterologous polypeptide include Chinese hamster ovary cells, HeLa cells, baby hamster kidney cells, NS0 mouse myeloma cells, and many others. A common bacterial host is E. coli. Any protein expression system compatible with the invention may be used to produce the disclosed proteins. Suitable expression systems include transgenic animals described in Gene Expression Systems, Academic Press, eds. Fernandez et al., 1999.
Suitable vectors can be chosen or constructed, so that they contain appropriate regulatory sequences, including promoter sequences, terminator sequences, polyadenylation sequences, enhancer sequences, marker genes and other sequences as appropriate. Vectors may be plasmids or viral, e.g., phage, or phagemid, as appropriate. For further details see, for example, Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Laboratory Press, 1989. Many known techniques and protocols for manipulation of nucleic acid, for example, in preparation of nucleic acid constructs, mutagenesis, sequencing, introduction of DNA into cells and gene expression, and analysis of proteins, are described in detail in Current Protocols in Molecular Biology, 2nd Edition, eds. Ausubel et al., John Wiley & Sons, 1992.
A still further aspect provides a method comprising introducing such a nucleic acid into a host cell. The introduction may employ any available technique. For eukaryotic cells, suitable techniques may include calcium phosphate transfection, DEAE-Dextran, electroporation, liposome-mediated transfection and transduction using retrovirus or other virus, e.g., vaccinia or, for insect cells, baculovirus. For bacterial cells, suitable techniques may include calcium chloride transformation, electroporation and transfection using bacteriophage. The introduction of the nucleic acid into the cells may be followed by causing or allowing expression from the nucleic acid, e.g., by culturing host cells under conditions for expression of the gene. A wide variety of host cells are available for expressing CREBRF polypeptide mutants of the present invention. Such host cells include, for example, yeast, plant, algae, bacterial, mammalian, and insect cells.
The invention provides methods for enhancing adipogenesis in a cell (e.g., preadipocyte or other adipocyte precursor), comprising expressing or overexpressing a CREBRF polypeptide (e.g., a CREBRF polypeptide having a glutamine at amino acid position 457).
Transducing viral (e.g., retroviral, adenoviral, and adeno-associated viral) vectors can be used to express CREBRF in a cell, especially because of their high efficiency of infection and stable integration and expression (see, e.g., Cayouette et al., Human Gene Therapy 8:423-430, 1997; Kido et al., Current Eye Research 15:833-844, 1996; Bloomer et al., Journal of Virology 71:6641-6649, 1997; Naldini et al., Science 272:263-267, 1996; and Miyoshi et al., Proc. Natl. Acad. Sci. U.S.A. 94:10319, 1997). For example, a polynucleotide encoding a CREBRF polypeptide, variant, or fragment thereof, can be cloned into a retroviral vector and expression can be driven from its endogenous promoter, from the retroviral long terminal repeat, or from a promoter specific for a target cell type of interest, such as an adipocyte.
Other viral vectors that can be used in the methods of the invention include, for example, a vaccinia virus, a bovine papilloma virus, or a herpes virus, such as Epstein-Barr Virus (also see, for example, the vectors of Miller, Human Gene Therapy 15-14, 1990; Friedman, Science 244:1275-1281, 1989; Eglitis et al., BioTechniques 6:608-614, 1988; Tolstoshev et al., Current Opinion in Biotechnology 1:55-61, 1990; Sharp, The Lancet 337:1277-1278, 1991; Cornetta et al., Nucleic Acid Research and Molecular Biology 36:311-322, 1987; Anderson, Science 226:401-409, 1984; Moen, Blood Cells 17:407-416, 1991; Miller et al., Biotechnology 7:980-990, 1989; Le Gal La Salle et al., Science 259:988-990, 1993; and Johnson, Chest 107:77S-83S, 1995). Retroviral vectors are particularly well developed and have been used in clinical settings (Rosenberg et al., N. Engl. J. Med 323:370, 1990; Anderson et al., U.S. Pat. No. 5,399,346). In one embodiment, an adeno-associated viral vector (e.g., serotype 2) is used to administer a polynucleotide to an adipocyte or precursor thereof.
Non-viral approaches can also be employed for the introduction of a CREBRF polynucleotide into an adipocyte or precursor thereof. For example, a nucleic acid molecule can be introduced into a cell by administering the nucleic acid in the presence of lipofection (Feigner et al., Proc. Natl. Acad. Sci. U.S.A. 84:7413, 1987; Ono et al., Neuroscience Letters 17:259, 1990; Brigham et al., Am. J. Med. Sci. 298:278, 1989; Staubinger et al., Methods in Enzymology 101:512, 1983), asialoorosomucoid-polylysine conjugation (Wu et al., Journal of Biological Chemistry 263:14621, 1988; Wu et al., Journal of Biological Chemistry 264:16985, 1989), or by micro-injection under surgical conditions (Wolff et al., Science 247:1465, 1990). In one embodiment, the nucleic acids are administered in combination with a liposome and protamine.
Gene transfer can also be achieved using non-viral means involving transfection in vitro. Such methods include the use of calcium phosphate, DEAE dextran, electroporation, and protoplast fusion. Liposomes can also be potentially beneficial for delivery of DNA into a cell (e.g., a preadipocyte, adipocyte, or precursor thereof).
The present invention provides a number of diagnostic assays that are useful for characterizing the genotype of a subject. Desirably, the methods of the invention discriminate between polymorphisms of a gene of interest. Preferably, both alleles corresponding to a gene of interest are identified. Accordingly, the invention provides for genotyping useful in virtually any clinical setting where conventional methods of analysis are used. In various aspects, the methods of the invention determine or detect CREBRF genetic variants at the SNP rs373863828. Results obtained from CREBRF genotyping at SNP rs373863828 may be used to select an appropriate therapy for a subject.
The presence or absence of SNP rs373863828 in the CREBRF gene may be evaluated using various techniques. In certain embodiments, PCR or real-time PCR may be used to detect a single nucleotide polymorphism. Polymerase chain reaction (PCR) is widely known in the art. See for example, U.S. Pat. Nos. 4,683,195, 4,683,202, and 4,800,159; K. Mullis, Cold Spring Harbor Symp. Quant. Biol., 51:263-273 (1986); and C. R. Newton & A. Graham, Introduction to Biotechniques: PCR, 2nd Ed., Springer-Verlag (New York: 1997). Various real-time PCR testing platforms that may be used with the present invention include: 5′ nuclease (TaqMan® probes), molecular beacons, and FRET hybridization probes. In certain embodiments, genotyping is performed using a TaqMan® assay, involving amplifying a CREBRF nucleic acid sequence, e.g., 5% AAGGCTATGAAAATGATTCTGTAGAAGACCTGAAGGAGGTGACTTC AATATCTTCACGGAAGAGAGGTAAAAGAAGATACTTCTGGGAGTATAGTGAACAACTT ACACCATCACAGCAAGAGAGGATGCTGAGACCATCTGAGTGGAACC[A/G]AGATACTTT GCCAAGTAATATGTATCAGAAAAATGGCTTACATCATGGTAAGAGGGGATTGCAGTCA GATATTTAGTGTCACTTTAATCAAGTTGAGCTACTAATCCATAATGTTTACTCCGTGTAC CTA-3′, where the SNP (rs373863828; denoted in brackets) in the sequence above is detected by using the following TaqMan primers and probe sequences:
In other embodiments, a polymorphism may be detected using a technique including hybridization with a probe specific for SNP rs373863828, restriction endonuclease digestion, nucleic acid sequencing, primer extension, microarray or gene chip analysis, mass spectrometry, or a DNAse protection assay. In other embodiments, DNA sequencing may be used to evaluate a polymorphism of the present invention. Sequencing techniques, such as the Sanger method, are well known to those of skill in the art. Next-generation sequencing techniques may be used that do not fall within the scope of Sanger sequencing, including for example microarray sequencing, Solexa sequencing (Illumina), Ion Torrent (Life Technologies), SOliD (Applied Biosystems), pyrosequencing (based on the detection of released pyrophosphate (PPI); see U.S. Pat. Publ. No. 2006008824; herein incorporated by reference), Single-molecule real-time sequencing (Pacific Bio) or other sequencing techniques being developed, including for example, nanopore sequencing and tunnelling currents sequencing.
The genotyping methods of the invention involve detecting or determining a genetic variant or biomarker of interest in a biological sample. In one embodiment, the biologic sample contains a cell having diploid DNA content. Human cells containing 46 chromosomes (e.g., human somatic cells) are diploid. In one embodiment, the biologic sample is a tissue sample that includes diploid cells of a tissue (epithelial cells) or organ (e.g., skin cells). Such tissue is obtained, for example, from a cheek swab or biopsy of a tissue or organ. In another embodiment, the biologic sample is a biologic fluid sample. Biological fluid samples containing diploid cells include saliva, blood, blood serum, plasma, urine, hair follicle, or any other biological fluid useful in the methods of the invention.
As reported herein below, the disruption of CREBRF gene function results reduced adipogenesis. Accordingly, the invention provides oligonucleotides that inhibit the expression of CREBRF. Such inhibitory nucleic acid molecules include single and double stranded nucleic acid molecules (e.g., DNA, RNA, and analogs thereof) that bind a nucleic acid molecule that encodes an CREBRF polypeptide (e.g., antisense molecules, siRNA, shRNA).
siRNA
Short twenty-one to twenty-five nucleotide double-stranded RNAs are effective at down-regulating gene expression (Zamore et al., Cell 101: 25-33; Elbashir et al., Nature 411: 494-498, 2001, hereby incorporated by reference). The therapeutic effectiveness of an siRNA approach in mammals was demonstrated in vivo by McCaffrey et al. (Nature 418: 38-39.2002).
Given the sequence of a target gene, siRNAs may be designed to inactivate that gene. Such siRNAs, for example, could be administered directly to an affected tissue, or administered systemically. The nucleic acid sequence of a gene can be used to design small interfering RNAs (siRNAs). The 21 to 25 nucleotide siRNAs may be used, for example, as therapeutics to treat a B cell neoplasia.
The inhibitory nucleic acid molecules of the present invention may be employed as double-stranded RNAs for RNA interference (RNAi)-mediated knock-down of CREBRF expression. RNAi is a method for decreasing the cellular expression of specific proteins of interest (reviewed in Tuschl, Chembiochem 2:239-245, 2001; Sharp, Genes & Devel. 15:485-490, 2000; Hutvagner and Zamore, Curr. Opin. Genet. Devel. 12:225-232, 2002; and Hannon, Nature 418:244-251, 2002). The introduction of siRNAs into cells either by transfection of dsRNAs or through expression of siRNAs using a plasmid-based expression system is increasingly being used to create loss-of-function phenotypes in mammalian cells.
In one embodiment of the invention, a double-stranded RNA (dsRNA) molecule is made that includes between eight and nineteen consecutive nucleobases of a nucleobase oligomer of the invention. The dsRNA can be two distinct strands of RNA that have duplexed, or a single RNA strand that has self-duplexed (small hairpin (sh)RNA). Typically, dsRNAs are about 21 or 22 base pairs, but may be shorter or longer (up to about 29 nucleobases) if desired. dsRNA can be made using standard techniques (e.g., chemical synthesis or in vitro transcription). Kits are available, for example, from Ambion (Austin, Tex.) and Epicentre (Madison, Wis.). Methods for expressing dsRNA in mammalian cells are described in Brummelkamp et al. Science 296:550-553, 2002; Paddison et al. Genes & Devel. 16:948-958, 2002. Paul et al. Nature Biotechnol. 20:505-508, 2002; Sui et al. Proc. Natl. Acad. Sci. USA 99:5515-5520, 2002; Yu et al. Proc. Natl. Acad. Sci. USA 99:6047-6052, 2002; Miyagishi et al. Nature Biotechnol. 20:497-500, 2002; and Lee et al. Nature Biotechnol. 20:500-505 2002, each of which is hereby incorporated by reference.
Small hairpin RNAs (shRNAs) comprise an RNA sequence having a stem-loop structure. A “stem-loop structure” refers to a nucleic acid having a secondary structure that includes a region of nucleotides which are known or predicted to form a double strand or duplex (stem portion) that is linked on one side by a region of predominantly single-stranded nucleotides (loop portion). The term “hairpin” is also used herein to refer to stem-loop structures. Such structures are well known in the art and the term is used consistently with its known meaning in the art. As is known in the art, the secondary structure does not require exact base-pairing. Thus, the stem can include one or more base mismatches or bulges. Alternatively, the base-pairing can be exact, i.e. not include any mismatches. The multiple stem-loop structures can be linked to one another through a linker, such as, for example, a nucleic acid linker, a miRNA flanking sequence, other molecule, or some combination thereof.
As used herein, the term “small hairpin RNA” includes a conventional stem-loop shRNA, which forms a precursor miRNA (pre-miRNA). While there may be some variation in range, a conventional stem-loop shRNA can comprise a stem ranging from 19 to 29 bp, and a loop ranging from 4 to 30 bp. “shRNA” also includes micro-RNA embedded shRNAs (miRNA-based shRNAs), wherein the guide strand and the passenger strand of the miRNA duplex are incorporated into an existing (or natural) miRNA or into a modified or synthetic (designed) miRNA. In some instances the precursor miRNA molecule can include more than one stem-loop structure. MicroRNAs are endogenously encoded RNA molecules that are about 22-nucleotides long and generally expressed in a highly tissue- or developmental-stage-specific fashion and that post-transcriptionally regulate target genes. More than 200 distinct miRNAs have been identified in plants and animals. These small regulatory RNAs are believed to serve important biological functions by two prevailing modes of action: (1) by repressing the translation of target mRNAs, and (2) through RNA interference (RNAi), that is, cleavage and degradation of mRNAs. In the latter case, miRNAs function analogously to small interfering RNAs (siRNAs). Thus, one can design and express artificial miRNAs based on the features of existing miRNA genes.
shRNAs can be expressed from DNA vectors to provide sustained silencing and high yield delivery into almost any cell type. In some embodiments, the vector is a viral vector. Exemplary viral vectors include retroviral, including lentiviral, adenoviral, baculoviral and avian viral vectors, and including such vectors allowing for stable, single-copy genomic integrations. Retroviruses from which the retroviral plasmid vectors can be derived include, but are not limited to, Moloney Murine Leukemia Virus, spleen necrosis virus, Rous sarcoma Virus, Harvey Sarcoma Virus, avian leukosis virus, gibbon ape leukemia virus, human immunodeficiency virus, Myeloproliferative Sarcoma Virus, and mammary tumor virus. A retroviral plasmid vector can be employed to transduce packaging cell lines to form producer cell lines. Examples of packaging cells which can be transfected include, but are not limited to, the PE501, PA317, R-2, R-AM, PAl2, T19-14×, VT-19-17-H2, RCRE, RCRIP, GP+E-86, GP+envAml2, and DAN cell lines as described in Miller, Human Gene Therapy 1:5-14 (1990), which is incorporated herein by reference in its entirety. The vector can transduce the packaging cells through any means known in the art. A producer cell line generates infectious retroviral vector particles which include polynucleotide encoding a DNA replication protein. Such retroviral vector particles then can be employed, to transduce eukaryotic cells, either in vitro or in vivo. The transduced eukaryotic cells will express a DNA replication protein.
Catalytic RNA molecules or ribozymes that include an antisense sequence of the present invention can be used to inhibit expression of a CREBRF nucleic acid molecule in vivo. The inclusion of ribozyme sequences within antisense RNAs confers RNA-cleaving activity upon them, thereby increasing the activity of the constructs. The design and use of target RNA-specific ribozymes is described in Haseloff et al., Nature 334:585-591. 1988, and U.S. Patent Application Publication No. 2003/0003469 A1, each of which is incorporated by reference.
Accordingly, the invention also features a catalytic RNA molecule that includes, in the binding arm, an antisense RNA having between eight and nineteen consecutive nucleobases. In preferred embodiments of this invention, the catalytic nucleic acid molecule is formed in a hammerhead or hairpin motif Examples of such hammerhead motifs are described by Rossi et al., Aids Research and Human Retroviruses, 8:183, 1992. Example of hairpin motifs are described by Hampel et al., “RNA Catalyst for Cleaving Specific RNA Sequences,” filed Sep. 20, 1989, which is a continuation-in-part of U.S. Ser. No. 07/247,100 filed Sep. 20, 1988, Hampel and Tritz, Biochemistry, 28:4929, 1989, and Hampel et al., Nucleic Acids Research, 18: 299, 1990. These specific motifs are not limiting in the invention and those skilled in the art will recognize that all that is important in an enzymatic nucleic acid molecule of this invention is that it has a specific substrate binding site which is complementary to one or more of the target gene RNA regions, and that it have nucleotide sequences within or surrounding that substrate binding site which impart an RNA cleaving activity to the molecule.
Essentially any method for introducing a nucleic acid construct into cells can be employed. Physical methods of introducing nucleic acids include injection of a solution containing the construct, bombardment by particles covered by the construct, soaking a cell, tissue sample or organism in a solution of the nucleic acid, or electroporation of cell membranes in the presence of the construct. A viral construct packaged into a viral particle can be used to accomplish both efficient introduction of an expression construct into the cell and transcription of the encoded shRNA. Other methods known in the art for introducing nucleic acids to cells can be used, such as lipid-mediated carrier transport, chemical mediated transport, such as calcium phosphate, and the like. Thus the shRNA-encoding nucleic acid construct can be introduced along with components that perform one or more of the following activities: enhance RNA uptake by the cell, promote annealing of the duplex strands, stabilize the annealed strands, or otherwise increase inhibition of the target gene.
For expression within cells, DNA vectors, for example plasmid vectors comprising either an RNA polymerase II or RNA polymerase III promoter can be employed. Expression of endogenous miRNAs is controlled by RNA polymerase II (Pol II) promoters and in some cases, shRNAs are most efficiently driven by Pol II promoters, as compared to RNA polymerase III promoters (Dickins et al., 2005, Nat. Genet. 39: 914-921). In some embodiments, expression of the shRNA can be controlled by an inducible promoter or a conditional expression system, including, without limitation, RNA polymerase type II promoters. Examples of useful promoters in the context of the invention are tetracycline-inducible promoters (including TRE-tight), IPTG-inducible promoters, tetracycline transactivator systems, and reverse tetracycline transactivator (rtTA) systems. Constitutive promoters can also be used, as can cell- or tissue-specific promoters. Many promoters will be ubiquitous, such that they are expressed in all cell and tissue types. A certain embodiment uses tetracycline-responsive promoters, one of the most effective conditional gene expression systems in in vitro and in vivo studies. See International Patent Application PCT/US2003/030901 (Publication No. WO 2004-029219 A2) and Fewell et al., 2006, Drug Discovery Today 11: 975-982, for a description of inducible shRNA.
Naked polynucleotides, or analogs thereof, are capable of entering mammalian cells and inhibiting expression of a gene of interest. Nonetheless, it may be desirable to utilize a formulation that aids in the delivery of oligonucleotides or other nucleobase oligomers to cells (see, e.g., U.S. Pat. Nos. 5,656,611, 5,753,613, 5,785,992, 6,120,798, 6,221,959, 6,346,613, and 6,353,055, each of which is hereby incorporated by reference).
Therapy may be provided at home, the doctor's office, a clinic, a hospital's outpatient department, or a hospital. Treatment generally begins at a hospital so that the doctor can observe the therapy's effects closely and make any adjustments that are needed. The duration of the therapy depends on the kind of cancer being treated, the age and condition of the patient, the stage and type of the patient's disease, and how the patient's body responds to the treatment. Drug administration may be performed at different intervals (e.g., daily, weekly, or monthly).
Oligonucleotides and other Nucleobase Oligomers
At least two types of oligonucleotides induce the cleavage of RNA by RNase H: polydeoxynucleotides with phosphodiester (PO) or phosphorothioate (PS) linkages. Although 2′-OMe-RNA sequences exhibit a high affinity for RNA targets, these sequences are not substrates for RNase H. A desirable oligonucleotide is one based on 2′-modified oligonucleotides containing oligodeoxynucleotide gaps with some or all internucleotide linkages modified to phosphorothioates for nuclease resistance. The presence of methylphosphonate modifications increases the affinity of the oligonucleotide for its target RNA and thus reduces the IC50. This modification also increases the nuclease resistance of the modified oligonucleotide. It is understood that the methods and reagents of the present invention may be used in conjunction with any technologies that may be developed, including covalently-closed multiple antisense (CMAS) oligonucleotides (Moon et al., Biochem J. 346:295-303, 2000; PCT Publication No. WO 00/61595), ribbon-type antisense (RiAS) oligonucleotides (Moon et al., J. Biol. Chem. 275:4647-4653, 2000; PCT Publication No. WO 00/61595), and large circular antisense oligonucleotides (U.S. Patent Application Publication No. US 2002/0168631 A1).
As is known in the art, a nucleoside is a nucleobase-sugar combination. The base portion of the nucleoside is normally a heterocyclic base. The two most common classes of such heterocyclic bases are the purines and the pyrimidines. Nucleotides are nucleosides that further include a phosphate group covalently linked to the sugar portion of the nucleoside. For those nucleosides that include a pentofuranosyl sugar, the phosphate group can be linked to either the 2′, 3′ or 5′ hydroxyl moiety of the sugar. In forming oligonucleotides, the phosphate groups covalently link adjacent nucleosides to one another to form a linear polymeric compound. In turn, the respective ends of this linear polymeric structure can be further joined to form a circular structure; open linear structures are generally preferred. Within the oligonucleotide structure, the phosphate groups are commonly referred to as forming the backbone of the oligonucleotide. The normal linkage or backbone of RNA and DNA is a 3′ to 5′ phosphodiester linkage.
Specific examples of preferred nucleobase oligomers useful in this invention include oligonucleotides containing modified backbones or non-natural internucleoside linkages. As defined in this specification, nucleobase oligomers having modified backbones include those that retain a phosphorus atom in the backbone and those that do not have a phosphorus atom in the backbone. For the purposes of this specification, modified oligonucleotides that do not have a phosphorus atom in their internucleoside backbone are also considered to be nucleobase oligomers.
Nucleobase oligomers that have modified oligonucleotide backbones include, for example, phosphorothioates, chiral phosphorothioates, phosphorodithioates, phosphotriesters, aminoalkyl-phosphotriesters, methyl and other alkyl phosphonates including 3′-alkylene phosphonates and chiral phosphonates, phosphinates, phosphoramidates including 3′-amino phosphoramidate and aminoalkylphosphoramidates, thionophosphoramidates, thionoalkylphosphonates, thionoalkylphosphotriest-ers, and boranophosphates having normal 3′-5′ linkages, 2′-5′ linked analogs of these, and those having inverted polarity, wherein the adjacent pairs of nucleoside units are linked 3′-5′ to 5′-3′ or 2′-5′ to 5′-2′. Various salts, mixed salts and free acid forms are also included. Representative United States patents that teach the preparation of the above phosphorus-containing linkages include, but are not limited to, U.S. Pat. Nos. 3,687,808; 4,469,863; 4,476,301; 5,023,243; 5,177,196; 5,188,897; 5,264,423; 5,276,019; 5,278,302; 5,286,717; 5,321,131; 5,399,676; 5,405,939; 5,453,496; 5,455,233; 5,466,677; 5,476,925; 5,519,126; 5,536,821; 5,541,306; 5,550,111; 5,563,253; 5,571,799; 5,587,361; and 5,625,050, each of which is herein incorporated by reference.
Nucleobase oligomers having modified oligonucleotide backbones that do not include a phosphorus atom therein have backbones that are formed by short chain alkyl or cycloalkyl internucleoside linkages, mixed heteroatom and alkyl or cycloalkyl internucleoside linkages, or one or more short chain heteroatomic or heterocyclic internucleoside linkages. These include those having morpholino linkages (formed in part from the sugar portion of a nucleoside); siloxane backbones; sulfide, sulfoxide and sulfone backbones; formacetyl and thioformacetyl backbones; methylene formacetyl and thioformacetyl backbones; alkene containing backbones; sulfamate backbones; methyleneimino and methylenehydrazino backbones; sulfonate and sulfonamide backbones; amide backbones; and others having mixed N, O, S and CH.sub.2 component parts. Representative United States patents that teach the preparation of the above oligonucleotides include, but are not limited to, U.S. Pat. Nos. 5,034,506; 5,166,315; 5,185,444; 5,214,134; 5,216,141; 5,235,033; 5,264,562; 5,264,564; 5,405,938; 5,434,257; 5,466,677; 5,470,967; 5,489,677; 5,541,307; 5,561,225; 5,596,086; 5,602,240; 5,610,289; 5,602,240; 5,608,046; 5,610,289; 5,618,704; 5,623,070; 5,663,312; 5,633,360; 5,677,437; and 5,677,439, each of which is herein incorporated by reference.
In other nucleobase oligomers, both the sugar and the internucleoside linkage, i.e., the backbone, are replaced with novel groups. The nucleobase units are maintained for hybridization with a gene listed in Table 2 or 3. One such nucleobase oligomer, is referred to as a Peptide Nucleic Acid (PNA). In PNA compounds, the sugar-backbone of an oligonucleotide is replaced with an amide containing backbone, in particular an aminoethylglycine backbone. The nucleobases are retained and are bound directly or indirectly to aza nitrogen atoms of the amide portion of the backbone. Methods for making and using these nucleobase oligomers are described, for example, in “Peptide Nucleic Acids: Protocols and Applications” Ed. P. E. Nielsen, Horizon Press, Norfolk, United Kingdom, 1999. Representative United States patents that teach the preparation of PNAs include, but are not limited to, U.S. Pat. Nos. 5,539,082; 5,714,331; and 5,719,262, each of which is herein incorporated by reference. Further teaching of PNA compounds can be found in Nielsen et al., Science, 1991, 254, 1497-1500.
In particular embodiments of the invention, the nucleobase oligomers have phosphorothioate backbones and nucleosides with heteroatom backbones, and in particular —CH2. NH—O—CH2—, —CH2—N(CH3)—O—CH2— (known as a methylene (methylimino) or MMI backbone), —CH2—O—N(CH3)—CH2—, —CH2—N(CH3)—N(CH3)—CH2—, and —O—N(CH3)—CH2—CH2—. In other embodiments, the oligonucleotides have morpholino backbone structures described in U.S. Pat. No. 5,034,506.
Nucleobase oligomers may also contain one or more substituted sugar moieties. Nucleobase oligomers comprise one of the following at the 2′ position: OH; F; O-, S-, or N-alkyl; O-, S-, or N-alkenyl; O-, S- or N-alkynyl; or O-alkyl-O-alkyl, wherein the alkyl, alkenyl, and alkynyl may be substituted or unsubstituted C1 to C10 alkyl or C2 to C10 alkenyl and alkynyl. Particularly preferred are O[(CH2)nO]nCH3, O(CH2)nOCH3, O(CH2)nNH2, O(CH2)nCH3, O(CH2)nONH2, and O(CH2) nON[(CH2)nCH3)]2, where n and m are from 1 to about 10. Other preferred nucleobase oligomers include one of the following at the 2′ position: C1 to C10 lower alkyl, substituted lower alkyl, alkaryl, aralkyl, O-alkaryl, or O-aralkyl, SH, SCH3, OCN, Cl, Br, CN, CF3, OCF3, SOCH3, SO2CH3, ONO2, NO2, NH2, heterocycloalkyl, heterocycloalkaryl, aminoalkylamino, polyalkylamino, substituted silyl, an RNA cleaving group, a reporter group, an intercalator, a group for improving the pharmacokinetic properties of a nucleobase oligomer, or a group for improving the pharmacodynamic properties of an nucleobase oligomer, and other substituents having similar properties. Preferred modifications are 2′-O-methyl and 2′-methoxyethoxy (2′-O—CH2CH2OCH3, also known as 2′-O-(2-methoxyethyl) or 2′-MOE). Another desirable modification is 2′-dimethylaminooxyethoxy (i.e., O(CH2)2ON(CH3)2), also known as 2′-DMAOE. Other modifications include, 2′-aminopropoxy (2′-OCH2CH2CH2NH2) and 2′-fluoro (2′-F). Similar modifications may also be made at other positions on an oligonucleotide or other nucleobase oligomer, particularly the 3′ position of the sugar on the 3′ terminal nucleotide or in 2′-5′ linked oligonucleotides and the 5′ position of 5′ terminal nucleotide. Nucleobase oligomers may also have sugar mimetics such as cyclobutyl moieties in place of the pentofuranosyl sugar. Representative United States patents that teach the preparation of such modified sugar structures include, but are not limited to, U.S. Pat. Nos. 4,981,957; 5,118,800; 5,319,080; 5,359,044; 5,393,878; 5,446,137; 5,466,786; 5,514,785; 5,519,134; 5,567,811; 5,576,427; 5,591,722; 5,597,909; 5,610,300; 5,627,053; 5,639,873; 5,646,265; 5,658,873; 5,670,633; and 5,700,920, each of which is herein incorporated by reference in its entirety.
Nucleobase oligomers may also include nucleobase modifications or substitutions. As used herein, “unmodified” or “natural” nucleobases include the purine bases adenine (A) and guanine (G), and the pyrimidine bases thymine (T), cytosine (C) and uracil (U). Modified nucleobases include other synthetic and natural nucleobases, such as 5-methylcytosine (5-me-C), 5-hydroxymethyl cytosine, xanthine, hypoxanthine, 2-aminoadenine, 6-methyl and other alkyl derivatives of adenine and guanine; 2-propyl and other alkyl derivatives of adenine and guanine; 2-thiouracil, 2-thiothymine and 2-thiocytosine; 5-halouracil and cytosine; 5-propynyl uracil and cytosine; 6-azo uracil, cytosine and thymine; 5-uracil (pseudouracil); 4-thiouracil; 8-halo, 8-amino, 8-thiol, 8-thioalkyl, 8-hydroxyl and other 8-substituted adenines and guanines; 5-halo (e.g., 5-bromo), 5-trifluoromethyl and other 5-substituted uracils and cytosines; 7-methylguanine and 7-methyladenine; 8-azaguanine and 8-azaadenine; 7-deazaguanine and 7-deazaadenine; and 3-deazaguanine and 3-deazaadenine. Further nucleobases include those disclosed in U.S. Pat. No. 3,687,808, those disclosed in The Concise Encyclopedia Of Polymer Science And Engineering, pages 858-859, Kroschwitz, J. I., ed. John Wiley & Sons, 1990, those disclosed by Englisch et al., Angewandte Chemie, International Edition, 1991, 30, 613, and those disclosed by Sanghvi, Y. S., Chapter 15, Antisense Research and Applications, pages 289-302, Crooke, S. T. and Lebleu, B., ed., CRC Press, 1993. Certain of these nucleobases are particularly useful for increasing the binding affinity of an antisense oligonucleotide of the invention. These include 5-substituted pyrimidines, 6-azapyrimidines, and N-2, N-6 and 0-6 substituted purines, including 2-aminopropyladenine, 5-propynyluracil and 5-propynylcytosine. 5-methylcytosine substitutions have been shown to increase nucleic acid duplex stability by 0.6-1.2.degree. C. (Sanghvi, Y. S., Crooke, S. T. and Lebleu, B., eds., Antisense Research and Applications, CRC Press, Boca Raton, 1993, pp. 276-278) and are desirable base substitutions, even more particularly when combined with 2′-O-methoxyethyl or 2′-O-methyl sugar modifications. Representative United States patents that teach the preparation of certain of the above noted modified nucleobases as well as other modified nucleobases include U.S. Pat. Nos. 4,845,205; 5,130,302; 5,134,066; 5,175,273; 5,367,066; 5,432,272; 5,457,187; 5,459,255; 5,484,908; 5,502,177; 5,525,711; 5,552,540; 5,587,469; 5,594,121, 5,596,091; 5,614,617; 5,681,941; and 5,750,692, each of which is herein incorporated by reference.
Another modification of a nucleobase oligomer of the invention involves chemically linking to the nucleobase oligomer one or more moieties or conjugates that enhance the activity, cellular distribution, or cellular uptake of the oligonucleotide. Such moieties include but are not limited to lipid moieties such as a cholesterol moiety (Letsinger et al., Proc. Natl. Acad. Sci. USA, 86:6553-6556, 1989), cholic acid (Manoharan et al., Bioorg. Med. Chem. Let, 4:1053-1060, 1994), a thioether, e.g., hexyl-S-tritylthiol (Manoharan et al., Atm. N.Y. Acad. Sci., 660:306-309, 1992; Manoharan et al., Bioorg. Med. Chem. Let., 3:2765-2770, 1993), a thiocholesterol (Oberhauser et al., Nucl. Acids Res., 20:533-538: 1992), an aliphatic chain, e.g., dodecandiol or undecyl residues (Saison-Behmoaras et al., EMBO J., 10:1111-1118, 1991; Kabanov et al., FEBS Lett., 259:327-330, 1990; Svinarchuk et al., Biochimie, 75:49-54, 1993), a phospholipid, e.g., di-hexadecyl-rac-glycerol or triethylammonium 1,2-di-O-hexadecyl-rac-glycero-3-H-phosphonate (Manoharan et al., Tetrahedron Lett., 36:3651-3654, 1995; Shea et al., Nucl. Acids Res., 18:3777-3783, 1990), a polyamine or a polyethylene glycol chain (Manoharan et al., Nucleosides & Nucleotides, 14:969-973, 1995), or adamantane acetic acid (Manoharan et al., Tetrahedron Lett., 36:3651-3654, 1995), a palmityl moiety (Mishra et al., Biochim. Biophys. Acta, 1264:229-237, 1995), or an octadecylamine or hexylamino-carbonyl-oxycholesterol moiety (Crooke et al., J. Pharmacol. Exp. Ther., 277:923-937, 1996. Representative United States patents that teach the preparation of such nucleobase oligomer conjugates include U.S. Pat. Nos. 4,587,044; 4,605,735; 4,667,025; 4,762,779; 4,789,737; 4,824,941; 4,828,979; 4,835,263; 4,876,335; 4,904,582; 4,948,882; 4,958,013; 5,082,830; 5,109,124; 5,112,963; 5,118,802; 5,138,045; 5,214,136; 5,218,105; 5,245,022; 5,254,469; 5,258,506; 5,262,536; 5,272,250; 5,292,873; 5,317,098; 5,371,241, 5,391,723; 5,414,077; 5,416,203, 5,451,463; 5,486,603; 5,510,475; 5,512,439; 5,512,667; 5,514,785; 5,525,465; 5,541,313; 5,545,730; 5,552,538; 5,565,552; 5,567,810; 5,574,142; 5,578,717; 5,578,718; 5,580,731; 5,585,481; 5,587,371; 5,591,584; 5,595,726; 5,597,696; 5,599,923; 5,599,928; 5,608,046; and 5,688,941, each of which is herein incorporated by reference.
The present invention also includes nucleobase oligomers that are chimeric compounds. “Chimeric” nucleobase oligomers are nucleobase oligomers, particularly oligonucleotides, that contain two or more chemically distinct regions, each made up of at least one monomer unit, i.e., a nucleotide in the case of an oligonucleotide. These nucleobase oligomers typically contain at least one region where the nucleobase oligomer is modified to confer, upon the nucleobase oligomer, increased resistance to nuclease degradation, increased cellular uptake, and/or increased binding affinity for the target nucleic acid. An additional region of the nucleobase oligomer may serve as a substrate for enzymes capable of cleaving RNA:DNA or RNA:RNA hybrids. By way of example, RNase H is a cellular endonuclease which cleaves the RNA strand of an RNA:DNA duplex. Activation of RNase H, therefore, results in cleavage of the RNA target, thereby greatly enhancing the efficiency of nucleobase oligomer inhibition of gene expression. Consequently, comparable results can often be obtained with shorter nucleobase oligomers when chimeric nucleobase oligomers are used, compared to phosphorothioate deoxyoligonucleotides hybridizing to the same target region.
Chimeric nucleobase oligomers of the invention may be formed as composite structures of two or more nucleobase oligomers as described above. Such nucleobase oligomers, when oligonucleotides, have also been referred to in the art as hybrids or gapmers. Representative United States patents that teach the preparation of such hybrid structures include U.S. Pat. Nos. 5,013,830; 5,149,797; 5,220,007; 5,256,775; 5,366,878; 5,403,711; 5,491,133; 5,565,350; 5,623,065; 5,652,355; 5,652,356; and 5,700,922, each of which is herein incorporated by reference in its entirety.
The nucleobase oligomers used in accordance with this invention may be conveniently and routinely made through the well-known technique of solid phase synthesis. Equipment for such synthesis is sold by several vendors including, for example, Applied Biosystems (Foster City, Calif.). Any other means for such synthesis known in the art may additionally or alternatively be employed. It is well known to use similar techniques to prepare oligonucleotides such as the phosphorothioates and alkylated derivatives.
The nucleobase oligomers of the invention may also be admixed, encapsulated, conjugated or otherwise associated with other molecules, molecule structures or mixtures of compounds, as for example, liposomes, receptor targeted molecules, oral, rectal, topical or other formulations, for assisting in uptake, distribution and/or absorption. Representative United States patents that teach the preparation of such uptake, distribution and/or absorption assisting formulations include U.S. Pat. Nos. 5,108,921; 5,354,844; 5,416,016; 5,459,127; 5,521,291; 5,543,158; 5,547,932; 5,583,020; 5,591,721; 4,426,330; 4,534,899; 5,013,556; 5,108,921; 5,213,804; 5,227,170; 5,264,221; 5,356,633; 5,395,619; 5,416,016; 5,417,978; 5,462,854; 5,469,854; 5,512,295; 5,527,528; 5,534,259; 5,543,152; 5,556,948; 5,580,575; and 5,595,756, each of which is herein incorporated by reference.
The invention provides cellular compositions (e.g., preadipocytes, adipocytes or precursors of these cell types) comprising a gene whose expression regulates the adipogenesis of the cell. In particular, as reported herein below, the invention provides cells comprising the CREBRF gene that are operably linked to a promoter.
Methods of the invention are useful for the high-throughput, low-cost screening of candidate agents (e.g., inhibitory nucleic acids such as shRNAs, polypeptides, polynucleotides, small organic molecules) that modulate the adipogenesis a cell of the invention. One skilled in the art appreciates that the effects of a candidate agent on a cell is typically compared to a corresponding control cell not contacted with the candidate agent. Thus, the screening methods include comparing the adipogenesis of a cell contacted by a candidate agent to the expression of an untreated reference (i.e., control cell).
In one aspect, the method provides a method of identifying a compound that modulates the expression of a CREB3 Regulatory Factor (CREBRF) polypeptide, comprising: contacting the compound with a nucleic acid that expresses a CREBRF polypeptide under conditions suitable for expression by the nucleic acid; determining the level of expression of the CREBRF polypeptide; determining the level of expression of the nucleic acid in the absence of the compound (i.e., determining the level of expression in a control or reference cell); and comparing the level of expression of the nucleic acid after contact with the compound with the level of expression of the nucleic acid without contact of the compound. Levels of gene expression are determine by methods well-known to those skilled in the art.
In one embodiment, cells of the invention are used to determine potential effects of pharmacological drugs on adipogenesis. The drugs may be proprietary, commercially available or novel compounds and are being administered to patients of various diseases such as diabetes, obesity and cardiovascular diseases. Those that have effects on adipogenesis function in our cell type-specific models would provide entry points for testing drug effects on human adipogenesis, such as changes in weight in patients.
In other embodiments, cells of the invention are used to determine the optimal time for drug administration to a subject. For example, a cell of the invention is contacted with an agent at various time points over the course of the day, and the agent's effect on cell physiology is assayed to determine whether the agent's efficacy or probability of causing adverse side effects alters as a function of the time of administration. The cellular physiology of potential interest in the context of fibroblasts, adipocytes and hepatocytes ranges from RNA and protein production, membrane transport, autophagy and cell division, to cell signaling, cell death, and metabolism. In particular, for example, hepatocytes can be used to study effects of differential temporal application of antidiabetic drugs such as Metformin and TZD, on cellular physiology such as insulin sensitivity, glycogen synthesis and gluconeogenesis, as well as on detoxification and metabolism of xenobiotics.
The effects of agents on a cell's adipogenesis can be assayed by detecting the expression or activity of an adipogenic polypeptide or polynucleotide. Polypeptide or polynucleotide expression can be detected by procedures well known in the art, such as Western blotting, flow cytometry, immunocytochemistry, binding to magnetic and/or antibody-coated beads, in situ hybridization, fluorescence in situ hybridization (FISH), ELISA, microarray analysis, RT-PCR, Northern blotting, or colorimetric assays, such as the Bradford Assay and Lowry Assay.
For example, one or more candidate agents are added at varying concentrations to the culture medium containing a cell of the invention. An agent that modulates the expression of detectable reporter expressed in the cell is considered useful in the invention; such an agent may be used, for example, as an adipogenesis modulator. An agent identified according to a method of the invention is locally or systemically delivered to modulate the adipogenesis of a subject.
In one embodiment, the effect of a candidate agent may be measured at the level of polypeptide production using the same general approach and standard immunological techniques, such as Western blotting or immunoprecipitation with an antibody specific for an adipogenic marker. For example, immunoassays may be used to detect or monitor the expression of protein of interest in a cell of the invention.
Alternatively, or in addition, candidate agents are identified by first assaying those that modulate the marker expression of a cell of the invention and subsequently testing their effect on cells, or on whole animals, which would have implications in human diseases. In one embodiment, an adipogenesis modulator polypeptide is assayed for its ability to interact with adipogenic marker polypeptides, for example, using Gal4 two-hybrid screen as described herein. Such interactions can also be readily assayed using any number of standard binding techniques and functional assays (e.g., those described in Ausubel et al., supra).
The invention also provides kits for carrying out the various methods of the invention. For example, in one aspect, such kits are useful for the identification of a CREBRF polymorphism in a biological sample obtained from a subject. In various embodiments, the kit includes one or more probes or primers that identifies a CREBRF nucleic acid sequence encoding a CREBRF polypeptide comprising a glutamine at amino acid position 457 (e.g., an A at nucleotide position 1689), together with instructions for using the primers to genotype a biological sample.
In another aspect, the invention also provides kits for identifying compounds/agents that modulate expression of a a CREBRF. Such kits are useful for the identification of a compound/agent that regulates adipogenesis in a subject. In various embodiments, the kit includes cells of the invention comprising the CREBRF gene that is operably linked to a promoter, together with instruction for using the cells to identify a modulator.
In one embodiment, the instructions include instructions using the kits in accordance with the methods of invention. In certain embodiment, the instructions include at least one of the following: description of a therapeutic agent (e.g., for treatment of obesity or symptoms thereof); dosage schedule; administration precautions; warnings; indications; counter-indications; over dosage information; adverse reactions; animal pharmacology; clinical studies; and/or references. The instructions may be printed directly on the container (when present), or as a label applied to the container, or as a separate sheet, pamphlet, card, or folder supplied in or with the container.
In some embodiments, the kit comprises a sterile container which contains composition of the invention; such containers can be boxes, ampoules, bottles, vials, tubes, bags, pouches, blister-packs, or other suitable container forms known in the art. Such containers can be made of plastic, glass, laminated paper, metal foil, or other materials suitable for holding medicaments.
The practice of the present invention employs, unless otherwise indicated, conventional techniques of molecular biology (including recombinant techniques), microbiology, cell biology, biochemistry and immunology, which are well within the purview of the skilled artisan. Such techniques are explained fully in the literature, such as, “Molecular Cloning: A Laboratory Manual”, second edition (Sambrook, 1989); “Oligonucleotide Synthesis” (Gait, 1984); “Animal Cell Culture” (Freshney, 1987); “Methods in Enzymology” “Handbook of Experimental Immunology” (Weir, 1996); “Gene Transfer Vectors for Mammalian Cells” (Miller and Calos, 1987); “Current Protocols in Molecular Biology” (Ausubel, 1987); “PCR: The Polymerase Chain Reaction”, (Mullis, 1994); “Current Protocols in Immunology” (Coligan, 1991). These techniques are applicable to the production of the polynucleotides and polypeptides of the invention, and, as such, may be considered in making and practicing the invention. Particularly useful techniques for particular embodiments will be discussed in the sections that follow.
The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to make and use the assay, screening, and therapeutic methods of the invention, and are not intended to limit the scope of what the inventors regard as their invention.
The experimental examples below were performed using the following materials and methods.
The participants in this study are derived from the populations of the Independent State of Samoa and the United States territory of American Samoa. Two samples were used in this study: a discovery sample of 3,072 phenotyped and genotyped Samoans and a replication sample of an additional 2,583 phenotyped and genotyped Samoans and American Samoans. The parent GWAS study, sample selection and data collection methods, and phenotype levels, including lipids and lipoproteins have been reported3. The discovery GWAS data set will be available in dbGAP (access number: phs000914). This study has been approved by the Health Research Committee of the Samoa Ministry of Health and the Institutional Review Boards of Brown University, University of Cincinnati, and University of Pittsburgh. All participants gave informed consent.
The discovery sample is drawn from 3,475 men and women (n=1,437 male), ages 24.5 to <65 years who reported Samoan ancestry (based on having four Samoan grandparents). Recruitment took place between February and July 2010 in 33 villages across both islands (Upolu and Savaii) of Samoa3. A population-based design was employed and consenting participants completed interviews targeting lifestyle factors related to cardiometabolic health (health history, socio-economic position, dietary intake, and physical activity) and anthropometric measurements (height, weight, blood pressure, body composition), and gave fasting whole blood samples for biochemical and genetic assays. A description of the prevalence of non-communicable diseases and associated risk factors is provided in Hawley et al.3.
In the original GWAS study design, the discovery sample size goal of 2,500 (which is exceeded) was chosen so as to have high power to detect risk SNPs with realistic effect sizes. Power was estimated as follows: Quanto34,35 was used to estimate the power to detect the FTO rs9930506 SNP, which in the Sardinia study36 explained 1.34% of the BMI variance. If it is assumed that the SNP has the same allele frequencies and BMI has the same overall mean and standard deviation as in Scuteri et al36, then at a significance level of 1×10−5, power >80% when the risk SNP explains at least 1.1% of the variance (and power of 90% when the SNP explains 1.3% of the variance). If it is instead tested at 1×10−7, power is >80% if the SNP explains at least 1.5% of the variance.
The replication sample consists of individuals from two samples of Samoans studied in 1990-95 and in 2002-03, which were analyzed as if all samples were unrelated because genome-wide marker data was not available. The 1990-95 study sample derives from a longitudinal study of adiposity and cardiovascular disease risk factors among adults from the U.S. territory of American Samoa and the independent nation of Samoa. Although there is substantial economic disparity between the two polities, the Samoans from both territories form a single socio-cultural unit with frequent exchange of mates. Genetically they represent a single homogenous population37,38. Participants were between 25-55 years of age at baseline and reported that all four grandparents were of Samoan ancestry. Detailed descriptions of the sampling and recruitment were reported previously39-41. Briefly, participants were recruited from 46 villages and worksites in American Samoa in 1990 and nine villages in (then Western) Samoa in 1991. All participants were free of self-reported history of heart disease, hypertension, or diabetes at baseline. There were 413 and 607 genotyped and phenotyped individuals available from American Samoa in 1990 and from Samoa in 1991, respectively (Table 1). Due to lack of genome-wide marker data on these samples, relatedness cannot be inferred, and so these were treated as unrelated in the analyses.
The 2002-03 family study sample includes adults and children recruited as part of an extended family-based genetic linkage analysis of cardiometabolic traits1,42-45. Probands and relatives were unselected for obesity or related phenotypes. The recruitment process and criteria used for inclusion in this study are described in detail previously42,45. There were 590 adults, 18-89 years, from 2002 in American Samoa; and 493 adults, 19-82 years, and 409 children ages 5-<18 years, from 2003 in Samoa, available with genotypes and phenotypes (Table 1). The analyses of these samples were adjusted for relatedness using kinships derived from the known family structures (which had been verified to be consistent with relatedness estimates derived using genome-wide microsatellite markers)1.
Height, weight and BMI were measured as previously describee39,46. Polynesian cutoffs were used to classify individuals as normal weight, overweight or obese based on BMI of <26 kg/m2, 26-32 kg/m2, and >32 kg/m2 respectively2. Obesity in children was categorized from BMI using the international age and sex-specific classifications developed by Cole et al.47
In the discovery sample, hypertension and abdominal (at the level of the umbilicus) and hip circumferences were measured in duplicate and averaged (Table 4). Bioelectrical impedance measures of resistance and reactance (RJL BIA-101Q device, RJL Systems, MI, USA) were used to estimate percent body fat based on Polynesian-specific equations2,46. Serum separated from whole blood samples, collected after a 10-hour overnight fast was assayed for cholesterol (total, HDL and LDL), triglycerides, glucose, and insulin. The assay techniques for these metabolic markers have been described previously′. Individuals were classified as having type 2 diabetes based on a fasting serum glucose >126 mg/dL or the current use of diabetes medication48. Hypertensives either had a systolic BP >140 mm Hg or diastolic BP >90 mm Hg, or were currently taking hypertension medication. Additionally, serum levels of leptin and adiponectin were obtained by using commercially available radioimmunoassay kits (EMD Millipore Inc., St. Charles, Mo., USA). HOMA-IR was calculated as glucose (mg/dL)×insulin (μU/mL)/405 as recommended.49
DNA was extracted from whole blood as previously reportee42. In the discovery sample, genotyping was attempted on 3,298 DNA samples (including 3,194 participants, 34 duplicates and 70 positive controls) across 909,622 probes using a Genome-Wide Human SNP Array 6.0 (Affymetrix, California, USA). Genotyping of the discovery samples was performed on 96 well plates, each plate containing two reference samples: 1) REF103 provided by Affymetrix, and 2) a Coriell DNA sample, NA15510, and a negative control. A duplicate sample from the same plate was introduced in each plate with blinded IDs for the laboratory personnel. The samples were not randomized and were processed in the order collected in the field. Laboratory personnel were blind to the sample phenotypes.
Extensive quality control was conducted based on a pipeline developed by Laurie et al.50 including assessment of probe and sample quality (probes and samples excluded with missingness rates >5%), sex validation, investigation of genotyping batch effects, assessment of cryptic relatedness and population substructure, and duplicate sample and duplicate probe discordance. Of the 3,194 samples attempted for genotyping, 4 were dropped due to high genotyping missingness, 3 due to discrepancy between reported and apparent genetic gender, 7 due to apparent sex chromosome aneuploidy, 9 due to chromosomal abnormalities such as deletions and duplications, 2 due to apparent sample admixture, and 50 due to poor cluster resolution across the genome. After quality control, 3,119 samples genotyped for 895,103 unique markers were available to conduct genome-wide association studies. An additional 25 participants were excluded due to self-reported pregnancy and 3 because each is one of a pair of monozygotic twins. There were 19 participants missing BMI. Complete phenotype and genotype data were available for up to 3,072 participants.
To test for possible overlap between the samples from our three studies, 116 single-nucleotide polymorphisms (SNPs) genotyped were used in common across all our samples. These 116 SNPs, including rs12513649, were chosen based on their association signals for a whole suite of traits in the discovery sample. At loci with multiple significant SNPs, the peak SNP was chosen as representative of that locus. At loci (defined as 1 Mbp windows) with different peak SNPs for different phenotypes, the SNP with the smallest P value among the associated phenotypes was genotyped as representative of that locus. These SNPs spanned all autosomal chromosomes and the X chromosome, and were at least 1 Mb away from each other and not in linkage disequilibrium with each other (r2<0.3 for all but one pair of adjacent markers; r2=0.73 between rs4932738 and rs7252689 on chromosome 19). Genotyping of variants selected for validation (described below) in the replication sample was performed using custom-designed TaqMan® OpenArray Real-Time PCR assays (Applied Biosystems). SNPs that could not be genotyped using OpenArray assays, including rs12513649, were genotyped individually using TaqMan® SNP Genotyping assays (Applied Biosystems). For replication genotyping, in each 384 well plate (n=8), 4 duplicates from the same plate with blinded ID were included; each plate also contained 8 negative controls and 8 Coriell samples (NA15510). The quality of genotype clustering for each SNP was verified and corrected manually. Eight subjects could not be genotyped due to technical difficulties.
During quality control, significant relatedness was observed among the discovery sample participants, so empirical kinship coefficients were estimated using the genotyped markers, in two iterations. In the first iteration, 10,000 independent autosomal markers were selected using PLINK51 to generate empirical kinship coefficients using GenABEL52. Individuals with kinship coefficients less than 0.0625 (first-cousin) were considered unrelated. A maximal set of 1,891 unrelated individuals was determined using previously published methods53. In the second iteration, the kinship matrix between all participants was estimated using, a new set of 10,000 independent autosomal markers that had been selected using the set of unrelated individuals.
The genetic ancestry of our discovery sample, where every individual self-reported having four Samoan grandparents, was assayed via principal components analyses using PCAiR54. We conducted two principal components analyses. Firstly, to examine the relationship of the Samoans against other continent populations, we compared the genotypes of a randomly chosen subset of 250 Samoans against genotypes from individuals comprising HapMap Phase 3. Genotype management was performed using PLINK51. HapMap Phase 3 genotypes55 were merged with the genotypes from the Samoan discovery sample. SNPs with a minor allele frequency <0.05, with a missingness rate >0.1, and located within regions problematic for the calculation of principal components analysis (the major histocompatibility locus on 6p21, the region near LCT on 2q21, and common inversion regions on 8p23 and 17q21) were dropped. Markers were further pruned down to every fourth marker. The PC-AiR algorithm was applied to the remaining 111,438 markers: the PCs were estimated in the unrelated subjects as determined by the KING-robust kinship coefficient estimator56 and extended to relatives in the dataset based on their genetic similarity. The first three principal components from this analysis are shown in
BMI was log-transformed to approximate normality. Residuals were generated by linear regression against age, age2, sex and the interactions between age and sex. Association between autosomal marker genotypes and the BMI residuals was tested while using the empirical kinship matrix to adjust for subject relatedness. Note that population substructure is accounted for in our analyses by inclusion of the empirical kinship model in the analysis models, because, as Hofmann59 states “explicitly modeling the pairwise relatedness between all individuals captures both population structure and kinship”. The tests were conducted using a score test using the mmscore function in GenABEL60. The statistics between X chromosome genotypes and BMI residuals were calculated in GenABEL without adjusting for the empirical kinship estimates. Following analysis, 230,554 SNPs with a minor allele frequency <0.01 (including 23,612 monomorphic SNPs) and then 4,093 SNPs with HWE test p values <0.00005 were filtered out, resulting in 659,492 autosomal and X-linked SNPs used for analyses. Inflation due to population stratification and cryptic relatedness was assessed by estimating λGC using the lower 90% of the p value distribution61.
Genome-wide significance for GWAS p values (pG) was set at pG<5×10−8. Suggestive association was set at pG<10−5. Statistical power to detect signals at these thresholds was calculated using the Genetic Power Calculator (Purcell, S., Cherny, S. S. & Sham, P. C. Genetic Power Calculator: design of linkage and association genetic mapping studies of complex traits. Bioinformatics 19, 149-150 (2003)). Probe intensities plots for each significantly and suggestively associated SNP were examined for genotype-calling errors.
SNPs with pG<10−5 were chosen for association validation in the additional Samoan participants from the replication sample. At loci (defined as a 1 Mbp window) with multiple SNPs more significant than this threshold, the peak SNP was chosen as representative of that locus.
The replication sample was divided into three groups for analysis: the 1990-1995 study participants, the 2002-03 family study adults (age ≥18), and the 2002-03 family study children (age <18). For the purposes of the meta-analyses, the studies were not further subdivide by nation; doing so would have broken up pedigrees in the family study that span both nations. For consistency, the 1990-1995 study was therefore not subdivided by nation either. All samples, including those from the discovery sample, were examined using the 116 SNPs typed in common across all samples (validation genotypes) for genetic identity that might have arisen through recruitment into multiple studies over the two decades that they span. One sample of each pair that had an estimated identity-by-descent >0.9 as estimated in PLINK were removed from analysis. For the participants, both adults and children, from the 2002-03 family study, kinship coefficients were calculated from the recorded pedigrees using the kinship2 package62 in R63. Replication association analyses were performed using GenABEL52 for each group, using the kinship coefficients to adjust for relatedness in the family sample. There are no sufficient marker data to infer relatedness and adjust for it in the 1990-1995 study, so they were treated as unrelated. The same covariates used in the GWAS analysis were used in the replication regression models, with an additional variable indicating whether subjects were from Samoa or American Samoa. Prior to meta-analysis, quality control of the summary statistics was performed using EasyQC64 to check for strand and allele frequency consistency. Meta-analysis was performed using METAL65 to generate two replication p values: one for the replication sample and one for the replication sample and discovery sample together (Table 2). The p-value-based method was used with sample sizes as weights with genomic control correction turned off. Heterogeneity across all the cohorts were assessed by calculating both Cochrane's Q and the I2 statistic66-68.
Before undertaking targeted sequencing, we first used SHAPEIT69-73 and IMPUTE274-76 to impute in our region of interest centered on rs12513649 using the December 2013 1,000 Genomes Phase I integrated variant set release haplotype reference panel. It implicated only one strongly-associated variant (with a predicted allele frequency of 0.075), but when we genotyped it in a pilot sample, it turned out to be monomorphic (as it was in the subsequent targeted sequencing experiment described below). Based on this experience, as well as on what we would expect given the unique population history of the Samoans, we believe that the best way to do accurate imputation in the Samoans is by using a Samoan-specific reference panel. This concurs with recent recommendations for optimal fine-mapping in populations with unique ancestries not found in the cosmopolitan reference panel77. A panel of 444 of our Samoans from the discovery sample is currently being whole-genome sequenced by the NHLBI TOPMed Consortium.
A 1.5 Mbp segment (NC_000005.09:171583933_173083933) around rs12513649 was chosen for targeted sequencing by finding the boundaries of the linkage disequilibrium block containing rs12513649. This block was defined by multiallelic D′ lows (calculated using gPLINK) within 2 Mbp of rs12513649 and extended from rs1433019 to rs4868246. The targeted region was then extended from these points until it encompassed 1.5 Mbp. Sequencing was performed on 96 discovery sample participants optimally chosen using INFOSTIP78. The sample size of 96 was chosen due to fiscal constraints, and was estimated to recover 94% of the information had we been able to sequence everyone. Baits were derived using SureDesign (Agilent Technologies), with additional baits derived based on blat analysis. DNA libraries were prepared using SureSelect (Agilent Technologies) and sequenced using 100 bp paired-end runs on an Illumina HiSeq 2500 with the goal that at least 95% of the targeted region achieves a coverage depth of 20× or greater. Mean bait coverage was 81×. Samples were processed using BWA, GATK3 (QD<2.0, MQ<40.0, FS>60.0, MQRankSum<-12.5, ReadPosRankSum<-8.0), and HaplotypeCaller with hard cutoffs. This resulted in 99.6% concordance to VeraCode array calls, and 98.35% of single nucleotide variants were in dbSNP 138.
DNA fragmentation was performed on 200 ng of genomic DNA using a Covaris E210 system, which shears DNA to fragments 150 to 200 bp in length with 3′ or 5′ overhangs. End repair was performed where 3′ to 5′ exonuclease activity of enzymes removes 3′ overhangs and the polymerase activity fills in the 5′ overhangs. An ‘A’ base is then added to the 3′ end of the blunt phosphorylated DNA fragments which prepares the DNA fragments for ligation to the sequencing adapters, which have a single ‘T’ base overhang at their 3′ end. Ligated fragments are subsequently size selected through purification using SPRI beads and undergo PCR amplification techniques to prepare the ‘libraries’. The Caliper LabChip GX is used for quality control of the libraries to ensure adequate concentration and appropriate fragment size.
Exon capture was done using the Agilent SureSelect Human All Exon Target Enrichment system, which results in ˜51 Mb of targeted sequence capture per sample. Under standard procedures, biotinylated RNA oligonucleotides were hybridized with 500 ng of the library. Magnetic bead selection is used to capture the resulting RNA-DNA hybrids. RNA is digested and remaining DNA capture PCR-amplified. Sample indexing is introduced at this step. The Agilent Bioanalyzer (HiSensitivity) is used for quality control of adequate fragment sizing and quantity of DNA capture.
DNA sequencing was performed on an Illumina® HiSeq 2500 instrument using standard protocols for a 100 bp paired-end run. Six samples were run per flowcell, guaranteeing >90-95% completeness at a minimum of 20× coverage.
Illumina HiSeq reads was processed through Illumina's Real-Time Analysis (RTA) software generating base calls and corresponding base call quality scores. Resulting data was aligned to a reference genome with the Burrows-Wheeler Alignment (BWA) tool (Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754-1760 (2009)). resulting in a SAM/BAM file. Post processing of the aligned data included local realignment around indels, base call quality score recalibration performed by the Genome Analysis Tool Kit (GATK) (McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20, 1297-1303 (2010)) and flagging of molecular/optical duplicates using software from the Picard program suite. Per-sample and multi-sample variant calling was performed by GATK Haplotype Caller. Per sample data quality metrics include (but are not limited to) transition/transversion ratios (ts/tv), percent in dbSNP, concordance and heterozygote sensitivity with previously generated genotyping data, capture specificity and percent of targeted bases covered=>20×.
The targeted sequencing sample was prephased using SHAPEIT69-73, and then imputed into our discovery sample using IMPUTE274-76. Association testing was carried out using ProbABEL79, adjusting for relatedness with the empirical kinship matrix generated by GenABEL. Three variants had nearly equivalent P values (rs12513649, rs150207780, rs373863828) due to nearly perfect linkage disequilibrium between them (r2>0.988); rs150207780 and rs373863828 were imputed very well (IMPUTE2 info metric=0.954 for both variants). To determine which of these variants might be the most likely causal candidate, we tested for association in the targeted sequencing region conditioned on each of these variants as well as the next most significant variant (rs3095870; info metric=0.957), using ProbABEL and adjusting for relatedness. As expected for variants in such high LD, the signals in the region were eliminated after conditioning (
For fine-mapping using the imputed variants, 160 variants were selected with minor allele frequency >0.05 on either side of the missense variant rs373863828. These 321 SNPs spanned from 172368674 to 172670745 on chromosome 5, including from the GWAS variant rs12513649 on the left to the variants with significant P values near NKX2-5 on the right (
Confirmatory genotyping.
Genotyping was attempted for both rs150207780 and rs373863828 using TaqMan® in all discovery and replication sample participants. The assay for rs150207780 failed; genotyping was not reattempted because it showed no residual association signal in the analyses of the imputed data conditioned on missense variant rs373863828 (
rs373863828 genotypes examined for association with the additional adiposity-related phenotypes listed in Table 4. Association was assessed in both the discovery sample (Table 4 and Table 5a) and in a mega-analysis of the replication sample adults (Table 5b). While meta analysis of properly transformed phenotypes generates more accurate pvalues (as we did in Table 2), we chose instead here to carry out mega analyses because we are primarily interested in estimating the effect sizes on the traits' natural scales. Sexstratified analyses were also conducted in both samples (Table 5). Diabetics were excluded from analyses of glucose, insulin and HOMA-IR. Since the distributions of leptin varied greatly between women and men, each sex was analyzed separately for this trait. Residuals for quantitative traits were generated using linear regression; for qualitative traits logistic regression was used. Age, age2, sex and the interactions between age and sex and age2 and sex were included in all models initially. For glucose, insulin, HOMA-IR, adiponectin, leptin, and diabetes status, second sets of residuals were generated including log-transformed BMI as a covariate. Sex and age-sex interactions were not included in the sex-stratified models. In the replication mega-analysis models, polity (Samoa or American Samoa) and cohort (1990s or 2000s) were included in the models initially as well. Stepwise regression was used to reduce the number of covariates for each trait separately. For quantitative traits, Residuals were tested for association using the mmscore function of GenABEL52, adjusted for the empirical kinship matrix as above. Dichotomous traits were analyzed using the palogist function of ProbABEL79 while adjusting for covariates and empirical kinship. A Bonferroni-corrected p value threshold of 0.0033 was used to assess significance; this is conservative as it adjusts for 23 tests even though some of traits are correlated with each other. To assess a possible survivor effect as the cause of the association between the BMI-increasing allele and decreased fasting glucose levels and risk of diabetes, we conducted linear regression of age by genotype. In the discovery sample, regarding the association of rs373863828 with BMI, fasting glucose, fasting insulin, obesity risk, and diabetes risk, the addition of the first 10 ‘local’ principal components from Supplementary
For human gene expression analysis, a Human Normal cDNA Array was obtained from Origene (Cat#HMRT103 and HBRT101). The human standard curve was prepared from Control Human Total RNA (ThermoFisher Scientific, 4307281). For mouse gene expression analysis, mouse tissues were collected between 8-10 am from littermate-matched, from ad lib-fed, male C56BL/6J mice at 10 weeks of age (n=6/group). The mouse standard curve was prepared from pooled kidney RNA from the above mice. mRNA was prepared using the RNeasy Lipid Tissue Mini Kit with on-column DNase treatment (Qiagen) followed by reverse transcription to cDNA using qScript cDNA Supermix (Quanta Biosciences). Gene expression was determined by qPCR (Quanta PerfCTa SYBR Green FastMix or PerfeCTa qPCR FastMix) using an Eppendorf Realplex System. Human CREBRF was amplified using the following CREBRF-specific primers: forward 5′ ATGTATGAACTGGATAGAGAGATG, reverse 5′ GTTAGGTCTTCACAGTATGTATCC. Mouse Crebrf was amplified using a Crebrf-specific primer-probe set (ThermoFisher ScientificCat# Mm00661539_ml). CREBRF expression was normalized to species-specific peptidylprolyl isomerase A/Cyclophilin A as the endogenous control gene (ThermoFisherScientific 4333763T and Mm02342430_gl for human and mouse, respectively). Mouse data are expressed as mean plus s.e.m. Data are relative expression values, and so randomization, blinding, and statistical comparisons were not indicated. Gene expression analysis was performed in accordance with Minimum Information for Publication of Quantitative Real-Time PCR Experiments (MIQE) guidelines. Animal experiments were approved by the University of Pittsburgh Institutional Animal Care and Use Committee and conducted in conformity with the Public Health Service Policy for Care and Use of Laboratory Animals. Human samples from Origene Technologies conform to Federal Policies for protection of human subjects (45 CDR 46) and are HIPPA compliant. Additional information and documentation can be obtained by contacting the company.
Expression plasmids with the eGFP and human CREBRF (NM_153607.2) open reading frames were obtained from GeneCopoeia (EX-EGFP-M10, EXEX-E3374-M10; Rockville, Md., USA). The backbone vector was pReceiver-M10, which had a cytomegalovirus promoter and a carboxy-terminal Myc-(His)6 tag. A rare missense variant, c.1447A>G, p.Thr483Ala (rs17854147), affecting a conserved residue was present in CREBRF open reading frame, which was predicted to be a loss-of-function variant. To avoid using this potentially function-altering variant, the variant sequence was converted to wild-type CREBRF and the BMI risk allele, c.1370G>A, p.Arg457Gln, (rs373863828), was introduced using PCR mutagenesis. The segments obtained by PCR in each plasmids were verified by sequencing before large-scale plasmid purification for transfection.
The mouse embryonic fibroblast cell line 3T3-L1 was obtained from ATCC (Manassas, Va., USA). No genetic authentication has been performed. However, the phenotype of the cells is consistent with previous publications. Cells were maintained in Dulbecco's modified Eagle's medium (DMEM; Gibco, Grand Island, N.Y.) supplemented with 10% newborn calf serum (NCS; Sigma, St. Louis, Mo.), 100 units/mL penicillin and 100 μg/mL stremptomycin (Sigma), 3.7 g/L NaHCO3, 4.77 g/L HEPES in a 37° C. with 5% CO2 humidified incubator. 3T3-L1 preadipocytes were transfected with plasmids containing eGFP-only negative control, wild-type human CREBRF, or the p.Arg457Gln variant using Lipofectamine 2000 (ThemoFisher Scientific, Waltham, Mass.) in triplicates. Transfected cells were kept under selection with 500 μg/mL Geneticin (G418, ThemoFisher Scientific) for 3 weeks to generate stable cell lines. Mycoplasma testing was performed by PCR and DAPI staining. All cells used in this study tested negative.
The differentiation of 3T3-L1 to adipocytes was carried out as described previously82. Differentiation was induced 2 days post confluence with a differentiation cocktail including 3-isobutyl-1-methylxanthine (IMBX, 0.5 mM; Sigma), dexamethasone (0.25 μM; Sigma), human insulin (1 μg/mL; Sigma) in basic media with 10% fetal bovine serum (FBS). After 2 days, the media was replaced with maintenance media with 10% FBS and 1 μg/mL human insulin. After further 2 days, the maintenance media was replaced with growth media containing 10% FBS, 100 units/mL penicillin and 100 μg/mL stremptomycin (Sigma) and was changed every other day for up to 10 days. Geneticin (500 μg/mL) selection was maintained throughout the differentiation protocol for stable transfected cells.
Oil Red O plate assay.
Oil Red O Staining has been established as a useful tool to measure intracellular triglyceride accumulation83, a quantitative measure of adipocyte differentiation. Cells were seeded in 96-well cell culture plates at 10,000 cells/well with 8 technical replicates. At endpoints of interest, cells were fixed with 4% paraformaldehyde for 15 min. Stock solution was 0.3% Oil Red O solution that was prepared from Oil Red O solution purchased from (Sigma, O1391). Working solution contained stock solution and water with the ratio of 24:16 v/v. After fixation, cells were rinsed with PBS and incubated with oil red O working solution for 15 min (30 μL per well). Washing with PBS three times was performed to remove residual oil red O solution. Then, 100 μL isopropanol was added in each well to elute the dye and the absorbance was measured at 560 nm. Cells containing media only served as blanks. Blank values were subtracted from experimental samples. Cells in a parallel plate were lysed using CelLytic M (Sigma) and the protein concentration was measured using the Bradford assay84 (Bio-Rad, Hercules, Calif.). Absorbance data were normalized to protein concentration and expressed in OD560/μg units.
To visualize lipid accumulation, cells were cultured on coverslips. Eight days after confluence the media was removed and the cells were washed twice with PBS. Fixation in 4% paraformaldehyde for 10 minutes at room temperature was followed by staining with Oil Red O working solution for 30 minutes at room temperature. The Oil Red O solution was aspirated and the cells were rinsed 6 times in distilled water. The cells were counterstained with hematoxylin for 5 minutes at room temperature followed by rinsing 6 times with distilled water. The coverslips were mounted with glycerol-gelatin media and images were captured using a DM5000 (Leica Microsystems, Buffalo Grove, Ill.) photomicroscope.
Cells were harvested 8 days after confluence and the PicoProbe Triglyceride Quantification Assay Kit (Abeam, ab178780) was used to measure the level of triglycerides in cell lysate. The triglyceride level (pmol) was normalized to the amount of protein measured by the Bradford method84 in each lysate sample.
Oxygen consumption rate (OCR), a measure of mitochondrial respiration, and extracellular acidification rate (ECAR), a measure of glycolysis, were determined using an XF96 extracellular flux analyzer (Seahorse Bioscience, North Billerica, Mass.). Transfected 3T3-L1 cells were seeded in a 96-well XF96 cell culture microplate (Seahorse Bioscience) at a density of 7000 cells per well in 200 μL DMEM (4.5 g/L glucose) supplemented with 10% FBS (Sigma) 36 hours before the measurement. Six replicates per cell type were included in the experiments and four wells were chosen evenly in the plate to correct for temperature variation. On the day of assay, the growth media was changed with assay media (unbuffered DMEM with 4.5 g/L glucose). Oligomycin at a final concentration of 2.0 μM, FCCP (carbonyl cyanide-p-trifluoromethoxyphenylhydrazone) at 1.0 μM, 2-deoxyglucose at 100 mM and rotenone at 15.0 μM were sequentially injected into each well in accordance with the manufacturer's protocol. Basal mitochondrial respiration, maximal respiration, ATP production and basal glycolysis were determined according to the manufacturer's instructions. At the conclusion of the assay cells in the analysis plate were lysed using CelLytic M (Sigma), the protein concentration was measured using the Bradford assay59 (Bio-Rad, Hercules, Calif.) and used to normalize the bioenergetic profile data.
Total RNA was harvested using an RNeasy Mini Kit (Qiagen) and cDNA was generated using the Superscirpt III Reverse Transcriptase (ThemoFisher Scientific). Quantitative RT-PCR analysis used SYBR Green PCR Master Mix (BioRad) with primers for human CREBRF (5′-GAAGACCTGAAGGAGGTGACT and 5′-GTTCCACTCA GATGGTCTCA GC), mouse Crebrf (5′-GAGGACTTGAAGGAGATGACG and 5′-CAGAAGGCCTCAGAATCCTC), mouse), mouse Pparg2 (5′-CCAGAGCATGGTGCCTTCGCT and 5′-CAGCAACCATTGGGTCAG), mouse Cebpa (5′-CAAGAACAGCAACGAGTACCG and 5′-GTCACTGGTCAACTCCAGCAC), mouse beta actin (Actb, 5′-CCACTGCCGCATCCTCTTCC and 5′-CTCGTTGCCAATAGTGATGACCTG). Samples were run on a QuantStudio 12 Flex Real Time PCR System (ThemoFisher Scientific). The efficiency of the qPCR assays was determined using a template dilution series and was found to be ≥0.9. The results were analyzed using ExpressionSuite Software v1.0.4 either using the ΔΔCt method85, or by calculating the 2e*Δct value, where e is PCR efficiency and ΔCt is the threshold cycle difference between the target gene and beta actin (Actb) as a reference gene.
3T3-L1 preadipocytes were subjected to starvation for 2 hours, 4 hours, 12 hours, and 24 hours by culturing cells in Hank's Balanced Salt Solution (HBSS; Gibco, Grand Island, N.Y.). To investigate the response to refeeding starving cells, cells undergoing 12 hours starvation were fed with fresh growth medium for an additional 12 hours (“24 hR” in
For cell studies, adequate sample sizes were determined based on publications using similar methods and pilot experiments. No blinding was done. Each experiment was performed twice with similar results unless otherwise stated in the figure legends. The data were initially evaluated by one-way ANOVA implemented in SPSS (IBM, Armonk, N.Y.). Homogeneity of variances was examined using the Levene's test. Two-sided Bonferroni and Games-Howell post hoc tests were used to compare data with equal and unequal variance, respectively. Alternatively, pairwise t-tests were used. A p-value less than 0.05 was considered to be statistical significance. SPSS analyses were verified using the same tests implemented in the statistical programming language R63 (R Foundation, Vienna, Austria).
Selection analyses.
Based on the genome-wide Affymetrix 6.0 SNP genotype data, we used Primus86,87 to select 626 individuals from the discovery sample using a kinship threshold (0.039) halfway between first and second cousins, so that first cousins and more closely related relatives were excluded. These ‘unrelated’ individuals were then haplotyped using SHAPEIT69-73, and were annotated with ancestral allele information using the selectionTools pipeline88. Haplotype bifurcation diagrams and extended haplotype homozygosity (EHH) plots were drawn using the ‘rehh’ R package89. The haplotype bifurcation diagram90 visualizes the breakdown of linkage disequilibrium as one moves away from the core allele at the focal SNP; each branch reflects the creation of new haplotypes, and the thickness of the line reflects the number of samples with the haplotype. EHH represents the probability that two randomly chosen chromosomes are identical by descent from the focal SNP to the current position of interest90. Selection at the core allele is expected to result in EHH values close to 1 in an extended region centered on the focal SNP. To measure the deviation, we used selscan91 to compute the integrated haplotype score (iHS)92, which is defined as the log of the ratio of the integrated EHH for the derived allele over the integrated EHH for the ancestral allele. These values are then normalized in frequency bins across the whole genome (25 bins were used). Note that selscan's definition of the iHS differs from earlier definitions where the ancestral allele was in the numerator of the ratio91,92. In our case, a large positive iHS indicates that a derived allele has had its frequency increase due to selection. We computed an approximate two-sided P value under the assumption that after normalization the iHS is approximately distributed as a standard normal. We also used selscan to compute nSL (number of segregation sites by length) scores93. The nSL is similar to the iHS, but instead of integrating over genetic distance, the nSL uses the number of segregating sites as a measure of ‘distance’. Thus the nSL is more robust to demographic assumptions than the iHS as it does not depend on a genetic map. As with the iHS, we normalized the nSL scores within 25 frequency bins across the whole genome, and computed approximate two-sided P values assuming a standard normal distribution. The selscan program was run using its assumed default values. As we are focused on testing whether there is positive selection at the missense variant, we did not adjust the P values for multiple testing.
Samples were prepared from 3T3-L1 cells stably overexpressing wild type or the p.Arg457Gln variant of human CREBRF with a carboxy-terminal Myc-His tag using a Pierce
Agarose ChIP Kit (Thermo Scientific, #26156). An anti-cMyc antibody (ThermoFisher MA1-980) was used for immunoprecipitation according to the instructions of the manufacturer. As targets, we selected orthologs of fruit fly genes that had been demonstrated to be up- or down-regulated with 6h rapamycin treatment in wildtype but not regulated in REPTOR mutant fruit fly larvae (Tiebe, M. et al. REPTOR and REPTOR-BP Regulate Organismal Metabolism and Transcription Downstream of TORC1. Dev Cell 33, 272-284 (2015)). CREBRF is the human ortholog of REPTOR. Immunoprecipitated chromatin was subjected to quantitative PCR analysis using SYBRgreen quantification and primer sets designed to amplify the most likely promoter or upstream regulatory sequences of target genes as indicated by evolutionary conservation and ENCODE data (Yue et al. 2014). A 5% aliquot of the chromatin immunoprecipitation samples were used as input controls to calculation % enrichment.
Generating transgenic mice involves five basic steps: purification of a transgenic construct, harvesting donor zygotes, microinjection of transgenic construct, implantation of microinjected zygotes into the pseudo-pregnant recipient mice, and genotyping and analysis of transgene expression in founder mice. Methods for the generation of transgenic mice are known in the art and described, for example, by Cho et al., Curr Protoc Cell Biol. 2009 March; CHAPTER: Unit-19.11, which is incorporated herein in its entirety.
An expression vector, such as an expression vector encoding CREBRF or an expression vector encoding a CREBRF variant (e.g., Arg457Gln), is generated using standard methods known in the art. Construction of transgenes can be accomplished using any suitable genetic engineering technique, such as those described in Ausubel et al. (Current Protocols in Molecular Biology, John Wiley & Sons, New York, 2000). In one embodiment, the transgene is generated using CRISPR/Cas9 technology. Many techniques of transgene construction and of expression constructs for transfection or transformation in general are known and may be used to generate the desired CREBRF expressing construct.
One skilled in the art will appreciate that a promoter is chosen that directs expression of the CREBRF gene in all tissues or in a preferred tissue. In particular embodiments, CREBRF expression is driven by a phosphoglycerate kinase 1 promoter (PGK1)(Qin et al. (2010) PLoS ONE 5(5): e10611. doi:10.1371/journal.pone.0010611), the spleen focus-forming virus (SFFV) (Gonzalez-Murillo et al., Hum. Gene Ther. 2010 May; 21(5):623-30, using knockin technology (Cohen-Tannoudji et al., Mol Hum Reprod 4:929-938, 1998; Rossant et al., Nat Med 1:592-594, 1995; tet-off promoter (Clontech), human EFIs, CMV or endogenous CRBN promotor. The modular nature of transcriptional regulatory elements and the absence of position-dependence of the function of some regulatory elements, such as enhancers, make modifications such as, for example, rearrangements, deletions of some elements or extraneous sequences, and insertion of heterologous elements possible. Numerous techniques are available for dissecting the regulatory elements of genes to determine their location and function. Such information can be used to direct modification of the elements, if desired. Preferably, an intact region that includes all of the transcriptional regulatory elements of a gene is used.
Following its construction, the transgene construct is amplified by transforming bacterial cells using standard techniques. Plasmid DNA is then purified and treated to remove endogenous bacterial sequences. A fragment suitable for expression of a transgenic CREBRF under the control of a suitable promoter, such as an endogenous murine CREBRF promoter, and optionally additional regulatory elements is purified (e.g., by a sucrose gradient or a gel-purification method) in preparation for microinjection.
Foreign DNA is transferred into a mouse zygote by microinjection into the pronucleus. A fragment of the transgene DNA isolated above is microinjected into the male pronuclei of fertilized mouse eggs derived from, for example, a C57BL/6 or C3B6 Fl strain, using the techniques described in Gordon et al. (Proc. Natl. Acad. Sci. USA 77:7380, 1980). The eggs are transplanted into pseudopregnant female mice for full-term gestation, and resultant litters are analysed to identify transgenic mice.
In other embodiments, the knock-in of a mutant allele in the mouse genome can be achieved using homologous recombination (HR) in embryonic stem (ES) cells (Thomas and Capecchi 1987), similar to the methods used to generate conditional knockout mice. Specific mutations can be introduced into endogenous genes and transmitted throughout the mouse germ-line. A DNA construct containing the engineered gene of interest (e.g., a mutated oncogene) is flanked by sequences identical to those in the target locus and introduced into ES cells, where homologous sequences align and recombine, thereby introducing the altered gene into an endogenous locus. This technology allows for the expression of mutant genes from their endogenous promoter, or another promoter of interest, and avoids issues of variability and founder effects that are frequently observed with randomly integrated transgenes.
The practice of the present invention employs, unless otherwise indicated, conventional techniques of molecular biology (including recombinant techniques), microbiology, cell biology, biochemistry and immunology, which are well within the purview of the skilled artisan. Such techniques are explained fully in the literature, such as, “Molecular Cloning: A Laboratory Manual”, second edition (Sambrook, 1989); “Oligonucleotide Synthesis” (Gait, 1984); “Animal Cell Culture” (Freshney, 1987); “Methods in Enzymology” “Handbook of Experimental Immunology” (Weir, 1996); “Gene Transfer Vectors for Mammalian Cells” (Miller and Calos, 1987); “Current Protocols in Molecular Biology” (Ausubel, 1987); “PCR: The Polymerase Chain Reaction”, (Mullis, 1994); “Current Protocols in Immunology” (Coligan, 1991). These techniques are applicable to the production of the polynucleotides and polypeptides of the invention, and, as such, may be considered in making and practicing the invention. Particularly useful techniques for particular embodiments will be discussed in the sections that follow.
The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to make and use the assay, screening, and therapeutic methods of the invention, and are not intended to limit the scope of what the inventors regard as their invention.
To discover genes influencing BMI, 659,492 markers genome-wide were genotyped in a discovery sample of 3,072 Samoans sampled recruited from 33 villages across the ‘Upolu and Savai’i islands using the Affymetrix 6.0 chip (Table 1,
The strongest association with BMI occurred at rs12513649 (P=5.3×10′14) on chromosome 5q35.1 (
†Non-diabetics only (n = 966 men, n = 1,423 women).
The missense variant rs373863828 was genotyped in the discovery and replication samples, obtaining very significant evidence of association with BMI in adults (P=7.0×10−13) (P=3.5×10−9), with a combined meta analysis P-value of 1.4×10−2° (Table 2, Table 3). The meta-analysis showed no evidence of heterogeneity among the three studies (I2=0%; Q=1.12; P=0.571). In the discovery sample, each copy of the A allele increased BMI by 1.36 kg/m2 (
†2A alleles in 73,328 measured alleles.
‡1A allele in 908 measured alleles.
In addition to BMI, the A allele was also positively associated with obesity risk (OR 1.305 and 1.441 in discovery and replication cohorts, respectively) as well as measures of total and regional adiposity including percent body fat, abdominal circumference, and hip circumference in both cohorts (Table 4 and Table 5). The A allele was also positively associated with serum leptin in women (both cohorts) and men (replication cohort) before but not after adjusting for BMI. These data indicate that the association between the missense variant and BMI is indeed due to an association with adiposity.
Given the strength of the association of rs373863828 with BMI, associations between this SNP and fifteen adiposity, metabolic, and lipid health outcome phenotypes were examined (Table 4). The BMI-increasing allele (A) was positively associated with abdominal circumference, hip circumference, percent body fat, abdominal—hip ratio, hypertension risk, and obesity risk and negatively associated with total cholesterol, fasting glucose, and diabetes risk at a Bonferroni-corrected significance threshold of p=0.0027. Fasting insulin and leptin levels were positively associated with the BMI-increasing allele in models that do not include BMI as a covariate, but not in models that include it, indicative of an effect of the allele on these traits through its influence on BMI.
1.78e−10
2.05e−12
1.19e−12
2.23e−03
9.52e−05
3.25e−04
6.89e−08
0.002
1.12e−05
3.86e−07
6.68e−09
1.12E−13
1.78E−10
2.05E−12
1.19E−12
9.52E−05
6.89E−08
1.84E−03
2.57E−04
2.75E−11
6.92E−09
3.98E−04
5.01E−10
1.30E−04
1.31E−09
3.62E−04
3.25E−04
3.24E−05
8.01E−04
1.12E−05
3.86E−07
6.68E−09
2.38E−04
6.31E−04
3.40E−04
5.49E−05
1.50E−04
†Analysis conducted only in non-diabetics
‡Leptin was not analyzed in men and women combined because
8.22E−10
6.58E−04
5.12E−10
4.27E−09
8.84E−04
9.55E−06
1.43E−04
7.62E−06
9.54E−07
6.41E−05
2.42E−06
2.64E−05
2.76E−06
1.16E−04
1.96E−03
2.26E−04
4.47E−04
1.47E−03
3.96E−04
8.49E
—
06
1.35E−04
†Analysis conducted only in non-diabetics
‡Leptin was not analyzed in men and women combined because
Higher BMI and adiposity are usually associated with greater insulin resistance (higher fasting insulin and HOMA-IR), an atherogenic lipid profile (especially, higher serum triglycerides and lower HDL cholesterol), and lower adiponectin. It is, therefore, expected rs373863828's BMI-increasing allele (A) to also be associated with these metabolic variables. However, even though the A allele was consistently associated with higher BMI and adiposity in both discovery and replication cohorts, the expected associations with the above obesity-related comorbidities were not observed, and in some cases, were even reversed (Table 4, Table 5). Notably, when considering all subjects, the risk of diabetes was actually lower (OR 0.586 for discovery cohort, p=6.68E-09) or trended lower (0.742 for replication cohort, p=0.029) in carriers of the A allele. Likewise, even in non-diabetic subjects, the variant was associated with a small but significant reduction in fasting glucose in both cohorts (i.e., decrease of −2.25 mg/dL and −2.09 mg/dL for each copy of the A allele in the discovery and replication cohorts, respectively). These effects became even more significant after adjusting for BMI, suggesting an independent effect of the variant on glucose homeostasis and diabetes risk. Such effects could be due to survival bias, however no correlation between age and genotype was observed (linear regression P=0.849). These effects appear to be independent of obesity associated insulin resistance since associations with fasting insulin and HOMA-IR were not consistently observed across cohorts (higher only in replication cohort before adjusting for BMI). Furthermore, although the variant was associated with lower total cholesterol in the discovery cohort, consistent effects on serum lipids or adiponectin were likewise not observed. Together, these data suggest that the missense variant does not promote, and may even protect against, obesity-associated comorbidities, however additional studies will be required to confirm these findings and directly test this hypothesis.
Although the majority of genes contributing to obesity do so by influencing the central regulation of energy balance18, emerging evidence highlights the contribution of altered cellular metabolism to obesityl9. Therefore, the impact of rs373863828 on cellular bioenergetics was examined. To do so, an established 3T3-L1 adipocyte model was selected for two reasons: 1) CREBRF is widely expressed in virtually all tissues including adipose tissue (Supplementary
CREBRF is conserved and widely expressed (
Since obesity is generally viewed as a disorder of energy homeostasis, and energy utilization (i.e. oxidative phosphorylation) increases during adipogenesis (
In addition to its role in in cellular energy storage and utilization, the Drosophila CREBRF ortholog, Reptor, has recently been implicated in both cellular and organismal adaptation to nutritional stress by mediating the downstream transcriptional response of the cellular energy sensor TORC126,27. In support of this hypothethesis, CREBRF orthologs are highly induced/activated upon starvation in all tissues of Drosophila26,27 as well as in human lymphoblasts28,29. Moreover, both Reptor knockout flies26 and Crebrf knockout mice30 have lower total energy storage and body weight, respectively. Similarly, nutrient starvation of 3T3-L1 cells rapidly increased Crebrf mRNA expression, which peaked at 13-fold by 4 h (P=1.1×10−16), and remained 5-fold elevated at 24 h.(P=4.1×10−14) (
The transcription factor binding sites in the CREBRF gene were analyzed and significant enrichment of binding sites for transcription factors were found. These transcription factors involve in a range of biological processes as shown in Table 6. This analysis was performed using the PANTHER Classification System at http://www.pantherdb.org. This tool classifies transcription factor binding sites within a query gene (in this case CREBRF) according to the gene ontology annotations for each transcription factor. Statistical analysis for enrichment of transcription factor binding sites for each gene ontology (GO) group is performed to compare the enrichment compared to the assumption of random distribution of binding sites within the genome. For example, 2-fold enriched means that there are twice as many binding sites for the transcription factors within that particular GO category in the CREBRF gene as would be under the assumption of random distribution of those binding sites. Table 6 shows the gnes upstream of CREBRF. The p value is the statistical significance of this fold enrichment.
Complementing the functional evidence of “thriftiness”, evidence of positive selection at the missense variant in Samoan genomes was identified. The core haplotype carrying the derived BMI-increasing allele showed long-range linkage disequilibrium (as shown by the single thick branch in
In 1962 James Neel posited the existence of a thrifty gene that provides a metabolic advantage in times of famine and promotes metabolic diseases in times of nutritional excess31. By carrying out a genome-wide association of BMI in the Samoan population, a strongly associated missense variant in CREBRF with a much larger effect size than any other known common BMI risk variant was discovered and replicated. Functional evidence further demonstrates that this missense variant promotes cellular energy conservation by increasing fat storage and decreasing energy utilization in an adipocyte model compared to WT. The potential importance of this variant in organismal energy homeostasis is further supported by the “lean” phenotype of mice30 and flies26 lacking this gene. These data, in combination with evidence of positive selection, support a “thrifty” variant hypothesis for human obesity and underscore the value of examining unique populations to identify novel genetic contributions to complex traits.
This variant was not detected by previous large-scale genome-wide association scans because it is extremely rare in most other populations. In Samoans, the risk allele has a much larger effect on BMI than other common BMI-associated loci found to date. In a model system, the p.Arg457Gln risk variant increases lipid accumulation while limiting energy utilization, but providing the same protection from nutritional stress as WT CRBRF does. Together, these data support an important role for CREBRF in energy homeostasis, thereby identifying a novel pathway for therapeutic intervention in metabolic disease. Further studies of CREBRF are likely to reveal important new insights into the pathogenesis of obesity, nutrition partitioning, and the adaptive response to starvation. Future studies of obesity and other metabolic phenotypes should include its potential modifying and mediating influences with diet and physical activity and gene-gene interactions. The present studies cannot determine the evolutionary source of this variant or resolve questions about the roles of selection and drift in determining its frequency. Detailed anthropological genetic studies throughout the Pacific may help clarify this. Lastly, research is urgently needed about how to integrate and use knowledge of this obesity risk variant to benefit Samoans at both the individual and population health levels.
To determine the effect of loss of function of CREBRF polypeptide, 3T3-L1 adipocytes were transfected with an inducible shRNA construct targeting the Crebrf mRNA. Table 7 lists the shRNA clones and the gene target sequence of each clone.
The oligonucleotide encoding the shRNA was cloned into the SMARTvector inducible lentiviral shRNA vector (GE Life Science). The vector contains a TRE3G tetracycline inducible promoter. The transcription of the shRNA was induced by doxycycline. The expression of shRNA (V3SM7671-235834732) suppresses the expression of wild type and variant CREBRF gene (
To investigate the function of the CREBRF domain in which the p.Arg457Gln variant is located, recombinant 3T3-L1 cells were generated, in which the exon 5 of the CREBRF gene, where the p. Arg457Gln is located, was deleted from the genome (the endogenous CREBRF gene locus) of the cell. For CRISPR mutagenesis, the protocols published by Feng Zhang's group (Nat Protoc. 2013 November; 8(11): 2281-308) was modified. Briefly, vector PX459 was obtained from Addgene. Plasmid vector pSpCas9-2A-Puro(PX459) (Addgene) was linearized by digestion with Bbsl. Annealed oligos, served as inserts, were phosphorylated and annealed by using T4 PNK (NEB). Ligation reaction were performed by T4 DNA Ligase (NEB). The plasmids with the oligonucleotide inserts were transformed into bacteria, individual clones were picked, the plasmid DNA was isolated and the correct insertion of the oligonucleotides was confirmed by agarose gel electrophoresis and DNA sequencing. Recombinant plasmid vectors were transfected into 3T3-L1 cells using the Lipofectamine 2000 regent (Invitrogen). Cell cloning was performed by limiting dilution in 96 well plates and individual clones were expanded and the CREBRF gene was analyzed for mutations induced by CRISPR/Cas9. Two weeks after transfection the cells were subjected to cloning by limiting dilution. Out of 11 clones analyzed, one clone (531C g.20,764_21,067 del304) had a complete deletion of exon 5. The deletion of exon 5 is expected to inactive the CREBRF gene's “thriftiness” function. Table 8 lists the guide RNA sequence for CRISPR/Cas9 mutagenesis targeting exon 5.
Similar protocol was used to generate recombinant cells in which arginine is substituted by glutamine at amino acid position 457 in human CREBRF or its murine equivalent (amino acid position 458). Below is the sequence of the single strand oligonucleotide for knocking in the p.Arg457Gln variant:
The pSpCas9-2A-Puro(PX459) has the backbone of PX459 plasmid. The total vector size is about 9200 bp. The selectable marker is Puromycin. The size of insert hSpCas9-2A-Puro is about 6000 bp. The promoter for Cas9 is Cbh promoter and the promoter for the guide RNA is U6 promoter.
To investigate if the mutation of Arginine to Glutamine at the position 457 of the CREBRF protein has any effect, protein-protein and protein-DNA interactions of p. Arg457Gln and wild type CREBRF were assessed. Co-immunoprecipitation was conducted to show that CREBRF binds another transcription factor, CREBL2, and this binding is enhanced by the or p.Arg457Gln variant (
By chromatin immunoprecipitation, several target genes that CREBRF can bind to were identified. Binding of CREBRF to these genes was enhanced by starvation, and further enhanced by the p.Arg457Gln variant (denoted as “mutation” in the x axis labels) (
Sdhaf4 encodes succinate dehydrogenase complex assembly factor 4. Succinate dehydrogenase is a key mitochondrial enzyme complex linking the tricaboxylic cycle with the electron transport chain. Sdhaf4 facilitates the assembly of the enzyme complex. Positive regulation of Sdhaf4 by Crebrf is likely to increase the efficiency of mitochondrial respiration, and limit the production of reactive oxygen species associated with the activity of unassembled succinate dehydrogenase subunits.
Mine encodes membrane metalloendopeptidase. Also known as neprilysin, Mme is a zinc-dependent endopeptidase that inactivates several peptide hormones, including glucagon and bradykinin. Up-regulation of Mme by Crebrf is expected to result in reduced glucagon availability and changes in glucose homeostasis.
Crebl2 encodes cAMP responsive element binding protein like 2. As indicated by our co-immunoprecipitation studies and investigations of Crebrf and Crebl2 orthologs in Drosophila (Tiebe et al. 2015) Crebl2 is a transcription factor and binding partner of Crebrf. The presence of Crebrf binding sites in the Crebl2 promoter provides evidence for transcriptional positive feedback regulation of the Crebrf/Crebl2 complex.
Tbcel encodes tubulin folding cofactor E like. Tbcel is a homolog of tubulin folding cofactors that depolymerizes tubulin microtubules. Thus Tbcel can regulate cell shape, cell division, the trafficking of cellular organelles, the secretion of proteins.
Creg2—cellular repressor of E1A-stimulated genes 2. Creg2 is a secreted glycoprotein highly expressed in neurons with little available functional data (only 1 paper in pubmed). It is likely involved in cellular differentiation.
The endogenous CREBRF sequence has been manipulated at the genomic level (i.e. not via an expression vector) to introduce the a nucleotide change that results in the arginine to glutamine substitution at amino acid position 457 in human CREBRF or its murine equivalent (amino acid position 458). The Arg457Gln variant or its equivalent in other model species can be introduced at the genomic level in cells and animals using a variety of techniques such Crispr/Cas9, BAC recombineering or any other techniques known in the art.
The below methods using CRISPR/Cas9 system can be used to generate a knockin of the CREBRF variant in any murine cell type, and has been successfully used to knockin the variant in cell and mice (CREBRF knockin mice).
The sequence comparison between the wild type (WT) Crebrf and p. Arg457Gln (Mut) is shown below:
For mCrebrf, the Sequence submitted for guide is:
Two guide primers have the sequence as follows:
One backup primer has the sequence as follows:
mCREBRF Guides are as follows:
The generic primer has sequence as follows:
The mCREBRF_RG guide 1 has sequence as follows:
The mCREBRF_RG guide 6 has sequence as follows:
Primers are selected for using 400-700 bp of CREBRF for product in Primer3plus:
The forward primers are as follows:
The reverse primers are as follows:
Briefly, the above specific sgRNA were selected, in which the sequence does not have any potential off-targets with fewer than 3 mismatches in the whole genome. To introduce the mutation in the locus a 200 bp ssODN Ultramer (MT) note herein was used as the template for homology directed repair (HDR) of the double strand break (DSB) produce by the CRISPR/Cas9 complex. The ultramer corresponds to the genomic sequence evenly flanking the target site, but contains substitutions that: i) introduce mutation, ii) introduce a new restriction site to facilitate genotyping and iii) mismatches in the seed sequences of the sgRNA to prevent further editing of the mutant allele by Cas9/sgRNA complex. It should be noted that if the DSB is repaired by non-homologous end joining instead of HDR, a frameshift could cause a premature stop codon and a null allele. Therefore, in the process of making the desired mice or cells, we will also generate a complete knockout (KO).
Cas9 mRNA and the sgRNA is produced according to Dr Gingras and co-worker optimized strategy (Pelletier S, Gingras S, Green DR. Mouse genome engineering via CRISPR-Cas9 for study of immune function. Immunity. 2015; 42(1):18-27. doi: 10.1016/j.immuni.2015.01.004. PubMed PMID: 25607456, Martinez J, Malireddi R K, Lu Q, Cunha L D, Pelletier S, Gingras S, Orchard R, Guan J L, Tan H, Peng J, Kanneganti T D, Virgin H W, Green DR. Molecular characterization of LC3-associated phagocytosis reveals distinct roles for Rubicon, NOX2 and autophagy proteins. Nat Cell Biol. 2015; 17(7):893-906. doi: 10.1038/ncb3192. PubMed PMID: 26098576). Briefly, Cas9 mRNA transcripts (capped and poly-adenylated) are produced from linearized plasmid encoding a human codon-optimized Cas9 nuclease using mMESSAGE mMACHINE T7 ULTRA Kit. The sgRNA is produced from the dsDNA template using the MEGAshortscript T7 Kit. Both Cas9 mRNA and sgRNAs are purified using the MEGAclear kit and eluted in nuclease-free water (all kits from Life Technologies). Table 10 below and
AAAGAAGGTACTTCTGGGAGTATAG
(Sense)
AGCAGCTTACACCATCACAGCA
(Sense)
CAAAGAGACTTAGAGGCCAGTC
(AntiSense)
The guide 1 and guild 6 probe have the following characteristics as follows:
Guide 1 LNA Probe: ACCTT+G+C+C+AA+GT 67.0° C.
Guide 6 LNA Probe: CCTT+C+T+G+AGT+GG 66.0° C.
Mice were generated by the transgenic core at the University of Pittsburgh's department of immunology. Briefly, fertilized C57BL/6J embryos were microinjected with Cas9 mRNA (100 ng/μl), sgRNA (50 ng/μl) and ssODN (1 μM) and cultured overnight. The next day, 2-cell embryos were transferred to the oviducts of pseudo-pregnant CD1 female recipients. The above generally results in cutting efficiencies as high as 80% and HDR efficiency with ssODN at rate of 8 to 65%, demonstrating that the core can create mutant mice using the CRISPR/Cas9 technology. Tail genomic DNA is tested by PCR, restriction fragment length polymorphism (RFLP) and sequencing to identify putative founders. Similar approached as are used for embryos can be used to create variant-specific knockin in of virtually any cell type.
The practice of the CRISPR/Cas9 employs techniques that are explained fully in the literature, such as, Lin X, Pelletier S, Gingras S, Rigaud S, Maine C J, Marquardt K, Dai Y D, Sauer K, Rodriguez A R, Martin G, Kupriyanov S, Jiang L, Yu L, Green D R, Sherman L A. CRISPR-Cas9 mediated modification of the NOD mouse genome with Ptpn22R619W mutation increases autoimmune diabetes. Diabetes. 2016. doi: 10.2337/db16-0061. PubMed PMID: 27207523, Van de Velde L A, Gingras S, Pelletier S, Murray PJ. Issues with the Specificity of Immunological Reagents for Murine IDO1. Cell Metab. 2016; 23(3):389-90. doi: 10.1016/j.cmet.2016.02.004. PubMed PMID: 26959176, Wang H, Yang H, Shivalila C S, Dawlaty M M, Cheng A W, Zhang F, Jaenisch R. One-step generation of mice carrying mutations in multiple genes by CRISPR/Cas-mediated genome engineering. Cell. 2013; 153(4):910-8. doi: 10.1016/j.ce11.2013.04.025. PubMed PMID: 23643243; PMCID: PMC3969854, Bae S, Park J, Kim J S. Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases. Bioinformatics. 2014; 30(10):1473-5. doi: 10.1093/bioinformatics/btu048. PubMed PMID: 24463181; PMCID: PMC4016707. These techniques are applicable to the production of the knockin mice, and, as such, may be considered in making and practicing the invention.
Hepatocytes, like adipocyte, play a critical role in determining cellular and organismal energy homeostasis and energy substrate metabolism (i.e. obesity and its complications). As reported herein below, manipulation of CREBRF in hepatocytes results in qualitatively similar outcomes as observed in adipocytes and/or adipocyte precursors.
CREBRF is expressed in human liver (
Overexpression of wild-type or variant (p. Arg457Gln) CREBRF influences hepatocellular lipid content, mitochondrial respiration, and cell survival (
From the foregoing description, it will be apparent that variations and modifications may be made to the invention described herein to adopt it to various usages and conditions. Such embodiments are also within the scope of the following claims.
The recitation of a listing of elements in any definition of a variable herein includes definitions of that variable as any single element or combination (or subcombination) of listed elements. The recitation of an embodiment herein includes that embodiment as any single embodiment or in combination with any other embodiments or portions thereof.
All patents and publications mentioned in this specification are herein incorporated by reference to the same extent as if each independent patent and publication was specifically and individually indicated to be incorporated by reference.
The following documents are cited herein.
This application claims priority to U.S. provisional patent application No. 62/214,045, filed Sep. 3, 2015, the content of which is incorporated herein by reference in its entirety.
This invention was made with government support under Grant Nos. R01-HL093093, RO1-AG009375, R01-DK059642, R01-HL090648, R01-DK055406, R01-HL052611, R01-DK090166 and P30 ES006096 awarded by the National Institutes of Health. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US16/50304 | 9/3/2016 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62214045 | Sep 2015 | US |