The contents of the electronic sequence listing (“BROD-5380WP.xml”; Size is 89,008 bytes and it was created on Jul. 6, 2022) is herein incorporated by reference in its entirety.
The subject matter disclosed herein is generally directed to compositions and methods for increasing COBLL1 expression or activity in adipocytes or BCL2 expression or activity to treat cardio-metabolic diseases, such as type 2 diabetes. The subject matter disclosed herein is also generally directed to adipocyte morphological and cellular profiling and metabolic genetic and polygenic risk.
Obesity and type 2 diabetes related traits are intimately linked by both environmental and genetic factors. The global prevalence of obesity and type 2 diabetes (T2D) has risen dramatically over the past century, and both diseases constitute a severely increasing health problem worldwide, with T2D) being predicted to rise in prevalence from 451 to 693 million people between the years 2017 and 2045 (Cho et al. 2018). Most epidemiological and genetic studies have linked obesity to the pathogenesis of T2D through positive phenotypic correlations between adiposity and T2D. However, a small number of loci have been reported that do not follow this observation or even correlate in the opposite phenotypic direction (Lu et al. 2016). In fact, up to 45% of obese individuals do not present with poor glycemic and/or lipid profiles, commonly called the metabolically healthy obese (MHO). Concurrently, up to 30% of normal-weight individuals present with cardiometabolic risk factors, the metabolically obese normal-weight (MONW) (Hosseinpanah et al. 2011; Caleyachetty et al. 2017; Arnlöv et al. 2010; Aung et al. 2014; Calori et al. 2011; Wildman et al. 2008; Primeau et al. 2011; Yaghootkar et al. 2014). Accordingly, there is a need to identify risk markers that can identify patient populations at increased risk for, but not presenting with typical characterized associated with T2D. Likewise, there is a need for new therapeutic targets that can help treat T2D in general, and in patient populations possessing MONW/MOH risk loci in particular.
The majority of genetic loci identified through genome-wide association studies (GWAS) map to more than one disease or trait (Watanabe et al. 2019). This highlights extensive pleiotropy between human diseases and traits and suggests that most loci act through multiple cell types and tissues giving rise to complex disease phenotypes. However, the mechanisms that ultimately converge to modulate disease susceptibility of seemingly unrelated traits and complex diseases are unclear. Type 2 Diabetes is a particularly heterogenous disease with hundreds of loci associated (Mahajan et al. 2018) and multiple tissues implicated in mediating genetic susceptibility (Torres et al. 2020). During disease pathogenesis, T2D) manifests as hyperglycemia which results from either a loss of insulin secretion from pancreatic beta-cells and/or a lack of insulin response in peripheral tissues, such as liver, adipose, and skeletal muscle. Disease heterogeneity of T2D gains further complexity by its diverse clinical presentation. Although T2D is more frequent in obese patients, there is growing evidence for a subset of patients presenting with T2D despite otherwise normal weight or even lower weight (Udler et al. 2018).
Citation or identification of any document in this application is not an admission that such a document is available as prior art to the present invention.
In one aspect, the present invention provides for a method of treating subjects at risk for, or suffering from a metabolic disease comprising, administering, to a subject in need thereof, a therapeutically effective amount of one or more agents that: increases the expression or activity of COBLL1, BCL2, or KDSR in one or more lipid-accumulating cells; reduces the expression or activity of VPS4B in one or more lipid-accumulating cells; enhances actin remodeling in one or more lipid-accumulating cells; or inhibits apoptosis in one or more lipid-accumulating cells.
In certain embodiments, the one or more lipid-accumulating cells is selected from the group consisting of adipocyte progenitors, adipocytes, and skeletal muscle. In certain embodiments, the metabolic disease is T2D, MONW/MOH, lipodystrophy, insulin resistance with a “lipodystrophy-like” fat distribution, insulin sensitivity, BMI-adjusted T2D, and/or increased BMI-adjusted waist-to-hip ratio (WHIRadjBMI). In certain embodiments, the subject has decreased expression of COBLL1 in adipocytes and/or adipocyte progenitors; decreased expression of BCL2 and/or KDSR in adipose-derived mesenchymal stem cells (AMSCs); decreased expression of BCL2 in skeletal muscle; and/or increased expression of VPS4B in AMSCs.
In certain embodiments, the subject has an impairment of actin cytoskeleton remodeling in adipocytes and/or adipocyte progenitors; and/or comprises one or more MONW/MOH risk loci, preferably, the rs6712203 variant. In certain embodiments, the subject has decreased expression of BCL2 and/or KDSR in AMSCs, decreased expression of BCL2 in skeletal muscle, increased expression of VPS4B in AMSCs, and/or increased apoptosis in adipocytes; and/or comprises one or more lipodystrophy risk loci, preferably, the rs12454712 variant.
In certain embodiments, the one or more agents that enhances actin remodeling is selected from the group consisting of geodiamolides (Geodiamolide H), Jasplakinolide, Chondramide (Chondramide A), ADF/Cofilin, Arp2/3 complex, Profilin, Gelsolin (Flightless-I), Formin, Villin (Advillin), and Adseverin. In certain embodiments, the metabolic disease is Type-2 Diabetes (T2D) and/or MONW/MOH.
In certain embodiments, the one or more agents that inhibits apoptosis is selected from the group consisting of Ginkgo biloba extract (EGb 761), Rhodiola crenulata extract (RCF), salidroside, dehydroepiandrosterone, allopregnanolone, diosmin, glycine, M50054, BI-6C9, TC9-305 (2-sulfonyl-pyrimidinyl derivatives), BI-11A7, 3-o-tolylthiazolidine-2,4-dione, minocycline, methazolamide, melatonin, gamma-tocotrienol (GTT), 3-hydroxypropyl-triphenylphosphonium (TPP)-conjugated imidazole-substituted oleic acid (TPP-IOA), TPP-conjugated stearic acid (TPP-ISA), TPP-6-ISA, CLZ-8, Xanthan gum (XG), PD98059, Vitamin E, and Tanshinone. In certain embodiments, the metabolic disease is lipodystrophy, insulin resistance with a “lipodystrophy-like” fat distribution, insulin sensitivity, BMI-adjusted T2D), increased BMI-adjusted waist-to-hip ratio (WHRadjBMI), and/or Type-2 Diabetes (T2D)).
In certain embodiments, the expression or activity of COBLL1 is increased in adipocyte progenitors or adipocytes. In certain embodiments, the metabolic disease is Type-2 Diabetes (T2D) and/or MONW/MOH.
In certain embodiments, the expression or activity of BCL2 or KDSR is increased in adipocyte progenitors. In certain embodiments, the adipocyte progenitors are subcutaneous adipose-derived mesenchymal stem cells (AMSCs). In certain embodiments, the expression or activity of BCL2 is increased in skeletal muscle. In certain embodiments, the expression or activity of VPS4B is reduced in adipocyte progenitors. In certain embodiments, the adipocyte progenitors are visceral AMSCs. In certain embodiments, the metabolic disease is lipodystrophy, insulin resistance with a “lipodystrophy-like” fat distribution, insulin sensitivity, BMI-adjusted T2D, increased BMI-adjusted waist-to-hip ratio (WHRadjBMI), and/or Type-2 Diabetes (T2D).
In certain embodiments, the one or more agents are one or more small molecules that enhances the activity or expression of COBLL1. In certain embodiments, the one or more agents are one or more small molecules that enhances the activity or expression of BCL2 or KDSR. In certain embodiments, the one or more agents are one or more small molecules that reduces the activity or expression of VPS4B.
In certain embodiments, the one or more agents is a polynucleotide comprising a sequence encoding COBLL1. In certain embodiments, the polynucleotide is part of a vector system comprising adipocyte specific regulatory sequences for tissue- and/or cell type-specific expression of the one or more agents. In certain embodiments, the vector system comprises a viral vector system. In certain embodiments, the viral vector system has tropism for adipose tissue. In certain embodiments, the one or more agents is a recombinant polypeptide derived from the COBLL1 gene or functional variant thereof.
In certain embodiments, the one or more agents is a fusion protein, comprising a DNA binding element of a programmable nuclease configured to specifically bind to a sequence in proximity or distant to the COBLL1 gene and wherein the protein activates expression of COBLL1; or configured to specifically bind to a sequence in proximity or distant to the 18q21.33 locus and wherein the protein activates expression of BCL2 and/or KDSR. In certain embodiments, the DNA-binding portion comprises a zinc finger protein or DNA-binding domain thereof, TALEN protein or DNA-binding domain thereof, or a Cas nuclease protein or DNA-binding domain thereof. In certain embodiments, the DNA-binding portion is linked to an activation domain. In certain embodiments, the activation domain is derived from an alternative splicing variant of POU2F2 that activates expression. In certain embodiments, the fusion protein is encoded in a polynucleotide vector. In certain embodiments, the vector system comprises adipocyte specific regulatory sequences for tissue specific expression of the one or more agents. In certain embodiments, the vector system comprises a viral vector system optionally comprising a tropism for adipose tissue.
In another aspect, the present invention provides for a method of treating subjects suffering from or at risk of developing T2D or lipodystrophy, comprising administering a gene editing system that corrects one or more genomic variants that decrease the expression or activity of COBLL1 in adipocytes and/or adipocyte progenitors; or that decrease the expression or activity of BCL2 and KDSR in adipocyte progenitors, decrease the expression or activity of BCL2 in skeletal muscle, and increase the expression or activity of VPS4B in adipocyte progenitors.
In another aspect, the present invention provides for a method of treating subjects suffering from or at risk of developing a metabolic disease, comprising administering a gene editing system that corrects one or more genomic risk variants selected from the group consisting of rs6712203 (COBLL1 locus), rs9686661, rs4804833, rs2972144, rs13389219, rs11837287, rs7903146 (TCF7L2 locus), rs1534696 (SNX10 locus), rs287621, rs1412956, rs13133548, rs11667352, rs12454712 (BCL2 locus), rs673918, rs646123, rs2963449, rs1572993, rs632057, rs11637681, rs6063048, rs7660000, rs1421085, rs7258937, rs9939609, rs998584, rs4925109, rs12641088, and any variant that is within the haplotype for the above variants.
In certain embodiments, the gene editing system is a zinc finger nuclease, a TALEN, a meganuclease, or a CRISPR-Cas system. In certain embodiments, the gene editing system is a CRISPR-Cas system. In certain embodiments, the method further comprises a donor template, configured to replace a portion of a genomic sequence comprising the one or more genomic risk variants with a wild-type or non-risk variant. In certain embodiments, the one or more variants comprises rs6712203 or rs12454712.
In certain embodiments, the gene editing system is a base editing system that corrects one or more of the genomic variants to a wild type or non-risk variant. In certain embodiments, the base editing system is a CRISPR-Cas base editing system. In certain embodiments, the one or more genomic variants include rs6712203 or rs12454712.
In certain embodiments, a C allele/risk genotype of rs6712203 is edited to the T allele/non-risk genotype; or wherein a T allele/risk genotype of rs12454712 is edited to the C allele/non-risk genotype.
In certain embodiments, the gene editing system is a prime editing system that corrects one or more of the genomic variants to a wild type or non-risk variant. In certain embodiments, the one or more genomic variants include rs6712203 or rs12454712. In certain embodiments, the PEG RNA encodes a donor template to replace the rs6712203 or rs12454712 variant with a wild-type or non-risk variant. In certain embodiments, the gene editing system is a prime editing system and wherein the PEG RNA encodes a donor template to replace the one or more genomic risk variants with a wild type or non-risk variant.
In certain embodiments, the gene editing system is a programmable transposition system that corrects one or more of the genomic variants to a wild type or non-risk variant. In certain embodiments, the one or more genomic variants include rs6712203 or rs12454712. In certain embodiments, the programmable transposition system is a CAST system. In certain embodiments, the guide polynucleotide of the CAST system comprises a donor construct comprising a donor sequence to replace a genomic region comprising the rs6712203 or rs12454712 variant with a wild type sequence.
In another aspect, the present invention provides for a method of treating Type-2 Diabetes in subjects comprising one or more variants that decrease COBLL1 expression or activity by decreasing binding of POU2F2 to a binding site in an enhancer regulating COBLL1 expression comprising, administering to a subject in need thereof 1) allogenic adipocyte progenitors that exhibit wild type COBLL1 expression, or 2) autologous adipocyte progenitors genetically edited to correct the one or more variants to a wild-type sequence.
In another aspect, the present invention provides for a method of treating a metabolic disorder in subjects comprising administering to a subject in need thereof 1) allogenic adipocyte progenitors that do not comprise one or more genomic risk variants selected from the group consisting of rs6712203 (COBLL1 locus), rs9686661, rs4804833, rs2972144, rs13389219, rs11837287, rs7903146 (TCF7L2 locus), rs1534696 (SNX10 locus), rs287621, rs1412956, rs13133548, rs11667352, rs12454712 (BCL2 locus), rs673918, rs646123, rs2963449, rs1572993, rs632057, rs11637681, rs6063048, rs7660000, rs1421085, rs7258937, rs9939609, rs998584, rs4925109, rs12641088, and any variant that is within the haplotype for the above variants; or, 2) autologous adipocyte progenitors genetically edited to correct the one or genomic risk variants to a wild-type or non-risk variant. In certain embodiments, the one or more variants comprise rs6712203 or rs12454712.
In certain embodiments, the adipocyte progenitors are adipose-derived mesenchymal stem cells (AMSCs). In certain embodiments, the autologous adipocyte progenitors are edited to change a C allele/risk genotype of rs6712203 to the T allele/non-risk genotype.
In another aspect, the present invention provides for a method for detecting a variant in subject, comprising, detecting whether a rs6712203 or rs12454712 variant is present in a subject by conducting a genotyping assay on a biological sample from the subject and detecting whether the rs6712203 or rs12454712 variant is present. In certain embodiments, genotyping is conducted by restriction fragment length polymorphism identification, random amplified polymorphic detection, amplified fragment length polymorphism, PCR, DNA sequencing, allele specific oligonucleotide hybridization, or microarray hybridization. In certain embodiments, the method further comprises administering a) a therapeutically effective amount of one or more agents that increase the expression or activity of COBLL1, or enhance actin remodeling in adipocytes or adipocyte progenitors, b) a therapeutically effective amount of one or more agents that increase the expression or activity of BCL2 and/or KDSR, or inhibit apoptosis in adipocytes or adipocyte progenitors, c) a gene editing system that corrects the one or more variants to a wild type sequence, d) adoptive cell transfer comprising allogenic adipocyte or adipocyte progenitor donors exhibiting wild type COBLL1 expression, or autologous adipocyte or adipocyte progenitor donors genetically modified to correct the one or more variants to a wild type sequence, or c) adoptive cell transfer comprising allogenic adipocyte progenitor donors exhibiting wild type BCL2 and/or KDSR expression, or autologous adipocyte progenitor donors genetically modified to correct the one or more variants to a wild type sequence.
In another aspect, the present invention provides for a method of treating T2D comprising: performing a genotyping assay on a biological sample from a subject to determine if the subject has one or more variants that decrease COBLL1 expression or activity by decreasing binding of POU2F2 to a binding site in an enhancer regulating COBLL1 expression; and if the subject has the one or more variants administering a) a therapeutically effective amount of one or more agents that increase the expression or activity of COBLL1, or enhance actin remodeling in adipocytes or adipocyte progenitors, b) a gene editing system that corrects the one or more variants to a wild type sequence, or c) adoptive cell transfer comprising allogenic adipocyte donors exhibiting wild type COBLL1 expression, or autologous adipocyte donors genetically modified to correct the one or more variants to a wild type sequence; or if the subject does not have the one or more variants, administering a standard-of-care T2D therapy.
In another aspect, the present invention provides for a method of treating lipodystrophy comprising: performing a genotyping assay on a biological sample from a subject to determine if the subject has one or more variants that decrease the expression or activity of BCL2 and KDSR in adipocyte progenitors, decrease the expression or activity of BCL2 in skeletal muscle, and increase the expression or activity of VPS4B in adipocyte progenitors; and if the subject has the one or more variants administering a) a therapeutically effective amount of one or more agents that increase the expression or activity of BCL2 and/or KDSR, or inhibit apoptosis in adipocytes or adipocyte progenitors, b) a gene editing system that corrects the one or more variants to a wild type sequence, or c) adoptive cell transfer comprising allogenic adipocyte progenitor donors exhibiting wild type BCL2 and/or KDSR expression, or autologous adipocyte progenitor donors genetically modified to correct the one or more variants to a wild type sequence; or if the subject does not have the one or more variants, administering a standard-of-care lipodystrophy therapy.
In another aspect, the present invention provides for a method for diagnosing metabolically obese normal weight (MONW) subjects at increased risk for developing T2D comprising, detecting one or more variants that decrease the expression or activity of COBLL1 in adipocyte and/or adipocyte progenitors and diagnosing the subject, and diagnosing the subject as increased risk of T2D if the one or more variants are detected. In certain embodiments, the one or more variants decrease binding of POU2F2 to a binding site in an enhancer regulating COBLL1 expression. In certain embodiments, the one or more variants comprises rs6712203.
In another aspect, the present invention provides for a method for diagnosing lipodystrophy subjects at increased risk for developing T2D or heart disease comprising, detecting one or more variants that that decrease the expression or activity of BCL2 and KDSR in adipocyte progenitors, decrease the expression or activity of BCL2 in skeletal muscle, and increase the expression or activity of VPS4B in adipocyte progenitors and diagnosing the subject as increased risk of T2D or heart disease if the one or more variants are detected. In certain embodiments, the one or more variants comprises rs12454712.
In another aspect, the present invention provides for a method of screening for agents capable of treating T2D in subjects with a MONW/MOH risk phenotype comprising: treating a population of cells comprising adipocytes having the rs6712203 variant with an agent; and detecting actin remodeling and/or one or more COBLL1 co-regulated genes, wherein detecting an increase in actin remodeling and/or the one or more genes identifies agent as capable of treating T2D) in subjects having a MONW/MOH risk phenotype. In certain embodiments, the one or more COBLL1 co-regulated genes are selected from the group consisting of ITGAM, PIK3CA, ROCK2, ITGA1, ARHGEF7, CRK, FGFR2, and ARHGEF6.
In another aspect, the present invention provides for a method of screening for agents capable of treating lipodystrophy in subjects with a lipodystrophy risk phenotype comprising: treating a population of cells comprising adipocytes having the rs12454712 variant with an agent; and detecting apoptosis and/or one or more apoptosis genes, wherein detecting a decrease in apoptosis and/or one or more apoptosis genes identifies agent as capable of treating lipodystrophy in subjects having a lipodystrophy risk phenotype.
In another aspect, the present invention provides for an unbiased high-throughput multiplex profiling method for simultaneously identifying morphological and cellular phenotypes for lipid-accumulating cells comprising: staining a cellular system comprising one or more lipid-accumulating cells with one or more stains that differentiate cellular compartments selected from the group consisting of nuclei, cytoplasm and total cell and differentiate organelles selected from the group consisting of DNA, mitochondria, actin, Golgi, plasma membrane, lipids, nucleoli and cytoplasmic RNA; imaging the stained cells using an automated image analysis pipeline; and identifying one or more morphological features for each of the organelles from the resulting images, wherein the features comprise one or more features selected from the group consisting of object size, object shape, intensity, granularity, texture, colocalization, number of objects, distance to neighboring objects, cellular compartment, and combinations thereof. In certain embodiments, about 100 or more cells are imaged for the cellular system. In certain embodiments, about 500 or more cells are imaged for the cellular system. In certain embodiments, each feature for each organelle includes a quantitative range comprising at least two values for the feature. In certain embodiments, a pattern of morphological features is linked to a cellular phenotype. In certain embodiments, the morphological features are linked to one or more gene expression programs.
In certain embodiments, the cellular system is obtained from a subject. In certain embodiments, the cellular system comprises lipocytes. In certain embodiments, the lipocytes are selected from the group consisting of adipocytes, hepatocytes, macrophages/foam cells and glial cells. In certain embodiments, the lipocytes are part of a pathophysiological process in cells selected from the group consisting of vascular smooth muscle cells, skeletal muscle cells, renal podocytes, and cancer cells. In certain embodiments, the cellular system comprises stem cells differentiated over a time course, wherein the cells from the cellular system are stained and imaged at different time points. In certain embodiments, the time points comprise one or more time points selected from the group consisting of 0 days, 3 days, 8 days and 14 days. In certain embodiments, the cellular system comprises adipose-derived mesenchymal stem cells (AMSCs) differentiated to adipocytes, wherein the cellular system is stained over a time course. In certain embodiments, the AMSCs are obtained from a subject. In certain embodiments, the AMSCs are subcutaneous AMSCs. In certain embodiments, the AMSCs are visceral AMSCs.
In certain embodiments, the method further comprises performing RNA-seq on the lipid-accumulating cells.
In certain embodiments, the cellular system is stained with one or more fluorescent dyes selected from the group consisting of Hoechst, MitoTracker Red, Phalloidin, wheat germ agglutinin (WGA), BODIPY, and SYTO14. In certain embodiments, the imaging is taken across four channels. In certain embodiments, the image analysis pipeline comprises image analysis software and a novel algorithm.
In certain embodiments, cells are clustered based on patterns of features identified.
In certain embodiments, the imaging pipeline comprises artificial intelligence, machine learning, deep learning, neural networks, and/or linear regression modeling.
In certain embodiments, the cellular system comprises cells comprising a SNP of interest, whereby morphological and cellular phenotypes can be determined for the SNP. In certain embodiments, the cellular system comprises cells perturbed with one or more drugs, whereby morphological and cellular phenotypes can be determined for the one or more drugs. In certain embodiments, the cellular system comprises cells perturbed at one or more genomic loci, whereby morphological and cellular phenotypes can be determined for the one or more genomic loci. In certain embodiments, the cells are perturbed with a programmable nuclease or RNAi.
In another aspect, the present invention provides for a method of identifying morphological features for predicting metabolic clinical characteristics in a subject in need thereof comprising: identifying morphological features according to the method of any embodiment herein for one or more cellular systems derived from one or more subjects having a metabolic clinical characteristic; and fitting a logistic regression model for the clinical characteristic on the entire set of features from and selecting features that best fit the model. In certain embodiments, the method further comprises: identifying a subset of features comprising: constructing an interaction network between the features, wherein nodes represent features, edges indicate interactions between two nodes, and edge weight indicates the strength of the interaction, and selecting a subset of nodes with at least one edge above a cutoff weight, whereby features with high-weight interactions are selected; and fitting a logistic regression model for the clinical characteristic on the entire set of features and selecting features that best fit the model. In certain embodiments, the method further comprises grouping the features into a compartment category selected from the group consisting of lipid, actin/Golgi/plasma membrane (AGP), Mito, DNA, and other, and stratifying by differentiation day, wherein the number of features that can be modeled in every grouped and stratified category are the features.
In another aspect, the present invention provides for a method of predicting metabolic clinical characteristics in a subject in need thereof comprising: identifying morphological or cellular features according to the method of any embodiment herein for one or more cellular systems derived from the subject; and estimating a metabolic clinical characteristic from one or more of the features. In certain embodiments, the one or more features used for estimating the clinical characteristic are selected according to any embodiment herein.
In another aspect, the present invention provides for a method of identifying histological features for predicting metabolic clinical characteristics in a subject in need thereof comprising: identifying features for one or more histological images of adipose tissue samples obtained from one or more subjects having a metabolic clinical characteristic, wherein the features are identified by a method comprising: grouping at least 100-500 cells from an image into cell area (μm2) categories, wherein the categories are defined by cell area ranges for a plurality of control subjects of the same sample tissue type; determining for each cell area category one or more features selected from: the fraction of cells in the cell area category, median area of cells in the category, 25% interquartile point in the category, and 75% interquartile point in the category; and fitting a logistic regression model for the clinical characteristic on the entire set of features and selecting features that best fit the model. In certain embodiments, the cells are grouped into 5 area categories consisting of: a cell area <25% quartile point for the control group (very small), a cell area ≥25% quartile point for the control group and <the median cell area for the control group (small), a cell area ≥median cell area for the control group and <mean cell area for the control group (medium), a cell area ≥mean area for the control group and <75% quartile point for the control group (large), and a cell area ≥75% quartile point for the control group (very large).
In another aspect, the present invention provides for a method of predicting metabolic clinical characteristics in a subject in need thereof comprising: identifying features from a histological image of an adipose tissue sample obtained from the subject comprising: grouping at least 100-500 cells from the image into cell area (μm2) categories, wherein the categories are defined by cell area ranges for a plurality of control subjects of the same cell tissue type; determining for each cell area category one or more features selected from the fraction of cells in the cell area category, median area of cells in the category, 25% interquartile point in the category, and 75% interquartile point in the category; and estimating a metabolic clinical characteristic from one or more of the features. In certain embodiments, the cells are grouped into 5 area categories consisting of: a cell area <25% quartile point for the control group (very small), a cell area ≥25% quartile point for the control group and <the median cell area for the control group (small), a cell area ≥median cell area for the control group and <mean cell area for the control group (medium), a cell area ≥mean area for the control group and <75% quartile point for the control group (large), and a cell area ≥75% quartile point for the control group (very large). In certain embodiments, the one or more features used for estimating the clinical characteristic are selected according to any embodiment herein. In certain embodiments, the tissue is subcutaneous adipose tissue. In certain embodiments, the tissue is visceral adipose tissue.
In another aspect, the present invention provides for a method of predicting metabolic clinical characteristics in a subject in need thereof comprising determining clinical characteristics using morphological features and using histological features; and comparing the clinical characteristics to predict clinical characteristics for the subject.
In certain embodiments, the logistic regression model is a linear model with logit link (GLM). In certain embodiments, the linear association with binomial distribution is implemented using the R glm function, wherein the default glm convergence criteria on deviances is used to stop the iterations, wherein the DeLong method is used to calculate confidence intervals for the e-statistics, wherein forward feature selection (R step function) is used to select the features, and/or wherein the Akaike information criterion (AIC) is used as the stop condition for the feature selection procedure.
In another aspect, the present invention provides for a method of detecting HOMA-IR or WHIRadjBMI risk in a subject comprising, detecting one or more features according to the method of any embodiment herein, wherein the one or more features are selected from the group consisting of: increased lipid granularity in visceral adipocytes; increased lipid texture_SumEntropy in visceral adipocytes; increased cell area/shape in visceral adipocytes; decreased lipid texture_InverseDifferenceMoment in visceral adipocytes; decreased BODIPY Texture_AngularSecondMoment; upregulation of one or more genes selected from the group consisting of GYS-1, TPI1, PFKP and PGK; and downregulation of one or more genes selected from the group consisting of ACAA1 and SCP2.
In another aspect, the present invention provides for a method of detecting lipodystrophy risk in a subject comprising, detecting one or more features according to the method of any embodiment herein, wherein the one or more features are selected from the group consisting of: increased mitochondrial stain intensity; smaller lipid droplets on average compared to adipocytes from individuals with low polygenic risk; upregulation of one or more genes selected from the group consisting of EHHADH and NFATC3.
In certain embodiments, the method further comprises a treatment step comprising administering one or more of insulin, thiazolidinedione, biguanide, meglitinide, DPP-4 inhibitors, Sodium-glucose transporter 2 (SGLT2) inhibitor, alpha-glucosidase inhibitor, bile acid sequestrant, sulfonylureas and/or amylin analogs.
These and other aspects, objects, features, and advantages of the example embodiments will become apparent to those having ordinary skill in the art upon consideration of the following detailed description of example embodiments.
An understanding of the features and advantages of the present invention will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the invention may be utilized, and the accompanying drawings of which:
The figures herein are for illustrative purposes only and are not necessarily drawn to scale.
Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure pertains. Definitions of common terms and techniques in molecular biology may be found in Molecular Cloning: A Laboratory Manual, 2nd edition (1989) (Sambrook, Fritsch, and Maniatis); Molecular Cloning: A Laboratory Manual, 4th edition (2012) (Green and Sambrook); Current Protocols in Molecular Biology (1987) (F. M. Ausubel et al. eds.); the series Methods in Enzymology (Academic Press, Inc.): PCR 2: A Practical Approach (1995) (M. J. MacPherson, B. D. Hames, and G. R. Taylor eds.): Antibodies, A Laboratory Manual (1988) (Harlow and Lane, eds.): Antibodies A Laboratory Manual, 2nd edition 2013 (E. A. Greenfield ed.); Animal Cell Culture (1987) (R. I. Freshney, ed.); Benjamin Lewin, Genes IX, published by Jones and Bartlet, 2008 (ISBN 0763752223); Kendrew et al. (eds.), The Encyclopedia of Molecular Biology, published by Blackwell Science Ltd., 1994 (ISBN 0632021829); Robert A. Meyers (ed.), Molecular Biology and Biotechnology: a Comprehensive Desk Reference, published by VCH Publishers, Inc., 1995 (ISBN 9780471185710); Singleton et al., Dictionary of Microbiology and Molecular Biology 2nd ed., J. Wiley & Sons (New York, N.Y. 1994), March, Advanced Organic Chemistry Reactions, Mechanisms and Structure 4th ed., John Wiley & Sons (New York, N.Y. 1992); and Marten H. Hofker and Jan van Deursen, Transgenic Mouse Methods and Protocols, 2nd edition (2011).
As used herein, the singular forms “a”, “an”, and “the” include both singular and plural referents unless the context clearly dictates otherwise.
The term “optional” or “optionally” means that the subsequent described event, circumstance or substituent may or may not occur, and that the description includes instances where the event or circumstance occurs and instances where it does not.
The recitation of numerical ranges by endpoints includes all numbers and fractions subsumed within the respective ranges, as well as the recited endpoints.
The terms “about” or “approximately” as used herein when referring to a measurable value such as a parameter, an amount, a temporal duration, and the like, are meant to encompass variations of and from the specified value, such as variations of +/−10% or less, +/−5% or less, +/−1% or less, and +/−0.1% or less of and from the specified value, insofar such variations are appropriate to perform in the disclosed invention. It is to be understood that the value to which the modifier “about” or “approximately” refers is itself also specifically, and preferably, disclosed.
As used herein, a “biological sample” may contain whole cells and/or live cells and/or cell debris. The biological sample may contain (or be derived from) a “bodily fluid”. The present invention encompasses embodiments wherein the bodily fluid is selected from amniotic fluid, aqueous humour, vitreous humour, bile, blood serum, breast milk, cerebrospinal fluid, cerumen (earwax), chyle, chyme, endolymph, perilymph, exudates, feces, female ejaculate, gastric acid, gastric juice, lymph, mucus (including nasal drainage and phlegm), pericardial fluid, peritoneal fluid, pleural fluid, pus, rheum, saliva, sebum (skin oil), semen, sputum, synovial fluid, sweat, tears, urine, vaginal secretion, vomit and mixtures of one or more thereof. Biological samples include cell cultures, bodily fluids, cell cultures from bodily fluids. Bodily fluids may be obtained from a mammal organism, for example by puncture, or other collecting or sampling procedures.
The terms “subject,” “individual,” and “patient” are used interchangeably herein to refer to a vertebrate, preferably a mammal, more preferably a human. Mammals include, but are not limited to, murines, simians, humans, farm animals, sport animals, and pets. Tissues, cells and their progeny of a biological entity obtained in vivo or cultured in vitro are also encompassed.
As used herein, an “allele” is one of a pair or series of genetic variants of a polymorphism at a specific genomic location. A “response allele” is an allele that is associated with altered response to a treatment. Where a SNP is biallelic, both alleles will be response alleles (e.g., one will be associated with a positive response, while the other allele is associated with no or a negative response, or some variation thereof).
As used herein, “genotype” refers to the diploid combination of alleles for a given genetic polymorphism. A homozygous subject carries two copies of the same allele and a heterozygous subject carries two different alleles.
As used herein, a “haplotype” is one or a set of signature genetic changes (polymorphisms) that are normally grouped closely together on the DNA strand and are usually inherited as a group; the polymorphisms are also referred to herein as “markers.” A “haplotype” as used herein is information regarding the presence or absence of one or more genetic markers in a given chromosomal region in a subject. A haplotype can consist of a variety of genetic markers, including indels (insertions or deletions of the DNA at particular locations on the chromosome); single nucleotide polymorphisms (SNPs) in which a particular nucleotide is changed; microsatellites; and minis satellites.
As used herein, the term “type 2 diabetes”, also known as type 2 diabetes mellitus, and often referred to as diabetes includes, e.g., adult-onset diabetes.
There are multiple terms for stem cells derived from adipose tissue, for example, preadipocytes, adipose-derived stromal cells (ADSC), processed lipoaspirated cells, adipose-derived mesenchymal stem cells (AMSC), adipose-derived adult stem cells. (Tsuji W, Rubin J P, Marra K G. Adipose-derived stem cells: Implications in tissue regeneration. World J Stem Cells. 2014; 6 (3): 312-321). These terms are used interchangeably throughout the specification. As used herein, “adipocyte progenitors” can refer to stem cells or any cell intermediates that differentiate into adipocytes.
As used in this context, to “treat” means to cure, ameliorate, stabilize, prevent, or reduce the severity of at least one symptom or a disease, pathological condition, or disorder. This term includes active treatment, that is, treatment directed specifically toward the improvement of a disease, pathological condition, or disorder, and also includes causal treatment, that is, treatment directed toward removal of the cause of the associated disease, pathological condition, or disorder. In addition, this term includes palliative treatment, that is, treatment designed for the relief of symptoms rather than the curing of the disease, pathological condition, or disorder; preventative treatment, that is, treatment directed to minimizing or partially or completely inhibiting the development of the associated disease, pathological condition, or disorder; and supportive treatment, that is, treatment employed to supplement another specific therapy directed toward the improvement of the associated disease, pathological condition, or disorder. It is understood that treatment, while intended to cure, ameliorate, stabilize, or prevent a disease, pathological condition, or disorder, need not actually result in the cure, amelioration, stabilization or prevention. The effects of treatment can be measured or assessed as described herein and as known in the art as is suitable for the disease, pathological condition, or disorder involved. Such measurements and assessments can be made in qualitative and/or quantitative terms. Thus, for example, characteristics or features of a disease, pathological condition, or disorder and/or symptoms of a disease, pathological condition, or disorder can be reduced to any effect or to any amount.
The term “in need of treatment” as used herein refers to a judgment made by a caregiver (e.g., physician, nurse, nurse practitioner, or individual in the case of humans; veterinarian in the case of animals, including non-human animals) that a subject requires or will benefit from treatment. This judgment is made based on a variety of factors that are in the realm of a caregiver's experience, but that include the knowledge that the subject is ill, or will be ill, as the result of a condition that is treatable by the compositions and therapeutic agents described herein. In embodiments, the judgment by the caregiver has been made, and the subject identified as requiring or benefitting from treatment.
The administration of compositions, agents, cells, or populations of cells, as disclosed herein may be carried out in any convenient manner including by aerosol inhalation, injection, ingestion, transfusion, implantation or transplantation. The cells or population of cells may be administered to a patient subcutaneously, intradermally, intratumorally, intranodally, intramedullary, intramuscularly, intrathecally, by intravenous or intralymphatic injection, or intraperitoneally.
Various embodiments are described hereinafter. It should be noted that the specific embodiments are not intended as an exhaustive description or as a limitation to the broader aspects discussed herein. One aspect described in conjunction with a particular embodiment is not necessarily limited to that embodiment and can be practiced with any other embodiment(s). Reference throughout this specification to “one embodiment”, “an embodiment,” “an example embodiment,” means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, appearances of the phrases “in one embodiment,” “in an embodiment,” or “an example embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment but may. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner, as would be apparent to a person skilled in the art from this disclosure, in one or more embodiments. Furthermore, while some embodiments described herein include some, but not other features included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention. For example, in the appended claims, any of the claimed embodiments can be used in any combination.
Reference is made to the manuscript posted Jul. 19, 2021 on BioRxiv and entitled, “Discovering cellular programs of intrinsic and extrinsic drivers of metabolic traits using LipocyteProfiler” and having as authors Samantha Laber, Sophie Strobel, Josep-Maria Mercader, Hesam Dashti, Alina Ainbinder, Julius Honecker, Garrett Garborcauskas, David R. Stirling, Aaron Leong, Katherine Figueroa, Nasa Sinnott-Armstrong, Maria Kost-Alimova, Giacomo Deodato, Alycen Harney, Gregory P. Way, Alham Saadat, Sierra Harken, Saskia Reibe-Pal, Hannah Ebert, Yixin Zhang, Virtu Calabuig-Navarro, Elizabeth McGonagle, Adam Stefek, Josée Dupuis, Beth A. Cimini, Hans Hauner, Miriam S. Udler, Anne E. Carpenter, Jose C. Florez, Cecilia M. Lindgren, Suzanne B. R. Jacobs, Melina Claussnitzer (bioRxiv 2021.07.17.452050; doi: doi.org/10.1101/2021.07.17.452050).
All publications, published patent documents, and patent applications cited herein are hereby incorporated by reference to the same extent as though each individual publication, published patent document, or patent application was specifically and individually indicated as being incorporated by reference.
Most disease-associated genetic loci map to more than one disease or trait, suggesting they act through multiple cell types and tissues giving rise to complex disease phenotypes. This pervasive pleiotropy of human diseases presents a tremendous burden on identifying mediating mechanisms and therapeutic targets. Multiple metabolic risk haplotypes are associated with risk for metabolic diseases. However, whether a haplotype actually causes a disease and the mechanisms that cause the disease are unknown. For example, a risk haplotype may be important for disease in a specific cell type at a specific time. Integration of phenotypic and transcriptional profiling in primary human cells allows for functional characterization of disease-associated genetic variants. Applicants have analyzed multiple risk haplotypes and determined the function of risk haplotypes involved in causation of specific metabolic phenotypes, such as type 2 diabetes and lipodystrophy.
The metabolic risk haplotype at 2q24.3 displays cross-phenotype association signatures that are reminiscent of the MONW/MOH phenotype and associate with increased risk of T2D, increased HOMA-IR, increased WHR adjusted for BMI (WHRadjBMI) and decreased body fat percentage, decreased estimated subcutaneous adipose tissue mass, and cardiometabolic trait risk (Kooner et al. 2011; DIAbetes Genetics Replication And Met . . . ; Morris et al. 2012; Heid et al. 2010; Lu et al. 2016). Consistent with these associations, the 2q24.3 locus falls into the lipodystrophy cluster of T2D loci (Udler et al. 2018), suggesting adipocytes as the mediating cell type at this locus. Notably, amongst the 20 loci identified in the T2D lipodystrophy process-specific cluster (Udler et al. 2018), the 2q24.3 locus is the top scoring one, inferring the strongest contribution to a ‘lipodystrophic-like’ phenotype amongst the T2D GWAS loci. However, similar to the vast majority of genetic risk loci identified through GWAS, the function of the 2q24.3 metabolic risk locus is currently unknown.
Applicants have identified causal variants leading to reduced COBL11 expression. Applicants further demonstrate that the cellular program under the genetic control of the 2q23.4 risk locus and the effector gene COBL11 is characterized by an impairment of actin cytoskeleton remodeling processes in differentiating subcutaneous adipocytes and a subsequent failure of these cells to accumulate lipids, and develop into a metabolically active and insulin-sensitive subcutaneous adipocyte. While not being bound by a particular scientific theory, individual risk for T2D and fasting insulin is believed to be modified by changes to the mass, distribution, and function of adipose tissue (Lotta et al 2017; Small et al 2018), and that a metabolically healthy state is largely dependent on subcutaneous adipose tissue expandability. As disclosed in further detail herein. Applicants have, for the first time, identified actin cytoskeleton remodeling as a critical factor for subcutaneous adipocyte function and as causally involved in metabolic disease progression in humans, thus identifying COBL11 and causal variants impacting COBL11 expression or function as viable therapeutic targets for treating and/or preventing T2D.
Using an unbiased approach based on phenotypic profiling of primary human adipocytes Applicants dissected the function of a genomic risk locus in 18q21.33 that is strongly associated with a lipodystrophy-like metabolic phenotype. Applicants showed that the haplotype modifies gene expression of at least three target genes (BCL2, KDSR, and VPS4B) in at least three diabetes-related tissues (subcutaneous adipose tissue, visceral adipose tissue, and skeletal muscle) during specific temporal windows with distinct cellular and morphological consequences that converge to modulate disease susceptibility. BCL2 and KDSR showed reduced expression in subcutaneous adipose-derived mesenchymal stem cells (AMSCs), however, reduced apoptosis and mitochondrial morphological features were observed in mature adipocytes that are terminally differentiated. BCL2 also showed reduced expression in skeletal muscle. VSPB4 showed increased expression in visceral adipose-derived mesenchymal stem cells (AMSCs), however, mitochondrial morphological features were observed in mature adipocytes that are terminally differentiated. The genotype mediated expression on target gene expression was observed in AMSCs and the morphological features were observed in differentiated adipocytes. Applicants identified that the rs12454712 variant increases apoptosis and apoptosis related genes in adipocytes. Thus, inhibiting apoptosis can be used to treat metabolic diseases caused by this mechanism.
Specifically, phenotype-informed clustering of T2D identified a subset of T2D loci that follow clinical presentation of insulin resistance with a “lipodystrophy-like” fat distribution (low BMI, adiponectin, and high-density lipoprotein cholesterol, and high triglycerides) (Udler et al. 2018). Amongst those genetic signals was rs12454712 on 18q21.33, a genetic locus of unknown function that maps to the first intron of the BCL2 gene. The 18q21.33 locus, like most genetic risk loci, lies within the non-coding genome, making the identification of mediating target genes and mechanisms challenging and experimentally intense. Non-coding variants may regulate one or more genes across long genomic distances, and the same variant might have very context-specific functions, including regulating different genes in different cell types under specific environmental conditions. In this study, we set out to decipher the function of the 18q21.33 metabolic risk locus. By combining novel experimental and statistical methods, Applicants mechanistically dissect this pleiotropic locus into mediating cell types and target genes, developmental time-points of action and cellular functions that could account for the associated phenotypes in humans.
Together, the findings highlight the complexities underlying disease-associated loci in humans and showcase an approach of unbiased dissection of mediating mechanisms.
Accordingly, embodiments disclosed herein are directed to methods for treating subjects at risk for, or suffering from, Type-2 Diabetes (T2D) or lipodystrophy. A subject may be at risk for T2D if they clinically demonstrate increased glucose tolerance, increased insulin resistance, are identified as possessing a MONW/MOH risk loci or lipodystrophy risk loci, or a combination thereof. Thus, treatment methods disclosed herein are directed to subjects who are both at risk for T2D) or lipodystrophy or have been diagnosed with T2D) or lipodystrophy. In example embodiments, the methods provide treatment options for individuals who possess certain metabolic risk loci, in particular those who classify as MONW/MOH. In one aspect, embodiments disclosed herein are directed to methods of treating subjects at risk for, or suffering from T2D, by administering one or more agents that increase COBL11 expression or COBL11 activity in adipocyte or adipocyte-progenitor cell types. In another aspect, embodiments disclosed herein are directed to methods of treating subjects at risk for, or suffering from, T2D by administering one or more agents that can edit causal risk variants in adipocyte or adipocyte-progenitors to a wild-type or non-risk variant. In certain example embodiments, the causal risk variant is an intronic variant in the COBL11 gene. In certain example embodiments, the intronic variant alters the binding affinity of POU Class 2 Homeobox 2 (POU2F2) to an enhancer controlling COBL11 expression. In certain example embodiments, the causal variant includes rs6712203. In another aspect, embodiments disclosed herein are directed to methods of treating subjects at risk for, or suffering from, lipodystrophy by administering one or more apoptosis inhibitors. In another aspect, embodiments disclosed herein are directed to methods of treating subjects at risk for, or suffering from, lipodystrophy by administering one or more agents that can edit causal risk variants in adipocyte-progenitors to a wild-type or non-risk variant. In certain example embodiments, the causal variant includes rs12454712.
In another aspect, embodiments disclosed herein are directed to a method for a identifying the presence of a rs6712203 or rs12454712 variant in a subject by conducting a genotyping assay on a biological sample from the subject. In one example embodiment, identification of the rs6712203 variant further comprises treating the subject with one or more agent that increases the expression or activity of COBLL1, or enhances actin remodeling; corrects the one or more variants to a wild type sequence with a gene editing system; or adoptive cell transfer comprising allogenic adipocyte donor exhibiting wild type COBLL1 expression, or autologous adipocyte donors genetically modified to correct the one or more variants to a wild type sequence. In one example embodiment, identification of the rs12454712 variant further comprises treating the subject with one or more agent that increases the expression or activity of BCL2, or inhibits apoptosis; corrects the one or more variants to a wild type sequence with a gene editing system; or adoptive cell transfer comprising allogenic adipocyte donor exhibiting wild type expression, or autologous adipocyte donors genetically modified to correct the one or more variants to a wild type sequence.
In another aspect, embodiments disclosed herein are directed to a method of treating a person at risk for, or suffering from T2D, based on detecting one or more polygenic risk indicators, and administering one or more treatments for increasing the expression of activity of COBLL1, or that enhance actin remodeling in adipocyte or adipocyte progenitors, if the polygenic risk indicator is detected, or treating the subject with a T2D standard-or-care therapy if the polygenic risk indicator is not detected.
In another aspect, embodiments disclosed herein are directed to methods for unbiased high-throughput multiplex profiling of morphological and cell phenotypes simultaneously. At least four fluorescent dyes may be used to stain cells. The stained cells are imaged using an automated image analysis pipeline, and morphological and cellular phenotypes are identified from the resulting images.
In one example embodiment, a method of treating subjects that are at risk for, or suffering from Type-2 Diabetes (T2D)), comprises administering to a subject in need thereof, a therapeutically effective amount of one or more agents that increase the expression or activity of COBLL1, or that enhance actin remodeling, in adipocytes or adipocyte progenitors. In one example embodiment, the subject may suffer from a cellular dysfunction that leads to impairment of actin cytoskeleton remodeling in adipocytes and/or adipocyte progenitors. In another example embodiment, the subject may have one or more MONW/MOH risk loci.
In another example embodiment, a method of treating subjects that are at risk for, or suffering from lipodystrophy, comprises administering to a subject in need thereof, a therapeutically effective amount of one or more agents that increase the expression or activity of BCL2 or KDSR, decrease the expression or activity of VPS4B, or that inhibit apoptosis, in adipocytes or adipocyte progenitors. In one example embodiment, the subject may suffer from a cellular dysfunction that leads to impairment of mitochondrial mechanisms that prevent apoptosis in adipocytes. In another example embodiment, the subject may have one or more lipodystrophy risk loci.
As used herein “lipodystrophy” refers to a group of genetic or acquired disorders in which the body is unable to produce and maintain healthy fat tissue. The medical condition is characterized by abnormal or degenerative conditions of the body's adipose tissue. (“Lipo” is Greek for “fat”, and “dystrophy” is Greek for “abnormal or degenerative condition”.) This condition is also characterized by a lack of circulating leptin which may lead to osteosclerosis. The absence of fat tissue is associated with insulin resistance, hypertriglyceridemia, non-alcoholic fatty liver disease (NAFLD) and metabolic syndrome. Due to an insufficient capacity of subcutaneous adipose tissue to store fat, fat is deposited in non-adipose tissue (lipotoxicity), leading to insulin resistance. Patients display hypertriglyceridemia, severe fatty liver disease and little or no adipose tissue. Average patient lifespan is approximately 30 years before death, with liver failure being the usual cause of death. In contrast to the high levels seen in non-alcoholic fatty liver disease associated with obesity, leptin levels are very low in lipodystrophy. In certain embodiments, polygenic lipodystrophy includes insulin resistance with a “lipodystrophy-like” fat distribution, insulin sensitivity, BMI-adjusted T2D, increased BMI-adjusted waist-to-hip ratio (WHRadjBMI), and/or Type-2 Diabetes (T2D).
In certain example embodiments, a method of treating subjects that are at risk for, or are suffering from Type 2 Diabetes (T2D) comprises administering one or more small molecules that increases expression of COBLL1, increases binding of POU2F2 to a binding site in an enhancer regulating COBLL1 expression, or that enhances actin remodeling in adipocytes or adipocyte progenitors.
In certain example embodiments, a method of treating subjects that are at risk for, or are suffering from lipodystrophy comprises administering one or more small molecules that increases expression of BCL2 in pre-adipocytes (e.g., subcutaneous AMSCs) and/or skeletal muscle, increases binding of BCL2 to pro-apoptotic proteins, or that inhibits apoptosis in adipocytes.
In certain example embodiments, a method of treating subjects that are at risk for, or are suffering from lipodystrophy comprises administering one or more small molecules that increases expression of KDSR in pre-adipocytes (e.g., subcutaneous AMSCs), increases activity of KDSR, or that enhances mitochondrial function in adipocytes.
In certain example embodiments, a method of treating subjects that are at risk for, or are suffering from lipodystrophy comprises administering one or more small molecules that increases expression of VPS4B in pre-adipocytes (e.g., visceral AMSCs), increases activity of VPS4B, or that enhances mitochondrial in adipocytes.
The term “small molecule” refers to compounds, preferably organic compounds, with a size comparable to those organic molecules generally used in pharmaceuticals. The term excludes biological macromolecules (e.g., proteins, peptides, nucleic acids, etc.). Preferred small organic molecules range in size up to about 5000 Da, e.g., up to about 4000, preferably up to 3000 Da, more preferably up to 2000 Da, even more preferably up to about 1000 Da, e.g., up to about 900, 800, 700, 600 or up to about 500 Da. In example embodiments, the small molecule may act as an antagonist or agonist.
In one example embodiment, a method for treating subjects suffering from, or at risk of, T2D comprises administering small molecules that target a similar mechanism of action as COBLL1, that is enhancing actin remodeling in adipocytes or adipocyte progenitors.
Actin is a protein and invertebrates have three main monomer isoforms including α-isoforms of skeletal, cardiac, and smooth muscles; β-isoforms in non-muscle and muscle cells; and γ-isoforms in non-muscle and muscle cells. Actin participates in protein-protein interactions and can transition between monomeric states called G-actin and filamentous states called F-actin. Actin plays a role in many cellular functions such as cell motility, cell shape, polarity, and regulation of transcription. Actin belongs to a structural superfamily with sugar kinases, hexokinases, and Hsp70 proteins. Actin comprises of around 375 amino acids and folds into two major a/B domains or inner and outer domains further comprising of four subdomains.
The actin cytoskeleton comprises of a network of fibrous actin and is the system that allows organelle, chromosome, and cell movement. It is also the structural support for a cell and can change the cell morphology by assembling or disassembling. This reorganization is also called actin remodeling and is controlled by actin-binding proteins that regulate nucleation, branching, elongation, bundling, severing, and capping of actin filaments.
Herein, improper actin cytoskeleton remodeling is implicated in metabolic disease progression. It has been shown that adipocyte size is positively correlated with impaired insulin sensitivity and glucose tolerance. Moreover, adipocyte size was shown to predict Type-2 diabetes. (Hansson, B., et al. Adipose cell size changes are associated with a drastic actin remodeling. Sci. Rep. 9, 12941, 2019). As a major structural modifier in adipocytes, actin cytoskeleton remodeling can be regulated as a treatment method for preventing or treating metabolic disorders or diseases. Further, the actin cytoskeleton remodeling process is required for differentiating subcutaneous adipocytes, and subsequent accumulation of lipids and development into metabolically active and insulin-sensitive subcutaneous adipocytes. Treatment may be regulation of COBLL1 expression, regulation of POU2F2 binding, and/or modification of a rs6712203 genetic variant.
In one example embodiment, actin remodeling can be enhanced by an agent selected from the group consisting of geodiamolides (Geodiamolide H), Jasplakinolide, Chondramide (Chondramide A), ADF/Cofilin, Arp2/3 complex, Profilin, Gelsolin (Flightless-I), Formin, Villin (Advillin), and Adseverin. In another example embodiment, the agent is a geodiamolide which is a cyclodepsipeptide commonly derived from marine sponges. In specific non-limiting embodiments, the geodiamolide is Geodiamolide H. In another example embodiment, the agent is a jasplakinolide, also known as jaspamide, is a cyclic peptide with a fifteen-carbon macrocyclic ring containing three amino acid residues 1-alanine, N-methyl-2-bromotryptophan, and βtyrosine. In another example embodiment, the agent is chondramide, which is a cyclodepsipeptide isolated from the mycobacterium Chondromyces crocatus. In another example embodiment, the agent is ADF/cofilin, which are actin-binding proteins of the actin-depolymerization factor family. ADF may also be known as destrin. In some embodiments, the agent is Arp2/3 complex, which is an assembly of seven protein subunits. Two of the seven subunits are actin-related proteins ARP2 and ARP3. In another example embodiment, the agent is profilin, which is an actin-binding protein. In another example embodiment, the agent is gelsolin, which is an actin binding/regulatory protein. In specific non-limiting embodiments, the gelsolin is Flightless I. In another example embodiment, the agent is formin, which is a protein with a conserved FH2 domain that stabilizes actin. In certain example embodiments, the agent is vilin which is a calcium-regulated actin-binding protein. In specific non-limiting embodiments, the vilin is advilin, a member of a gelsolin/villin superfamily of actin binding and regulatory proteins. In another example embodiment, the agent is adseverin also known as scinderin, which belongs to the gelsolin superfamily and is an actin severing and capping protein.
Small Molecules that Inhibit Apoptosis or Target BCL2 Expression
In one example embodiment, a method for treating subjects suffering from, or at risk of, lipodystrophy comprises administering small molecules that inhibit apoptosis or enhance BCL2 expression in adipocytes or adipocyte progenitors (e.g., BCL2). In one example embodiment, apoptosis can be inhibited by an agent selected from the group consisting of Ginkgo biloba extract (FGb 761), Rhodiola crenulata extract (RCF), salidroside, dehydroepiandrosterone, allopregnanolone, diosmin, glycine, M50054, BI-6C9, TC9-305 (2-sulfonyl-pyrimidinyl derivatives), BI-11A7, 3-o-tolylthiazolidine-2,4-dione, minocycline, methazolamide, melatonin, gamma-tocotrienol (GTT), 3-hydroxypropyl-triphenylphosphonium (TPP)-conjugated imidazole-substituted oleic acid (TPP-IOA), TPP-conjugated stearic acid (TPP-ISA), TPP-6-ISA, CLZ-8, Xanthan gum (XG), PD98059, Vitamin E, and Tanshinone (see, e.g., El-Shimaa Mohamed Naguib Abdelhafcz, Sara Mohamed Naguib Abdelhafez Ali, Mohamed Ramadan Eisa Hassan and Adel Mohammed Abdel-Hakem (June 20th 2019). Apoptotic Inhibitors as Therapeutic Targets for Cell Survival, Cytotoxicity-Definition, Identification, and Cytotoxic Compounds, Erman Salih Istifli and Hasan Basri Ila, IntechOpen, DOI: 10.5772/intechopen.85465).
Rhodiola crenulata extract (RCE) is an edible alcohol extract, conserving greatly the mitochondrial integrity and in turn prohibiting the release of cytochrome C, which leads to cell death. The effective concentration of the most important component, salidroside, was ˜4% (w/w). Glycine can upregulate of Bcl2 and Bcl2-bax (apoptosis regulator BAX). Minocycline directly inhibits the release of cytochrome C from mitochondria. Methazolamide was FDA approved for the treatment of glaucoma, while melatonin inhibited oxygen/glucose deprivation induced cell death, loss of mitochondrial membrane potential, release of mitochondrial factors, pro-IL-1β processing, and activation of caspase-1 and -3. Gamma-tocotrienol (GTT) prevents the activation of caspase-3 and caspase-9, reducing the release of cytochrome C from the mitochondria and preventing H2O2-induced apoptosis. 3-hydroxypropyl-triphenylphosphonium (TPP)-conjugated imidazole-substituted oleic acid (TPP-IOA) and stearic acid (TPP-ISA) exert strong specific liganding of heme-iron in cytochrome C/cardiolipin (CI) complex and effectively suppress its peroxidase activity and CL peroxidation, thus preventing cytochrome C release and cell death. TPP-6-ISA is an effective inhibitor of the peroxidase function of cyt c/CL complexes with a significant antiapoptotic activity. CLZ-8 is capable of targeting a PUMA protein and provides for apoptosis resistance. Xanthan gum (XG) is an extracellular polysaccharide secreted by microorganisms that decreases the apoptosis of chondrocytes, downregulates the expressions of active caspase-9, active caspase-3 and bax, and upregulates the expression of bcl-2. PD98059 inhibits apoptosis through inhibition of BAX and other factors. Vitamin E can modify BAX and BCL-2 expression levels. Tanshinone can inhibit the expression of Bax and stimulate the expression of Bcl-2.
In one example embodiment, subjects at risk for, or suffering T2D, are treated by increasing expression of COBLL1 using a gene therapy approach. As used herein, the terms “gene therapy”, “gene delivery”, “gene transfer” and “genetic modification” are used interchangeably and refer to modifying or manipulating the expression of a gene to alter the biological properties of living cells for therapeutic use.
In one example embodiment, a vector for use in gene therapy comprises a sequence encoding COBLL1 or a functional fragment thereof, and is used to deliver said sequence to adipocyte or adipocyte progenitors to increase expression of COBLL1 in those cells types. The vector may further comprise one or more regulatory elements to control expression of COBLL1. The vector may further comprise regulatory/control elements, e.g., promoters, enhancers, introns, polyadenylation signals, Kozak consensus sequences, or internal ribosome entry sites (IRES). The vector may further comprise cellular localization signals, such as a nuclear localization signal (NLS) or nuclear export signal (NES). The vector may further comprise a targeting moiety that directs the vector specifically to adipocyte or adipocyte progenitors. In another example embodiment, the vector may comprise a viral vector with a trophism specific for adipocyte and adipocyte progenitors.
COBLL1, also known as CORDON-BLEU WHI2 REPEAT PROTEIN-LIKE 1; CORDON-BLEU PROTEIN-LIKE 1; COBL-LIKE 1; COBLR1; and KIAA0977, is located on the human 2Q24.3 locus. In one example embodiment, the polynucleotide sequence included in the vector is a DNA sequence derived from the primary accession number Q53SF7. In another example embodiment, the DNA sequence is Q53SF7. In another example embodiment, the DNA sequence is derived from the secondary accession numbers A6NM73, Q6IQ33, Q7Z316, Q9BRH4, Q9UG88, and Q9Y213. In another example embodiment, the DNA sequence is selected from the group consisting of A6NMZ3, Q6IQ33, Q7Z316, Q9BRH4, Q9UG88, and Q9Y2I3.
In another example embodiment, the polynucleotide sequence included in the vector is a RNA sequence derived from; NM_001365672; NM_014900; NM_001278458; NM_001278460; NM_001278461; NM_001365670; NM_001365671; NM_001365673; NM_001365674; or NM_001365675. In another example embodiment, the polynucleotide sequence included in the vector is a RNA sequence selected from the group consisting of: NM_001365672; NM_014900; NM_001278458; NM_001278460; NM_001278461; NM_001365670; NM_001365671; NM_001365673; NM_001365674; or NM_001365675. In another example embodiment, the sequence include in the vector is derived from mRNA selected from the group consisting of: AB023194.1; AI261693.1; AK001813.1; AK002054.1; AK002057.1; AK075181.1; AK225849.1; AK294937.1; AL049939.1; AL832824.1; BC006264.2; BC071588.1; BX537877.1; BX648994.1; BX649112.1; or CB989062.1. In another example embodiment, the sequence included in the vector is a mRNA sequence selected from the group consisting of: AB023194.1; AI261693.1; AK001813.1; AK002054.1; AK002057.1; AK075181.1; AK225849.1; AK294937.1; AL049939.1; AL832824.1; BC006264.2; BC071588.1; BX537877.1; BX648994.1; BX649112.1; or CB989062.1.
All gene name symbols as used throughout the specification refer to the gene as commonly known in the art. The examples described herein that refer to gene names are to be understood to encompass human genes, as well as genes in any other organism (e.g., homologous, orthologous genes). The term, homolog, may apply to the relationship between genes separated by the event of speciation (e.g., ortholog). Orthologs are genes in different species that evolved from a common ancestral gene by speciation. Normally, orthologs retain the same function in the course of evolution. Gene symbols may be those referred to by the HUGO Gene Nomenclature Committee (HGNC) or National Center for Biotechnology Information (NCBI). Any reference to the gene symbol is a reference made to the entire gene or variants of the gene. Reference to a gene encompasses the gene product (e.g., protein encoded for by the gene).
Recombinant expression vectors can comprise a nucleic acid of the invention in a form suitable for expression of the nucleic acid in a host cell, which means that the recombinant expression vectors include one or more regulatory elements, which may be selected on the basis of the host cells to be used for expression, that is operably-linked to the nucleic acid sequence to be expressed. Within a recombinant expression vector, “operably linked” is intended to mean that the nucleotide sequence of interest is linked to the regulatory element(s) in a manner that allows for expression of the nucleotide sequence (e.g., in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell). The term “operably linked” as used herein also refers to the functional relationship and position of a promoter sequence relative to a polynucleotide of interest (e.g., a promoter or enhancer is operably linked to a coding sequence if it affects the transcription of that sequence). Typically, an operably linked promoter is contiguous with the sequence of interest. However, enhancers need not be contiguous with the sequence of interest to control its expression. The term “promoter”, as used herein, refers to a nucleic acid fragment that functions to control the transcription of one or more polynucleotides, located upstream of the polynucleotide sequence(s), and which is structurally identified by the presence of a binding site for DNA-dependent RNA polymerase, transcription initiation sites, and any other DNA sequences including, but not limited to, transcription factor binding sites, repressor, and activator protein binding sites, and any other sequences of nucleotides known in the art to act directly or indirectly to regulate the amount of transcription from the promoter. A “tissue-specific” promoter is only active in specific types of differentiated cells or tissues.
In another embodiment, the vector of the invention further comprises expression control sequences including, but not limited to, appropriate transcription sequences (i.e., initiation, termination, promoter, and enhancer), efficient RNA processing signals (e.g., splicing and polyadenylation (polyA) signals), sequences that stabilize cytoplasmic mRNA, sequences that enhance translation efficiency (i.e., Kozak consensus sequence), and sequences that enhance protein stability. A great number of expression control sequences, including promoters which are native, constitutive, inducible, or tissue-specific are known in the art and may be utilized according to the present invention.
In another embodiment, the vector of the invention further comprises a post-transcriptional regulatory region. In a preferred embodiment, the post-transcriptional regulatory region is the Woodchuck Hepatitis Virus post-transcriptional region (WPRE) or functional variants and fragments thereof and the PPT-CTS or functional variants and fragments thereof (see, e.g., Zufferey R, et al., J. Virol. 1999; 73:2886-2892; and Kappes J, et al., WO 2001/044481). In a particular embodiment, the post-transcriptional regulatory region is WPRE. The term “Woodchuck hepatitis virus posttranscriptional regulatory element” or “WPRE”, as used herein, refers to a DNA sequence that, when transcribed, creates a tertiary structure capable of enhancing the expression of a gene (see, e.g., Lec Y, et al., Exp. Physiol. 2005; 90 (1): 33-37 and Donello J, et al, J. Virol. 1998; 72 (6): 5085-5092).
The term “regulatory element” is intended to include promoters, enhancers, internal ribosomal entry sites (IRES), and other expression control elements (e.g., transcription termination signals, such as polyadenylation signals and poly-U sequences). Such regulatory elements are described, for example, in Goeddel, GENE EXPRESSION TECHNOLOGY: METHODS IN ENZYMOLOGY 185, Academic Press, San Diego, Calif. (1990).
Regulatory elements include those that direct constitutive expression of a nucleotide sequence in many types of host cell and those that direct expression of the nucleotide sequence only in certain host cells (e.g., tissue-specific regulatory sequences). A tissue-specific promoter may direct expression primarily in a desired tissue of interest, such as adipose tissue or particular cell types (e.g., adipocytes or adipocyte progenitors). Regulatory elements may also direct expression in a temporal-dependent manner, such as in a cell-cycle dependent or developmental stage-dependent manner, which may or may not also be tissue or cell-type specific. In some embodiments, a vector comprises one or more pol III promoter (e.g., 1, 2, 3, 4, 5, or more pol III promoters), one or more pol II promoters (e.g., 1, 2, 3, 4, 5, or more pol II promoters), one or more pol I promoters (e.g., 1, 2, 3, 4, 5, or more pol I promoters), or combinations thereof. Also encompassed by the term “regulatory element” are enhancer elements (e.g., adipose specific enhancers or Woodchuck Hepatitis Virus Posttranscriptional Regulatory Element (WPRE)). It will be appreciated by those skilled in the art that the design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression desired, etc. A vector can be introduced into host cells to thereby produce transcripts, proteins, or peptides, including fusion proteins or peptides, encoded by nucleic acids as described herein (e.g., COBLL1).
In a preferred embodiment, the adipose-tissue specific regulatory region according to the invention comprises the adipose-specific aP2 enhancer and the basal aP2 promoter (see, e.g., Rival Y, et al., J. Pharmacol. Exp. Ther. 2004:31 1 (2): 467-475). The region comprising the adipose-specific aP2 enhancer and the basal aP2 promoter is also known as “mini/aP2 regulatory region” and is formed by the basal promoter of the aP2 gene and the adipose-specific enhancer of said aP2 gene. Preferably, the aP2 promoter is murine. (See, e.g., Graves R, et al, Mol. Cell Biol. 1992; 12 (3): 1202-1208; and Ross S, et al, Proc. Natl. Acad. Sci. USA 1990; 87:9590-9594).
In another preferred embodiment, the adipose-tissue specific regulatory region according to the invention comprises the adipose-specific UCP1 enhancer and the basal UCP1 promoter. (Sec, e.g., del Mar Gonzalez-Barroso M, et al, J. Biol. Chem. 2000; 275 (41): 31722-31732; and Rim J, et al, J. Biol. Chem. 2002; 277 (37): 34589-34600). The region comprising the adipose-specific (CPI enhancer and the basal UCP1 promoter is also known as “mini/UCP regulatory region” and refers to a combination of the basal promoter of the UCP1 gene and the adipose-specific enhancer of said UCP1 gene. Preferably, a rat UCP1 promoter is used. (See, e.g., Larose M, et al, J. Biol. Chem. 1996; 271 (49): 31533-31542; and Cassard-Doulcier A, et al, Biochem. J. 1998; 333:243-246).
In general, and throughout this specification, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. Vectors include, but are not limited to, nucleic acid molecules that are single-stranded, double-stranded, or partially double-stranded; nucleic acid molecules that comprise one or more free ends, no free ends (e.g., circular); nucleic acid molecules that comprise DNA, RNA, or both; and other varieties of polynucleotides known in the art. There are no limitations regarding the type of vector that can be used. The vector can be a cloning vector, suitable for propagation and for obtaining polynucleotides, gene constructs or expression vectors incorporated to several heterologous organisms. Suitable vectors include eukaryotic expression vectors based on viral vectors (e.g., adenoviruses, adeno-associated viruses as well as retroviruses and lentiviruses), as well as non-viral vectors such as plasmids.
In one example embodiment, the vector is a viral vector, wherein virally-derived DNA or RNA sequences are present in the vector for packaging into a virus (e.g., retroviruses, replication defective retroviruses, adenoviruses, replication defective adenoviruses, and adeno-associated viruses). Viral vectors also include polynucleotides carried by a virus for transfection into a host cell. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., episomal mammalian vectors). Other vectors (e.g., non-episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome. Moreover, certain vectors are capable of directing the expression of genes to which they are operably-linked. Such vectors are referred to herein as “expression vectors.” Vectors for and that result in expression in a eukaryotic cell can be referred to herein as “eukaryotic expression vectors.” In another example embodiment, the vector integrates the gene into the cell genome or is maintained episomally.
In one example embodiment, COBLL1 is introduced to adipocytes or adipocyte progenitors by means of an AAV viral vector. The terms “adeno-associated virus”, “AAV virion”, and “AAV particle”, as used interchangeably herein, refer to a virion composed of at least one AAV capsid protein (preferably all capsid proteins of a particular AAV serotype) and an encapsidated polynucleotide AAV genome. If the particle comprises a heterologous polynucleotide flanked by AAV inverted terminal repeats (i.e., a polynucleotide that is not a wild-type AAV genome, e.g., a transgene is delivered to a mammalian cell), it is often referred to as an “AAV vector particle” or “AAV vector”. AAV refers to a virus belonging to the genus dependovirus parvoviridae. The AAV genome is approximately 4.7 kilobases long and consists of single-stranded deoxyribonucleic acid (ssDNA), which can be in either the positive or negative orientation. The genome comprises Inverted Terminal Repeats (ITRs), and two Open Reading Frames (ORFs), at both ends of the DNA strand: rep and cap. The Rep framework is formed by four overlapping genes encoding the Rep proteins required for the AAV life cycle. The cap framework contains overlapping nucleotide sequences of the capsid proteins: VP1, VP2, and VP3, which interact together to form an icosahedral symmetric capsid (see, e.g., Carter B, Adeno-assisted viruses and ado-assisted viruses vectors for genetic drive, Lassic D, et al, eds., “Gene Therapy: Therapeutic Mechanisms and Strategies” (Marcel Dekker, Inc., New York, NY, US, 2000); and Gao G, et al, J. Virol. 2004; 78 (12): 6381-6388). The term “adeno-associated virus ITR” or “AAV ITR” as used herein refers to inverted terminal repeats present at both ends of the DNA strand of the genome of an adeno-associated virus. The ITR sequences are required for efficient proliferation of the AAV genome. Another characteristic of these sequences is their ability to form hairpins. This property contributes to its own priming, which allows synthesis of the second DNA strand independent of the priming enzyme. It has also been shown that ITRs are essential for integration and rescue of wild-type AAV DNA into the host cell genome (i.e., chromosome 19 of humans) and for efficient encapsidation of AAV DNA that binds to the resulting fully assembled, DNase-resistant AAV particles.
The term “AAV vector” as used herein further refers to a vector comprising one or more polynucleotides of interest (or transgenes) flanked by AAV terminal repeats (ITRs). Such AAV vectors can be replicated and packaged as infectious viral particles when present in a host cell that has been transfected with a vector that can encode and express Rep and Cap gene products (i.e., AAV Rep and Cap proteins), and wherein the host cell has been transfected with a vector that encodes and expresses proteins from adenovirus open reading frame F4orf 6. When an AAV vector is incorporated into a larger polynucleotide (e.g., a chromosome or another vector, such as a plasmid for cloning or transfection), then the AAV vector is typically referred to as a “protein-vector”. This protein-vector can be “rescued” by replication and encapsidation in the presence of AAV packaging functions and the necessary helper functions provided by E4orf 6.
In one example embodiment, gene therapy uses an adeno-associated viral (AAV) vector comprising a recombinant viral genome wherein said recombinant viral genome comprises an expression cassette comprising an adipose tissue-specific transcriptional regulatory region operably linked to a polynucleotide encoding for COBLL1 (AAV vectors can also be used for any compositions described herein, such as a programmable nuclease). AAV according to the present invention can include any serotype of the 42 serotypes of AAV known. In another example embodiment, the AAV is as described previously for adipose tissue specific tropism (see, e.g., WO2014020149A1; and Bates R, Huang W, Cao L. Adipose Tissue: An Emerging Target for Adeno-associated Viral Vectors. Mol Ther Methods Clin Dev. 2020; 19:236-249). In particular, the AAV may include an adipocyte specific promoter.
In particular, the AAV of the present invention may belong to the serotype AAV1, AAV2, AAV3 (including types 3A and 3B), AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11 and any other AAV. In a preferred embodiment, the adeno-associated viral vector of the invention is of a serotype selected from the group consisting of the AAV6, AAV7, AAV8, and AAV9 serotypes. In more preferred embodiments, the adeno-associated viral vector of the invention is an AAV8 serotype. In more preferred embodiments, the adeno-associated viral vector of the invention is the engineered hybrid serotype Rec2 (see, e.g., Charbel Issa, et al., 2013, Assessment of tropism and effectiveness of new primate-derived hybrid recombinant AAV serotypes in the mouse and primate retina PLOS ONE, 8 (2013), p. e60361). In one example embodiment, Rec2 can be used for oral administration, as oral administration of Rec2 results in preferential transduction of BAT with absence of transduction in the gastrointestinal track.
The genome of the AAV according to the invention typically comprises the cis-acting 5′ and 3′ inverted terminal repeat sequences and an expression cassette (see, e.g., Tijsser P, Ed., “Handbook of Parvoviruses” (CRC Press, Boca Raton, FL, US, 1990, pp. 155-168)).
The polynucleotide of the invention can comprise ITRs derived from any one of the AAV serotypes. In a preferred embodiment, the ITRs are derived from the AAV2 serotype. The AAV of the invention comprises a capsid from any serotype. In particular embodiment, the capsid is derived from the AAV of the group consisting on AAV1, AAV2, AAV4, AAV5, AAV6, AAV7, AAV8 and AAV9. In a preferred embodiment, the AAV of the invention comprises a capsid derived from the AAV8 or AAV9 serotypes.
In another particular embodiment, the AAV vector is a pseudotyped AAV vector (i.e., the vector comprises sequences or components originating from at least two distinct AAV serotypes). In a particular embodiment, the pseudotyped AAV vector comprises an AAV genome derived from one AAV serotype (e.g., AAV2), and a capsid derived at least in part from a distinct AAV serotype. In a preferred embodiment, the adeno-associated viral vector used in the method for transducing cells in vitro or in vivo has a serotype selected from the group consisting of AAV6, AAV7, AAV8, and AAV9, and the adeno-associated virus ITRs are AAV2 ITRs.
In one example embodiment, adeno-associated viral vectors of the AAV6, AAV7, AAV8, and AAV9 serotypes are capable of transducing adipose tissue cells efficiently. This feature makes possible the development of methods for the treatment of diseases which require or may benefit from the expression of a polynucleotide of interest in adipocytes (e.g., COBLL1). In particular, this finding facilitates the delivery of polypeptides of interest to a subject in need thereof by administering the AAV vectors of the invention to the patient, thus generating adipocytes capable of expressing the polynucleotide of interest and its encoded polypeptide in vivo (e.g., COBLL1).
In one embodiment the AAV vector contains one promoter with the addition of at least one target sequence of at least one miRNA.
In one example embodiment, the transcriptional regulatory region within the AAV comprises a mini/aP2 regulatory region when white adipocytes or stem cells for differentiating to white adipocytes are transduced. In another example embodiment, the transcriptional regulatory region within the AAV comprises a mini/UCP1 regulatory region when brown adipocytes or stem cells for differentiating to brown adipocytes are transduced. In another example embodiment, the transduced cells can be implanted in the human or animal body to obtain the desired therapeutic effect (described further herein in section on ACT). Thus, the invention also relates to a method for the treatment or prevention of a disease which comprises administering to a subject in need thereof the adipocytes or cell compositions obtained according to the method of the invention.
In one example embodiment, COBLL1 is introduced to adipocytes or adipocyte progenitors by means of a lentiviral viral vector (see, e.g., Balkow A, Hoffmann L S, Klepac K, et al. Direct lentivirus injection for fast and efficient gene transfer into brown and beige adipose tissue. J Biol Methods. 2016; 3 (3): e48. Published 2016 Jul. 16. doi: 10.14440/jbm.2016.123). Lentiviruses are enveloped, single stranded RNA viruses that belong to the family of Retroviridae. Moreover, lentiviral vectors are preferred as they are able to transduce or infect non-dividing cells and typically produce high viral titers.
In one example embodiment, the vector is a “plasmid,” which refers to a circular double stranded DNA loop into which additional DNA segments can be inserted, such as by standard molecular cloning techniques.
In one example embodiment, the vector is an mRNA vector (see, e.g., Sahin, U, Kariko, K and Tureci, O (2014). mRNA-based therapeutics-developing a new class of drugs. Nat Rev Drug Discov 13:759-780; Weissman D, Kariko K. mRNA: Fulfilling the Promise of Gene Therapy. Mol Ther. 2015; 23 (9): 1416-1417. doi: 10.1038/mt.2015.138; Kowalski P S, Rudra A, Miao L, Anderson D G. Delivering the Messenger: Advances in Technologies for Therapeutic mRNA Delivery. Mol Ther. 2019; 27 (4): 710-728. doi: 10.1016/j.ymthe.2019.02.012; Magadum A, Kaur K, Zangi L. mRNA-Based Protein Replacement Therapy for the Heart. Mol Ther. 2019; 27 (4): 785-793. doi: 10.1016/j.ymthe.2018.11.018; Reichmuth A M, Oberli M A, Jaklenec A, Langer R, Blankschtein D. mRNA vaccine delivery using lipid nanoparticles Ther Deliv. 2016; 7 (5): 319-334. doi: 10.4155/tde-2016-0006; and Khalil A S, Yu X, Umhoefer J M, et al. Single-dose mRNA therapy via biomaterial-mediated sequestration of overexpressed proteins. Sci Adv. 2020; 6 (27): caba2422). In an exemplary embodiment, mRNA encoding for COBL11 is delivered using lipid nanoparticles (see, e.g., Reichmuth, et al., 2016) and administered directly to adipose tissue. In an exemplary embodiment, mRNA encoding for COBLL1 is delivered using biomaterial-mediated sequestration (see, e.g., Khalil, et al., 2020) and administered directly to adipose tissue. Sequences present in mRNA molecules, as described further herein, are applicable to mRNA vectors (e.g., Kozak consensus sequence, miRNA target sites and WPRE).
In one example embodiment, the non-viral vector for use in gene transfer and/or nanoparticle formulations is a lipid. In one example embodiment the non-viral lipid vector may comprise: 1,2-Dioleoyl-sn-glycero-3-phosphatidylcholine; 1,2-Dioleoyl-sn-glycero-3-phosphatidylethanolamine; Cholesterol; N-[1-(2,3-Dioleyloxy) propyl]N,N, N-trimethylammonium chloride; 1,2-Dioleoyloxy-3-trimethylammonium-propane; Dioctadecylamidoglycylspermine; N-(3-Aminopropyl)-N,N-dimethyl-2,3-bis(dodecyloxy)-1-propanaminium bromide; Cetyltrimethylammonium bromide; 6-Lauroxyhexyl ornithinate; 1-(2,3-Diolcoyloxypropyl)-2,4,6-trimethylpyridinium; 2,3-Dioleyloxy-N-[2 (sperminecarboxamido-ethyl]-N,N-dimethyl-1-propanaminium trifluoroacetate; 1,2-Diolcyl-3-trimethylammonium-propane; N-(2-Hydroxyethyl)-N,N-dimethyl-2,3-bis(tetradecyloxy)-1-propanaminium bromide; Dimyristooxypropyl dimethyl hydroxyethyl ammonium bromide; 3β-[N—(N′,N′-Dimethylaminoethane)-carbamoyl]cholesterol; Bis-guanidium-tren-cholesterol; 1,3-Diodeoxy-2-(6-carboxy-spermyl)-propylamide; Dimethyloctadecylammonium bromide; Dioctadecylamidoglicylspermidin; rac-[(2,3-Dioctadecyloxypropyl) (2-hydroxyethyl)]-dimethylammonium chloride; rac-[2 (2,3-Dihexadecyloxypropyl-oxymethyloxy)ethyl]trimethylammonium bromide; Ethyldimyristoylphosphatidylcholine; 1,2-Distearyloxy-N,N-dimethyl-3-aminopropane; 1,2-Dimyristoyl-trimethylammonium propane; O,O′-Dimyristyl-N-lysyl aspartate; 1,2-Distearoyl-sn-glycero-3-ethylphosphocholine; N-Palmitoyl D-erythro-sphingosyl carbamoyl-spermine; N-t-Butyl-N0-tetradecyl-3-tetradecylaminopropionamidine; Octadecenolyoxy [ethyl-2-heptadecenyl-3 hydroxyethyl]imidazolinium chloride; N1-Cholesteryloxycarbonyl-3,7-diazanonane-1,9-diamine; 2-(3-[Bis(3-amino-propyl)-amino]propylamino)-N-ditetradecylcarbamoylme-ethyl-acetamide; 1,2-dilinoleyloxy-3-dimethylaminopropane; 2,2-dilinoleyl-4-dimethylaminoethyl-[1,3]-dioxolane; and dilinoleyl-methyl-4-dimethylaminobutyrate.
In one example embodiment, the non-viral vector for use in gene transfer and/or nanoparticle formulations is a polymer. In one example embodiment the non-viral polymer vector may comprise: Poly(ethylene)glycol; Polyethylenimine; Dithiobis(succinimidylpropionate); Dimethyl-3,3′-dithiobispropionimidate; Poly(ethylene imine)biscarbamate; Poly(L-lysine); Histidine modified PLL; Poly(N-vinylpyrrolidone); Poly(propylenimine); Poly(amidoamine); Poly(amido ethylenimine); Triethylenetetramine; Poly(β-aminoester); Poly(4-hydroxy-L-proline ester); Poly(allylamine); Poly(α-[4-aminobutyl]-L-glycolic acid); Poly(D,L-lactic-co-glycolic acid); Poly(N-ethyl-4-vinylpyridinium bromide); Poly(phosphazene)s; Poly(phosphoester)s; Poly(phosphoramidate)s; Poly(N-2-hydroxypropylmethacrylamide); Poly(2-(dimethylamino)ethyl methacrylate); Poly(2-aminoethyl propylene phosphate); Chitosan; Galactosylated chitosan; N-Dodacylated chitosan; Histone; Collagen; and Dextran-spermine.
In one example embodiment, gene therapy vectors are used that have tropism for expression in adipocytes or adipocyte progenitors. In another example embodiment, the transcriptional regulatory region may comprise a promoter and, optionally, an enhancer region. Preferably, the promoter is specific for adipose tissue. The enhancer need not be specific for adipose tissue. Alternatively, the transcriptional regulatory region may comprise an adipose tissue-specific promoter and an adipose tissue-specific enhancer. In one embodiment, the tissue-specific promoter is an adipocyte-specific promoter such as, for example, the adipocyte protein 2 (aP2, also known as fatty acid binding protein 4 (FABP4)), the PPARy promoter, the adiponectin promoter, the phosphoenolpyruvate carboxykinase (PEPCK) promoter, the promoter derived from human aromatase cytochrome p450 (p450arom), or the Foxa-2 promoter (see, e.g., Graves R, et al, Genes Dev. 1991; 5:428-437; Ross S, et al, Proc. Natl. Acad. Sci. USA 1990; 87:9590-9594; Simpson E, et al., U.S. Pat. No. 5,446,143; Mahendroo M, et al., J. Biol. Chem. 1993; 268:19463-19470; Simpson E, et al., Clin. Chem. 1993; 39:317-324; and Sasaki H, et al., Cell 1994; 76:103-115). In a preferred embodiment, the enhancer region is selected from the group consisting of the adipose-specific aP2 enhancer and the adipose-specific UCP1 enhancer. In another example embodiment, an adipose-specific promoter is much less potent than that of a ubiquitous promoter. Thus, a ubiquitous promoter, such as hybrid cytomegalovirus enhancer/chicken β-actin (CBA or CAG) or cytomegalovirus (CMV) is used. In another example embodiment, a ubiquitous promoter is used in combination with any adipose targeting strategy described herein or when the vector is administered locally to adipose tissue. In another example embodiment, systemic delivery utilizes an adipose-specific promoter with a higher dose, while local delivery utilizes a CBA or CMV promoter with a lower dosage.
In one embodiment, the vector contains at least one target sequence of at least one miRNA expressed in non-adipose tissue. In another example embodiment, liver- and heart-specific abundant miRNAs are used to de-target or suppress transgene expression in liver and heart by embedding the miRNA target sequences in the vectors, in particular for AAV8 vectors. In one embodiment, the target sequence of at least one miRNA is located in the 3′ untranslated region (3′UTR) of cellular messenger RNA (mRNA). Exemplary target sequences of the at least one miRNA include, but are not limited to miR1 (miRbase database accession numbers MI0000651 and MI0000437), miR122 or miR122a (MI0000442), miR152 (MI0000462), miR199 (MI0000242), miR215 (MI0000291), miR192 (MI0000234), miR148a (MI0000253), miR194 (MI0000488), miR1 (MI0000651), miRT133 (MI0000450), miR206 (MI0000490), miR208 (MI0000251), miR124 (MI0000443), miR125 (MI0000469), miR216 (MI0000292), and miR130 (MI0000448). In preferred embodiments, the miRNA target sites are selected from miRNA122a and miRNA1. In another example embodiment, 1, 2, 3, or 4 repeat target sites for each miRNA can be used. Sequence references are publicly available and may be obtained from the miRbase (www.mirbase.org/). The term “microRNAs” or “miRNAs”, as used herein, are small (˜22-nt), evolutionarily conserved, regulatory RNAs involved in RNA-mediated gene silencing at the post-transcriptional level (see, e.g., Barrel DP. Cell 2004; 116:281-297). Through base pairing with complementary regions (most often in the 3′ untranslated region (3′UTR) of cellular messenger RNA (mRNA)), miRNAs can act to suppress mRNA translation or, upon high-sequence homology, cause the catalytic degradation of mRNA. Because of the highly differential tissue expression of many miRNAs, cellular miRNAs can be exploited to mediate tissue-specific targeting of gene therapy vectors. By engineering tandem copies of target elements perfectly complementary to tissue-specific miRNAs (miRT) within vectors, transgene expression in undesired tissues can be efficiently inhibited.
In another example embodiment, a method for treating subjects at risk for, or suffering from, T2D comprises administering a COBL11 recombinant polypeptide. In certain embodiments, recombinant COBLL1 protein is delivered intracellularly to a subject in need thereof and is used as a protein therapeutic. Protein therapeutics offer high specificity, and the ability to treat “undruggable” targets, in diseases associated with protein deficiencies or mutations (e.g., COBLL1). As used herein COBLL1 protein includes all variants and protein fragments, described further herein. Previous studies have found that COBLL1 interacts with ROR1 (Plešingerová, et al. Expression of COBLL1 encoding novel ROR1 binding partner is robust predictor of survival in chronic c lymphocytic leukemia. Haematologica. 2018; 103 (2): 313-324). Applicants discovered that COBLL1 plays a role in the remodeling of the actin cytoskeleton, specifically, actin remodeling in differentiating adipocytes. Thus, while not being bound by a particular scientific theory, it is expected that administration of functional COBLL1 protein may restore proper actin remodeling in differentiating adipocytes.
COBLL1 has the following domains: WH2, COBL-like, and Cordon-bleu_ubiquitin_domain. The WHI2 (WASP-Homology 2, or Wiskott-Aldrich homology 2) domain is an ˜18 amino acids actin-binding motif. Single WH2 domains can sequester G-actin. COBL contains three G-actin-binding WH2 domains and act as a dynamizer of actin assembly. COBL has profilin-like filament nucleating and severing activities. The Cordon-bleu_ubiquitin_domain protein domain is highly conserved among vertebrates. The sequence contains three repeated lysine, arginine, and proline-rich regions, the KKRAP motif. It is expressed specifically in the node. This domain has a ubiquitin-like fold. In certain embodiments, full length COBLL1 protein is administered. In one example embodiment, a COBL11 sequence selected from Table A is administered. In certain embodiments, a truncated COBLL1 protein is administered. For example, protein domains that function in the nucleus are not required for the recombinant protein (e.g., AR interacting domains). Further, only the actin binding domains and domains required for actin remodeling are required. Various methods can be used for delivery of COBLL1 to adipose cells. In certain embodiments, COBLL1 is delivered in a composition capable of delivering COBLL1 intracellularly.
In another example embodiment, a method for treating subjects at risk for, or suffering from, lipodystrophy comprises administering a BCL2 recombinant polypeptide. In certain embodiments, recombinant BCL2 protein is delivered intracellularly to a subject in need thereof and is used as a protein therapeutic. Protein therapeutics offer high specificity, and the ability to treat “undruggable” targets, in diseases associated with protein deficiencies or mutations (e.g., BCL2). As used herein BCL2 protein includes all variants and protein fragments, described further herein. Previous studies have found that BCL2 promotes and inhibits apoptosis, and that the BCL-2 family proteins are evolutionary conserved and share BCL2 homology (BH) domains. Choudhury, A comparative analysis of BCL-2 family, Bioinformation. 2019; 15 (4): 299-306. In an aspect, the BCL2 is selected from three groups based on their primary function (1) anti-apoptotic proteins (BCL-2, BCL-XL, BCL-W, MCL-1, BFL-1/A1), (2) pro-apoptotic pore-formers (BAX, BAK, BOK) and (3) pro-apoptotic BH3-only proteins (BAD, BID, BIK, BIM, BMF, HRK, NOXA, PUMA, etc.). In an aspect, the BCL-2 comprises a BH3 domain. In embodiments, the BCL-2 protein is an anti-apoptotic or pore-former protein and comprises BH1, BH2, BH3 and BH4 domain. Sec, e.g., Kale, J., Osterlund, E. & Andrews, D. BCL-2 family proteins: changing partners in the dance towards death. Cell Death Differ 25, 65 80 (2018). Residues of the domains in BCL-2 are generally conserved: BIII (residues 136-155), BH2 (187-202), BH3 (93-107) and BH4 (10-30). See, e.g., Reed J C, Zha H, Aime-Sempe C, Takayama S, Wang H G. Structure-function analysis of Bcl-2 family proteins. Regulators of programmed cell death. Adv Exp Med Biol. 1996; 406:99-112. In an aspect, the BCL-2 is an anti-apoptotic protein and comprises both BH1 and BH2 domains. In an aspect, the BCL-2 protein may be truncated at the BH4 domain.
As disclosed herein, the variant causes BCL2 to be reduced in Subcutaneous AMSCs and skeletal muscle. The reduction is in the stem cells at day 0, but the effect on increased apoptosis is seen in mature adipocytes. Thus, while not being bound by a particular scientific theory, it is expected that administration of functional BCL2 protein may improve or enhance modulation of disease susceptibility in T2D. In an aspect, the administration of BCL-2 is provided when the risk allele rs12454712 is present.
In certain embodiments, full length BCL2 protein is administered. In one example embodiment, a BCL2 sequence selected from Table 2 is administered. In certain embodiments, a truncated BCL2 protein is administered. In an aspect, an isoform of a BCL-2 or BCL-2-like protein, for example, BCL2L1, BCL2L2, BCL2L10, BCL2L12, BCL2L13, BCL2L14, BCL2L15 is provided. Various methods can be used for delivery of BCL2 to adipose cells. In certain embodiments, BCL2 is delivered in a composition capable of delivering BCL2 intracellularly. In embodiments, BCL2 is administered to skeletal muscle or AMSCs.
In an example embodiment, a method for treating subjects at risk for, or suffering from, lipodystrophy comprises administering a 3-ketodihydrosphingosine reductase (KDSR) recombinant polypeptide. In certain embodiments, recombinant KDSR protein is delivered intracellularly to a subject in need thereof and is used as a protein therapeutic. Protein therapeutics offer high specificity, and the ability to treat “undruggable” targets, in diseases associated with protein deficiencies or mutations (e.g., KDSR). As used herein KDSR protein includes all variants and protein fragments, described further herein. In an aspect, KDSR comprises the sequence
Previous studies have found that KDSR putative active site residues of the encoded protein are found on the cytosolic side of the endoplasmic reticulum membrane. Key structural elements of KDSR include transmembrane anchors near the N-terminal and C-terminal ends of the protein, Rossman folds, and a highly conserved domain containing three putative catalytic sites. See, e.g., Bariana, T. K., et al. (2019). Sphingolipid dysregulation due to lack of functional KDSR impairs proplatelet formation causing thrombocytopenia. Haematologica, 104 (5), 1036-1045. Doi: 10.3324/haematol.2018.20478. The TyrXXXLys, Asn, and Ser residues form the canonical catalytic triad, and the putative NAD binding site is identified as ThrGlyXXXGlyxGly (SEQ ID NO: 21). See, Boyden et al., Mutations in KDSR Cause Recessive Progressive Symmetric Erythrokeratoderma, The American Journal of Human Genetics 100, 978 984, Jun. 1, 2017; doi: 0.1016/j.ajhg.2017.05.003. Applicants discovered that KDSR plays a role in adipocytes. Thus, while not being bound by a particular scientific theory, it is expected that administration of functional KDSR protein may provide treatment for metabolic disease, alone or in combination with BCL2, and/or COBLL1.
In certain embodiments, full length KDSR protein is administered. In certain embodiments, a truncated KDSR protein is administered. Various methods can be used for delivery of KDSR to adipose cells. In certain embodiments, KDSR is delivered in a composition capable of delivering KDSR intracellularly, in an aspect delivered to AMSCs.
In an example embodiment, a method for treating subjects at risk for, or suffering from, lipodystrophy comprises reducing the expression or activity of a Vacuolar protein sorting-associated protein 4B (VPS4B) recombinant polypeptide. As used herein, VPS4B protein includes all variants and protein fragments, described further herein. In an aspect, VPS4B the comprises sequence:
Vps4 is an adenosine triphosphatase associated with diverse cellular activities (AAA) family member, a subfamily of the AAA+ superfamily. AAA+ ATPases function in assembly/disassembly of protein complexes, protein transport and protein degradation. See, e.g. Ogura T, Wilkinson A J. AAA+ superfamily ATPases: common structure-diverse function. Genes Cells 2001; 6:575-597. The VSP4B is a mammalian homologue of Vps4p, and is also referred to suppressor of K I transport growth defect (SKD1). The VPS4B comprises an AAA domain which is further divided into an alpha/beta domain and an alpha helical domain, a beta-domain inserted with the AAA alpha helical domain and a C-terminal alpha helix (helix alpha10). Sec, Inoue et al., Traffic (2008) 9:12, 2180-2189. The apo form of human VPS4B, which shows 96% amino acid sequence identity with mouse SKD1; however, the human VPS4B structure comprises an N-terminal beta strand structure, an N-terminal region (residues 1 122) including the microtubule-interacting and trafficking (MIT) domain, and comprise a/β domains (residues 123-300 and 425-444).
Applicants discovered that increased expression or activity of VPS4B plays a role in lipid-accumulating cells, for example increased expression is associated with risk or presence of metabolic disease. Thus, while not being bound by a particular scientific theory, it is expected that administration of a catalytically inactive VPS4B or a molecule that inhibits VPS4B may be used for treatment of subjects suffering or at risk from metabolic disease. In an aspect, the VPS4B comprises one or more mutations, In one embodiment, inhibition of VPS4B function is by short hairpin VPS4B (sh-VPS4B) or expression of dominant negative VPS4B (E235Q) See, Lin et al., Identification of an AAA ATPase VPS4B-Dependent Pathway That Modulates Epidermal Growth Factor Receptor Abundance and Signaling during Hypoxia, (2012) Mol. And Cell. Biol. 32:6 1124-1138; doi: 10.1128/MCB.06053-11. In certain embodiments, a short hairpin VPS4B protein is administered.
In one embodiment, a method of treating subjects at risk for, or suffering from, T2D comprises administering a gene editing system that corrects one or more genomic variants that decrease the expression of COBL11 in adipocyte and/or adipocyte progenitors. In one example embodiment, the gene editing system is used to edit one or more variants that reduce COBL11 expression. In one example embodiment, the one or more variants reduce binding of POU2FA to an enhancer controlling COBL11 expression. In another example embodiment, the gene editing system is used to edit a rs6712203 variant from C to T. In one embodiment, a method of treating subjects at risk for, or suffering from, lipodystrophy comprises administering a gene editing system that corrects one or more genomic variants that decrease the expression of BCL2 in adipose-derived mesenchymal stem cells (AMSCs) or skeletal muscle and/or KDSR in ASMCs. In one example embodiment, the gene editing system is used to edit one or more variants that reduce BCL2 and/or KDSR expression. In one embodiment, a method of treating subjects at risk for, or suffering from, lipodystrophy comprises administering a gene editing system that corrects one or more genomic variants that increase the expression of VPS4B in ASMCs. In one example embodiment, the gene editing system is used to edit one or more variants that increase VPS4B. In another example embodiment, the gene editing system is used to edit a rs12454712 variant from T to C.
In certain example embodiments, a programmable nuclease may be used to edit a genomic region comprising one or more genomic variants associated with decreased expression or activity of COBLL1 in adipocyte or adipocyte progenitors. In certain example embodiments, a programmable nuclease may be used to edit a genomic region comprising one or more genomic variants associated with increased expression or activity of VPS4B in ASMCs. In example embodiments, a programmable nuclease may be used to edit a genomic region comprising one or more genomic variants associated with decreased expression or activity of BCL2 in skeletal muscle, or with decreased expression or activity of BCL2 or KDSR in ASMCs. Gene editing using programmable nucleases may utilize two different cell repair pathways, non-homologous end joining (NHEJ) and homology directed repair. In certain example embodiment, HDR is used to provide template that replaces a genomic region comprising the variant with a donor that edits the risk variant to a wild-type or non-risk variant. Example programmable nucleases for use in this manner include zinc finger nucleases (ZFN), TALE nucleases (TALENS), meganucleases, and CRISPR-Cas systems.
In one example embodiment, the gene editing system is a CRISPR-Cas system. The CRISPR-Cas systems comprise a Cas polypeptide and a guide sequence, wherein the guide sequence is capable of forming a CRISPR-Cas complex with the Cas polypeptide and directing site-specific binding of the CRISPR-Cas sequence to a target sequence. The Cas polypeptide may induce a double- or single-stranded break at a designated site in the target sequence. The site of CRISPR-Cas cleavage, for most CRISPR-Cas systems, is dictated by distance from a protospacer-adjacent motif (PAM), discussed in further detail below. Accordingly, a guide sequence may be selected to direct the CRISPR-Cas system to induce cleavage at a desired target site at or near the one or more variants.
In one example embodiment, the CRISPR-Cas system is used to introduce one or more insertions or deletions that restore POU2FA binding to an enhancer that controls expression of COBL11. More than one guide sequence may be selected to insert multiple insertion, deletions, or combination thereof. Likewise, more than one Cas protein type may be used, for example, to maximize targets sites adjacent to different PAMs. In one example embodiment, a guide sequence is selected that directs the CRISPR-Cas system to make one or more insertions or deletions within the enhance region containing a variant that reduces POU2A binding to an enhancer controlling COBL11 expression. In one example embodiment, a guide is selected that directs the CRISPR-Cas system to make an insertion 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 base pairs upstream of a variant that reduces POU2FA binding to an enhancer controlling COBL11 expression. In one example embodiment, a guide sequence is selected to that directs the CRISPR-Cas system to make an insertion 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 base pairs downstream of a variant that reduces POU2FA binding to an enhancer controlling COBL11 expression. In one example embodiment, a guide sequence is selected to that directs the CRISPR-Cas system to make a deletion 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 base pairs downstream of a variant that reduces POU2FA binding to an enhancer controlling COBL11 expression. In one example embodiment, a guide sequence is selected to that directs the CRISPR-Cas system to make a deletion 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 base pairs downstream of a variant that reduces POU2FA binding to an enhancer controlling COBL11 expression. In one example embodiment, the above insertions and/or deletions are made relative to the rs6712203 variant position.
In one example embodiment, a donor template is provided to replace a genomic sequence comprising one or more variants that reduce COBL11A expression. A donor template may comprise an insertion sequence flanked by two homology regions. The insertion sequence comprises an edited sequence to be inserted in place of the target sequence (e.g. a portion of genomic DNA comprising the one or more variants). The homology regions comprise sequences that are homologous to the genomic DNA strands at the site of the CRISPR-Cas induced double-strand break. Cellular HDR mechanisms then facilitate insertion of the insertion sequence at the site of the DSB.
Accordingly, in certain example embodiments, a donor template and guide sequence are selected to direct excision and replacement of a section of genome DNA comprising a variant that reduces POU2FA binding to an enhancer controlling COBL11 expression with an insertion sequence that edits the one or more variants to a wild-type or non-risk variant. In one example embodiment, the insertion sequence comprises a wild-type or non-risk variant that restores or increases POU2FA binding to the enhancer. In one example embodiment, the insertion sequence encodes a portion of genomic DNA in which the rs6712203 variant is changed from a C to a T.
The donor template may include a sequence which results in a change in sequence of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or more nucleotides of the target sequence.
A donor template may be of any suitable length, such as about or more than about 10, 15, 20, 25, 50, 75, 100, 150, 200, 500, 1000, or more nucleotides in length. In an embodiment, the template nucleic acid may be 20+/−10, 30+/−10, 40+/−10, 50+/−10, 60+/−10, 70+/−10, 80+/−10, 90+/−10, 100+/−10, 1 10+/−10, 120+/−10, 130+/−10, 140+/−10, 150+/−10, 160+/−10, 170+/−10, 1 80+/−10, 190+/−10, 200+/−10, 210+/−10, of 220+/−10 nucleotides in length. In an embodiment, the template nucleic acid may be 30+/−20, 40+/−20, 50+/−20, 60+/−20, 70+/−20, 80+/−20, 90+/−20, 100+/−20, 1 10+/−20, 120+/−20, 130+/−20, 140+/−20, I 50+/−20, 160+/−20, 170+/−20, 180+/−20, 190+/−20, 200+/−20, 210+/−20, of 220+/−20 nucleotides in length. In an embodiment, the template nucleic acid is 10 to 1,000, 20 to 900, 30 to 800, 40 to 700, 50 to 600, 50 to 500, 50 to 400, 50 to 300, 50 to 200, or 50 to 100 nucleotides in length.
The homology regions of the donor template may be complementary to a portion of a polynucleotide comprising the target sequence. When optimally aligned, a donor template might overlap with one or more nucleotides of a target sequences (e.g. about or more than about 1, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100 or more nucleotides). In some embodiments, when a template sequence and a polynucleotide comprising a target sequence are optimally aligned, the nearest nucleotide of the template polynucleotide is within about 1, 5, 10, 15, 20, 25, 50, 75, 100, 200, 300, 400, 500, 1000, 5000, 10000, or more nucleotides from the target sequence.
The donor template comprises a sequence to be integrated (e.g., a mutated gene). The sequence for integration may be a sequence endogenous or exogenous to the cell. Examples of a sequence to be integrated include polynucleotides encoding a protein or a non-coding RNA (e.g., a microRNA). Thus, the sequence for integration may be operably linked to an appropriate control sequence or sequences. Alternatively, the sequence to be integrated may provide a regulatory function.
Homology arms of the donor template may comprise from about 20 bp to about 2500 bp, for example, about 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400, or 2500 bp. In some methods, the exemplary upstream or downstream sequence have about 200 bp to about 2000 bp, about 600 bp to about 1000 bp, or more particularly about 700 bp to about 1000.
In one example embodiment, one or both homology arms may be shortened to avoid including certain sequence repeat elements. For example, a 5′ homology arm may be shortened to avoid a sequence repeat element. In other embodiments, a 3′ homology arm may be shortened to avoid a sequence repeat element. In some embodiments, both the 5′ and the 3′ homology arms may be shortened to avoid including certain sequence repeat elements.
The donor template may further comprise a marker. Such a marker may make it easy to screen for targeted integrations. Examples of suitable markers include restriction sites, fluorescent proteins, or selectable markers. The donor template of the disclosure can be constructed using recombinant techniques (see, for example, Sambrook et al., 2001 and Ausubel et al., 1996).
In one example embodiment, a donor template is a single-stranded oligonucleotide. When using a single-stranded oligonucleotide, 5′ and 3′ homology arms may range up to about 200 base pairs (bp) in length, e.g., at least 25, 50, 75, 100, 125, 150, 175, or 200 bp in length.
Suzuki et al. describe in vivo genome editing via CRISPR/Cas9 mediated homology-independent targeted integration (2016, Nature 540:144 149).
The CRISPR-Cas therapeutic methods disclosed herein may be designed for use with Class 1 CRISPR-Cas systems. In certain example embodiments, the Class 1 system may be Type I, Type III or Type IV CRISPR-Cas as described in Makarova et al. “Evolutionary classification of CRISPR-Cas systems: a burst of class 2 and derived variants” Nature Reviews Microbiology, 18:67-81 (February 2020)., incorporated in its entirety herein by reference, and particularly as described in
The CRISPR-Cas therapeutic methods disclosed herein may be designed for use with. Class 2 systems are distinguished from Class 1 systems in that they have a single, large, multi-domain effector protein. In certain example embodiments, the Class 2 system can be a Type II, Type V, or Type VI system, which are described in Makarova et al. “Evolutionary classification of CRISPR-Cas systems: a burst of class 2 and derived variants” Nature Reviews Microbiology, 18:67-81 (February 2020), incorporated herein by reference. Each type of Class 2 system is further divided into subtypes. See Markova et al. 2020, particularly at Figure. 2. Class 2, Type II systems can be divided into 4 subtypes: II-A, II-B, II-C1, and II-C2. Class 2, Type V systems can be divided into 17 subtypes: V-A, V-B1, V-B2, V-C, V-D, V-E, V-F1, V-F1 (V-U3), V-F2, V-F3, V-G, V-II, V-I, V-K (V-U5), V-U1, V-U2, and V-U4. Class 2, Type IV systems can be divided into 5 subtypes: VI-A, VI-B1, VI-B2, VI-C, and VI-D.
The distinguishing feature of these types is that their effector complexes consist of a single, large, multi-domain protein. Type V systems differ from Type II effectors (e.g., Cas9), which contain two nuclear domains that are each responsible for the cleavage of one strand of the target DNA, with the HNH nuclease inserted inside a split Ruv-C like nuclease domain sequence. The Type V systems (e.g., Cas12) only contain a RuvC-like nuclease domain that cleaves both strands. Some Type V systems have also been found to possess this collateral activity with two single-stranded DNA in in vitro contexts.
In one example embodiment, the Class 2 system is a Type II system. In one example embodiment, the Type II CRISPR-Cas system is a II-A CRISPR-Cas system. In one example embodiment, the Type II CRISPR-Cas system is a II-B CRISPR-Cas system. In one example embodiment, the Type II CRISPR-Cas system is a II-C1 CRISPR-Cas system. In one example embodiment, the Type II CRISPR-Cas system is a II-C2 CRISPR-Cas system. In some example embodiments, the Type II system is a Cas9 system. In some embodiments, the Type II system includes a Cas9.
In one example embodiment, the Class 2 system is a Type V system. In one example embodiment, the Type V CRISPR-Cas system is a V-A CRISPR-Cas system. In one example embodiment, the Type V CRISPR-Cas system is a V-B1 CRISPR-Cas system. In one example embodiment, the Type V CRISPR-Cas system is a V-B2 CRISPR-Cas system. In one example embodiment, the Type V CRISPR-Cas system is a V-C CRISPR-Cas system. In one example embodiment, the Type V CRISPR-Cas system is a V-D CRISPR-Cas system. In one example embodiment, the Type V CRISPR-Cas system is a V-E CRISPR-Cas system. In one example embodiment, the Type V CRISPR-Cas system is a V-F1 CRISPR-Cas system. In one example embodiment, the Type V CRISPR-Cas system is a V-F1 (V-U3) CRISPR-Cas system. In one example embodiment, the Type V CRISPR-Cas system is a V-F2 CRISPR-Cas system. In one example embodiment, the Type V CRISPR-Cas system is a V-F3 CRISPR-Cas system. In one example embodiment, the Type V CRISPR-Cas system is a V-G CRISPR-Cas system. In one example embodiment, the Type V CRISPR-Cas system is a V-H CRISPR-Cas system. In one example embodiment, the Type V CRISPR-Cas system is a V-I CRISPR-Cas system. In one example embodiment, the Type V CRISPR-Cas system is a V-K (V-U5) CRISPR-Cas system. In one example embodiment, the Type V CRISPR-Cas system is a V-U1 CRISPR-Cas system. In one example embodiment, the Type V CRISPR-Cas system is a V-U2 CRISPR-Cas system. In one example embodiment, the Type V CRISPR-Cas system is a V-U4 CRISPR-Cas system. In one example embodiment, the Type V CRISPR-Cas is a Cas12a (Cpf1), Cas12b (C2c1), Cas12c (C2c3), Cas12d (CasY), Cas12e (CasX), Cas14, and/or CasΦ.
The following include general design principles that may be applied to the guide molecule. The terms guide molecule, guide sequence and guide polynucleotide refer to polynucleotides capable of guiding Cas to a target genomic locus and are used interchangeably as in foregoing cited documents such as International Patent Publication No. WO 2014/093622 (PCT/US2013/074667). In general, a guide sequence is any polynucleotide sequence having sufficient complementarity with a target polynucleotide sequence to hybridize with the target sequence and direct sequence-specific binding of a CRISPR complex to the target sequence. The guide molecule can be a polynucleotide.
The ability of a guide sequence (within a nucleic acid-targeting guide RNA) to direct sequence-specific binding of a nucleic acid-targeting complex to a target nucleic acid sequence may be assessed by any suitable assay. For example, the components of a nucleic acid-targeting CRISPR system sufficient to form a nucleic acid-targeting complex, including the guide sequence to be tested, may be provided to a host cell having the corresponding target nucleic acid sequence, such as by transfection with vectors encoding the components of the nucleic acid-targeting complex, followed by an assessment of preferential targeting (e.g., cleavage) within the target nucleic acid sequence, such as by Surveyor assay (Qui et al. 2004. BioTechniques. 36 (4) 702-707). Similarly, cleavage of a target nucleic acid sequence may be evaluated in a test tube by providing the target nucleic acid sequence, components of a nucleic acid-targeting complex, including the guide sequence to be tested and a control guide sequence different from the test guide sequence, and comparing binding or rate of cleavage at the target sequence between the test and control guide sequence reactions. Other assays are possible and will occur to those skilled in the art.
In some embodiments, the guide molecule is an RNA. The guide molecule(s) (also referred to interchangeably herein as guide polynucleotide and guide sequence) that are included in the CRISPR-Cas or Cas based system can be any polynucleotide sequence having sufficient complementarity with a target nucleic acid sequence to hybridize with the target nucleic acid sequence and direct sequence-specific binding of a nucleic acid-targeting complex to the target nucleic acid sequence. In some embodiments, the degree of complementarity, when optimally aligned using a suitable alignment algorithm, can be about or more than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99%, or more. Optimal alignment may be determined with the use of any suitable algorithm for aligning sequences, non-limiting examples of which include the Smith-Waterman algorithm, the Needleman-Wunsch algorithm, algorithms based on the Burrows-Wheeler Transform (e.g., the Burrows Wheeler Aligner), ClustalW, Clustal X, BLAT, Novoalign (Novocraft Technologies; available at www.novocraft.com), ELAND (Illumina, San Diego, CA), SOAP (available at soap.genomics.org.cn), and Maq (available at maq.sourceforge.net).
A guide sequence, and hence a nucleic acid-targeting guide, may be selected to target any target nucleic acid sequence. The target sequence may be DNA. The target sequence may be any RNA sequence. In some embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of messenger RNA (mRNA), pre-mRNA, ribosomal RNA (rRNA), transfer RNA (tRNA), micro-RNA (miRNA), small interfering RNA (siRNA), small nuclear RNA (snRNA), small nucleolar RNA (snoRNA), double stranded RNA (dsRNA), non-coding RNA (ncRNA), long non-coding RNA (lncRNA), and small cytoplasmatic RNA (scRNA). In some preferred embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of mRNA, pre-mRNA, and rRNA. In some preferred embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of ncRNA, and lncRNA. In some more preferred embodiments, the target sequence may be a sequence within an mRNA molecule or a pre-mRNA molecule.
In I some embodiments, a nucleic acid-targeting guide is selected to reduce the degree secondary structure within the nucleic acid-targeting guide. In some embodiments, about or less than about 75%, 50%, 40%, 30%, 25%, 20%, 15%, 10%, 5%, 1%, or fewer of the nucleotides of the nucleic acid-targeting guide participate in self-complementary base pairing when optimally folded. Optimal folding may be determined by any suitable polynucleotide folding algorithm. Some programs are based on calculating the minimal Gibbs free energy. An example of one such algorithm is mFold, as described by Zuker and Stiegler (Nucleic Acids Res. 9 (1981), 133-148). Another example folding algorithm is the online webserver RNAfold, developed at Institute for Theoretical Chemistry at the University of Vienna, using the centroid structure prediction algorithm (see e.g., A. R. Gruber et al., 2008, Cell 106 (1): 23-24; and P A Carr and G M Church, 2009, Nature Biotechnology 27 (12): 1151-62).
In one example embodiment, a guide RNA or crRNA may comprise, consist essentially of, or consist of a direct repeat (DR) sequence and a guide sequence or spacer sequence. In another example embodiment, the guide RNA or crRNA may comprise, consist essentially of, or consist of a direct repeat sequence fused or linked to a guide sequence or spacer sequence. In another example embodiment, the direct repeat sequence may be located upstream (i.e., 5′) from the guide sequence or spacer sequence. In other embodiments, the direct repeat sequence may be located downstream (i.e., 3′) from the guide sequence or spacer sequence.
In one example embodiment, the crRNA comprises a stem loop, preferably a single stem loop. In one example embodiment, the direct repeat sequence forms a stem loop, preferably a single stem loop.
In one example embodiment, the spacer length of the guide RNA is from 15 to 35 nt. In another example embodiment, the spacer length of the guide RNA is at least 15 nucleotides. In another example embodiment, the spacer length is from 15 to 17 nt, e.g., 15, 16, or 17 nt, from 17 to 20 nt, e.g., 17, 18, 19, or 20 nt, from 20 to 24 nt, e.g., 20, 21, 22, 23, or 24 nt, from 23 to 25 nt, e.g., 23, 24, or 25 nt, from 24 to 27 nt, e.g., 24, 25, 26, or 27 nt, from 27 to 30 nt, e.g., 27, 28, 29, or 30 nt, from 30 to 35 nt, e.g., 30, 31, 32, 33, 34, or 35 nt, or 35 nt or longer.
The “tracrRNA” sequence or analogous terms includes any polynucleotide sequence that has sufficient complementarity with a crRNA sequence to hybridize. In some embodiments, the degree of complementarity between the tracrRNA sequence and crRNA sequence along the length of the shorter of the two when optimally aligned is about or more than about 25%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 97.5%, 99%, or higher. In some embodiments, the tracr sequence is about or more than about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 40, 50, or more nucleotides in length. In some embodiments, the tracr sequence and crRNA sequence are contained within a single transcript, such that hybridization between the two produces a transcript having a secondary structure, such as a hairpin.
In general, degree of complementarity is with reference to the optimal alignment of the sca sequence and tracr sequence, along the length of the shorter of the two sequences. Optimal alignment may be determined by any suitable alignment algorithm and may further account for secondary structures, such as self-complementarity within either the sca sequence or tracr sequence. In some embodiments, the degree of complementarity between the tracr sequence and sca sequence along the length of the shorter of the two when optimally aligned is about or more than about 25%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 97.5%, 99%, or higher.
In some embodiments, the degree of complementarity between a guide sequence and its corresponding target sequence can be about or more than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99%, or 100%; a guide or RNA or sgRNA can be about or more than about 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75, or more nucleotides in length; or guide or RNA or sgRNA can be less than about 75, 50, 45, 40, 35, 30, 25, 20, 15, 12, or fewer nucleotides in length; and tracr RNA can be 30 or 50 nucleotides in length. In some embodiments, the degree of complementarity between a guide sequence and its corresponding target sequence is greater than 94.5% or 95% or 95.5% or 96% or 96.5% or 97% or 97.5% or 98% or 98.5% or 99% or 99.5% or 99.9%, or 100%. Off target is less than 100% or 99.9% or 99.5% or 99% or 99% or 98.5% or 98% or 97.5% or 97% or 96.5% or 96% or 95.5% or 95% or 94.5% or 94% or 93% or 92% or 91% or 90% or 89% or 88% or 87% or 86% or 85% or 84% or 83% or 82% or 81% or 80% complementarity between the sequence and the guide, with it being advantageous that off target is 100% or 99.9% or 99.5% or 99% or 99% or 98.5% or 98% or 97.5% or 97% or 96.5% or 96% or 95.5% or 95% or 94.5% complementarity between the sequence and the guide.
In some embodiments according to the invention, the guide RNA (capable of guiding Cas to a target locus) may comprise (1) a guide sequence capable of hybridizing to a genomic target locus in the eukaryotic cell; (2) a tracr sequence; and (3) a tracr mate sequence. All of (1) to (3) may reside in a single RNA, i.e., an sgRNA (arranged in a 5′ to 3′ orientation), or the tracr RNA may be a different RNA than the RNA containing the guide and tracr sequence. The tracr hybridizes to the tracr mate sequence and directs the CRISPR/Cas complex to the target sequence. Where the tracr RNA is on a different RNA than the RNA containing the guide and tracer sequence, the length of each RNA may be optimized to be shortened from their respective native lengths, and each may be independently chemically modified to protect from degradation by cellular RNase or otherwise increase stability.
Many modifications to guide sequences are known in the art and are further contemplated within the context of this invention. Various modifications may be used to increase the specificity of binding to the target sequence and/or increase the activity of the Cas protein and/or reduce off-target effects. Example guide sequence modifications are described in International Patent Application No. PCT US2019/045582, specifically paragraphs [0178]-[0333]. which is incorporated herein by reference.
In the context of formation of a CRISPR complex, “target sequence” refers to a sequence to which a guide sequence is designed to have complementarity, where hybridization between a target sequence and a guide sequence promotes the formation of a CRISPR complex. In other words, the target polynucleotide can be a polynucleotide or a part of a polynucleotide to which a part of the guide sequence is designed to have complementarity with and to which the effector function mediated by the complex comprising the CRISPR effector protein and a guide molecule is to be directed. In some embodiments, a target sequence is located in the nucleus or cytoplasm of a cell.
PAM elements are sequences that can be recognized and bound by Cas proteins. Cas proteins/effector complexes can then unwind the dsDNA at a position adjacent to the PAM element. It will be appreciated that Cas proteins and systems target RNA do not require PAM sequences (Marraffini et al. 2010. Nature. 463:568-571). Instead, many rely on PFSs, which are discussed elsewhere herein. In one example embodiment, the target sequence should be associated with a PAM (protospacer adjacent motif) or PFS (protospacer flanking sequence or site), that is, a short sequence recognized by the CRISPR complex. Depending on the nature of the CRISPR-Cas protein, the target sequence should be selected, such that its complementary sequence in the DNA duplex (also referred to herein as the non-target sequence) is upstream or downstream of the PAM. In the embodiments, the complementary sequence of the target sequence is downstream or 3′ of the PAM or upstream or 5′ of the PAM. The precise sequence and length requirements for the PAM differ depending on the Cas protein used, but PAMs are typically 2-5 base pair sequences adjacent the protospacer (that is, the target sequence). Examples of the natural PAM sequences for different Cas proteins are provided herein below and the skilled person will be able to identify further PAM sequences for use with a given Cas protein.
The ability to recognize different PAM sequences depends on the Cas polypeptide(s) included in the system. See e.g., Gleditzsch et al. 2019. RNA Biology. 16 (4): 504-517. Table C (from Gleditzsch et al. 2019) below shows several Cas polypeptides and the PAM sequence they recognize.
In a preferred embodiment, the CRISPR effector protein may recognize a 3′ PAM. In one example embodiment, the CRISPR effector protein may recognize a 3′ PAM which is 5′H, wherein H is A, C or U.
Further, engineering of the PAM Interacting (PI) domain on the Cas protein may allow programing of PAM specificity, improve target site recognition fidelity, and increase the versatility of the CRISPR-Cas protein, for example as described for Cas9 in Kleinstiver B P et al. Engineered CRISPR-Cas9 nucleases with altered PAM specificities. Nature. 2015 Jul. 23; 523 (7561): 481-5. doi: 10.1038/nature14592. As further detailed herein, the skilled person will understand that Cas13 proteins may be modified analogously. Gao et al, “Engineered Cpf1 Enzymes with Altered PAM Specificities,” bioRxiv 091611; doi: http://dx.doi.org/10.1101/091611 (Dec. 4, 2016). Doench et al. created a pool of sgRNAs, tiling across all possible target sites of a panel of six endogenous mouse and three endogenous human genes and quantitatively assessed their ability to produce null alleles of their target gene by antibody staining and flow cytometry. The authors showed that optimization of the PAM improved activity and also provided an on-line tool for designing sgRNAs.
PAM sequences can be identified in a polynucleotide using an appropriate design tool, which are commercially available as well as online. Such freely available tools include, but are not limited to, CRISPRFinder and CRISPRTarget. Mojica et al. 2009. Microbiol. 155 (Pt. 3): 733-740; Atschul et al. 1990. J. Mol. Biol. 215:403-410; Biswass et al. 2013 RNA Biol. 10:817-827; and Grissa et al. 2007. Nucleic Acid Res. 35: W52-57. Experimental approaches to PAM identification can include, but are not limited to, plasmid depletion assays (Jiang et al. 2013. Nat. Biotechnol. 31:233-239; Esvelt et al. 2013. Nat. Methods. 10:1116-1121; Kleinstiver et al. 2015. Nature. 523:481-485), screened by a high-throughput in vivo model called PAM-SCNAR (Pattanayak et al. 2013. Nat. Biotechnol. 31:839-843 and Leenay et al. 2016. Mol. Cell. 16:253), and negative screening (Zetsche et al. 2015. Cell. 163:759-771).
As previously mentioned, CRISPR-Cas systems that target RNA do not typically rely on PAM sequences. Instead, such systems typically recognize protospacer flanking sites (PFSs) instead of PAMs Thus, Type VI CRISPR-Cas systems typically recognize protospacer flanking sites (PFSs) instead of PAMs. PFSs represents an analogue to PAMs for RNA targets. Type VI CRISPR-Cas systems employ a Cas13. Some Cas13 proteins analyzed to date, such as Cas13a (C2c2) identified from Leptotrichia shahii (LShCAs13a) have a specific discrimination against G at the 3′end of the target RNA. The presence of a C at the corresponding crRNA repeat site can indicate that nucleotide pairing at this position is rejected. However, some Cas13 proteins (e.g., LwaCAs13a and PspCas13b) do not seem to have a PFS preference. See e.g., Gleditzsch et al. 2019. RNA Biology. 16 (4): 504-517.
Some Type VI proteins, such as subtype B, have 5′-recognition of D (G, T, A) and a 3′-motif requirement of NAN or NNA. One example is the Cas13b protein identified in Bergeyella zoohelcum (BzCas13b). See e.g., Gleditzsch et al. 2019. RNA Biology. 16 (4): 504-517.
Overall Type VI CRISPR-Cas systems appear to have less restrictive rules for substrate (e.g., target sequence) recognition than those that target DNA (e.g., Type V and type II).
In some embodiments, one or more components (e.g., the Cas protein) in the composition for engineering cells may comprise one or more sequences related to nucleus targeting and transportation. Such sequences may facilitate the one or more components in the composition for targeting a sequence within a cell. In order to improve targeting of the CRISPR-Cas protein used in the methods of the present disclosure to the nucleus, it may be advantageous to provide one or both of these components with one or more nuclear localization sequences (NLSs).
In one example embodiment, the NLSs used in the context of the present disclosure are heterologous to the proteins. Non-limiting examples of NLSs include an NLS sequence derived from: the NLS of the SV40 virus large T-antigen, having the amino acid sequence PKKKRKV (SEQ ID NO: 23) or PKKKRKVEAS (SEQ ID NO:24); the NLS from nucleoplasmin (e.g., the nucleoplasmin bipartite NLS with the sequence KRPAATKKAGQAKKKK (SEQ ID NO: 25)); the c-myc NLS having the amino acid sequence PAAKRVKLD (SEQ ID NO: 26) or RQRRNELKRSP (SEQ ID NO: 27); the hRNPA1 M9 NIS having the sequence NOSSNFGPMKGGNFGGRSSGPYGGGGQYFAKPRNQGGY (SEQ ID NO: 28); the sequence RMRIZFKNKGKDTAELRRRRVEVSVELRKAKKDEQILKRRNV (SEQ ID NO: 29) of the IBB domain from importin-alpha; the sequences VSRKRPRP (SEQ ID NO: 30) and PPKKARED (SEQ ID NO: 31) of the myoma T protein; the sequence PQPKKKPL (SEQ ID NO: 32) of human p53; the sequence SALIKKKKKMAP (SEQ ID NO: 33) of mouse c-abl IV; the sequences DRLRR (SEQ ID NO: 34) and PKQKKRK (SEQ ID NO: 35) of the influenza virus NS1; the sequence RKLKKKIKKL (SEQ ID NO: 36) of the Hepatitis virus delta antigen; the sequence REKKKFLKRR (SEQ ID NO: 37) of the mouse Mx1 protein; the sequence KRKGDEVDGVDEVAKKKSKK (SEQ ID NO: 37) of the human poly(ADP-ribose) polymerase; and the sequence RKCLQAGMNLEARKTKK (SEQ ID NO: 39) of the steroid hormone receptors (human) glucocorticoid. In general, the one or more NLSs are of sufficient strength to drive accumulation of the DNA-targeting Cas protein in a detectable amount in the nucleus of a eukaryotic cell. In general, strength of nuclear localization activity may derive from the number of NLSs in the CRISPR-Cas protein, the particular NLS(s) used, or a combination of these factors. Detection of accumulation in the nucleus may be performed by any suitable technique. For example, a detectable marker may be fused to the nucleic acid-targeting protein, such that location within a cell may be visualized, such as in combination with a means for detecting the location of the nucleus (e.g., a stain specific for the nucleus such as DAPI). Cell nuclei may also be isolated from cells, the contents of which may then be analyzed by any suitable process for detecting protein, such as immunohistochemistry, Western blot, or enzyme activity assay. Accumulation in the nucleus may also be determined indirectly, such as by an assay for the effect of nucleic acid-targeting complex formation (e.g., assay for deaminase activity) at the target sequence, or assay for altered gene expression activity affected by DNA-targeting complex formation and/or DNA-targeting), as compared to a control not exposed to the Cas protein, or exposed to a Cas protein lacking the one or more NLSs.
The Cas proteins may be provided with 1 or more, such as with, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more heterologous NLSs. In some embodiments, the proteins comprises about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the amino-terminus, about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the carboxy-terminus, or a combination of these (e.g., zero or at least one or more NLS at the amino-terminus and zero or at one or more NIS at the carboxy terminus). When more than one NIS is present, each may be selected independently of the others, such that a single NLS may be present in more than one copy and/or in combination with one or more other NLSs present in one or more copies. In some embodiments, an NLS is considered near the N- or C-terminus when the nearest amino acid of the NLS is within about 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50, or more amino acids along the polypeptide chain from the N- or C-terminus. In preferred embodiments of the Cas proteins, an NLS attached to the C-terminal of the protein.
Other preferred tools for genome editing for use in the context of this invention include zinc finger systems. One type of programmable DNA-binding domain is provided by artificial zinc-finger (ZF) technology, which involves arrays of ZF modules to target new DNA-binding sites in the genome. Each finger module in a ZF array targets three DNA bases. A customized array of individual zinc finger domains is assembled into a ZF protein (ZFP).
Zinc Finger proteins can comprise a functional domain (e.g., activator domain). The first synthetic zinc finger nucleases (ZFNs) were developed by fusing a ZF protein to the catalytic domain of the Type IIS restriction enzyme FokI. (Kim, Y. G. et al., 1994, Chimeric restriction endonuclease, Proc. Natl. Acad. Sci. U.S.A. 91, 883-887; Kim, Y. G. et al., 1996, Hybrid restriction enzymes: zinc finger fusions to Fok I cleavage domain. Proc. Natl. Acad. Sci. U.S.A. 93, 1156-1160). Increased cleavage specificity can be attained with decreased off target activity by use of paired ZFN heterodimers, each targeting different nucleotide sequences separated by a short spacer. (Doyon, Y. et al., 2011, Enhancing zinc-finger-nuclease activity with improved obligate heterodimeric architectures. Nat. Methods 8, 74-79). ZFPs can also be designed as transcription activators and repressors and have been used to target many genes in a wide variety of organisms. Exemplary methods of genome editing using ZFNs can be found for example in U.S. Pat. Nos. 6,534,261, 6,607,882, 6,746,838, 6,794,136, 6,824,978, 6,866,997, 6,933,113, 6,979,539, 7,013,219, 7,030,215, 7,220,719, 7,241,573, 7,241,574, 7,585,849, 7,595,376, 6,903,185, and 6,479,626, all of which are specifically incorporated by reference.
As disclosed herein editing can be made by way of the transcription activator-like effector nucleases (TALENs) system. Transcription activator-like effectors (TALEs) can be engineered to bind practically any desired DNA sequence. Exemplary methods of genome editing using the TALEN system can be found for example in Cermak T. Doyle E L. Christian M. Wang L. Zhang Y. Schmidt C, et al. Efficient design and assembly of custom TALEN and other TAL effector-based constructs for DNA targeting. Nucleic Acids Res. 2011; 39: e82; Zhang F. Cong L. Lodato S. Kosuri S. Church G M. Arlotta P Efficient construction of sequence-specific TAL effectors for modulating mammalian transcription. Nat Biotechnol. 2011; 29:149-153 and U.S. Pat. Nos. 8,450,471, 8,440,431 and 8,440,432, all of which are specifically incorporated by reference.
In some embodiments, a TALE nuclease or TALE nuclease system can be used to modify a polynucleotide. In some embodiments, the methods provided herein use isolated, non-naturally occurring, recombinant or engineered DNA binding proteins that comprise TALE monomers or TALE monomers or half monomers as a part of their organizational structure that enable the targeting of nucleic acid sequences with improved efficiency and expanded specificity.
Naturally occurring TALEs or “wild type TALEs” are nucleic acid binding proteins secreted by numerous species of proteobacteria. TALE polypeptides contain a nucleic acid binding domain composed of tandem repeats of highly conserved monomer polypeptides that are predominantly 33, 34 or 35 amino acids in length and that differ from each other mainly in amino acid positions 12 and 13. In advantageous embodiments the nucleic acid is DNA. As used herein, the term “polypeptide monomers”, “TALE monomers” or “monomers” will be used to refer to the highly conserved repetitive polypeptide sequences within the TALE nucleic acid binding domain and the term “repeat variable di-residues” or “RVD” will be used to refer to the highly variable amino acids at positions 12 and 13 of the polypeptide monomers. As provided throughout the disclosure, the amino acid residues of the RVD are depicted using the IUPAC single letter code for amino acids. A general representation of a TALE monomer which is comprised within the DNA binding domain is X1-11-(X12X13)-X14-33 or 34 or 35, where the subscript indicates the amino acid position and X represents any amino acid. X12X13 indicate the RVDs. In some polypeptide monomers, the variable amino acid at position 13 is missing or absent and in such monomers, the RVD) consists of a single amino acid. In such cases the RVD may be alternatively represented as X*, where X represents X12 and (*) indicates that X13 is absent. The DNA binding domain comprises several repeats of TALE monomers and this may be represented as (X1-11-(X12X13)-X14-33 or 34 or 35)z, where in an advantageous embodiment, z is at least 5 to 40. In a further advantageous embodiment, z is at least 10 to 26.
The TALE monomers can have a nucleotide binding affinity that is determined by the identity of the amino acids in its RVD. For example, polypeptide monomers with an RVD) of NI can preferentially bind to adenine (A), monomers with an RVD of NG can preferentially bind to thymine (T), monomers with an RVD of HD can preferentially bind to cytosine (C) and monomers with an RVD of NN can preferentially bind to both adenine (A) and guanine (G). In some embodiments, monomers with an RVD of IG can preferentially bind to T. Thus, the number and order of the polypeptide monomer repeats in the nucleic acid binding domain of a TALE determines its nucleic acid target specificity. In some embodiments, monomers with an RVD of NS can recognize all four base pairs and can bind to A, T, G or C. The structure and function of TALEs is further described in, for example, Moscou et al., Science 326:1501 (2009); Boch et al., Science 326:1509-1512 (2009); and Zhang et al., Nature Biotechnology 29:149-153 (2011). each of which is incorporated herein by reference in its entirety.
The polypeptides used in methods of the invention can be isolated, non-naturally occurring, recombinant or engineered nucleic acid-binding proteins that have nucleic acid or DNA binding regions containing polypeptide monomer repeats that are designed to target specific nucleic acid sequences.
As described herein, polypeptide monomers having an RVD of HN or NH preferentially bind to guanine and thereby allow the generation of TALE polypeptides with high binding specificity for guanine containing target nucleic acid sequences. In some embodiments, polypeptide monomers having RVDs RN, NN, NK, SN, NH, KN, HN, NQ, HH, RG, KH, RH and SS can preferentially bind to guanine. In some embodiments, polypeptide monomers having RVDs RN, NK, NQ, HH, KH, RH, SS and SN can preferentially bind to guanine and can thus allow the generation of TALE polypeptides with high binding specificity for guanine containing target nucleic acid sequences. In some embodiments, polypeptide monomers having RVDs HH, KH, NH, NK, NQ, RH, RN and SS can preferentially bind to guanine and thereby allow the generation of TALE polypeptides with high binding specificity for guanine containing target nucleic acid sequences. In some embodiments, the RVDs that have high binding specificity for guanine are RN, NH RH and KH. Furthermore, polypeptide monomers having an RVD of NV can preferentially bind to adenine and guanine. In some embodiments, monomers having RVDs of H*, HA, KA, N*, NA, NC, NS, RA, and S* bind to adenine, guanine, cytosine and thymine with comparable affinity.
The predetermined N-terminal to C-terminal order of the one or more polypeptide monomers of the nucleic acid or DNA binding domain determines the corresponding predetermined target nucleic acid sequence to which the polypeptides of the invention will bind. As used herein the monomers and at least one or more half monomers are “specifically ordered to target” the genomic locus or gene of interest. In plant genomes, the natural TALE-binding sites always begin with a thymine (T), which may be specified by a cryptic signal within the non-repetitive N-terminus of the TALE polypeptide; in some cases, this region may be referred to as repeat 0. In animal genomes, TALE binding sites do not necessarily have to begin with a thymine (T) and polypeptides of the invention may target DNA sequences that begin with T, A, G or C. The tandem repeat of TALE monomers always ends with a half-length repeat or a stretch of sequence that may share identity with only the first 20 amino acids of a repetitive full-length TALE monomer and this half repeat may be referred to as a half-monomer. Therefore, it follows that the length of the nucleic acid or DNA being targeted is equal to the number of full monomers plus two.
As described in Zhang et al., Nature Biotechnology 29:149-153 (2011), TALE polypeptide binding efficiency may be increased by including amino acid sequences from the “capping regions” that are directly N-terminal or C-terminal of the DNA binding region of naturally occurring TALEs into the engineered TALEs at positions N-terminal or C-terminal of the engineered TALE DNA binding region. Thus, in one example embodiment, the TALE polypeptides described herein further comprise an N-terminal capping region and/or a C-terminal capping region.
An exemplary amino acid sequence of a N-terminal capping region is:
An exemplary amino acid sequence of a C-terminal capping region is:
As used herein the predetermined “N-terminus” to “C terminus” orientation of the N-terminal capping region, the DNA binding domain comprising the repeat TALE monomers and the C-terminal capping region provide structural basis for the organization of different domains in the d-TALEs or polypeptides of the invention.
The entire N-terminal and/or C-terminal capping regions are not necessary to enhance the binding activity of the DNA binding region. Therefore, in one example embodiment, fragments of the N-terminal and/or C-terminal capping regions are included in the TALE polypeptides described herein.
In one example embodiment, the TALE polypeptides described herein contain a N-terminal capping region fragment that included at least 10, 20, 30, 40, 50, 54, 60, 70, 80, 87, 90, 94, 100, 102, 110, 117, 120, 130, 140, 147, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260 or 270 amino acids of an N-terminal capping region. In another example embodiment, the N-terminal capping region fragment amino acids are of the C-terminus (the DNA-binding region proximal end) of an N-terminal capping region. As described in Zhang et al., Nature Biotechnology 29:149-153 (2011), N-terminal capping region fragments that include the C-terminal 240 amino acids enhance binding activity equal to the full length capping region, while fragments that include the C-terminal 147 amino acids retain greater than 80% of the efficacy of the full length capping region, and fragments that include the C-terminal 117 amino acids retain greater than 50% of the activity of the full-length capping region.
In some embodiments, the TALE polypeptides described herein contain a C-terminal capping region fragment that included at least 6, 10, 20, 30, 37, 40, 50, 60, 68, 70, 80, 90, 100, 110, 120, 127, 130, 140, 150, 155, 160, 170, 180 amino acids of a C-terminal capping region. In one example embodiment, the C-terminal capping region fragment amino acids are of the N-terminus (the DNA-binding region proximal end) of a C-terminal capping region. As described in Zhang et al., Nature Biotechnology 29:149-153 (2011), C-terminal capping region fragments that include the C-terminal 68 amino acids enhance binding activity equal to the full-length capping region, while fragments that include the C-terminal 20 amino acids retain greater than 50% of the efficacy of the full-length capping region.
In one example embodiment, the capping regions of the TALE polypeptides described herein do not need to have identical sequences to the capping region sequences provided herein. Thus, in some embodiments, the capping region of the TALE polypeptides described herein have sequences that are at least 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identical or share identity to the capping region amino acid sequences provided herein. Sequence identity is related to sequence homology. Homology comparisons may be conducted by eye, or more usually, with the aid of readily available sequence comparison programs. These commercially available computer programs may calculate percent (%) homology between two or more sequences and may also calculate the sequence identity shared by two or more amino acid or nucleic acid sequences. In some preferred embodiments, the capping region of the TALE polypeptides described herein have sequences that are at least 95% identical or share identity to the capping region amino acid sequences provided herein.
Sequence homologies can be generated by any of a number of computer programs known in the art, which include but are not limited to BLAST or FASTA. Suitable computer programs for carrying out alignments like the GCG Wisconsin Bestfit package may also be used. Once the software has produced an optimal alignment, it is possible to calculate % homology, preferably % sequence identity. The software typically does this as part of the sequence comparison and generates a numerical result.
In some embodiments described herein, the TALE polypeptides of the invention include a nucleic acid binding domain linked to the one or more effector domains. The terms “effector domain” or “regulatory and functional domain” refer to a polypeptide sequence that has an activity other than binding to the nucleic acid sequence recognized by the nucleic acid binding domain. By combining a nucleic acid binding domain with one or more effector domains, the polypeptides of the invention may be used to target the one or more functions or activities mediated by the effector domain to a particular target DNA sequence to which the nucleic acid binding domain specifically binds.
In some embodiments of the TALE polypeptides described herein, the activity mediated by the effector domain is a biological activity. For example, in some embodiments the effector domain is a transcriptional inhibitor (i.e., a repressor domain), such as an mSin interaction domain (SID). SID4X domain or a Krüppel-associated box (KRAB) or fragments of the KRAB domain. In some embodiments, the effector domain is an enhancer of transcription (i.e., an activation domain), such as the VP16, VP64 or p65 activation domain. In some embodiments, the nucleic acid binding is linked, for example, with an effector domain that includes but is not limited to a transposase, integrase, recombinase, resolvase, invertase, protease, DNA methyltransferase, DNA demethylase, histone acetylase, histone deacetylase, nuclease, transcriptional repressor, transcriptional activator, transcription factor recruiting, protein nuclear-localization signal or cellular uptake signal.
In some embodiments, the effector domain is a protein domain which exhibits activities which include but are not limited to transposase activity, integrase activity, recombinase activity, resolvase activity, invertase activity, protease activity, DNA methyltransferase activity, DNA demethylase activity, histone acetylase activity, histone deacetylase activity, nuclease activity, nuclear-localization signaling activity, transcriptional repressor activity, transcriptional activator activity, transcription factor recruiting activity, or cellular uptake signaling activity. Other preferred embodiments of the invention may include any combination of the activities described herein.
Other preferred tools for genome editing for use in the context of this invention include zinc finger systems and TALE systems. One type of programmable DNA-binding domain is provided by artificial zinc-finger (ZF) technology, which involves arrays of ZF modules to target new DNA-binding sites in the genome. Each finger module in a/F array targets three DNA bases. A customized array of individual zinc finger domains is assembled into a/F protein (ZFP).
In some embodiments, a meganuclease or system thereof can be used to modify a polynucleotide. Meganucleases, which are endodeoxyribonucleases characterized by a large recognition site (double-stranded DNA sequences of 12 to 40 base pairs). Exemplary methods for using meganucleases can be found in U.S. Pat. Nos. 8,163,514, 8,133,697, 8,021,867, 8,119,361, 8,119,381, 8,124,369, and 8,129,134, which are specifically incorporated herein by reference.
In one example embodiment, a programmable nuclease system is used to recruit an activator protein to the COBLL1 gene in order to enhance expression. In one example embodiment, the activator protein is recruited to the enhancer region of the COBLL1 gene. In another example embodiment, the nuclease system is programmed to bind a sequence variant responsible for decreased COBLL1 expression. In another example embodiment, the nuclease system is recruited to a POU2F2 binding site comprising a mutation that decreases or eliminates binding by POU2F2. In a preferred embodiment, the mutation is rs6712203. In another embodiment, the mutation is rs6712203 and the nuclease system is recruited within 20 base pairs surrounding it. In another example embodiment, the nuclease system is recruited to an enhancer possessing the variant. For example, if a subject comprises a variant that prevents binding of a transcription factor to an enhancer controlling expression of COBLL1, a catalytically inactive Cas protein (“dCas”) fused to an activator can be used to recruit that activator protein to the mutated sequence. Accordingly, a guide sequence is designed to direct binding of the dCas-activator fusion such that the activator can interact with the target genomic region and induce COBLL1 expression. In one example embodiment, the guide is designed to bind within 10, 20, 30, 40, 50, 60, 70, 80, 90, 100 or up to 500 base pairs of the variant nucleotide. In one example embodiment, a CRISPR guide sequence includes the specific variant nucleotide. In one example embodiment, POU2F2 or the activation domain thereof is recruited to the COBLL1 enhancer. The Cas protein used may be any of the Cas proteins disclosed above. In one example protein, the Cas protein is a dCas9.
In one embodiment, the programmable nuclease system is a CRISPRa system (see, e.g., US20180057810A1; and Konermann et al. “Genome-scale transcriptional activation by an engineered CRISPR-Cas9 complex” Nature. 2014 Dec. 10. doi: 10.1038/nature14136). Numerous genetic variants associated with disease phenotypes are found to be in non-coding region of the genome, and frequently coincide with transcription factor (TF) binding sites and non-coding RNA genes. In one embodiment, a CRISPR system may be used to activate gene transcription. A nuclease-dead RNA-guided DNA binding domain, dCas9, tethered to transcriptional activator domains that promote gene activation (e.g., p65) may be used for “CRISPRa” that activates transcription. In one example embodiment, for use of dCas9 as an activator (CRISPRa), a guide RNA is engineered to carry RNA binding motifs (e.g., MS2) that recruit effector domains fused to RNA-motif binding proteins, increasing transcription. A key dendritic cell molecule, p65, may be used as a signal amplifier, but is not required.
In certain embodiments, one or more activator domains are recruited. In one example embodiment, the activation domain is linked to the CRISPR enzyme. In another example embodiment, the guide sequence includes aptamer sequences that bind to adaptor proteins fused to an activation domain. In general, the positioning of the one or more activator domains on the inactivated CRISPR enzyme or CRISPR complex is one which allows for correct spatial orientation for the activator domain to affect the target with the attributed functional effect. For example, the transcription activator is placed in a spatial orientation which allows it to affect the transcription of the target. This may include positions other than the N-/C-terminus of the CRISPR enzyme.
In another example embodiment, a zinc finger system is used to recruit an activation domain to the COBLL1 gene. In one example embodiment, the activation domain is linked to the zinc finger system. In general, the positioning of the one or more activator domains on the zinc finger system is one which allows for correct spatial orientation for the activator domain to affect the target with the attributed functional effect.
In another example embodiment, a TALE system is used to recruit an activation domain to the COBLL1 gene. In one example embodiment, the activation domain is linked to the TALE system. In general, the positioning of the one or more activator domains on the TALE system is one which allows for correct spatial orientation for the activator domain to affect the target with the attributed functional effect. For example, the transcription activator is placed in a spatial orientation which allows it to affect the transcription of the target.
In another example embodiment, a meganuclease system is used to recruit an activation domain to the COBLL1 gene. In one example embodiment, the activation domain is linked to the meganuclease system. In general, the positioning of the one or more activator domains on the inactivated meganuclease system is one which allows for correct spatial orientation for the activator domain to affect the target with the attributed functional effect. For example, the transcription activator is placed in a spatial orientation which allows it to affect the transcription of the target.
In one example embodiment, a method of treating subjects suffering from, or at risk of developing, T2D) comprises administering a base editing system that corrects one or more variants associated with decreased expression or activity of COBL11 in adipocyte and/or adipocyte progenitors. A base-editing system may comprise a Cas polypeptide linked to a nucleobase deaminase (“base editing system”) and a guide molecule capable of forming a complex with the Cas polypeptide and directing sequence-specific binding of the base editing system at a target sequence. In one example embodiment, the Cas polypeptide is catalytically inactive. In another example embodiment, the Cas polypeptide is a nickase. The Cas polypeptide may be any of the Cas polypeptides disclosed above. In one example embodiment, the Cas polypeptide is a Type II Cas polypeptide. In one example embodiment, the Cas polypeptide is a Cas9 polypeptide. In another example embodiment, the Cas polypeptide is a Type V Cas polypeptide. In one example embodiment, the Cas polypeptide is a Cas12a or Cas12b polypeptide. The nucleobase deaminase may be cytosine base editor (CBE) or adenosine base editors (ABEs). CBEs convert C⋅G base pairs into a T⋅A base pair (Komor et al. 2016. Nature. 533:420-424; Nishida et al. 2016. Science. 353; and Li et al. Nat. Biotech. 36:324-327) and ABEs convert an A⋅T base pair to a G⋅C base pair. Collectively, CBEs and ABEs can mediate all four possible transition mutations (C to T, A to G, T to C, and G to A). Example base editing systems are disclosed in Rees and Liu. 2018. Nat. Rev. Genet. 19 (12): 770-788, particularly at
The editing window of a base editing system may range over a 5-8 nucleotide window, depending on the base editing system used. Id. Accordingly, given the base editing system used, a guide sequence may be selected to direct the base editing system to convert a base or base pair of one or more variants resulting in reduced POU2FA binding to an enhancer controlling COBL11 expression to a wild-type or non-risk variant. In one example embodiment, the variant is rs6712203. Accordingly, in one example embodiment, the base editing system comprises a CBE capable of editing the C of rs6712203 to a T. In one embodiment, the variant is rs12454712. Accordingly, in one example embodiment, the base editing system comprises a CBE capable of editing the T of rs12454712 to a C.
In one example embodiment, a method of treating subjects suffering from, or at risk of developing, T2D comprises administering an ARCUS base editing system. Exemplary methods for using ARCUS can be found in U.S. Pat. No. 10,851,358, US Publication No. 2020-0239544, and WIPO Publication No. 2020/206231 which are incorporated herein by reference.
In one example embodiment, a method of treating subjects suffering from, or at risk of developing, T2D comprises administering a prime editing system that corrects one or more variants associated with decreased expression or activity of COBL11 in adipocyte and/or adipocyte progenitors. In one example embodiment, a method of treating subjects suffering from, or at risk of developing, lipodystrophy comprises administering a prime editing system that corrects one or more variants associated with decreased expression or activity of BCL2 in skeletal muscle or ASMCs and/or KDSR in ASMCs. In an example embodiment, a method of treating subjects suffering from, or at risk of developing, lipodystrophy comprises administering a prime editing system that corrects one or more variants associated with increased expression or activity of VPS4B in ASMCs. In one example embodiment, a prime editing system comprises a Cas polypeptide having nickase activity, a reverse transcriptase, and a prime editing guide RNA (pegRNA). Cas polypeptide, and/or reverse transcriptase can be coupled together or otherwise associate with each other to form a prime editing complex and edit a target sequence. The Cas polypeptide may be any of the Cas polypeptides disclosed above. In one example embodiment, the Cas polypeptide is a Type II Cas polypeptide. In another example embodiment, the Cas polypeptide is a Cas9 nickase. In one example embodiment, the Cas polypeptide is a Type V Cas polypeptide. In another example embodiment, the Cas polypeptide is a Cas12a or Cas12b.
The prime editing guide molecule (pegRNA) comprises a primer binding site (PBS) configured to hybridize with a portion of a nicked strand on a target polynucleotide (e.g. genomic DNA) a reverse transcriptase (RT) template comprising the edit to be inserted in the genomic DNA and a spacer sequence designed to hybridize to a target sequence at the site of the desired edit. The nicking site is dependent on the Cas polypeptide used and standard cutting preference for that Cas polypeptide relative to the PAM. Thus, based on the Cas polypeptide used, a pegRNA can be designed to direct the prime editing system to introduce a nick where the desired edit should take place. In on example embodiment, a pegRNA is configured to direct the prime editing system to convert a single base or base pair of the one or more variants associated with reduced COBL11 expression to a wild-type or non-risk variant. In one example embodiment, a pegRNA is configured to direct the prime editing system to convert a single base or base pair of one or more variants associated with reduced POU2FA binding to an enhancer controlling COBL11 expression such that POU2FA binding affinity to the enhance. In another example embodiment, a pegRNA is configured to direct the prime editing system to convert to C of rs6712203 to a T. In another example embodiment, a pegRNA is configured to direct the prime editing system to excise a portion of genomic DNA comprising one or more variants associated with reduced expression of COBL11 with a sequence that replaces the one or more variants with a wild-type or non-risk variant. In another example embodiment, a pegRNA is configured to direct the prime editing system to excise a portion of genomic DNA comprising one or more variants that reduce POU2FA binding to an enhancer controlling COBL11 expression such that the binding affinity of POU2FA is restored. In one example embodiment, the one or more variants comprise rs6712203. Accordingly, in one example embodiment, a pegRNA is configured to the prime editing system to excise a portion of genomic DNA comprising rs6712203 and replace with a polynucleotide sequence in which the C of rs6712203 is replaced with a T.
The pegRNA can be about 10 to about 200 or more nucleotides in length, such as 10 to/or 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, or 200 or more nucleotides in length. Optimization of the peg guide molecule can be accomplished as described in Anzalone et al. 2019. Nature. 576:149-157, particularly at pg. 3,
In one example embodiment, a method of treating subject suffering from, or at risk of developing, T2D comprises administering a CAST system that replaces a genomic region comprising one or more variants associated with decreased expression or activity of COBL11 in adipocyte and/or adipocyte progenitors with a polynucleotide sequence comprising a wild type sequence or non-risk variant. In one example embodiment, a CAST system is used to replace all or a portion of an enhancer controlling COBL11 expression and comprising one or more variants that reduce POU2FA binding to the enhancer. In one example embodiment, a CAST system is used to replace a portion of genomic DNA comprising the rs6712203 variant with a sequence that replaces the C of rs6712203 with a T.
In one example embodiment, a method of treating subject suffering from, or at risk of developing, lipodystrophy comprises administering a CAST system that replaces a genomic region comprising one or more variants associated with decreased expression or activity of BCL2 in ASMCs or skeletal muscle and/or KDSR in ASMCs with a polynucleotide sequence comprising a wild type sequence or non-risk variant. In one example embodiment, a method of treating subject suffering from, or at risk of developing, lipodystrophy comprises administering a CAST system that replaces a genomic region comprising one or more variants associated with increased expression or activity of VPS4B in ASMCs. In one example embodiment, a CAST system is used to replace a portion of genomic DNA comprising the rs12454712 variant with a sequence that replaces the T of rs12454712 with a C.
CAST systems comprise a Cas polypeptide, a guide sequence, a transposase, and a donor construct. The transposase is linked to or otherwise capable of forming a complex with the Cas polypeptide. The donor construct comprises a donor sequence to be inserted into a target polynucleotide and one or more transposase recognition elements. The transposase is capable of binding the donor construct and excising the donor template and directing insertion of the donor template into a target site on a target polynucleotide (e.g. genomic DNA). The guide molecule is capable of forming a CRISPR-Cas complex with the Cas polypeptide, and can be programmed to direct the entire CAST complex such that the transposase is positioned to insert the donor sequence at the target site on the target polynucleotide. For multimeric transposase, only those transposases needed for recognition of the donor construct and transposition of the donor sequence into the target polypeptide may be required. The Cas may be naturally catalytically inactive or engineered to be catalytically inactive.
In one example embodiment, the CAST system is a Tn7-like CAST system, wherein the transposase comprises one or more polypeptides from a Tn7 or Tn7-like transposase. The Cas polypeptide of the Tn7-like transposase may be a Class 1 (multimeric effector complex) or Class 2 (single protein effector) Cas polypeptide.
In one example embodiments, the Cas polypeptide is a Class 1 Type-If Cas polypeptide. In one example embodiment, the Cas polypeptide may comprise a cas6, a cas7, and a cas8-cas5 fusion. In one example embodiments, the Tn7 transposase may comprise TnsB, TnsC, and TniQ. In another example embodiment, the Tn7 transposase may comprise TnsB, TnsC, and TnsD. In certain example embodiments, the Tn7 transposase may comprise TnsD, TnsE, or both. As used herein, the terms “TnsAB”, “TnsAC”, “TnsBC”, or “TnsABC” refer to a transposon complex comprising TnsA and TnsB, TnsA and TnsC, TnsB and TnsC, TnsA and TnsB and TnsC, respectively. In these combinations, the transposases (TnsA, TnsB, TnsC) may form complexes or fusion proteins with each other. Similarly, the term TnsABC-TniQ refer to a transposon comprising TnsA, TnsB, TnsC, and TniQ, in a form of complex or fusion protein. An example Type If-Tn7 CAST system is described in Klompe et al. Nature, 2019, 571:219-224 and Vo et al. bioRxiv, 2021, doi.org/10.1101/2021.02.11.430876, which are incorporated herein by reference.
In one example embodiment, the Cas polypeptide is a Class 1 Type-1b Cas polypeptide. In one example embodiment, the Cas polypeptide may comprise a cas6, a cas7, and a cas8b (e.g. a ca8b3). In one example embodiments, the Tn7 transposase may comprise TnsB, TnsC, and TniQ. In another example embodiment, the Tn7 transposase may comprise TnsB, TnsC, and TnsD. In certain example embodiments, the Tn7 transposase may comprise TnsD, TnsE, or both. As used herein, the terms “TnsAB”, “TnsAC”, “TnsBC”, or “TnsABC” refer to a transposon complex comprising TnsA and TnsB, TnsA and TnsC, TnsB and TnsC, TnsA and TnsB and TnsC, respectively. In these combinations, the transposases (TnsA, TnsB, TnsC) may form complexes or fusion proteins with each other. Similarly, the term TnsABC-TniQ refer to a transposon comprising TnsA, TnsB, TnsC, and TniQ, in a form of complex or fusion protein.
In one example embodiment, the Cas polypeptide is Class 2, Type V Cas polypeptide. In one example embodiment, the Type V Cas polypeptide is a Cas12k. In one example embodiments, the Tn7 transposase may comprise TnsB, TnsC, and TniQ. In another example embodiment, the Tn7 transposase may comprise TnsB, TnsC, and TnsD. In certain example embodiments, the Tn7 transposase may comprise TnsD, TnsE, or both. As used herein, the terms “TnsAB”, “TnsAC”, “TnsBC”, or “TnsABC” refer to a transposon complex comprising TnsA and TnsB, TnsA and TnsC, TnsB and TnsC, TnsA and TnsB and TnsC, respectively. In these combinations, the transposases (TnsA, TnsB, TnsC) may form complexes or fusion proteins with each other. Similarly, the term TnsABC-TniQ refer to a transposon comprising TnsA, TnsB, TnsC, and TniQ, in a form of complex or fusion protein. An example Cas12k-Tn7 CAST system is described in Strecker et al. Science, 2019 365:48-53, which is incorporated herein by reference.
In one example embodiment, the CAST system is a Mu CAST system, wherein the transposase comprises one or more polypeptides of a Mu transposase. An example Mu CAST system is disclosed in WO/2021/041922 which is incorporated herein by reference.
In one example embodiment, the CAST comprise a catalytically inactive Type II Cas polypeptide (e.g. dCas9) fused to one or more polypeptides of a Tn5 transposase. In another example embodiment, the CAST system comprises a catalytically inactive Type II Cas polypeptide (e.g. dCas9) fused to a piggyback transposase
The system may further comprise one or more donor polynucleotides (e.g., for insertion into the target polynucleotide). A donor polynucleotide may be an equivalent of a transposable element that can be inserted or integrated to a target site. The donor polynucleotide may be or comprise one or more components of a transposon. A donor polynucleotide may be any type of polynucleotides, including, but not limited to, a gene, a gene fragment, a non-coding polynucleotide, a regulatory polynucleotide, a synthetic polynucleotide, etc. The donor polynucleotide may include a transposon left end (LE) and transposon right end (RE). The LE and RE sequences may be endogenous sequences for the CAST used or may be heterologous sequences recognizable by the CAST used, or the LE or RE may be synthetic sequences that comprise a sequence or structure feature recognized by the CAST and sufficient to allow insertion of the donor polynucleotide into the target polynucleotides. In certain example embodiments, the LE and RE sequences are truncated. In certain example embodiments may be between 100-200 bps, between 100-190 base pairs, 100-180 base pairs, 100-170 base pairs, 100-160 base pairs, 100-150 base pairs, 100-140 base pairs, 100-130 base pairs, 100-120 base pairs, 100-110 base pairs, 20-100 base pairs, 20-90 base pairs, 20-80 base pairs, 20-70 base pairs, 20-60 base pairs, 20-50 base pairs, 20-40 base pairs, 20-30 base pairs, 50 to 100 base pairs, 60-100 base pairs, 70-100 base pairs, 80-100 base pairs, or 90-100 base pairs in length
The donor polynucleotide may be inserted at a position upstream or downstream of a PAM on a target polynucleotide. In some embodiments, a donor polynucleotide comprises a PAM sequence. Examples of PAM sequences include TTTN, ATTN, NGTN, RGTR, VGTD, or VGTR.
The donor polynucleotide may be inserted at a position between 10 bases and 200 bases, e.g., between 20 bases and 150 bases, between 30 bases and 100 bases, between 45 bases and 70 bases, between 45 bases and 60 bases, between 55 bases and 70 bases, between 49 bases and 56 bases or between 60 bases and 66 bases, from a PAM sequence on the target polynucleotide. In some cases, the insertion is at a position upstream of the PAM sequence. In some cases, the insertion is at a position downstream of the PAM sequence. In some cases, the insertion is at a position from 49 to 56 bases or base pairs downstream from a PAM sequence. In some cases, the insertion is at a position from 60 to 66 bases or base pairs downstream from a PAM sequence.
The donor polynucleotide may be used for editing the target polynucleotide. In some cases, the donor polynucleotide comprises one or more mutations to be introduced into the target polynucleotide. Examples of such mutations include substitutions, deletions, insertions, or a combination thereof. The mutations may cause a shift in an open reading frame on the target polynucleotide. In some cases, the donor polynucleotide alters a stop codon in the target polynucleotide. For example, the donor polynucleotide may correct a premature stop codon. The correction may be achieved by deleting the stop codon or introduces one or more mutations to the stop codon. In other example embodiments, the donor polynucleotide addresses loss of function mutations, deletions, or translocations that may occur, for example, in certain disease contexts by inserting or restoring a functional copy of a gene, or functional fragment thereof, or a functional regulatory sequence or functional fragment of a regulatory sequence. A functional fragment refers to less than the entire copy of a gene by providing sufficient nucleotide sequence to restore the functionality of a wild type gene or non-coding regulatory sequence (e.g. sequences encoding long non-coding RNA). In certain example embodiments, the systems disclosed herein may be used to replace a single allele of a defective gene or defective fragment thereof. In another example embodiment, the systems disclosed herein may be used to replace both alleles of a defective gene or defective gene fragment. A “defective gene” or “defective gene fragment” is a gene or portion of a gene that when expressed fails to generate a functioning protein or non-coding RNA with functionality of a corresponding wild-type gene. In certain example embodiments, these defective genes may be associated with one or more disease phenotypes. In certain example embodiments, the defective gene or gene fragment is not replaced but the systems described herein are used to insert donor polynucleotides that encode gene or gene fragments that compensate for or override defective gene expression such that cell phenotypes associated with defective gene expression are eliminated or changed to a different or desired cellular phenotype.
In certain embodiments of the invention, the donor may include, but not be limited to, genes or gene fragments, encoding proteins or RNA transcripts to be expressed, regulatory elements, repair templates, and the like. According to the invention, the donor polynucleotides may comprise left end and right end sequence elements that function with transposition components that mediate insertion.
In certain cases, the donor polynucleotide manipulates a splicing site on the target polynucleotide. In some examples, the donor polynucleotide disrupts a splicing site. The disruption may be achieved by inserting the polynucleotide to a splicing site and/or introducing one or more mutations to the splicing site. In certain examples, the donor polynucleotide may restore a splicing site. For example, the polynucleotide may comprise a splicing site sequence.
The donor polynucleotide to be inserted may have a size from 10 bases to 50 kb in length, e.g., from 50 to 40 kb, from 100 to 30 kb, from 100 bases to 300 bases, from 200 bases to 400 bases, from 300 bases to 500 bases, from 400 bases to 600 bases, from 500 bases to 700 bases, from 600 bases to 800 bases, from 700 bases to 900 bases, from 800 bases to 1000 bases, from 900 bases to from 1100 bases, from 1000 bases to 1200 bases, from 1100 bases to 1300 bases, from 1200 bases to 1400 bases, from 1300 bases to 1500 bases, from 1400 bases to 1600 bases, from 1500 bases to 1700 bases, from 600 bases to 1800 bases, from 1700 bases to 1900 bases, from 1800 bases to 2000 bases, from 1900 bases to 2100 bases, from 2000 bases to 2200 bases, from 2100 bases to 2300 bases, from 2200 bases to 2400 bases, from 2300 bases to 2500 bases, from 2400 bases to 2600 bases, from 2500 bases to 2700 bases, from 2600 bases to 2800 bases, from 2700 bases to 2900 bases, or from 2800 bases to 3000 bases in length.
The components in the systems herein may comprise one or more mutations that alter their (e.g., the transposase(s)) binding affinity to the donor polynucleotide. In some examples, the mutations increase the binding affinity between the transposase(s) and the donor polynucleotide. In certain examples, the mutations decrease the binding affinity between the transposase(s) and the donor polynucleotide. The mutations may alter the activity of the Cas and/or transposase(s).
In certain embodiments, the systems disclosed herein are capable of unidirectional insertion, that is the system inserts the donor polynucleotide in only one orientation.
Delivery mechanisms for CAST systems includes those discussed above for CRISPR-Cas systems.
In one example embodiment, a subject at risk for, or suffering from, Type-2 Diabetes (T2D)) due to decreased COBLL1 expression or activity or aberrant actin remodeling is treated by transplanting AMSCs having normal function to adipose tissue in the subject (ACT). As used herein, “transplant” refers to transferring cells to a subject to replace or supplement cells or tissue causing disease and can be used interchangeably with “ACT”. The AMSCs may be obtained from a donor (allogenic) or obtained from the subject (autologous) and modified using gene therapy to have normal function when differentiated into adipocytes. As used herein, “ACT”, “adoptive cell therapy” and “adoptive cell transfer” may be used interchangeably. In another example embodiment, Adoptive cell therapy (ACT) can refer to the transfer of cells to a patient with the goal of transferring the functionality and characteristics into the new host by engraftment of the cells. As used herein, the terms “engraft” or “engraftment” refers to the process of cell incorporation into a tissue of interest in vivo through contact with existing cells of the tissue. Adoptive cell therapy (ACT) can refer to the transfer of cells back into the same patient or into a new recipient host with the goal of transferring the functionality and characteristics into the new host (e.g., adipocyte function). In another example embodiment, use of autologous cells helps the subject by minimizing graft-versus-host disease (GVHD). In another example embodiment, allogenic AMSCs can be transferred to a subject, as AMSCs are hypoimmunogenic. In another example embodiment, allogenic cells can be edited to reduce alloreactivity and prevent GVHD. Thus, use of allogenic cells allows for cells to be obtained from healthy donors and prepared for use in patients as opposed to preparing autologous cells from a patient after diagnosis. In another example embodiment, gene therapy as described herein can be used to modify cells ex vivo before ACT. In another example embodiment, a programmable nuclease is used to enhance expression of the endogenous COBLL1 gene. In another example embodiment, a polynucleotide sequence encoding COBLL1 is transferred to cells. In another example embodiment, genome editing is used to repair expression of the endogenous COBLL1 gene.
In another example embodiment, a programmable nuclease is used to enhance expression of the endogenous BCL2 gene. In another example embodiment, a polynucleotide sequence encoding BCL2 is transferred to cells. In another example embodiment, genome editing is used to repair expression of the endogenous BCL2 gene. In another example embodiment, a programmable nuclease is used to enhance expression of the endogenous KDSR gene. In another example embodiment, a polynucleotide sequence encoding KDSR is transferred to cells. In another example embodiment, genome editing is used to repair expression of the endogenous KDSR gene. In another example embodiment, a programmable nuclease is used to reduce expression of the endogenous VPS4B gene. In another example embodiment, genome editing is used to repair expression of the endogenous VPS4B gene. In another example embodiment, a programmable nuclease is used to enhance expression of the endogenous VPS4B gene. In another example embodiment, the modified cells can be implanted in the human or animal body to obtain the desired therapeutic effect.
Mesenchymal stem cells are multipotent stromal cells that can differentiate into a variety of cell types, including osteoblasts (bone cells), chondrocytes (cartilage cells), myocytes (muscle cells) and adipocytes, which are fat cells that give rise to marrow adipose tissue. The bone marrow (BM) stroma contains a heterogeneous population of cells, including endothelial cells, fibroblasts, adipocytes and osteogenic cells, and it was initially thought to function primarily as a structural framework upon which hematopoiesis occurs. However, it turns out that at least two distinct stem cell populations reside in the bone marrow of many mammalian species: hematopoietic stem cells (HSCs) and mesenchymal stem cells (MSCs), with the latter responsible for the maintenance of the non-hematopoietic bone marrow cells. MSCs, also termed multipotent marrow stromal cells or mesenchymal stromal cells, are a heterogeneous population of plastic-adherent, fibroblast-like cells, which can self-renew and differentiate into bone, adipose and cartilage tissue in culture. Single cell suspensions of BM stroma can generate colonies of adherent fibroblast-like cells in vitro. These colony-forming unit fibroblasts (CFU-Fs) are capable of osteogenic differentiation and provide evidence for a clonogenic precursor for cells of the bone lineage. Functional in vitro characterization of the stromal compartment has also revealed its importance in regulating the proliferation, differentiation and survival of HSCs. CFU-F initiating cells in vivo have been shown to be quiescent, existing at a low frequency in human bone marrow.
Although MSCs are traditionally isolated from bone marrow, cells with MSC-like characteristics have been isolated from a variety of fetal, neonatal and adult tissues, including cord blood, peripheral blood, fetal liver and lung, adipose tissue, compact bone, dental pulp, dermis, human islet, adult brain, skeletal muscle, amniotic fluid, synovium, and the circulatory system. There is evidence indicating a perivascular location for these MSC-like cells in all tissues, implying that all MSCs are pericytes that closely encircle endothelial cells in capillaries and microvessels in multiple organs. Pericytes are thought to stabilize blood vessels, contribute to tissue homeostasis under physiological conditions, and play an active role in response to focal tissue injury through the release of bioactive molecules with trophic and immunomodulatory properties. Pericytes and adventitial cells also natively express mesenchymal markers and share similar gene expression profiles as well as developmental and differentiation potential with mesenchymal cells. Pericytes may represent a subpopulation of the total pool of assayable MSCs at least within the bone marrow.
AMSCs can be collected from a subject or donor and can be maintained and expanded in culture for long periods of time without losing their differentiation capacity (see, e.g., Mazini, et al. “Regenerative Capacity of Adipose Derived Stem Cells (ADSCs), Comparison with Mesenchymal Stem Cells (MSCs).” International journal of molecular sciences vol. 20, 10 2523. 22 May. 2019, doi: 10.3390/ijms20102523; and Mazini L, Ezzoubi M, Malka G. Overview of current adipose-derived stem cell (ADSCs) processing involved in therapeutic advancements: flow chart and regulation updates before and after COVID-19. Stem Cell Res Ther. 2021; 12 (1): 1). In another example embodiment, AMSCs are isolated from the subcutaneous adipose tissue (see, e.g., Palumbo, et al. In vitro evaluation of different methods of handling human liposuction aspirate and their effect on adipocytes and adipose derived stem cells. J Cell Physiol. 2015; 230 (8): 1974-1981), which allows for them to be rapidly acquired in large numbers and with a high cellular activity. AMSCs are found in abundant quantities and they are harvested by a minimally invasive procedure, can differentiate into multiple cell lineages in a regulatory and reproducible manner and they are safely transplanted at the both autologous and allogeneic setting (see, e.g., Mazini, et al., 2019). Commercial kits for collection and separation of the stromal vascular fraction (SVF) to isolate AMSCs are available (see, e.g., Mazini, et al., 2019, Table 1). AMSC differentiation into adipocytes is well established and adipose tissue regeneration can be performed in vivo (see, e.g., Tsuji W, Rubin J P, Marra K G. Adipose-derived stem cells: Implications in tissue regeneration. World J Stem Cells. 2014; 6 (3): 312-321).
In one example embodiment, AMSCs are administered in combination with bio-engineered materials (e.g., biomaterials, growth factors, plastic support, nanostructures, polymers, etc., as a support of a tissue or organ repair based on tissue engineering) (see, e.g., Mazini, et al., 2019). In another example embodiment, adipose tissue is generated in vivo using a combination of AMSCs and scaffolds. In an example embodiment, acellular scaffolds in combination with drugs or growth factors are used. Exemplary scaffolds, include, but are not limited to type I collagen, fibrin, silk fibroin, alginate, hyaluronic acid, and matrigel (see, e.g., Choi, et al., Adipose tissue engineering for soft tissue regeneration. Tissue Eng Part B Rev. 2010; 16:413 426; Tsuji, et al., Adipogenesis induced by human adipose tissue-derived stem cells. Tissue Eng Part A. 2009; 15:83 93; and Ito, et al., Adipogenesis using human adipose tissue-derived stromal cells combined with a collagen/gelatin sponge sustaining release of basic fibroblast growth factor. J Tissue Eng Regen Med. 2012: Epub ahead of print). In an example embodiment, injectable scaffolds are used, as minimally invasive therapies would be widely adapted by surgeons. In an example embodiment, methods of drug delivery include, but are not limited to using polymeric microspheres to control the release of factors such as bFGF, insulin, and dexamethasone (see, e.g., Marra, et al., FGF-2 enhances vascularization for adipose tissue engineering. Plast Reconstr Surg. 2008; 121:1153 1164; Kimura, et al., Time course of de novo adipogenesis in matrigel by gelatin microspheres incorporating basic fibroblast growth factor. Tissue Eng. 2002; 8:603-613; and Rubin, et al., Encapsulation of adipogenic factors to promote differentiation of adipose-derived stem cells. J Drug Target. 2009; 17:207-215). In one example embodiment, AMSCs are administered in a dose of about 1-5×106 AMSCs/kg of body weight, however, the dose can be adjusted based on time and administration route and schedule.
In one example embodiment, allogenic AMSCs are used for ACT. In another example embodiment, donors for allogenic AMSCs are screened for mutations/variants that decrease COBLL1 expression as described herein. In another example embodiment, COBLL1 expression is modified in allogenic cells even in situations where the cells do not have a COBLL1 variant or a decrease in function. In another example embodiment, increased COBLL1 expression or activity in transferred cells can compensate for host cells having decreased expression or activity. AMSCs are commonly known for their low immunogenicity and modulatory effects (see, e.g., Puissant, et al. Immunomodulatory effect of human adipose tissue-derived adult stem cells: comparison with bone marrow mesenchymal stem cells. Br J Haematol. 2005; 129 (1): 118-129). Less than 1% of AMSCs express the HLADR protein on their surface, leading to immunosuppressive effects and making them suitable for clinical applications in allogeneic transplantation and in therapies for the treatment of resistant immune disorders. Id. Further, adipogenic differentiated allogenic AMSCs can form new adipose tissue without immune rejection, such that adipogenic differentiated AMSCs can be used as a “universal donor” for soft-tissue engineering in MHC-mismatched recipients (see, e.g., Kim, et al., Clinical implication of allogenic implantation of adipogenic differentiated adipose-derived stem cells. Stem Cells Transl Med. 2014; 3 (11): 1312-1321).
In one example embodiment, the potential immunogenicity of allogeneic cells might cause their rejection after infusion. AMSC differentiation may alter their immunogenic phenotype, increasing HILA class-I and HLA class-II expression (see, e.g., Ceccarelli, et al., Immunomodulatory Effect of Adipose-Derived Stem Cells: The Cutting Edge of Clinical Application. Front Cell Dev Biol. 2020; 8:236). In another example embodiment, adipose tissue from HLA identical siblings, haplo-identical relatives, or HLA-screened healthy volunteers is used for collection and storage until used in an HLA-matched patient for allogenic transfer.
In one example embodiment, autologous AMSCs are used for ACT. In one embodiment, autologous AMSCs are used for chronic pathologies because the time required for the isolation and expansion of cells is not a limit given the non-acute nature of the diseases (e.g., T2D, lipodystrophy). In another example embodiment, autologous AMSCs are obtained from a subject in need thereof and cells for ACT are genetically modified using any of the methods described herein (e.g., repair of the mutation decreasing expression of COBLL1 or BCL2, overexpressing COBLL1 or BCL2 using gene therapy). CRISPR-Cas editing has been used to repair a variant in primary adipocytes and AMSCs (see, e.g., Claussnitzer, et al. FTO Obesity Variant Circuitry and Adipocyte Browning in Humans. N Engl J Med. 2015; 373 (10): 895-907).
Also described herein are pharmaceutical formulations that can contain an amount, effective amount, and/or least effective amount, and/or therapeutically effective amount of one or more compounds, molecules, compositions, vectors, vector systems, cells as described above, or a combination thereof (which are also referred to as the primary active agent or ingredient elsewhere herein) described in greater detail elsewhere herein a pharmaceutically acceptable carrier or excipient. As used herein, “pharmaceutical formulation” refers to the combination of an active agent, compound, or ingredient with a pharmaceutically acceptable carrier or excipient, making the composition suitable for diagnostic, therapeutic, or preventive use in vitro, in vivo, or ex vivo. As used herein, “pharmaceutically acceptable carrier or excipient” refers to a carrier or excipient that is useful in preparing a pharmaceutical formulation that is generally safe, non-toxic, and is neither biologically or otherwise undesirable, and includes a carrier or excipient that is acceptable for veterinary use as well as human pharmaceutical use. A “pharmaceutically acceptable carrier or excipient” as used in the specification and claims includes both one and more than one such carrier or excipient. When present, the compound can optionally be present in the pharmaceutical formulation as a pharmaceutically acceptable salt. In some embodiments, the pharmaceutical formulation can include, such as an active ingredient, a CRISPR-Cas system or component thereof described in greater detail elsewhere herein. In some embodiments, the pharmaceutical formulation can include, such as an active ingredient, a CRISPR-Cas polynucleotide described in greater detail elsewhere herein. In some embodiments, the pharmaceutical formulation can include, such as an active ingredient one or more modified cells, such as one or more modified cells described in greater detail elsewhere herein.
In some embodiments, the active ingredient is present as a pharmaceutically acceptable salt of the active ingredient. As used herein, “pharmaceutically acceptable salt” refers to any acid or base addition salt whose counter-ions are non-toxic to the subject to which they are administered in pharmaceutical doses of the salts. Suitable salts include, hydrobromide, iodide, nitrate, bisulfate, phosphate, isonicotinate, lactate, salicylate, acid citrate, tartrate, oleate, tannate, pantothenate, bitartrate, ascorbate, succinate, maleate, gentisinate, fumarate, gluconate, glucaronate, saccharate, formate, benzoate, glutamate, methanesulfonate, ethanesulfonate, benzenesulfonate, p-toluenesulfonate, camphorsulfonate, napthalenesulfonate, propionate, malonate, mandelate, malate, phthalate, and pamoate.
The pharmaceutical formulations described herein can be administered to a subject in need thereof via any suitable method or route to a subject in need thereof. Suitable administration routes can include, but are not limited to auricular (otic), buccal, conjunctival, cutaneous, dental, electro-osmosis, endocervical, endosinusial, endotracheal, enteral, epidural, extra-amniotic, extracorporeal, hemodialysis, infiltration, interstitial, intra-abdominal, intra-amniotic, intra-arterial, intra-articular, intrabiliary, intrabronchial, intrabursal, intracardiac, intracartilaginous, intracaudal, intracavernous, intracavitary, intracerebral, intracisternal, intracorneal, intracoronal (dental), intracoronary, intracorporus cavernosum, intradermal, intradiscal, intraductal, intraduodenal, intradural, intraepidermal, intraesophageal, intragastric, intragingival, intraileal, intralesional, intraluminal, intralymphatic, intramedullary, intrameningeal, intramuscular, intraocular, intraovarian, intrapericardial, intraperitoneal, intrapleural, intraprostatic, intrapulmonary, intrasinal, intraspinal, intrasynovial, intratendinous, intratesticular, intrathecal, intrathoracic, intratubular, intratumor, intratympanic, intrauterine, intravascular, intravenous, intravenous bolus, intravenous drip, intraventricular, intravesical, intravitreal, iontophoresis, irrigation, laryngeal, nasal, nasogastric, occlusive dressing technique, ophthalmic, oral, oropharyngeal, other, parenteral, percutaneous, periarticular, peridural, perineural, periodontal, rectal, respiratory (inhalation), retrobulbar, soft tissue, subarachnoid, subconjunctival, subcutaneous, sublingual, submucosal, topical, transdermal, transmucosal, transplacental, transtracheal, transtympanic, ureteral, urethral, and/or vaginal administration, and/or any combination of the above administration routes, which typically depends on the disease to be treated and/or the active ingredient(s).
Where appropriate, compounds, molecules, compositions, vectors, vector systems, cells, or a combination thereof described in greater detail elsewhere herein can be provided to a subject in need thereof as an ingredient, such as an active ingredient or agent, in a pharmaceutical formulation. As such, also described are pharmaceutical formulations containing one or more of the compounds and salts thereof, or pharmaceutically acceptable salts thereof described herein. Suitable salts include, hydrobromide, iodide, nitrate, bisulfate, phosphate, isonicotinate, lactate, salicylate, acid citrate, tartrate, oleate, tannate, pantothenate, bitartrate, ascorbate, succinate, maleate, gentisinate, fumarate, gluconate, glucaronate, saccharate, formate, benzoate, glutamate, methanesulfonate, ethanesulfonate, benzenesulfonate, p-toluenesulfonate, camphorsulfonate, napthalenesulfonate, propionate, malonate, mandelate, malate, phthalate, and pamoate.
In some embodiments, the subject in need thereof has or is suspected of having a Type-2 Diabetes or a symptom thereof. In some embodiments, the subject in need thereof has or is suspected of having, a metabolic disease or disorder, insulin resistance, or glucose intolerance, or a combination thereof. As used herein, “agent” refers to any substance, compound, molecule, and the like, which can be biologically active or otherwise can induce a biological and/or physiological effect on a subject to which it is administered to. As used herein, “active agent” or “active ingredient” refers to a substance, compound, or molecule, which is biologically active or otherwise, induces a biological or physiological effect on a subject to which it is administered to. In other words, “active agent” or “active ingredient” refers to a component or components of a composition to which the whole or part of the effect of the composition is attributed. An agent can be a primary active agent, or in other words, the component(s) of a composition to which the whole or part of the effect of the composition is attributed. An agent can be a secondary agent, or in other words, the component(s) of a composition to which an additional part and/or other effect of the composition is attributed.
The pharmaceutical formulation can include a pharmaceutically acceptable carrier. Suitable pharmaceutically acceptable carriers include, but are not limited to water, salt solutions, alcohols, gum arabic, vegetable oils, benzyl alcohols, polyethylene glycols, gelatin, carbohydrates such as lactose, amylose or starch, magnesium stearate, talc, silicic acid, viscous paraffin, perfume oil, fatty acid esters, hydroxy methylcellulose, and polyvinyl pyrrolidone, which do not deleteriously react with the active composition.
The pharmaceutical formulations can be sterilized, and if desired, mixed with agents, such as lubricants, preservatives, stabilizers, wetting agents, emulsifiers, salts for influencing osmotic pressure, buffers, coloring, flavoring and/or aromatic substances, and the like which do not deleteriously react with the active compound.
In some embodiments, the pharmaceutical formulation can also include an effective amount of secondary active agents, including but not limited to, biologic agents or molecules including, but not limited to, e.g. polynucleotides, amino acids, peptides, polypeptides, antibodies, aptamers, ribozymes, hormones, immunomodulators, antipyretics, anxiolytics, antipsychotics, analgesics, antispasmodics, anti-inflammatories, anti-histamines, anti-infectives, chemotherapeutics, and combinations thereof.
In some embodiments, the amount of the primary active agent and/or optional secondary agent can be an effective amount, least effective amount, and/or therapeutically effective amount. As used herein, “effective amount” refers to the amount of the primary and/or optional secondary agent included in the pharmaceutical formulation that achieve one or more therapeutic effects or desired effect. As used herein, “least effective” amount refers to the lowest amount of the primary and/or optional secondary agent that achieves the one or more therapeutic or other desired effects. As used herein, “therapeutically effective amount” refers to the amount of the primary and/or optional secondary agent included in the pharmaceutical formulation that achieves one or more therapeutic effects. In some embodiments, the one or more therapeutic effects are promoting actin cytoskeleton remodeling processes, promoting accumulation of lipids in targeted cells, and promoting insulin-sensitivity.
The effective amount, least effective amount, and/or therapeutically effective amount of the primary and optional secondary active agent described elsewhere herein contained in the pharmaceutical formulation can range from about 0 to 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 580, 590, 600, 610, 620, 630, 640, 650, 660, 670, 680, 690, 700, 710, 720, 730, 740, 750, 760, 770, 780, 790, 800, 810, 820, 830, 840, 850, 860, 870, 880, 890, 900, 910, 920, 930, 940, 950, 960, 970, 980, 990, 1000 pg, ng, μg, mg, or g or be any numerical value with any of these ranges.
In some embodiments, the effective amount, least effective amount, and/or therapeutically effective amount can be an effective concentration, least effective concentration, and/or therapeutically effective concentration, which can each range from about 0 to 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 580, 590, 600, 610, 620, 630, 640, 650, 660, 670, 680, 690, 700, 710, 720, 730, 740, 750, 760, 770, 780, 790, 800, 810, 820, 830, 840, 850, 860, 870, 880, 890, 900, 910, 920, 930, 940, 950, 960, 970, 980, 990, 1000 pM, nM, μM, mM, or M or be any numerical value with any of these ranges.
In other embodiments, the effective amount, least effective amount, and/or therapeutically effective amount of the primary and optional secondary active agent can range from about 0 to 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 580, 590, 600, 610, 620, 630, 640, 650, 660, 670, 680, 690, 700, 710, 720, 730, 740, 750, 760, 770, 780, 790, 800, 810, 820, 830, 840, 850, 860, 870, 880, 890, 900, 910, 920, 930, 940, 950, 960, 970, 980, 990, 1000 IU or be any numerical value with any of these ranges.
In some embodiments, the primary and/or the optional secondary active agent present in the pharmaceutical formulation can range from about 0 to 0.001, 0.002, 0.003, 0.004, 0.005, 0.006, 0.007, 0.008, 0.009, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1, 0.11, 0.12, 0.13, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, 0.2, 0.21, 0.22, 0.23, 0.24, 0.25, 0.26, 0.27, 0.28, 0.29, 0.3, 0.31, 0.32, 0.33, 0.34, 0.35, 0.36, 0.37, 0.38, 0.39, 0.4, 0.41, 0.42, 0.43, 0.44, 0.45, 0.46, 0.47, 0.48, 0.49, 0.5, 0.51, 0.52, 0.53, 0.54, 0.55, 0.56, 0.57, 0.58, 0.59, 0.6, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.7, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.8, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.9, 0.91, 0.92, 0.93, 0.94, 0.95, 0.96, 0.97, 0.98, 0.9, to 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 99.1, 99.2, 99.3, 99.4, 99.5, 99.6, 99.7, 99.8, 99.9% w/w, v/v, or w/v of the pharmaceutical formulation.
In some embodiments where a cell population is present in the pharmaceutical formulation (e.g., as a primary and/or or secondary active agent), the effective amount of cells can range from about 2 cells to 1×101/mL, 1×1020/mL or more, such as about 1×101/mL, 1×102/mL, 1×103/mL, 1×104/mL, 1×105/mL, 1×106/mL, 1×107/mL, 1×108/mL, 1×109/mL, 1×1010/mL, 1×1011/mL, 1×1012/mL, 1×1013/mL, 1×1014/mL, 1×1015/mL, 1×1016/mL, 1×1017/mL, 1×1018/mL, 1×1019/mL, to/or about 1×1020/ml.
In some embodiments, the amount or effective amount, particularly where an infective particle is being delivered (e.g. a virus particle having the primary or secondary agent as a cargo), the effective amount of virus particles can be expressed as a titer (plaque forming units per unit of volume) or as a MOI (multiplicity of infection). In some embodiments, the effective amount can be 1×101 particles per pL, nL, μL, mL, or L to 1×1020/particles per pL, nL, μL, mL, or L or more, such as about 1×101, 1×102, 1×103, 1×104, 1×105, 1×106, 1×107, 1×108, 1×109, 1×1010, 1×1011, 1×1012, 1×1013, 1×1014, 1×1015, 1×1016, 1×1017, 1×1018, 1×1019, to/or about 1×1020 particles per pL, nL, μL, mL, or L. In some embodiments, the effective titer can be about 1×101 transforming units per pL, nL, μL, mL, or L to 1×1020/transforming units per pL, nL, μL, mL, or L or more, such as about 1×101, 1×102, 1×103, 1×104, 1×105, 1×106, 1×107, 1×108, 1×109, 1×1010, 1×1011, 1×1012, 1×1013, 1×1014, 1×1015, 1×1016, 1×1017, 1×1018, 1×1019, to/or about 1×1020 transforming units per pL, nL, pL, mL, or L. In some embodiments, the MOI of the pharmaceutical formulation can range from about 0.1 to 10 or more, such as 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 4, 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, 5, 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7, 5.8, 5.9, 6, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, 7, 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 7.7, 7.8, 7.9, 8, 8.1, 8.2, 8.3, 8.4, 8.5, 8.6, 8.7, 8.8, 8.9, 9, 9.1, 9.2, 9.3, 9.4, 9.5, 9.6, 9.7, 9.8, 9.9, 10 or more.
In some embodiments, the amount or effective amount of the one or more of the active agent(s) described herein contained in the pharmaceutical formulation can range from about 1 pg/kg to about 10 mg/kg based upon the bodyweight of the subject in need thereof or average bodyweight of the specific patient population to which the pharmaceutical formulation can be administered.
In embodiments where there is a secondary agent contained in the pharmaceutical formulation, the effective amount of the secondary active agent will vary depending on the secondary agent, the primary agent, the administration route, subject age, disease, stage of disease, among other things, which will be one of ordinary skill in the art.
When optionally present in the pharmaceutical formulation, the secondary active agent can be included in the pharmaceutical formulation or can exist as a stand-alone compound or pharmaceutical formulation that can be administered contemporaneously or sequentially with the compound, derivative thereof, or pharmaceutical formulation thereof.
In some embodiments, the effective amount of the secondary active agent can range from about 0 to 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 99.1, 99.2, 99.3, 99.4, 99.5, 99.6, 99.7, 99.8, 99.9% w/w, v/v, or w/v of the total secondary active agent in the pharmaceutical formulation. In additional embodiments, the effective amount of the secondary active agent can range from about 0 to 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 99.1, 99.2, 99.3, 99.4, 99.5, 99.6, 99.7, 99.8, 99.9% w/w, v/v, or w/v of the total pharmaceutical formulation.
In some embodiments, the pharmaceutical formulations described herein can be provided in a dosage form. The dosage form can be administered to a subject in need thereof. The dosage form can be effective generate specific concentration, such as an effective concentration, at a given site in the subject in need thereof. As used herein, “dose,” “unit dose,” or “dosage” can refer to physically discrete units suitable for use in a subject, each unit containing a predetermined quantity of the primary active agent, and optionally present secondary active ingredient, and/or a pharmaceutical formulation thereof calculated to produce the desired response or responses in association with its administration. In some embodiments, the given site is proximal to the administration site. In some embodiments, the given site is distal to the administration site. In some cases, the dosage form contains a greater amount of one or more of the active ingredients present in the pharmaceutical formulation than the final intended amount needed to reach a specific region or location within the subject to account for loss of the active components such as via first and second pass metabolism.
The dosage forms can be adapted for administration by any appropriate route. Appropriate routes include, but are not limited to, oral (including buccal or sublingual), rectal, intraocular, inhaled, intranasal, topical (including buccal, sublingual, or transdermal), vaginal, parenteral, subcutaneous, intramuscular, intravenous, internasal, and intradermal. Other appropriate routes are described elsewhere herein. Such formulations can be prepared by any method known in the art.
Dosage forms adapted for oral administration can discrete dosage units such as capsules, pellets or tablets, powders or granules, solutions, or suspensions in aqueous or non-aqueous liquids; edible foams or whips, or in oil-in-water liquid emulsions or water-in-oil liquid emulsions. In some embodiments, the pharmaceutical formulations adapted for oral administration also include one or more agents which flavor, preserve, color, or help disperse the pharmaceutical formulation. Dosage forms prepared for oral administration can also be in the form of a liquid solution that can be delivered as a foam, spray, or liquid solution. The oral dosage form can be administered to a subject in need thereof. Where appropriate, the dosage forms described herein can be microencapsulated.
The dosage form can also be prepared to prolong or sustain the release of any ingredient. In some embodiments, compounds, molecules, compositions, vectors, vector systems, cells, or a combination thereof described herein can be the ingredient whose release is delayed. In some embodiments the primary active agent is the ingredient whose release is delayed. In some embodiments, an optional secondary agent can be the ingredient whose release is delayed. Suitable methods for delaying the release of an ingredient include, but are not limited to, coating or embedding the ingredients in material in polymers, wax, gels, and the like. Delayed release dosage formulations can be prepared as described in standard references such as “Pharmaceutical dosage form tablets,” eds. Liberman et. al. (New York, Marcel Dekker, Inc., 1989), “Remington—The science and practice of pharmacy”, 20th ed., Lippincott Williams & Wilkins, Baltimore, MD, 2000, and “Pharmaceutical dosage forms and drug delivery systems”, 6th Edition, Ansel et al., (Media, PA: Williams and Wilkins, 1995). These references provide information on excipients, materials, equipment, and processes for preparing tablets and capsules and delayed release dosage forms of tablets and pellets, capsules, and granules. The delayed release can be anywhere from about an hour to about 3 months or more.
Examples of suitable coating materials include, but are not limited to, cellulose polymers such as cellulose acetate phthalate, hydroxypropyl cellulose, hydroxypropyl methylcellulose, hydroxypropyl methylcellulose phthalate, and hydroxypropyl methylcellulose acetate succinate; polyvinyl acetate phthalate, acrylic acid polymers and copolymers, and methacrylic resins that are commercially available under the trade name EUDRAGIT® (Roth Pharma, Westerstadt, Germany), zein, shellac, and polysaccharides.
Coatings may be formed with a different ratio of water-soluble polymer, water insoluble polymers, and/or pH dependent polymers, with or without water insoluble/water soluble non-polymeric excipient, to produce the desired release profile. The coating is either performed on the dosage form (matrix or simple) which includes, but is not limited to, tablets (compressed with or without coated beads), capsules (with or without coated beads), beads, particle compositions, “ingredient as is” formulated as, but not limited to, suspension form or as a sprinkle dosage form.
Where appropriate, the dosage forms described herein can be a liposome. In these embodiments, primary active ingredient(s), and/or optional secondary active ingredient(s), and/or pharmaceutically acceptable salt thereof where appropriate are incorporated into a liposome. In embodiments where the dosage form is a liposome, the pharmaceutical formulation is thus a liposomal formulation. The liposomal formulation can be administered to a subject in need thereof.
Dosage forms adapted for topical administration can be formulated as ointments, creams, suspensions, lotions, powders, solutions, pastes, gels, sprays, aerosols, or oils. In some embodiments for treatments of the eye or other external tissues, for example the mouth or the skin, the pharmaceutical formulations are applied as a topical ointment or cream. When formulated in an ointment, a primary active ingredient, optional secondary active ingredient, and/or pharmaceutically acceptable salt thereof where appropriate can be formulated with a paraffinic or water-miscible ointment base. In other embodiments, the primary and/or secondary active ingredient can be formulated in a cream with an oil-in-water cream base or a water-in-oil base. Dosage forms adapted for topical administration in the mouth include lozenges, pastilles, and mouth washes.
Dosage forms adapted for nasal or inhalation administration include aerosols, solutions, suspension drops, gels, or dry powders. In some embodiments, a primary active ingredient, optional secondary active ingredient, and/or pharmaceutically acceptable salt thereof where appropriate can be in a dosage form adapted for inhalation is in a particle-size-reduced form that is obtained or obtainable by micronization. In some embodiments, the particle size of the size reduced (e.g. micronized) compound or salt or solvate thereof, is defined by a D50 value of about 0.5 to about 10 microns as measured by an appropriate method known in the art. Dosage forms adapted for administration by inhalation also include particle dusts or mists. Suitable dosage forms wherein the carrier or excipient is a liquid for administration as a nasal spray or drops include aqueous or oil solutions/suspensions of an active (primary and/or secondary) ingredient, which may be generated by various types of metered dose pressurized aerosols, nebulizers, or insufflators. The nasal/inhalation formulations can be administered to a subject in need thereof.
In some embodiments, the dosage forms are aerosol formulations suitable for administration by inhalation. In some of these embodiments, the aerosol formulation contains a solution or fine suspension of a primary active ingredient, secondary active ingredient, and/or pharmaceutically acceptable salt thereof where appropriate and a pharmaceutically acceptable aqueous or non-aqueous solvent. Aerosol formulations can be presented in single or multi-dose quantities in sterile form in a sealed container. For some of these embodiments, the sealed container is a single dose or multi-dose nasal or an aerosol dispenser fitted with a metering valve (e.g. metered dose inhaler), which is intended for disposal once the contents of the container have been exhausted.
Where the aerosol dosage form is contained in an aerosol dispenser, the dispenser contains a suitable propellant under pressure, such as compressed air, carbon dioxide, or an organic propellant, including but not limited to a hydrofluorocarbon. The aerosol formulation dosage forms in other embodiments are contained in a pump-atomizer. The pressurized aerosol formulation can also contain a solution or a suspension of a primary active ingredient, optional secondary active ingredient, and/or pharmaceutically acceptable salt thereof. In further embodiments, the aerosol formulation also contains co-solvents and/or modifiers incorporated to improve, for example, the stability and/or taste and/or fine particle mass characteristics (amount and/or profile) of the formulation. Administration of the aerosol formulation can be once daily or several times daily, for example 2, 3, 4, or 8 times daily, in which 1, 2, 3 or more doses are delivered each time. The aerosol formulations can be administered to a subject in need thereof.
For some dosage forms suitable and/or adapted for inhaled administration, the pharmaceutical formulation is a dry powder inhalable-formulations. In addition to a primary active agent, optional secondary active ingredient, and/or pharmaceutically acceptable salt thereof where appropriate, such a dosage form can contain a powder base such as lactose, glucose, trehalose, manitol, and/or starch. In some of these embodiments, a primary active agent, secondary active ingredient, and/or pharmaceutically acceptable salt thereof where appropriate is in a particle-size reduced form. In further embodiments, a performance modifier, such as L-leucine or another amino acid, cellobiose octaacetate, and/or metals salts of stearic acid, such as magnesium or calcium stearate. In some embodiments, the aerosol formulations are arranged so that each metered dose of aerosol contains a predetermined amount of an active ingredient, such as the one or more of the compositions, compounds, vector(s), molecules, cells, and combinations thereof described herein.
Dosage forms adapted for vaginal administration can be presented as pessaries, tampons, creams, gels, pastes, foams, or spray formulations. Dosage forms adapted for rectal administration include suppositories or enemas. The vaginal formulations can be administered to a subject in need thereof.
Dosage forms adapted for parenteral administration and/or adapted for injection can include aqueous and/or non-aqueous sterile injection solutions, which can contain antioxidants, buffers, bacteriostats, solutes that render the composition isotonic with the blood of the subject, and aqueous and non-aqueous sterile suspensions, which can include suspending agents and thickening agents. The dosage forms adapted for parenteral administration can be presented in a single-unit dose or multi-unit dose containers, including but not limited to sealed ampoules or vials. The doses can be lyophilized and re-suspended in a sterile carrier to reconstitute the dose prior to administration. Extemporaneous injection solutions and suspensions can be prepared in some embodiments, from sterile powders, granules, and tablets. The parenteral formulations can be administered to a subject in need thereof.
For some embodiments, the dosage form contains a predetermined amount of a primary active agent, secondary active ingredient, and/or pharmaceutically acceptable salt thereof where appropriate per unit dose. In an embodiment, the predetermined amount of primary active agent, secondary active ingredient, and/or pharmaceutically acceptable salt thereof where appropriate can be an effective amount, a least effect amount, and/or a therapeutically effective amount. In other embodiments, the predetermined amount of a primary active agent, secondary active agent, and/or pharmaceutically acceptable salt thereof where appropriate, can be an appropriate fraction of the effective amount of the active ingredient.
In some embodiments, the pharmaceutical formulation(s) described herein can be part of a combination treatment or combination therapy. The combination treatment can include the pharmaceutical formulation described herein and an additional treatment modality. The additional treatment modality can be a chemotherapeutic, a biological therapeutic, surgery, radiation, diet modulation, environmental modulation, a physical activity modulation, and combinations thereof.
In some embodiments, the co-therapy or combination therapy can additionally include but not limited to, polynucleotides, amino acids, peptides, polypeptides, antibodies, aptamers, ribozymes, hormones, immunomodulators, antipyretics, anxiolytics, antipsychotics, analgesics, antispasmodics, anti-inflammatories, anti-histamines, anti-infectives, chemotherapeutics, and combinations thereof.
The pharmaceutical formulations or dosage forms thereof described herein can be administered one or more times hourly, daily, monthly, or yearly (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or more times hourly, daily, monthly, or yearly). In some embodiments, the pharmaceutical formulations or dosage forms thereof described herein can be administered continuously over a period of time ranging from minutes to hours to days. Devices and dosages forms are known in the art and described herein that are effective to provide continuous administration of the pharmaceutical formulations described herein. In some embodiments, the first one or a few initial amount(s) administered can be a higher dose than subsequent doses. This is typically referred to in the art as a loading dose or doses and a maintenance dose, respectively. In some embodiments, the pharmaceutical formulations can be administered such that the doses over time are tapered (increased or decreased) overtime so as to wean a subject gradually off of a pharmaceutical formulation or gradually introduce a subject to the pharmaceutical formulation.
As previously discussed, the pharmaceutical formulation can contain a predetermined amount of a primary active agent, secondary active agent, and/or pharmaceutically acceptable salt thereof where appropriate. In some of these embodiments, the predetermined amount can be an appropriate fraction of the effective amount of the active ingredient. Such unit doses may therefore be administered once or more than once a day, month, or year (e.g. 1, 2, 3, 4, 5, 6, or more times per day, month, or year). Such pharmaceutical formulations may be prepared by any of the methods well known in the art.
Where co-therapies or multiple pharmaceutical formulations are to be delivered to a subject, the different therapies or formulations can be administered sequentially or simultaneously. Sequential administration is administration where an appreciable amount of time occurs between administrations, such as more than about 15, 20, 30, 45, 60 minutes or more. The time between administrations in sequential administration can be on the order of hours, days, months, or even years, depending on the active agent present in each administration. Simultaneous administration refers to administration of two or more formulations at the same time or substantially at the same time (e.g. within seconds or just a few minutes apart), where the intent is that the formulations be administered together at the same time.
Compositions of the invention may be formulated for delivery to human subjects, as well as to animals for veterinary purposes (e.g. livestock (cattle, pigs, others)), and other non-human mammalian subjects. The dosage of the formulation can be measured or calculated as viral particles or as genome copies (“GC”)/viral genomes (“vg”). Any method known in the art can be used to determine the genome copy (GC) number of the viral compositions of the invention. In one example embodiment, the viral compositions can be formulated in dosage units to contain an amount of viral vectors that is in the range of about 1.0×109 GC to about 1.0×1015 GC (to treat an average subject of 70 kg in body weight), and preferably 1.0×1012 GC to 1.0×1014 GC for a human patient. Preferably, the dose of virus in the formulation is 1.0×109 GC, 5.0×109 GC, 1.0×1010 GC, 5.0×1010 GC, 1.0×1011GC, 5.0×1011 GC, 1.0×1012 GC, 5.0×1012 GC, or 1.0×1013 GC, 5.0×1013 GC, 1.0×1014 GC, 5.0×1014 GC, or 1.0×1015 GC.
The viral vectors can be formulated in a conventional manner using one or more physiologically acceptable carriers or excipients. The viral vectors may be formulated for parenteral administration by injection (e.g. by bolus injection or continuous infusion). Formulations for injection may be presented in unit dosage form (e.g. in ampoules or in multi-dose containers) with an added preservative. The viral compositions may take such forms as suspensions, solutions, or emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, stabilizing, or dispersing agents. Liquid preparations of the viral vector formulations may be prepared by conventional means with pharmaceutically acceptable additives such as suspending agents (e.g. sorbitol syrup, cellulose derivatives or hydrogenated edible fats), emulsifying agents (e.g. lecithin or acacia), non-aqueous vehicles (e.g. almond oil, oily esters, ethyl alcohol or fractionated vegetable oils), and preservatives (e.g. methyl or propyl-p-hydroxybenzoates or sorbic acid). The preparations may also contain buffer salts. Alternatively, the compositions may be in powder form for constitution with a suitable vehicle (e.g. sterile pyrogen-free water) before use.
In one example embodiment, virus like particles (VLPs) are used to facilitate intracellular recombinant protein therapy (see, e.g., W (2020252455A1, U.S. Pat. No. 10,577,397B2). In certain embodiments, VLPs include a Gag-COBLL1 fusion protein. The Gag-COBLL1 fusion protein may include a matrix protein, a capsid protein, and/or a nucleocapsid protein covalently linked to COBLL1. In certain embodiments, the VLPs include a membrane comprising a phospholipid bilayer with one or more human endogenous retrovirus (HERV) derived ENV/glycoprotein(s) on the external side; a HERV-derived GAG protein in the VLP core, and a COBLL1 fusion protein on the inside of the membrane, wherein COBLL1 is fused to a human-endogenous GAG or other plasma membrane recruitment domain (see, e.g., WO2020252455A1). Fusion proteins can be obtained using standard recombinant protein technology.
In one example embodiment, cell-penetrating peptides (CPPs) are used to facilitate intracellular recombinant protein therapy (see, e.g., Dinca A, Chien W-M, Chin M T. Intracellular Delivery of Proteins with Cell-Penetrating Peptides for Therapeutic Uses in Human Disease. International Journal of Molecular Sciences. 2016; 17 (2): 263). In certain embodiments, cell-penetrating peptides can be conjugated to COBLL1, for example, using standard recombinant protein technology. In certain embodiments, cell-penetrating peptides can be concurrently delivered with recombinant COBLL1.
In one example embodiment, nanocarriers are used to facilitate intracellular recombinant protein therapy (see, e.g., Lee Y W, Luther D C, Kretzmann J A, Burden A, Jeon T, Zhai S, Rotello V M. Protein Delivery into the Cell Cytosol using Non-Viral Nanocarriers. Theranostics 2019; 9 (11): 3280-3292). Non-limiting nanocarriers include, but are not limited to nanoparticles (e.g., silica, gold), polymers, lipid based (e.g., cationic lipid within a polymer shell, lipid-like nanoparticles).
The pharmaceutical composition of the invention may be administered locally or systemically. In a preferred embodiment, the pharmaceutical composition is administered near the tissue whose cells are to be transduced. In a particular embodiment, the pharmaceutical composition of the invention is administered locally to the subcutaneous adipose tissue, which is composed of varying amounts of the two different types of adipose tissue: white adipose tissue (WAT) that stores energy in the form of triacylglycerol (TAG) and brown adipose tissue (BAT) that dissipates energy as heat, “burning” fatty acids to maintain body temperature. In one example embodiment, the pharmaceutical composition of the invention is administered in the white adipose tissue (WAT) and/or in the brown adipose tissue (BAT) by intra-WAT or intra-BAT injection. In another preferred embodiment, the pharmaceutical composition of the invention is administered systemically.
The “adeno-associated virus” (AAV) can be formulated with a physiologically acceptable carrier for use in gene transfer and gene therapy applications. The dosage of the formulation can be measured or calculated as viral particles or as genome copies (“GC”)/viral genomes (“vg”). Any method known in the art can be used to determine the genome copy (GC) number of the viral compositions of the invention. One method for performing AAV GC number titration is as follows: purified AAV vector samples are first treated with DNase to eliminate un-encapsulated AAV genome DNA or contaminating plasmid DNA from the production process. The DNase resistant particles are then subjected to heat treatment to release the genome from the capsid. The released genomes are then quantitated by real-time PCR using primer/probe sets targeting specific region of the viral genome.
In any of the described methods the one or more vectors may be comprised in a delivery system. In any of the described methods the vectors may be delivered via liposomes, particles (e.g., nanoparticles), exosomes, microvesicles, a gene-gun. In any of the described methods viral vectors may be delivered by transduction of viral particles. The delivery systems may be administered systemically or by localized administration (e.g., direct injection). The term “systemically administered” and “systemic administration”, as used herein, means that the polynucleotides, vectors, polypeptides, or pharmaceutical compositions of the invention are administered to a subject in a non-localized manner. The systemic administration of the polynucleotides, vectors, polypeptides, or pharmaceutical compositions of the invention may reach several organs or tissues throughout the body of the subject or may reach specific organs or tissues of the subject. For example, the intravenous administration of a pharmaceutical composition of the invention may result in the transduction of more than one tissue or organ in a subject. The term “transduce” or “transduction”, as used herein, refers to the process whereby a foreign nucleotide sequence is introduced into a cell via a viral vector. The term “transfection”, as used herein, refers to the introduction of DNA into a recipient eukaryotic cell.
Recombinant protein compositions described herein may be administered systemically (e.g., intravenously) or administered locally to adipose tissue (e.g., injection). In preferred embodiments, the recombinant protein compositions are administered with an appropriate carrier to be administered to a mammal, especially a human, preferably a pharmaceutically acceptable composition. A “pharmaceutically acceptable composition” refers to a non-toxic semisolid, liquid, or aerosolized filler, diluent, encapsulating material, colloidal suspension or formulation auxiliary of any type. Preferably, this composition is suitable for injection. These may be in particular isotonic, sterile, saline solutions (monosodium or disodium phosphate, sodium, potassium, calcium or magnesium chloride and similar solutions or mixtures of such salts), or dry, especially freeze-dried compositions which upon addition, depending on the case, of sterilized water or physiological saline, permit the constitution of injectable solutions.
The CRISPR-Cas systems disclosed herein may be delivered using vectors comprising polynucleotides encoding the Cas polypeptide and the guide molecule. For HDR based embodiments, the donor template may also be encoded on a vector. Vectors, dosages, and adipocyte-specific configurations suitable for delivery of these components include those discussed above.
The vector(s) can include regulatory element(s), e.g., promoter(s). The vector(s) can comprise Cas encoding sequences, and/or a single, but possibly also can comprise at least 3 or 8 or 16 or 32 or 48 or 50 guide RNA(s) (e.g., sgRNAs) encoding sequences, such as 1-2, 1-3, 1-4 1-5, 3-6, 3-7, 3-8, 3-9, 3-10, 3-8, 3-16, 3-30, 3-32, 3-48, 3-50 RNA(s) (e.g., sgRNAs). In a single vector there can be a promoter for each RNA (e.g., sgRNA), advantageously when there are up to about 16 RNA(s); and, when a single vector provides for more than 16 RNA(s), one or more promoter(s) can drive expression of more than one of the RNA(s), e.g., when there are 32 RNA(s), each promoter can drive expression of two RNA(s), and when there are 48 RNA(s), each promoter can drive expression of three RNA(s). By simple arithmetic and well-established cloning protocols and the teachings in this disclosure one skilled in the art can readily practice the invention as to the RNA(s) for a suitable exemplary vector such as AAV, and a suitable promoter such as the U6 promoter. For example, the packaging limit of AAV is ˜4.7 kb. The length of a single U6-gRNA (plus restriction sites for cloning) is 361 bp. Therefore, the skilled person can readily fit about 12-16, e.g., 13 U6-gRNA cassettes in a single vector. This can be assembled by any suitable means, such as a golden gate strategy used for TALE assembly (genome-engineering.org/taleffectors/). The skilled person can also use a tandem guide strategy to increase the number of U6-gRNAs by approximately 1.5 times, e.g., to increase from 12-16, e.g., 13 to approximately 18-24, e.g., about 19 U6-gRNAs. Therefore, one skilled in the art can readily reach approximately 18-24, e.g., about 19 promoter-RNAs, e.g., U6-gRNAs in a single vector, e.g., an AAV vector. A further means for increasing the number of promoters and RNAs in a vector is to use a single promoter (e.g., U6) to express an array of RNAs separated by cleavable sequences. And an even further means for increasing the number of promoter-RNAs in a vector is to express an array of promoter-RNAs separated by cleavable sequences in the intron of a coding sequence or gene; and, in this instance, it is advantageous to use a polymerase II promoter, which can have increased expression and enable the transcription of long RNA in a tissue specific manner. (see, e.g., Chung K H, Hart C C, Al-Bassam S, et al. Polycistronic RNA polymerase II expression vectors for RNA interference based on BIC/miR-155. Nucleic Acids Res. 2006; 34 (7): e53). In an advantageous embodiment, AAV may package U6 tandem gRNA targeting up to about 50 genes. Accordingly, from the knowledge in the art and the teachings in this disclosure the skilled person can readily make and use vector(s), e.g., a single vector, expressing multiple RNAs or guides under the control or operatively or functionally linked to one or more promoters, especially as to the numbers of RNAs or guides discussed herein, without any undue experimentation.
The Cas polypeptide and guide molecule (and donor) may also be delivered as a pre-formed ribonucleoprotein complex (RNP). Delivery methods for delivery RNPs include virus like particles, cell-penetrating peptides, and nanocarriers discussed above.
Delivery mechanisms for CRISPRa systems include virus like particles, cell-penetrating peptides, and nanocarriers discussed above for CRISPR-Cas systems.
Base editing systems may deliver on one or more vectors encoding the Cas-nucleobase deaminase and guide sequence. Vector systems suitable for this purpose includes those discussed above. Alternatively, base editing systems may be delivered as pre-complex Ribonucleoprotein complex (RNP. Systems for delving RNPs include the protein delivery systems: virus like particles; cell-penetrating peptides; and nanocarriers, discuss above.
A further example method for delivery of base-editing systems may include use of a split-intein approach to divide CBE and ABE into reconstitutable halves, is described in Levy et al. Nature Biomedical Engineering doi.org/10.1038/s41441-019-0505-5 (2019), which is incorporated herein by reference.
In another aspect, the variants resulting in reduced COBL11 expression may also be used in diagnostic and theranostic methods to detect increased risk for T2D and to guide treatment decisions.
In one example embodiment, a method for treating a subject suffering from, or at risk for, T2D comprises detecting one or more polygenic metabolic risk factors in a subject in need thereof, and administering one of the treatments for increasing COBL11 expression and/or COBL11 activity in adipocyte or adipocyte progenitors if the metabolic risk factors are detected, or administering a T2D standard of care if the metabolic risk factor is detected. In one example embodiment, the one or more risk indicators are selected from the group consisting of; a heterogenous lipid-associated morphological profile in visceral adipocytes, heterogeneity in lipid droplet size in visceral adipocytes, heterogeneity in lipid droplet number in visceral adipocytes, heterogeneity in lipid droplet distribution in visceral adipocytes, if the subject is post-menopausal, optionally older than 50 years old, increased adipocyte diameter, expression of one or more of the 51 genes in Table 6, up-regulation of one or more genes selected from the group consisting of ACAA1 and SCP2, expression of one or more genes selected from the group consisting of PLIN, ABHD5, MGLL, ATGL, and HIS as compared to an average level for adipocytes, increased lipid accumulation in matural visceral adipocytes, and reduced degradation in matural visceral adipocytes. In another example embodiment, the one or more risk factors are selected from the group consisting of higher intensity/ready of BODIPY, higher intensity/reading of mitochondrial-related intensity, higher count of BODIPY-related objects; and decreased BODIPY-related granularity, which may be detected using the methods described in the “Profiling Adipocyte Section” below.
In another example embodiment, a method for detecting T2D, or an increased risk of developing T2D, comprises detecting one or more variants associated with decreased expression of COBL11 or activity of COBL11, wherein detection of the one or more variants indicates a subject has, or is at an increase risk of developing T2D, or alternatively where the subject possesses a MONW/MOH risk phenotype. In certain example embodiments, the one or more variants include rs6712203. Detection of the one or more variants may be determined using any of the methods disclosed in the “Genotyping” section below. In certain example embodiments, the method may further comprise a treatment step comprising administering a therapeutically effective amount of one or more agents that a) increase the expression or activity of COBL11 or enhance actin remodeling in adipocyte or adipocyte-progenitors, b) a gene editing system the corrects one or more variants to a wild-type or non-risk variant, or c) adoptive cell transfer comprising allogenic or autologous adipocyte donors as disclosed in the therapeutic embodiments above.
In another example embodiment, a method for detecting lipodystrophy, or an increased risk of developing lipodystrophy, comprises detecting one or more variants associated with decreased expression of BCL2 and/or KDSR or activity of BCL2 and/or KDSR, or detecting one or more variants associated with increased expression of VPS4B or activity of VPS4B wherein detection of the one or more variants indicates a subject has, or is at an increased risk of developing lipodystrophy, or alternatively where the subject possesses a lipodystrophy risk phenotype. In certain example embodiments, the one or more variants include rs12454712. Detection of the one or more variants may be determined using any of the methods disclosed in the “Genotyping” section below. In certain example embodiments, the method may further comprise a treatment step comprising administering a therapeutically effective amount of one or more agents that a) increase the expression or activity of BCL2 and/or KDSR or decrease expression of VPS4B, b) a gene editing system the corrects one or more variants to a wild-type or non-risk variant, or c) adoptive cell transfer comprising allogenic or autologous adipocyte donors as disclosed in the therapeutic embodiments above.
In another example embodiments, a method of treating T2D comprises performing a genotyping assay on a biological sample from a subject to determine if the subject has one or more risk variants that decrease COBL11 expression or activity, and administering one of the therapeutic modalities described above in the “Methods of Treatment” section if the one or more variants are detected, or administering a T2D standard-of-care therapy, as further defined below, if the one or more variants are not detected. In one example embodiment, the one or more variants comprise rs6712203.
In an example embodiment, a method of treating lipodystrophy comprises performing a genotyping assay on a biological sample from a subject to determine if the subject has one or more risk variants that decrease BCL2 and/or KDSR expression or activity, or one or more risk variants that increase VPS4B expression or activity, and administering one of the therapeutic modalities described above in the “Methods of Treatment” section if the one or more variants are detected, or administering a T2D standard-of-care therapy, as further defined below, if the one or more variants are not detected. In one example embodiment, the one or more variants comprise rs12454712.
In any of the above diagnostic/theranostic embodiments, identifying whether a metabolic risk factor is present includes obtaining information regarding the identity (i.e., of a specific nucleotide), presence or absence of one or more specific risk loci in a subject. Determining the presence of a risk loci can, but need not, include obtaining a sample comprising DNA from a subject. The individual or organization who determines the presence of an risk loci need not actually carry out the physical analysis of a sample from a subject; the methods can include using information obtained by analysis of the sample by a third party. Thus, the methods can include steps that occur at more than one site. For example, a sample can be obtained from a subject at a first site, such as at a health care provider, or at the subject's home in the case of a self-testing kit. The sample can be analyzed at the same or a second site, e.g., at a laboratory or other testing facility. Identifying the presence of a risk loci can be done by any DNA detection method known in the art, including sequencing at least part of a genome of one or more cells from the subject. In certain example embodiments, risk loci are detected via detection of a single nucleotide polymorphism (SNP), e.g., rs6712203.
SNPs may be detected through hybridization-based methods, including dynamic allele-specific hybridization (DASH), molecular beacons, and SNP microarrays, enzyme-based methods including RFLP, PCR-based, e.g., allelic-specific polymerase chain reaction (AS-PCR), polymerase chain reaction restriction fragment length polymorphism (PCR-RFLP), multiplex PCR real-time invader assay (mPCR-RETINA), (amplification refractory mutation system (ARMS), Flap endonuclease, primer extension, 5′ nuclease, e.g., Taqman or 5′nuclease allelic discrimination assay, and oligonucleotide ligation assay, and methods such as single strand conformation polymorphism, temperature gradient gel electrophoresis, denaturing high performance liquid chromatography, high-resolution melting of the entire amplicon, use of DNA mismatch-binding proteins, SNPlex, and Surveyor nuclease assay.
In certain example embodiments, detection of SNPs can be done by sequencing. Sequencing can be, for example, whole genome sequencing. In one example embodiment, the invention involves high-throughput and/or targeted nucleic acid profiling (for example, sequencing, quantitative reverse transcription polymerase chain reaction, and the like).
In certain embodiments, sequencing comprises high-throughput (formerly “next-generation”) technologies to generate sequencing reads. In DNA sequencing, a read is an inferred sequence of base pairs (or base pair probabilities) corresponding to all or part of a single DNA fragment. A typical sequencing experiment involves fragmentation of the genome into millions of molecules or generating complementary DNA (cDNA) fragments, which are size-selected and ligated to adapters. The set of fragments is referred to as a sequencing library, which is sequenced to produce a set of reads. Methods for constructing sequencing libraries are known in the art (see, e.g., Head et al., Library construction for next-generation sequencing: Overviews and challenges. Biotechniques. 2014; 56 (2): 61-77; Trombetta, J. J., Gennert, D., Lu, D., Satija, R., Shalek, A. K. & Regev, A. Preparation of Single-Cell RNA-Seq Libraries for Next Generation Sequencing. Curr Protoc Mol Biol. 107, 4 22 21-24 22 17, doi: 10.1002/0471142727.mb0422s107 (2014). PMCID: 4338574). A “library” or “fragment library” may be a collection of nucleic acid molecules derived from one or more nucleic acid samples, in which fragments of nucleic acid have been modified, generally by incorporating terminal adapter sequences comprising one or more primer binding sites and identifiable sequence tags. In certain embodiments, the library members (e.g., genomic DNA, cDNA) may include sequencing adaptors that are compatible with use in, e.g., Illumina's reversible terminator method, long read nanopore sequencing, Roche's pyrosequencing method (454), Life Technologies' sequencing by ligation (the SOLID platform) or Life Technologies' Ion Torrent platform. Examples of such methods are described in the following references: Margulies et al (Nature 2005 437:376-80); Schneider and Dekker (Nat Biotechnol. 2012 Apr. 10; 30 (4): 326-8); Ronaghi et al (Analytical Biochemistry 1996 242:84-9); Shendure et al (Science 2005 309:1728-32); Imelfort et al (Brief Bioinform. 2009 10:609-18); Fox et al (Methods Mol. Biol. 2009; 553:79-108); Appleby et al (Methods Mol. Biol. 2009; 513:19-39); and Morozova et al (Genomics. 2008 92:255-64), which are incorporated by reference for the general descriptions of the methods and the particular steps of the methods, including all starting products, reagents, and final products for each of the steps.
In certain embodiments, the present invention includes whole genome sequencing. Whole genome sequencing (also known as WGS, full genome sequencing, complete genome sequencing, or entire genome sequencing) is the process of determining the complete DNA sequence of an organism's genome at a single time. This entails sequencing all of an organism's chromosomal DNA as well as DNA contained in the mitochondria and, for plants, in the chloroplast. “Whole genome amplification” (“WGA”) refers to any amplification method that aims to produce an amplification product that is representative of the genome from which it was amplified. Non-limiting WGA methods include Primer extension PCR (PEP) and improved PEP (I-PEP), Degenerated oligonucleotide primed PCR (DOP-PCR), Ligation-mediated PCR (LMP), T7-based linear amplification of DNA (TLAD), and Multiple displacement amplification (MDA).
In certain embodiments, the present invention includes whole exome sequencing. Exome sequencing, also known as whole exome sequencing (WES), is a genomic technique for sequencing all of the protein-coding genes in a genome (known as the exome) (see, e.g., Ng et al., 2009, Nature volume 461, pages 272-276). It consists of two steps: the first step is to select only the subset of DNA that encodes proteins. These regions are known as exons—humans have about 180,000 exons, constituting about 1% of the human genome, or approximately 30 million base pairs. The second step is to sequence the exonic DNA using any high-throughput DNA sequencing technology. In certain embodiments, whole exome sequencing is used to determine mutations in genes associated with disease.
In certain embodiments, targeted sequencing is used in the present invention (see, e.g., Mantere et al., PLOS Genet 12 e1005816 2016; and Carneiro et al. BMC Genomics, 2012 13:375). Targeted gene sequencing panels are useful tools for analyzing specific mutations in a given sample. Focused panels contain a select set of genes or gene regions that have known or suspected associations with the disease or phenotype under study. In certain embodiments, targeted sequencing is used to detect mutations associated with a disease in a subject in need thereof. Targeted sequencing can increase the cost-effectiveness of variant discovery and detection.
As noted above, when a metabolic risk factor is not detected, a standard of care therapy may be administered instead. A standard of care therapy may comprise administration metformin, thiazolidinediones (glitazones), biguanides, meglitinides, DPP-4 inhibitors, Sodium-glucose transporter 2 (SGLT2) inhibitors, alpha-glucosidase inhibitors, bile acid sequestrants, incretin based therapies, sulfonylureas and amylin analogs. In some embodiments, the biguanide is a metformin. In some embodiments, the meglitinide is repaglinide or nateglinide. Sulfonylureas include, for example, chlorpropamide, glipizide, glyburide and glimepiride. Rosiglitazone (Avandia) and pioglitazone (ACTOS) are exemplary thiazolidinediones. DPP-4 inhibitors include Sitagliptin (Januvia), saxagliptin (Onglyza), linagliptin (Tradjenta), alogliptin (Nesina). Sodium-glucose transporter 2 (SGLT2) inhibitors include Canagliflozin (Invokana) and dapagliflozin (Farxiga). Acarbose (Precose) and miglitol (Glyset) are exemplary alpha-glucosidase inhibitors. An exemplary bile acid sequestrate is colesevelam (Welchol) which is a cholesterol-lowering medication that can reduce blood glucose levels. In some embodiments, more than one drug can be used in a combination therapy, in particular when the drugs act in different ways to lower blood glucose levels. Treatment may also include, alone, or in addition to drug therapy, intensive lifestyle interventions including modifications to diet and exercise. Initiating a treatment can include devising a treatment plan based on the risk group, which corresponds to the PRS calculated for the subject. In some embodiments, the polygenic risk score is used to guide enhanced monitoring strategies. In some embodiments, the polygenic risk score is used to guide intensive lifestyle interventions. As used herein, “polygenic risk score” refers to an assessment of the risk of a specific condition based on the collective influence of many genetic variants or a score based on the number of variants related to the disease a subject has.
In certain example embodiments, where a metabolic risk factor is detected, the methods of treatment for increasing COBL11, BCL2 or KDSR expression or COBL11, BCL2 or KDSR activity disclosed herein may also be co-administered with a standard of care therapy. Similarly, in an example embodiment, where a metabolic risk factor is detected, the methods of treatment for decreasing VPS4B expression or VPS4B activity disclosed herein may also be co-administered with a standard of care therapy.
Applicants have performed functional analysis (morphological and histological) of additional SNPs associated with metabolic diseases. For example, SNPs in the BCL2 gene result in cellular phenotypes associated with lipodystrophy. Lipodystrophy syndromes are a group of genetic or acquired disorders in which the body is unable to produce and maintain healthy fat tissue. Other SNPs analyzed using the methods of the present invention include rs9686661, rs4804833, rs2972144, rs13389219, rs11837287, TCF71.2, rs1534696 (SNX10), rs287621, rs1412956, rs13133548, rs11667352, rs12454712 (BCL2), rs673918, rs646123, rs2963449, rs1572993, rs632057, rs11637681, rs6063048, rs7660000, rs1421085, rs7258937, rs9939609, rs998584, rs4925109, and rs12641088. In certain embodiments, the present invention provides for a method of treating subjects suffering from or at risk of developing a metabolic disease, comprising administering a gene editing system that corrects one or more genomic risk variants selected from the group consisting of rs9686661, rs4804833, rs2972144, rs13389219, rs11837287, TCF71.2, rs1534696 (SNX10), rs287621, rs1412956, rs13133548, rs11667352, rs12454712 (BCL2), rs673918, rs646123, rs2963449, rs1572993, rs632057, rs11637681, rs6063048, rs7660000, rs1421085, rs7258937, rs9939609, rs998584, rs4925109, and rs12641088. In certain embodiments, the present invention provides for a method of diagnosing subjects suffering from or at risk of developing a metabolic disease, comprising detecting one or more genomic risk variants selected from the group consisting of rs9686661, rs4804833, rs2972144, rs13389219, rs11837287, TCF7L2, rs1534696 (SNX10), rs287621, rs1412956, rs13133548, rs11667352, rs12454712 (BCL2), rs673918, rs646123, rs2963449, rs1572993, rs632057, rs11637681, rs6063048, rs7660000, rs1421085, rs7258937, rs9939609, rs998584, rs4925109, and rs12641088.
In certain embodiments, high-throughput multiplex profiling for simultaneously identifying morphological and cellular phenotypes is performed on cellular system. The cellular system may be a homogenous population of cells. The cellular system may be derived from a subject. The subject can be a control healthy subject or a subject having a specific clinical phenotype. Methods of obtaining cells from a subject are known in the art and are described further herein. The cellular system can include cells that were isolated and expanded or differentiated. In preferred embodiments, the cellular system may comprise lipid-accumulating cells. The lipid accumulating cells may be lipocytes. As used herein, lipocytes are any fat storing cell. The lipocytes may be adipocytes, hepatocytes, macrophages/foam cells and glial cells. The lipocytes may be part of a pathophysiological process in cells that include fat storing cells, such as, vascular smooth muscle cells, skeletal muscle cells, renal podocytes, and cancer cells. In certain embodiments, high-throughput multiplex and simultaneous profiling of morphological and cellular phenotypes is performed on adipose tissue or adipose cells (e.g., AMSCs, adipocytes). As used herein, adipocytes, also known as lipocytes and fat cells, are the cells that primarily compose adipose tissue, specialized in storing energy as fat. Adipocytes are derived from mesenchymal stem cells which give rise to adipocytes through adipogenesis. The cellular system may include stem cells differentiated over a time course, wherein the cells from the cellular system are stained and imaged at different time points. The time points may be one or more days of differentiation, such as, but not limited to 0 days, 1 day, 2 days, 3 days, 4 days, 5 days, 6 days, 7 days, 8 days, 9 days, 10 days, 11 days, 12 days, 13 days or 14 or more days. The stem cells may be mesenchymal stem cells (AMSCs) differentiated to adipocytes. The AMSCs may be obtained from a subject. The AMSCs may be subcutaneous AMSCs. The AMSCs may be visceral AMSCs. The adipose tissue beneath the skin is called subcutaneous adipose tissue (SAT), whereas the one lining internal organs is termed visceral adipose tissue (VAT).
The method can include a combination of fluorescent dyes that are used to stain various biological models present in adipocytes. The cells can be imaged simultaneously. The images can be analyzed by an automated image analysis pipeline to identify morphological and cellular phenotypes from the resulting images.
In certain embodiments, the cellular system is stained to differentiate cellular compartments. The cellular compartments can include the nucleus, cytoplasm or the entire cell (e.g., including nucleus and cytoplasm). In certain embodiments, the cellular system is stained to differentiate organelles. The organelles can include DNA (e.g., genomic DNA), mitochondria, actin, golgi, plasma membrane, lipids (e.g. lipid containing vesicles), nucleoli and cytoplasmic RNA. In certain embodiments, actin, golgi, plasma membrane are represented as a single organelle (AGP). In certain embodiments, the stain can indicate intensity, granularity, and/or texture for each stained compartment or organelle. The size and shape of each identified object can be determined (e.g., lipid droplets). The colocalization, number of objects, and distance to neighboring objects can also be determined by staining. Methods of staining non-lipocyte cells may be used, such as, CellPainting (Bray M A, Singh S, Han H, et al. Cell Painting, a high-content image-based assay for morphological profiling using multiplexed fluorescent dyes. Nat Protoc. 2016; 11 (9): 1757-1774).
In certain embodiments, features can be extracted from the images. In certain embodiments, the features are categorized based on a range of values for each feature. For example each separate feature can be divided into at least 2 categories based on dividing the values based on a range. Each separate features may be divided into 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more sub features. For example, object size may be divided into 5 size categories. Each size category may have different categories of intensity, texture or granularity. Features can be combinations of object size, object shape, intensity, granularity, texture, colocalization, number of objects, distance to neighboring objects, and/or cellular compartment (see tables and figures for example features).
A number of bioimaging software packages (free and commercial) exist for morphological feature extraction (Eliceiri K W, et al. Biological imaging software tools. Nat Methods. 2012; 9:697 710). In one example, CellProfiler and a novel pipeline can be used to automate imaging (see, e.g., Carpenter et al., (2006) CellProfiler: image analysis software for identifying and quantifying cell phenotypes. Genome Biology 7: R100. PMID: 17076895; and Kamentsky et al., (2011) Improved structure, function, and compatibility for CellProfiler: modular high-throughput image analysis software. Bioinformatics 2011/doi. PMID: 21349861 PMCID: PMC3072555). The image feature extraction workflow for Cell Painting is divided into three tasks, each of which is performed by a CellProfiler pipeline: (a) illumination correction, (b) quality control, and (c) morphological feature extraction.
In one example embodiment the features can be linked to specific phenotypes. The phenotypes can be specific gene programs (biological programs) by comparing features to gene programs in the same cellular system and by determining genes associated with morphological characteristics. As used herein the term “gene program” or “biological program” can be used interchangeably with “expression program” and refers to a set of biomarkers that share a role in a biological function (e.g., lipolysis). Biological programs can include a pattern of biomarker expression that result in a corresponding physiological event or phenotypic trait. Biological programs can include up to several hundred biomarkers that are expressed in a spatially and temporally controlled fashion. The phenotypes can be specific clinical features. In certain embodiments, features associated with clinical characteristics are identified by comparing features in a control group of subjects having a clinical characteristic. Clinical characteristics can include risk for a disease, such as type 2 diabetes (T2D), coronary disease. Clinical characteristics can also include, age, weight, BMI, etc.
In certain embodiments, more than one cell needs to be imaged in order to determine morphological features for a subject or cellular system. In example embodiments, 50 or more cells per cellular system are imaged, more preferably, more than 100, more preferably about 500 or more cells are imaged per cellular system.
In certain embodiments, a cellular system is stained with one or more fluorescent dyes. As used herein, the terms “fluorescent dye”, “reactive dye”, or “fluorophore” are used herein interchangeably. They refer to non-protein molecules that absorb photons and re-emit them. Fluorescent dyes typically contain several combined aromatic groups, or planar or cyclic molecules with several x-bonds. Fluorescent dyes are usually targeted to proteins of interest by antibody conjugates or peptide tags. Fluorescent dyes may be used alone, as a tracer fluid, as a dye for staining of certain structures, or as a probe or indicator. As an indicator, a fluorescent dye may fluoresce as a result of its environment, such as but not limited to, polarity or ions.
In one example embodiment, one or more fluorescent dyes are selected from the group consisting of Hoechst, Phalloidin, WGA, MitoTracker Red, BODIPY, and SYTO14. As used herein, “Hoechst” and “Hoechst 33342” are used interchangeably. The CAS name for Hoechst is 2,5′-1II-benzimidazole, 2′-(4-ethoxyphenyl)-5-(4-methyl-1-piperazinyl). Hoechst is a bis-benzimide derivative that binds to AT-rich sequences in the minor groove of double-stranded DNA. The emission wavelengths of Hoechst are in the red visible spectrum around 630-650 nm and the blue visible spectrum around 405-450 nm.
Phalloidin is a bicyclic peptide that belongs to a class of toxins called phallotoxins that binds to F-actin. These phallotoxins are isolated from Amanita phalloides. Phalloidin conjugates include: Alexa Fluor 350 Phalloidin, whose excitation/emission wavelength is around 346/442 nm respectively; NBD phallacidin, whose excitation/emission wavelength is around 465/536 nm respectively; Alexa Fluor Plus 405 Phalloidin, whose excitation/emission wavelength is around 405/450 nm respectively; Fluorescein phalloidin, whose excitation/emission wavelength is around 496/516 nm respectively; Alexa Fluor 488 Phalloidin, whose excitation/emission wavelength is around 496/519 nm respectively; Oregon Green 488 phalloidin, whose excitation/emission wavelength is around 496/520 nm respectively; Rhodamine phalloidin, whose excitation/emission wavelength is around 540/565 nm respectively; Alexa Fluor Plus 555 phalloidin, whose excitation/emission wavelength is around 555/565 nm respectively; BODIPY 558/568 phalloidin, whose excitation/emission wavelength is around 558/569 nm respectively; Alexa Fluor 594 Phalloidin, whose excitation/emission wavelength is around 590/617 nm respectively; Texas Red-X phalloidin, whose excitation/emission wavelength is around 591/608 nm respectively; Alexa Fluor Plus 647 phalloidin, whose excitation/emission wavelength is around 650/668 nm respectively; Alexa Fluor 680 Phalloidin, whose excitation/emission wavelength is around 679/702 nm respectively; Biotin-XX Phalloidin; and Alexa Fluor Plus 750 Phalloidin, whose excitation/emission wavelength is around 758/784 nm respectively.
Wheat germ agglutinin or WGA is a carbohydrate-binding protein. The excitation/emission wavelengths are around 495/519 nm respectively.
MitoTracker Deep Red is a highly conjugated compound that selectively binds to mitochondria. Additional MitoTracker probes comprise of: MitoTracker Green FM, whose absorption/emission wavelength is around 490/516 nm respectively; MitoTracker Orange CMTMRos, whose absorption/emission wavelength is around 551/576 nm respectively; MitoTracker Orange CM-H2TMRos, whose absorption/emission wavelength is around 551/576 nm respectively; MitoTracker Red CMXRos, whose absorption/emission wavelength is around 578/599 nm respectively; MitoTracker Red CM-H2XRos, whose absorption/emission wavelength is around 578/599 nm respectively; MitoTracker Red FM, whose absorption/emission wavelength is around 581/644 nm respectively
As used herein, the terms “BODIPY”, “dipyrromethencboron difluoride”, and “boron-dipyrromethene” are used herein interchangeably. The BODIPY IUPAC name is 4,4-difluoro-4-bora-3a,4a-diaza-s-indacene. BODIPY probes have fluorescence excitation maxima from around 500-600 nm and emission maxima from around 510-665 nm. In one example embodiment, BODIPY refers to BODIPY 505/515, whose excitation/emission wavelength is around 502/512 nm respectively. In another example embodiment, BODIPY probes comprise of: BODIPY FL, whose absorption/emission wavelength is around 503/512 nm respectively; BODIPY R6G, whose absorption/emission wavelength is around 528/547 nm respectively; BODIPY TMR, whose absorption/emission wavelength is around 544/570 nm respectively; BODIPY 581/591, whose absorption/emission wavelength is around 581/591 nm respectively; BODIPY TR, whose absorption/emission wavelength is around 588/616 nm respectively; BODIPY 630/650, whose absorption/emission wavelength is around 625/640 nm respectively; BODIPY 650/665, whose absorption/emission wavelength is around 646/660 nm respectively.
SYTO14 dye binds to both DNA and RNA. STYO14 probes have fluorescence excitation/emission wavelength is around 517/549 nm for DNA and 521/547 for RNA respectively. Addition SYTO dyes include: SYTO 40 blue-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 419/445 nm respectively; SYTO 41 blue-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 426/455 nm respectively; SYTO 42 blue-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 430/460 nm respectively; SYTO 45 blue-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 452/484 nm respectively; SYTO RNASelect green-fluorescent cell stain, whose excitation/emission wavelength is around 490/530 nm respectively; SYTO 9 green-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 483/503 nm respectively; SYTO 10 green-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 484/505 nm respectively; SYTO BC green-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 485/500 nm respectively; SYTO 13 green-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 488/509 nm respectively; SYTO 16 green-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 488/518 nm respectively; SYTO 24 green-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 490/515 nm respectively; SYTO 21 green-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 494/517 nm respectively; SYTO 12 green-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 500/522 nm respectively; SYTO 11 green-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 508/527 nm respectively; SYTO 25 green-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 521/556 nm respectively; SYTO 81 orange-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 530/544 nm respectively; SYTO 80 orange-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 531/545 nm respectively; SYTO 82 orange-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 541/560 nm respectively; SYTO 83 orange-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 543/559 nm respectively; SYTO 84 orange-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 567/582 nm respectively; SYTO 85 orange-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 567/583 nm respectively; SYTO 64 red-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 598/620 nm respectively; SYTO 61 red-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 620/647 nm respectively; SYTO 17 red-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 621/634 nm respectively; SYTO 59 red-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 622/645 nm respectively; SYTO 62 red-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 649/680 nm respectively; SYTO 60 red-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 652/678 nm respectively; and SYTO 63 red-fluorescent nucleic acid stain, whose excitation/emission wavelength is around 654/675 nm respectively;
In certain embodiment, a dye may be a non-protein organic dye belonging to a family such as Xanthene, Cyanine, Squaraine, Squaraine rotaxane, Naphthalene, Coumarin, Oxadiazole, Anthracene, Pyrene, Oxazine, Acridine, Arylmethine, Tetrapyrrole, Dipyrromethenc.
In certain embodiment, a dye may be a fluorescent protein such as green fluorescent protein (GFP), enhanced green fluorescent protein (EGFP), red fluorescent protein (RFP), blue fluorescent protein (BFP), cyan fluorescent protein (CFP), yellow fluorescent protein (YFP), miRFP, miRFP670, mCherry, tdTomato, DsRed-Monomer, DsRed-Express, DSRed-Express2, DsRed2, AsRed2, mStrawberry, mPlum, mRaspberry, HcRed1, E2-Crimson, mOrange, mOrange2, mBanana, ZsYellow1, TagBFP, mTagBFP2, Azurite, EBFP2, mKalamal, Sirius, Sapphire, T-Sapphire, ECFP, Cerulean, SCFP3A, mTurquoise, m′Turquoise2, monomelic Midoriishi-Cyan, TagCFP, niTFP1, Emerald, Superfolder GFP, Monomeric Azami Green, TagGFP2, mUKG, mWasabi, Clover, mNeonGreen, Citrine, Venus, SYFP2, TagYFP, Monomeric Kusabira-Orange, mKOk, mK02, mTangerine, mApple, mRuby, mRuby2, HcRed-Tandem, mKate2, mNeptune, NiFP, mkeima Red, LSS-mKatel, LSS-mKate2, mBeRFP, PA-GFP, PAmCherry1, PATagRFP, TagRFP6457, IFP1.2, iRFP, Kaede (green), Kaede (red), KikGR1 (green), KikGR1 (red), PS-CFP2, mLos2 (green), mEos2 (red), mEos3.2 (green), mEos3.2 (red), PSmOrange, Dronpa, Dendra2, Timer, AmCyan1, GFPuv, mCFP, CyPct, mKeima-Red, AmCyan1, mTFP1, Midoriishi Cyan, Wild Type GFP, TurboGFP, ZsGreen1, FYFP, Topaz, mCitrine, YPet, Turbo YFP, ZsYellow1, Kusabira Orange, Allophycocyanin, TurboRFP, DsRed monomer, TurboFP602, mRFP1, J-Red, R-phycocrythrin, RPE, B-phycoerythrin, BPE, HcRed1, Katusha, Peridinin Chlorophyll, PerCP, TagFP635, TurboFP635, or a combination thereof.
In certain embodiment, a dye may be a cell function dye such as Indo-1, Fluo-3, Fluo-4, DCFH, DHR, SNARF.
In certain embodiment, a dye may be a nucleic acid dye such as DAPI, SYTOX Blue, Chromomycin A3, Mithramycin, YOYO-1, Ethidium Bromide, Acridine Orange, SYTOX Green, TOTO-1, TO-PRO-1, TO-PRO: Cyanine Monomer, Thiazole Orange, CyTRAK Orange, Propidium Iodide (PI), LDS 751, 7-AAD, SYTOX Orange, TOTO-3, TO-PRO-3, DRAQ5, DRAQ7
In certain embodiment, a dye may be a Reactive and conjugated dye such as Allophycocyanin (APC), Aminocoumarin, APC-Cy7 conjugates, Cascade Blue, Cy2, Cy3, Cy3.5, Cy3B, Cy5, Cy5.5, Cy7, Fluorescein, FluorX, G-Dye100, G-Dye200, G-Dye300, G-Dye400, Hydroxycoumarin, Lissamine Rhodamine B, Lucifer yellow, Methoxycoumarin, NBD, Pacific Blue, Pacific Orange, PE-Cy5 conjugates, PE-Cy7 conjugates, PerCP, R-Phycoerythrin (PE), Red 613, Texas Red, TRITC, TruRed, X-Rhodaminc.
In certain embodiment, a dye may be CF dye, DRAQ and CyTRAK probes, EverFluor, Alexa Fluor, Bella Fluor, Dylight Fluor, Atto and Tracy, FluoProbes, Abberior Dyes, DY and MegaStokes Dyes, Sulfo Cy dyes, HiLyte Fluor, Seta, SeTau and Square Dyes, Quasar and Cal Fluor dyes, SureLight Dyes, APC, APCXL, RPE, BPE, Vio Dyes.
In certain embodiments, morphological profiling is performed on a cellular system and RNA-seq is performed on the same cellular system. In certain embodiments, a separate sample of the cellular system is sequenced. Thus, in one example, RNA-seq data can be linked to morphological imaging data. In certain embodiments, a transcriptome is sequenced. As used herein the term “transcriptome” refers to the set of transcripts molecules. In some embodiments, transcript refers to RNA molecules, e.g., messenger RNA (mRNA) molecules, small interfering RNA (siRNA) molecules, transfer RNA (tRNA) molecules, ribosomal RNA (rRNA) molecules, and complimentary sequences, e.g., cDNA molecules. In some embodiments, a transcriptome refers to a set of mRNA molecules. In some embodiments, a transcriptome refers to a set of cDNA molecules. In some embodiments, a transcriptome refers to one or more of mRNA molecules, siRNA molecules, tRNA molecules, rRNA molecules, in a sample, for example, a single cell or a population of cells. In some embodiments, a transcriptome refers to cDNA generated from one or more of mRNA molecules, siRNA molecules, tRNA molecules, rRNA molecules, in a sample, for example, a single cell or a population of cells. In some embodiments, a transcriptome refers to 50%, 55, 60, 65, 70, 75, 80, 85, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 99.9, or 100% of transcripts from a single cell or a population of cells. In some embodiments, transcriptome not only refers to the species of transcripts, such as mRNA species, but also the amount of each species in the sample. In some embodiments, a transcriptome includes each mRNA molecule in the sample, such as all the mRNA molecules in a single cell.
In certain embodiments, samples or cells are clustered based on the features identified. Clustering can use features from varying sources (e.g., LipocyteProfiler, RNA-seq) (see, e.g., International Application No. PCT/US2018/061348).
In certain embodiments, morphological features and optionally gene programs are determined for a SNP of interest. For example, cells are stained that include a SNP and where the SNP is active (e.g., a gene is expressed that is under control of a regulatory element comprising the SNP) or expressed (i.e., the SNP is expressed in the cell type). The function of the SNP may be determined based on determining morphological features. In certain embodiments, morphological features and optionally gene programs are determined for a candidate drug. In certain embodiments, the drug is suspected to alter one or more characteristics of a lipid accumulating cell. In certain embodiments, features associated with perturbation of one or more genomic loci are determined. In preferred embodiments, a cellular system is perturbed with a programmable nuclease system as described herein or an RNAi system as described herein.
In certain embodiments, clinical characteristics can be predicted by determining features for a cellular system obtained from a subject and comparing the features to features identified for a characteristic. In certain embodiments, the features are chosen by fitting a logistic regression model for the clinical characteristic on the entire set of features identified for subjects having a characteristic. Features can be further determined by connecting features in a network and generating a cutoff value to select features with a specific weight of interaction with other features. In another embodiment, features can be the number of features that can be modeled in a specific compartment category. The features that can be modeled can be adjusted based on cutoff values for each feature.
The logistic regression model may be a linear model with logit link (GLM). The linear association with binomial distribution may be implemented using the R glm function. The default glm convergence criteria on deviances may be used to stop the iterations. The DeLong method may be used to calculate confidence intervals for the c-statistics. Forward feature selection (R step function) may be used to select the features. The Akaike information criterion (AIC) may be used as the stop condition for the feature selection procedure. Histology
In certain embodiments, histological staining is performed on a tissue sample. The tissue sample may be obtained from a subject. The subject can be a control healthy subject or a subject having a specific clinical phenotype. Methods of obtaining tissues from a subject are known in the art and are described further herein. In certain embodiments, the tissue sample comprises lipid-accumulating cells. In preferred embodiments, the tissue sample is adipose tissue. The adipose tissue may be subcutaneous adipose tissue (SAT) or visceral adipose tissue (VAT).
Histology, also known as microscopic anatomy or microanatomy, is the branch of biology which studies the microscopic anatomy of biological tissues. Histology is the microscopic counterpart to gross anatomy, which looks at larger structures visible without a microscope. Although one may divide microscopic anatomy into organology, the study of organs, histology, the study of tissues, and cytology, the study of cells, modern usage places these topics under the field of histology. In medicine, histopathology is the branch of histology that includes the microscopic identification and study of diseased tissue. Biological tissue has little inherent contrast in either the light or electron microscope. Staining is employed to give both contrast to the tissue as well as highlighting particular features of interest. When the stain is used to target a specific chemical component of the tissue (and not the general structure), the term histochemistry is used. Antibodies can be used to specifically visualize proteins, carbohydrates, and lipids. This process is called immunohistochemistry, or when the stain is a fluorescent molecule, immunofluorescence. This technique has greatly increased the ability to identify categories of cells under a microscope. Other advanced techniques, such as nonradioactive in situ hybridization, can be combined with immunochemistry to identify specific DNA or RNA molecules with fluorescent probes or tags that can be used for immunofluorescence and enzyme-linked fluorescence amplification.
In certain embodiments, features are extracted from the histological images (see, e.g., Glastonbury C A, Pulit S L, Honecker J, et al. Machine Learning based histology phenotyping to investigate the epidemiologic and genetic basis of adipocyte morphology and cardiometabolic traits. PLOS Comput Biol. 2020; 16 (8): c1008044. Published 2020 Aug. 14. doi: 10.1371/journal.pcbi.1008044). Applicants have identified specific cell area features that associate with clinical features. Previously, cell area could only be associated to BMI (Glastonbury, et al. 2020). In certain embodiments, the histological features are cell area (μm2) features. In certain embodiments, the histological features are cell shape features. In one exemplary embodiment, cell area features include 5, 6, 7, 8, 9, 10, 15, or 20 or more features, preferably 20 features. The features may be determined by grouping cells into two or more size categories (e.g., 5). The size categories may be “very small”, “small”, “medium”, “large” and “very large.” The size categories may be determined by determining cell areas for the same tissue type in a large cohort of the same tissue type (e.g., control group). The cohort may include healthy and diseased subjects. In an example embodiment, the categories are determined by grouping cells according to: cell area <25% quartile point for the control group (very small), cell area ≥25% quartile point for the control group and <the median cell area for the control group (small), cell area ≥median cell area for the control group and <mean cell area for the control group (medium), cell area ≥mean area for the control group and <75% quartile point for the control group (large), and cell area ≥75% quartile point for the control group (very large). The size categories above would, for example, result in 5 features. Each size category can be further divided to determine further features. For example, each size category can be divided into 2, 3, 4 or more features. In an example embodiment, each size category is divided based on the fraction of cells in the cell area category, median area of cells in the category, 25% interquartile point in the category, and 75% interquartile point in the category. Thus, the features in this example that can be determined for each tissue sample would be 20 features. In an example embodiment, the 20 features can be used to predict clinical features that could not be predicted with previous cell area methods. Moreover, the features can be used to predict morphological features. Combining predictions made using both histological and morphological features may provide an improved prediction.
In one example embodiment the features can be linked to specific phenotypes. The phenotypes can be specific gene programs (biological programs) by comparing features to gene programs in the tissue sample and by determining genes associated with histological characteristics. The phenotypes can be specific clinical features. In certain embodiments, features associated with clinical characteristics are identified by comparing features in a control group of subjects having a clinical characteristic. Clinical characteristics can include risk for a disease, such as type 2 diabetes (T2D), coronary disease. Clinical characteristics can also include, age, weight, BMI, etc.
In certain embodiments, more than one cell needs to be imaged in order to determine histological features for a subject. In example embodiments, 50 or more cells per tissue sample are imaged, more preferably, more than 100, more preferably about 500 or more cells are imaged per tissue sample.
In certain embodiments, histological features and optionally gene programs are determined for a SNP of interest. For example, tissues are stained from a subject having a SNP and where the SNP is active (e.g., a gene is expressed that is under control of a regulatory element comprising the SNP) or expressed in the tissue. The function of the SNP may be determined based on determining histological features. In certain embodiments, histological features and optionally gene programs are determined for a candidate drug. In certain embodiments, the drug is suspected to alter one or more characteristics of a lipid accumulating cell. For example, a subject or animal model is treated with a drug before histological analysis, In certain embodiments, features associated with perturbation of one or more genomic loci are determined. In preferred embodiments, a cellular system is perturbed in vivo (e.g., animal model) with a programmable nuclease system as described herein or an RNAi system as described herein.
In certain embodiments, clinical characteristics can be predicted by determining histological features for a tissue obtained from a subject and comparing the features to features identified for a characteristic. In certain embodiments, the features are chosen by fitting a logistic regression model for the clinical characteristic on the entire set of features identified for subjects having a characteristic. The logistic regression model may be a linear model with logit link (GLM). The linear association with binomial distribution may be implemented using the R glm function. The default glm convergence criteria on deviances may be used to stop the iterations. The DeLong method may be used to calculate confidence intervals for the c-statistics. Forward feature selection (R step function) may be used to select the features. The Akaike information criterion (AIC) may be used as the stop condition for the feature selection procedure.
In certain embodiments, the cell subset frequency and/or differential cell states (e.g., intrinsic immune response) can be detected for screening of novel therapeutic agents. In certain embodiments, the present invention can be used to identify improved treatments by monitoring the identified cell states in a subject undergoing an experimental treatment. In certain embodiments, an organoid system is used to detect shifts in the identified cell states to identify agents capable of shifting a subject from a severe disease state to a mild/moderate state (see, e.g., Yin X, Mead B E, Safaee H, Langer R, Karp J M, Levy O. Engineering Stem Cell Organoids. Cell Stem Cell. 2016; 18 (1): 25-38). As used herein, the term “organoid” or “epithelial organoid” refers to a cell cluster or aggregate that resembles an organ, or part of an organ, and possesses cell types relevant to that particular organ. Organoid systems have been described previously, for example, for brain, retinal, stomach, lung, thyroid, small intestine, colon, liver, kidney, pancreas, prostate, mammary gland, fallopian tube, taste buds, salivary glands, and esophagus (see, e.g., Clevers, Modeling Development and Disease with Organoids, Cell. 2016 Jun. 16; 165 (7): 1586-1597). In certain embodiments, a tissue system or tissue explant is used to detect shifts in the identified cell states to identify agents capable of shifting a subject from a severe disease state to a mild/moderate state (see, e.g., Grivel J C, Margolis L. Use of human tissue explants to study human infectious agents. Nat Protoc. 2009; 4 (2): 256-269). In certain embodiments, an animal model is used to detect shifts in the identified cell states to identify agents capable of shifting a subject from a severe disease state to a mild/moderate state (see, e.g., Muñoz-Fontela C, Dowling W E, Funnell S G P, et al. Animal models for COVID-19. Nature. 2020; 586 (7830): 509-515).
In certain embodiments, candidate agents are screened. The term “agent” broadly encompasses any condition, substance or agent capable of modulating one or more phenotypic aspects of a cell or cell population as disclosed herein. Such conditions, substances or agents may be of physical, chemical, biochemical and/or biological nature. The term “candidate agent” refers to any condition, substance or agent that is being examined for the ability to modulate one or more phenotypic aspects of a cell or cell population as disclosed herein in a method comprising applying the candidate agent to the cell or cell population (e.g., exposing the cell or cell population to the candidate agent or contacting the cell or cell population with the candidate agent) and observing whether the desired modulation takes place.
Agents may include any potential class of biologically active conditions, substances or agents, such as for instance antibodies, proteins, peptides, nucleic acids, oligonucleotides, small molecules, or combinations thereof, as described herein.
The terms “therapeutic agent”, “therapeutic capable agent” or “treatment agent” are used interchangeably and refer to a molecule or compound that confers some beneficial effect upon administration to a subject. The beneficial effect includes enablement of diagnostic determinations; amelioration of a disease, symptom, disorder, or pathological condition; reducing or preventing the onset of a disease, symptom, disorder or condition; and generally counteracting a disease, symptom, disorder or pathological condition.
In certain embodiments, the present invention provides for gene signature screening to identify agents that shift expression of the gene targets described herein (e.g., cell subset markers and differentially expressed genes). The concept of signature screening was introduced by Stegmaier et al. (Gene expression-based high-throughput screening (GE-HTS) and application to leukemia differentiation. Nature Genet. 36, 257-263 (2004)), who realized that if a gene-expression signature was the proxy for a phenotype of interest, it could be used to find small molecules that effect that phenotype without knowledge of a validated drug target. The gene signatures or biological programs of the present invention may be used to screen for drugs that reduce the signature or biological program in cells as described herein.
The Connectivity Map (cmap) is a collection of genome-wide transcriptional expression data from cultured human cells treated with bioactive small molecules and simple pattern-matching algorithms that together enable the discovery of functional connections between drugs, genes and diseases through the transitory feature of common gene-expression changes (see, Lamb et al., The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease. Science 29 Sep. 2006: Vol. 313, Issue 5795, pp. 1929-1935, DOI: 10.1126/science. 1132939; and Lamb, J., The Connectivity Map: a new tool for biomedical research. Nature Reviews Cancer January 2007: Vol. 7, pp. 54-60). In certain embodiments, Cmap can be used to identify small molecules capable of modulating a gene signature or biological program of the present invention in silico.
Further embodiments are illustrated in the following Examples which are given for illustrative purposes only and are not intended to limit the scope of the invention.
Here, Applicants provide LipocyteProfiler (also referred to herein as Adipocyte Profiler) which is a metabolic disease-orientated phenotypic profiling system for lipid-accumulating cells. LipocyteProfiler expands on CellPainting (Bray M A, Singh S, Han H, et al. Cell Painting, a high-content image-based assay for morphological profiling using multiplexed fluorescent dyes. Nat Protoc. 2016; 11 (9): 1757-1774) and is an unbiased profiling assay, that multiplexes a combination of dyes that make it amenable to large-scale and high-throughput profiling of generic morphological as well as cell type-specific cellular traits. Lipid droplets are storage organelles at the center of whole body metabolism and energy homeostasis and are highly dynamic organelles, that are ubiquitous to cell types (Olzmann and Carvalho 2019) either as part of cellular homeostasis in lipocytes, such as adipocytes, hepatocytes, macrophages/foam cells and glial cells (Liu et al. 2015; Olzmann and Carvalho 2019; Wang et al. 2013; Grandl and Schmitz. 2010; Robichaud et al. 2021) or as part of pathophysiological processes in cells such as vascular smooth muscle cells, skeletal muscle cells, renal podocytes, and cancer cells (Hershey et al. 2019; Cruz et al. 2020; Wang et al. 2005; Weinert et al. 2013; Prats et al. 2006). Applicants vetted LipocyteProfiler in adipocytes, which are highly specialized cells for the storage of excess energy in the form of lipid droplets. First, Applicants connected known biology with rich phenotypic signatures at spatiotemporal resolution, by characterizing feature profiles of known biological processes, including adipocyte differentiation, distinct characteristics of white and brown adipocyte lineages and targeted perturbation of lipid accumulation via CRISPR/Cas9-mediated knockout of specific marker genes, and drug perturbations. Next, Applicants correlated LipocyteProfiles with transcriptomic data from RNAseq to link gene sets with morphological and cellular features that capture a broad range of cell activity in adipocytes. Applicants then used LipocyteProfiler to connect polygenic risk scores for Type 2 Diabetes (T2D)-related traits to intermediate cellular phenotypes, and found trait-specific cellular mechanisms underlying polygenic risk. Finally, Applicants used the method to uncover cellular traits under the genetic control of an individual genetic risk locus, as shown for the 2p23.3 metabolic risk locus at DNMT34. Applicants demonstrated that the customized morphometric approach is capable of identifying diverse cellular mechanisms by generating depot-specific, trait/process-specific and allele-specific morphological and cellular profiles. In the present study, Applicants show the power of LipocyteProfiler to identify genetically informed cellular programs in adipocytes driving metabolic diseases. The approach demonstrated here paves the way to large-scale and high-throughput forward and reverse phenotypic genetic profiling in lipid storing cell types in the future.
To quantitatively map dynamic, context-dependent morphological and cellular signatures in lipocytes and to discover intrinsic and extrinsic drivers of cellular programs, Applicants developed a profiling approach called LipocyteProfiler, based on high-content imaging (
Intensity features, which are a collection of features that measure pixel intensities across an image, cover 15.2% of all LipocyteProfiler extracted features. To test if LipocyteProfiler extracts tractable intensity features, Applicants used an established white adipocyte line (hWAT) (Xue et al. 2015) and mapped the phenotypic signature of progressive lipid accumulation over the course of adipocyte differentiation. Applicants showed that intensity of BODIPY, a proxy of overall lipid content within a cell, significantly increases with adipogenic differentiation (
The second class of feature measurement, Granularity, is informative for size spectra and covers 5.9% of total LipocyteProfiler features. Adipocyte differentiation is characterized by the progressive accumulation of lipid droplets that increase first in number and then enlarge and fuse to larger lipid droplets over the course of maturation (Fei et al. 2011). Confirmingly, Applicants found dynamic changes of BODIPY Granularity during the course of differentiation (
The third main class of features are Texture features (67.8% of total features) that describe the complexity within an image. During adipogenesis of hWAT, AGP Texture_AngularSecondMoment, a measure for image homogeneity, was decreased, whereas it was increased for BODIPY (
LipocyteProfiler extracts a fourth class of Other features, which reflect various measurements of Area, Shape and Size. These size estimates intuitively change over the course of differentiation as cells become lipid-laden, grow in size, and as nuclei become more round and compact (
To identify relevant processes that converge into morphological and cellular features and to identify pathways of a given set of features, Applicants next used a linear mixed model to correlate the expression of 60,000 genes derived from RNAseq with each of the 2,760 image-based features derived from LipocyteProfiler in adipocytes at day 14 of differentiation. Applicants found 44,736 non-redundant significant feature-gene connections (FDR<0.1) that were composed of 10,931 genes and 869 features, that mapped across all channels (
LipocyteProfiler Identifies Distinct Depot-Specific Morphological Signatures Associated with Differentiation Trajectories in Both Visceral and Subcutaneous Adipocytes
Applicants next sought to distinguish primary human AMSCs derived from the two main adipose tissue depots in the body, namely subcutaneous and visceral, across the course of differentiation. Applicants used those profiles to resolve adipogenesis into temporal dynamic changes in cell morphology (
Lastly, to assess the in vivo relevance of morphological features of in vitro differentiated adipocytes, Applicants correlated BODIPY Granularity features with tissue-derived size estimates of mature floating adipocytes. Applicants showed that changes of BODIPY Granularity of in vitro differentiated female subcutaneous adipocytes correlate significantly with the mean adipocyte size from tissue (
To investigate whether LipocyteProfiler is capable of identifying effects of drug perturbations on morphological and cellular profiles, Applicants first assayed isoproterenol-stimulated compared to DMSO-treated subcutaneous and visceral adipocytes. Isoproterenol is a β-adrenergic agonist that binds to the β-adrenergic receptor (ADRB) in adipocytes. While isoproterenol is known to induce lipolysis and increase mitochondrial energy dissipation (Miller et al. 2015), Applicants set out to find out whether its concerted effects on morphological and cellular signatures could be captured using LipocyteProfiler (
Next, Applicants assayed the effects of oleic acid and metformin in primary human hepatocytes (PHH) using LipocyteProfiler. It has been shown that free fatty acid treatment induces lipid droplet accumulation in PHH (Liu et al. 2014). The results showed that 24h treatment of PHH with oleic acid (OA) resulted in changes predominantly of BODIPY features (
Next, Applicants used LipocyteProfiler to discover cellular programs of metabolic polygenic risk in adipocytes. For systematic profiling of AMSCs in the context of natural genetic variation (Table 4), Applicants first assessed the effect of both technical and biological variance on LipocyteProfiler features in the setting. To obtain a measure of batch-to-batch variance associated with the experimental set-up, Applicants differentiated hWAT, hBAT and SGBS preadipocytes (Fischer-Posovszky et al. 2008) in three independent experiments and found no significant batch effect (BEscore 0.0047, 0.0001, 0.0003,
Using the latest summary statistics for T2D, Applicants then constructed individual genome-wide polygenic risk scores (PRSs) for three T2D-related traits that have been linked to adipose tissue, using the latest summary statistics for T2D (Mahajan et al. 2018), HOMA-IR (Dupuis et al. 2010), a proxy of insulin resistance (Matthews et al. 1985), and waist-to-hip ratio adjusted for BMI (WHRadjBMI) (Pulit et al. 2019). Applicants selected donors from the bottom and top 25 percentiles of these genome-wide PRSs (forthon referred to as low polygenic risk and high polygenic risk, respectively) and compared LipocyteProfiles across the time course of visceral and subcutaneous adipocyte differentiation in high and low polygenic risk groups for each of the traits (
Applicants found significant polygenic effects on image-based cellular signatures for HOMA-IR and WHIRadjBMI, but no effect for T2D (Table 5). More specifically, Applicants observed an effect of HOMA-IR polygenic risk on morphological profiles at day 14 in visceral adipocytes (43 features, FDR<5%,
Polygenic Risk for Lipodystrophy-Like Phenotype Manifests in Cellular Programs that Indicate Reduced Lipid Accumulation Capacity in Subcutaneous Adipocytes
To resolve polygenic effects on adipocyte cellular programs beyond heterogeneous T2D and insulin resistance traits, Applicants used clinically informed process-specific, partitioned PRSs of lipodystrophy (Udler et al. 2018) and correlated those scores with morphological features throughout adipocyte differentiation. Those lipodystrophy PRSs were constructed based on 20 T2D-associated loci with a lipodystrophy-like phenotype (
To identify cellular pathways of lipodystrophy polygenic risk that could underlie the morphological signature in subcutaneous adipocytes, Applicants created a network of genes linked to features identified to be under the control of lipodystrophy polygenic risk. This analysis identified 23 genes that had equal or more than 10 connections to features derived from the polygenic risk lipodystrophy LipocyteProfile (FDR 0.1%,
To confirm that Applicants can apply LipocyteProfiler to link an individual genetic risk locus to meaningful cellular profiles in adipocytes Applicants investigated a locus on chromosome 2, location 2p23.3, spanning the DNMT34 gene. The metabolic risk haplotype (minor allele frequency of 0.35 in 1000 Genome Phase 3 combined populations), associated with higher risk for T2D and WHIRadjBMI (
In this study, Applicants present a new imaging framework, LipocyteProfiler, and demonstrate its power in unraveling causal disease mechanisms. Applicants showed that the mechanistic information gained from LipocyteProfiles is not limited to generic cellular organelles but reflects a physiological state of the cell that yields insight into disease-relevant cellular mechanisms. Using LipocyteProfiler, Applicants were able to detect subtle phenotypic differences driven by drug treatment and natural genetic variation at relatively small sample size. This is potentially due to the design of LipocyteProfiler presenting a more granular assay that has high sensitivity for small effect sizes because it assesses cellular phenotypes that present the amelioration of genomic, transcriptional and proteomic states. Applicants showed that polygenic risk for T2D-related traits converge into discrete pathways and mechanisms and demonstrated that LipocyteProfiler determines morphological and cellular signatures underlying differential polygenic risk that were specific to adipocyte depot, trait and developmental time point. Applicants generated a resource and assay that enables unbiased mechanistic interrogation of the hundreds of metabolic disease loci whose function still remains unknown. Applicants showed that LipocyteProfiler could be used to characterize and map underlying mechanisms of donor contribution and drug perturbation to cell behavior. This approach can pave the way for future cellular GWAS linking common genetic variation to phenotypes and can accelerate therapeutic pathway discovery.
Applicants obtained AMSCs from subcutaneous and visceral adipose tissue from patients undergoing a range of abdominal laparoscopic surgeries (sleeve gastrectomy, fundoplication or appendectomy). The visceral adipose tissue is derived from the proximity of the angle of His and subcutaneous adipose tissue obtained from beneath the skin at the site of surgical incision. Additionally, human liposuction material was obtained. Each participant gave written informed consent before inclusion and the study protocol was approved by the ethics committee of the Technical University of Munich (Study No 5716/13). Isolation of AMSCs was performed as previously described (Hauner et al. 2001). For a subset of donors, purity of AMCSs was assessed as previously described (Raajendiran et al. 2019). Briefly, cells were stained with 0.05 ug CD34, 0.125 ug CD29, 0.375 ug CD31, 0.125 ug CD45 per 250K cells and analyzed on CytoFlex together with negative control samples of corresponding AMCSs.
Purity of AMCSs was assessed as previously described (Raajendiran et al, 2019). Briefly, cells were stained with 0.05 ug CD34, 0.125 ug CD29, 0.375 ug CD31, 0.125 ug CD45 per 250K cells and analyzed on CytoFlex together with negative control samples of corresponding AMCSs.
For imaging, cells were seeded at 10K cells/well in 96-well plates (Cell Carrier, Perkin Elmer #6005550) and induced 4 days after seeding. For RNAseq, cells were seeded at 40K cells/well in 12-well dishes (Corning). Before Induction cells were cultured in proliferation medium (Basic medium consisting of DMEM-F12 1% Penicillin-Streptomycin, 33 μM Biotin and 17 μM Pantothenate supplemented with 0.13 μM Insulin, 0.01 ug/ml EGF, 0.001 ug/ml FGF, 2.5% FCS). Adipogenic differentiation was induced by changing culture medium to induction medium. (Basic medium supplemented with 0.861 pM Insulin, 1 nM T3, 0.1 μM Cortisol, 0.01 mg/ml Transferrin, 1 μM Rosiglitazone, 25 nM Dexamethasone, 2.5 nM IBMX). On day 3 of adipogenic differentiation culture medium was changed to differentiation medium (Basic medium supplemented with 0.861 pM Insulin, 1 nM T3, 0.1 μM Cortisol, 0.01 mg/ml Transferrin). Medium was changed every 3 days. Visceral-derived AMSCs were differentiated by further adding 2% FBS as well as 0.1 mM oleic and linoleic acid to the induction and differentiation media. For isoproterenol stimulation experiments, 1 μM isoproterenol was added to the differentiation media and cells treated overnight.
Mature adipocyte isolation was carried out as described earlier (Fischer B, Schöttl T, Schempp C, et al. Inverse relationship between body mass index and mitochondrial oxidative phosphorylation capacity in human subcutaneous adipocytes. Am J Physiol Endocrinol Metab. 2015; 309 (4): E380-F387). Immediately after isolation, approximately 50 μl of the adipocyte suspension was pipetted onto a glass slide and the diameter of 100 cells was manually determined under a light microscope.
Primary human hepatocytes (PHH) were purchased from BioIVT. Donor lot YNZ was used in this study. PHH were thawed and immediately resuspended in CP media (BioIVT) supplemented with torpedo antibiotic (BioIVT). Cell count and viability were assessed by trypan blue exclusion test prior to plating. Hepatocytes were plated onto collagen-coated Cellcarrier-96 Ultra Microplates (Perkin Elmer) at a density of 50,000 cells per well in CP media supplemented. Four hours after plating, media was replaced with fresh CP media. After 24 h, media was replaced with fresh CP media or CP media containing oleic acid (0.3 mM) or metformin (5 mM). Hepatocytes were incubated for an additional 24 h prior to processing.
Human primary AMSCs and PHH were plated in 96-well CellCarrier plates (Perkinelmer #6005550). AMSCs were differentiated for 14 days and high content imaging was performed at day 0, day 3, day 8 and day 14 of adipogenic differentiation. Primary human hepatocytes were stained after 48 h in culture, and 24h following treatment with oleic acid or metformin. On the respective day of the assay, cell culture media was removed and replaced by 0.5 μM Mitotracker staining solution (1 mM MitoTracker Deep Red stock (Invitrogen #M22426) diluted in culture media) to each well followed by 30 minutes incubation at 37° C.′ protected from light. After 30 min Mitotracker staining solution was removed and cells were washed twice with Dulbecco's Phosphate-Buffered Saline (1×), DPBS (Corning® #21-030-CV) and 2.9 μM BODIPY staining solution (3.8 mM BODIPY 505/515 stock (Thermofisher #D) 3921) diluted in DPBS) was added followed by 15 minutes incubation at 37° C. protected from light. Subsequently, cells were fixed by adding 16% Methanol-free Paraformaldehyde, PFA (Electron Microscopy Sciences #15710-S) directly to the BODIPY staining solution to a final concentration of 3.2% and incubated for 20 minutes at RT protected from light. PFA was removed and cells were washed once with Hank's Balanced Salt Solution (1×), HBSS (Gibco #14025076). To permeabilize cells 0.1% Triton X-100 (Sigma Aldrich #X100) was added and incubated at RT for 10 minutes protected from light. After Permeabilization multi-stain solution (10 units of Alexa Fluor™ 568 Phalloidin (ThermoFisher #A12380), 0.01 mg/ml Hoechst 33342 (Invitrogen #H3570), 0.0015 mg/ml Wheat Germ Agglutinin, Alexa Fluor™ 555 Conjugate (ThermoFisher #W32464), 3 uM SYTO™ 14 Green Fluorescent Nucleic Acid Stain (Invitrogen #/S7576) diluted in HBSS) was added and cells were incubated at RT for 10 minutes protected from light. Finally, staining solution was removed and cells were washed three times with HBSS. Cells were imaged using a Opera Phenix High content screening system. Per well Applicants imaged 25 fields.
DNA was extracted and sent to the Oxford Genotyping Center for genotyping on the Infinium HTS assay on Global Screening Array bead-chips. Genotype QC was done using GenomeStudio and genotypes were converted into PLINK format for downstream analysis. Applicants checked sample missingness but found no sample with missingness >5%. For the remaining sample quality control (QC) steps, Applicants reduced the genotyping data down to a set of high-quality SNPs. These SNPs were: (a) Common (minor allele frequency >10%); (b) Had missingness <0.1%; (c) Independent, pruned at a linkage disequilibrium (r2) threshold of 0.2; (d) Autosomal only; (e) Outside the lactase locus (chr2), the major histocompatibility complex (MHC, chr6), and outside the inversions on chr8 and chr17; (f) In Hardy-Weinberg equilibrium (P>1×10 3).
Using the remaining ˜65,000 SNPs, Applicants checked samples for inbreeding (--het in PLINK), but found no samples with excess homozygosity or heterozygosity (no sample >6 standard deviations from the mean). Applicants also checked for relatedness (--genome in PLINK) and found one pair of samples to be identical; Applicants kept the sample with the higher overall genotyping rate. Finally, Applicants performed PCA using EIGENSTRAT and projected the samples onto data from HapMap3, which includes samples from 11 global populations. Six samples appeared to have some amount of non-European ancestral background, while the majority of samples appeared to be of European descent. Applicants removed no samples at this step, selecting to adjust for principal components in genome-wide testing. However, adjustment for principal components failed to eliminate population stratification, and Applicants therefore restricted to samples of European descent only, defined as samples falling within +/−10 standard deviations of the first and second principal component values of the CEU (Northern and Western European-ancestry samples living in Utah) and TSI (Tuscans in Italy) samples included in the HapMap 3 dataset.4 2, 43 Finally, sex information was received after initial sample QC was complete. As a result, one sample with potentially mismatching sex information (comparing genotypes and phenotype information) was discovered after analyses were complete and therefore remained in the analysis.
Applicants removed all SNPs with missingness >5% and out of HWE, P<1×10−6. Applicants also removed monomorphic SNPs. Finally, Applicants set heterozygous haploid sites to missing to enable downstream imputation.
The final cleaned dataset included 190 samples and ˜700,000 SNPs. Applicants note that histology data was not available for all genotyped samples.
For the genotyped cohorts without imputation data (ENDOX and MOBB) Applicants performed imputation via the Michigan Imputation Server. Applicants aligned SNPs to the positive strand, and then uploaded the data (in VCF format) to the server. Applicants imputed the data with the Haplotype Reference Consortium (HRC) panel, to be consistent with the fatDIVA data which was already imputed with the HRC panel. Applicants selected EAGLE as the phasing tool to phase the data. To impute chromosome X, Applicants followed the server protocol for imputing this chromosome (including using SHAPEIT to perform the phasing step).
Applicants constructed GRSs for BMI, WHIR, and WHIRadjBMI using independent (12<0.05) primary (“index”, associated with each obesity trait P<5×10−9) SNPs in the combined-sexes analyses in a recent GWAS3 (see data availability). Applicants excluded SNPs with duplicated positions, missingness >0.05, HWE P<1×10−6, and minor allele frequency <0.05 in the imputed data, after filtering on imputation info >0.3 in the imputed cohorts and restricting the GTEx cohort to those of European ancestry and excluding one individual due to relatedness. For these analyses, the individual in MOBB with potential sex mis-match between genotypic and phenotypic sex was removed. Only SNPs available in all cohorts after quality control was included, resulting in a final set of 530, 259, and 274 SNPs for BMI, WHIR and WHRadjBMI, respectively. The SNPs were aligned so that the effect allele corresponded to the obesity-trait increasing allele. GRSs were then computed for each participant by taking the sum of the participant's obesity-increasing alleles weighted by the SNPs effect estimate, using plink v1.90b3.5 0.
Applicants then investigated associations with subcutaneous and visceral mean adipocyte area per 1-unit higher obesity GRS, corresponding to a predicted one standard deviation higher obesity trait, using linear regression in R version 3.4.3.5 1 All analyses were performed both with adipocyte area in μm2 and in standard deviation units, computed through rank inverse normal transformation of the residuals and adjusting for any covariates at this stage. Applicants adjusted for age, sex, and ten principal components, and with and without adjusting for BMI in the GTEx, MOBB, and fatDIVA cohorts. As Applicants did not have access to data about age and BMI in the all-female ENDOX cohort, Applicants only adjusted for ten principal components in that cohort and with and without adjusting for chip type. Applicants then meta-analyzed the cohorts, assuming a fixed-effects model. In the main meta-analysis model, ENDOX was included using the adjusted for chip type estimates. As a sensitivity analysis, Applicants also reran the meta-analyses using the ENDOX estimates unadjusted for chip type and completely excluding the ENDOX cohort, yielding highly similar results.
Quantitation was performed using CellProfiler 3.1.9. Prior to processing, flat field illumination correction was performed using functions generated from the median intensity across each plate. Nuclei were identified using the DAPI stain and then expanded to identify whole cells using the Phalloidin/WGA and BODIPY stains. Regions of cytoplasm were then determined by removing the Nuclei from the Cell segmentations. Speckles of BODIPY staining were enhanced to assist in detection of small and large individual Lipid objects. For each object set measurements were collected representing size, shape, intensity, granularity, texture, colocalization and distance to neighbouring objects. After LipocyteProfiler (LP) feature extraction data was filtered by applying automated and manual quality control steps. First, fields with a total cell count less than 50 cells were removed. Second, every field was assessed visually and fields that were corrupted by experimental induced technical artifacts were removed. Furthermore, blocklisted features (Way, Gregory (2019): Blocklist Features-Cell Profiler. figshare. Dataset. doi.org/10.6084/m9.figshare.10255811.v3), LP-features measurement category Manders, R W C and Costes, that are known to be noisy and generally unreliable were removed. Additionally, LP-features named SmallLipidObjetcs, that measure small objects stained by SYTO14 rather than lipid informative objects, were also removed. After filtering data were normalised per plate using a robust scaling approach (Pedregosa et al. 2011) that subtracts the median from each variable and divides it by the interquartile range. Individual wells were aggregated for downstream analysis by cell depot and day of differentiation.
Subsequent data analyses were performed in R3.6.1 and Matlab using base packages unless noted. To assess batch effects Applicants visualized the data using a Principle component analysis and quantified it using a Kolmogorov-Smirnov test implemented in the “BEclear” R package (Akulenko et al. 2016). Additionally Applicants performed a k-nearest neighbour (knn) supervised machine learning algorithm implemented in the “class” R package (Venables and Ripley 2002) to investigate the accuracy of predicting biological and technical variation. For this analysis the data set, consisting of 3 different cell types (hWAT, hBAT, SGBS) distributed on the 96-well plate, imaged at 4 days of differentiation, was split into equally balanced testing (n=18) and training (n=56) sets. Accuracy of the classification model was predicted based on three different categories cell type, batch and column of the 96-well plate. (github)
For dimensionality reduction visualisation Uniform manifold approximation and projection maps (UMAP) were created using the UMAP R package version 0.2.7.0 (McInnes et al. 2018) (github). To visualise LipocyteProfiles and their effect size ComplexHeatmap Bioconductor package version 2.7.7 (Gu et al. 2016) was used. (github)
To identify patterns of adipocyte differentiation underlying the morphological profiles a sample progression discovery analysis (SPD)) was performed using the algorithm previously described (Qiu et al. 2011). Briefly, the two adipose depots were analyzed separately and features were clustered into modules based on correlation (correlation coefficient 0.6). Minimal spanning trees (MST) were constructed for each module and MSTs of each module are correlated to each other. Modules that support common MST were selected and an overall MST based on features of all selected modules is reconstructed.
Variance component analysis was performed by fitting multivariable linear regression models-yi˜xi+zi+ . . . —where y denotes an LipocyteProfiler feature of individual i and x, 7, etc. independent variables that could confound identification of biological sources of variability of the dataset. Independent variables are experimental batch, adipose depot, passaging before freezing, season and year of AMSCs isolation, sex, age, BMI, T2D status of individual, LipocyteProfiler feature Cells_Neighbors_PercentTouching_Adjacent corresponding to density of cell seeding and identification numbers of individuals. (github)
To test whether there is a difference of morphological profiles on the tail ends of polygenic risk scores (PRS) for T2D, HOMA-IR and WHRadjBMI a multi-way analysis of variance (ANOVA) was performed. Individuals belonging to top 25% and bottom 25% of PRS score distribution are categorized into a categorical variable with 2 levels, top 25% or 25% bottom, according to their PRS percentile. Differences of morphological profiles are predicted using the categorised PRS variable adjusted for sex, age, BMI and batch. For the process-specific lipodystrophy polygenic risk score a linear regression model was fitted adjusted for sex, age, BMI and batch to predict differences of morphological and cellular profiles. To overcome multiple testing burden p-values were corrected using false positive rate (FDR) described in R package “qvalue” (qvalue). Features with FDR <5% were classified to be significantly impacted by the PRS variable. (github)
For each group of interest, cells were pooled and divided into 100 clusters via K-Means clustering (scikit-learn). Individual cells were then sampled from the cluster closest to a theoretical point representing the mean of all object measurements, as determined by a euclidean distance matrix.
RNA-seq data were processed using FastQC (Krueger and Others 2015) and spliced reads were mapped using STAR (Dobin et al. 2013) followed by counting gene levels using featureCounts (Liao et al. 2014). Next, raw read counts were normalized using DESseq2 R package (Love et al. 2014). For differential expression analysis on the tail ends of polygenic risk scores (PRS) for HOMA-IR a multi-way analysis of variance (ANOVA) was performed on subset of 512 genes (GSEA hallmark gene sets for adipogenesis, fatty acid metabolism and glycolysis). Individuals belonging to top 25% and bottom 25% of PRS score distribution are categorized into a categorical variable with 2 levels, top 25% or 25% bottom, according to their PRS percentile. Differences in transcriptional profiles are predicted using categorised PRS variable adjusted for sex, age, BMI and batch. To overcome multiple testing burden p-values were corrected using false positive rate (FDR) described in R package “qvalue” (Storey J D, Bass A J, Dabney A, Robinson D, 2020). Genes with FDR <10% were classified to be significantly impacted by PRS and were uploaded to Enrichr to analyze them as a gene list against the WikiPathways.
A linear regression model was fitted of 2,760 LP-features and global transcriptome RNA-seq data adjusted for sex, age, BMI and batch in subcutaneous AMSCs at day 14 of differentiation. Gene LP features association were declared to be significant when passing FDR cut-off of 0.1% FDR. LP features belonging to Cells category were used for further analysis. Associations between genes and LP features were visualized using “igraph” R package (Csardi et al. 2006) (github). Genes that are connected to top scoring LP features were uploaded to Enrichr to analyse them as a gene list against WikiPathways or BioPlanet. Adipocyte marker genes, SCD, PLIN2, LIPE, INSR, GLUT4 and TIMM22, were chosen to demonstrate morphological profiles matching their known pathways, by identifying LP features that associate with those genes with a global significant level of 5% FDR. (github)
Applicants generated a hWAT cell-line stably expressing Cas9 as previously described (Shalem et al. 2014). Applicants validated the generated line by assessing Cas9 activity (90%) and adipocyte differentiation capacity using adipocyte marker gene expression and morphological profiling. CRISPR/Cas9 mediated knockdown of PPARG, PGC1A, MFN1 and PLIN1 was performed in pre-adipocytes (5 days before differentiation) using three replicates per guide and two guides per gene (guide sequences targeting PPARG: ATACACAGGTGCAATCAAAG (SEQ ID NO: 42) and CAACTTTGGGATCAGCTCCG (SEQ ID NO: 43); PGC1A: TATTGAACGCACCTTAAGTG (SEQ ID NO: 44) and AGTCCTCACTGGTGGACACG (SEQ ID NO: 45); MFN1: CACCAGGTCATCTCTCAAGA (SEQ ID NO: 46) and TTATATGGCCAATCCCACTA (SEQ ID NO: 47); PLIN1: TCACGGCAGATACTTACCAG (SEQ ID NO: 48) and TCTGCACGGTGTATCGAGAG (SEQ ID NO: 49)) as well as five non-targeted controls (control guide sequences: ATCAGGCCTTGTCCGTGATT (SEQ ID NO: 50), TACGTCATTAAGAGTTCAAC (SEQ ID NO: 51), GACAGTGAAATTAGCTCCCA (SEQ ID NO: 52), GATTCATACTAAACACTCTA (SEQ ID NO: 53), CCTAGTTCATAAGCTACGCC (SEQ ID NO: 54)) in an 96-well arrayed format. Guide on-target efficiency was assessed using Next-generation sequencing followed by CRISPResso analysis (Pinello et al. 2016). AMSCs were stained using LipocytePainting (see above) on day 14 of differentiation. After feature extraction and QC steps (see also LipocyteProfiling), Applicants removed samples where guide cutting efficiency was <10% or where discrepancy between the two guides was equal or above 10%.
Genotyping of all samples was performed in two separate batches using the Infinium HTS assay on Global Screening Array bead-chips. Since the two sets of samples were genotyped with different versions of the beadchips and in different batches, Applicants Qced, imputed, and generated the genome-wide polygenic scores separately and combined the results afterwards.
A 3-step quality control protocol was applied using PLINK (Purcell et al. 2007; Chang et al. 2015), and included 2 stages of SNP removal and an intermediate stage of sample exclusion.
The exclusion criteria for genetic markers consisted of: proportion of missingness ≥0.05, HWE p≤1×10−20 for all the cohort, and MAF <0.001. This protocol for genetic markers was performed twice, before and after sample exclusion.
For the individuals, Applicants considered the following exclusion criteria: gender discordance, subject relatedness (pairs with PI-HAT ≥0.125 from which Applicants removed the individual with the highest proportion of missingness), sample call rates ≥0.02 and population structure showing more than 4 standard deviations within the distribution of the study population according to the first seven principal components. After QC, 35 subjects remained for the analysis for which Applicants had matched LipocyteProfiler imaging data.
Genotypes were phased with SHAPEIT2 (Delaneau et al. 2013), and then performed genotype imputation with the Michigan Imputation server, using Haplotype Reference Consortium (HRC) (Consortium and the Haplotype Reference . . . ) as reference panel. Applicants excluded variants with an info imputation r-squared <0.5 and a MAF <0.005.
Genome-wide polygenic scores were computed using PRS-CS (Ge et al. 2019) and using the “auto” parameter to specify the phi shrinkage parameter. Applicants computed the PRS-CS polygenic scores for the following traits: T2D (Mahajan et al. 2018), BMI, waist-to-hip ratio adjusted and unadjusted by BMI, and stratified by sex and combined (Pulit et al. 2019). Genome-wide PRS for HOMA-IR were computed with LdPred (Vilhjálmsson et al. 2015) using summary statistics from Dupuis et al (Dupuis et al. 2010).
Process-specific PRSs were constructed based on five clusters defined in Udler et al. (Udler et al. 2018) by selecting the SNPs that had weight larger than 0.75 for each of a given cluster. Applicants used the effect sizes described in Mahajan et al as weight for the polygenic scores (Mahajan et al. 2018).
All PRSs were tested for association with T2D and with BMI using the 30,240 MGB Biobank samples from European Ancestry defined based on self-reported and principal components.
The MGB Biobank (Karlson et al. 2016) maintains blood and DNA samples from more than 60,000 consented patients seen at Partners HealthCare hospitals, including Massachusetts General Hospital, Brigham and Women's Hospital, McLean Hospital, and Spaulding Rehabilitation Hospital, all in the USA. Patients are recruited in the context of clinical care appointments at more than 40 sites, clinics and also electronically through the patient portal at Partners HealthCare. Biobank subjects provide consent for the use of their samples and data in broad-based research. The Partners Biobank works closely with the Partners Research Patient Data Registry (RPDR), the Partners' enterprise scale data repository designed to foster investigator access to a wide variety of phenotypic data on more than 4 million Partners HealthCare patients. Approval for analysis of Biobank data was obtained by Partners IRB, study 2016P001018.
Type 2 diabetes status was defined based on “curated phenotypes” developed by the Biobank Portal team using both structured and unstructured electronic medical record (EMR) data and clinical, computational and statistical methods. Natural Language Processing (NLP) was used to extract data from narrative text. Chart reviews by disease experts helped identify features and variables associated with particular phenotypes and were also used to validate results of the algorithms. The process produced robust phenotype algorithms that were evaluated using metrics such as sensitivity, the proportion of true positives correctly identified as such, and positive predictive value (PPV), the proportion of individuals classified as cases by the algorithm (Yu et al. 2015).
A 3-step quality control protocol was applied using PLINK (Purcell et al. 2007; Chang et al. 2015), and included 2 stages of SNP removal and an intermediate stage of sample exclusion.
The exclusion criteria for genetic markers consisted of: proportion of missingness ≥0.05, HWE p≤1×10−20 for all the cohort, and MAF <0.001. This protocol for genetic markers was performed twice, before and after sample exclusion.
For the individuals, Applicants considered the following exclusion criteria: gender discordance, subject relatedness (pairs with PI-HAT ≥0.125 from which Applicants removed the individual with the highest proportion of missingness), sample call rates ≥0.02 and population structure showing more than 4 standard deviations within the distribution of the study population according to the first seven principal components.
Genotypes were phased with SHAPEIT2 (Delaneau et al. 2013), and then performed genotype imputation with the Michigan Imputation server, using Haplotype Reference Consortium (HRC) as reference panel. Applicants excluded variants with an info imputation r-squared <0.5 and a MAF <0.005.
Human liposuction material used for isolation of preadipocytes was obtained from a collaborating private plastic surgery clinic Medaesthetic Privatklinik Hoffmann & Hoffmann in Munich, Germany. Harvested subcutaneous liposuction material was filled into sterile 1 L laboratory bottles and immediately transported to the laboratory in a secure transportation box. The fat was aliquoted into sterile straight-sided wide-mouth jars, excluding the transfer of liposuction fluid. The fat was stored in cold Adipocyte Basal medium (AC-BM) at a 1:1 ratio of fat to medium and stored at 4° C. to be processed the following day. Additionally, small quantities of the original liposuction material would be aliquoted into T-25 flasks at a 1:1 ratio of fat to medium as controls to check for contamination. These control flasks were stored in the 37° C. incubator and were not processed. Krebs-Ringer Phosphate (KRP) buffer was prepared containing 200 U/ml of collagenase and 4% heat shock fraction BSA and sterilized by filtration using a Bottle Top Filter 0.22 μm. When the fat reached RT, 12.5 ml of liposuction material was aliquoted into sterile 50-ml tubes with plug seal caps. The tubes were filled to 47.5 ml with warm KRP-BSA-collagenase buffer and the caps were securely tightened and wrapped in Parafilm to avoid leakage. The tubes were incubated in a shaking water bath for 30 minutes at 37° C. with strong shaking. After 30 minutes, the oil on top was discarded and the supernatant was initially filtered through a nylon mesh. The supernatant of all tubes was combined after filtration and centrifuged at 200×g for 10 minutes. The supernatant was discarded and each pellet was resuspended with 3 ml of erythrocyte lysis buffer, then all the pellets were pulled in one tube and incubated for 10 minutes at RT. The cell suspension was filtered through a 250 μm Filter and then through 150 μm Filter, followed by centrifugation at 200 g for 10 minutes. The supernatant was discarded and the pellet containing preadipocytes was resuspended in an appropriate amount of DMEM/F12 with 1% P/S and 10% FCS and seeded in T75 cell culture flasks and stored in the incubator (37° C., 5% CO2). The next day the medium was changed to PAC-PM. Once preadipocytes reached 100% confluency in T25 or T75 flasks they were split into 6-well plates at a seeding density on 250,000 cells per plate in PAC-PM. Once they reached 100% confluency, PAC-IM was prepared fresh and added to the preadipocytes to induce differentiation. On day 3 after induction, the medium was changed to PAC-DM and replaced twice a week.
Subcutaneous adipose tissue was sampled from the abdominal area at the site of incision and visceral adipose tissue from the angle of his from patients undergoing elective abdominal laparoscopic surgery. Each patient gave written informed consent prior to inclusion and the study protocol was approved by the ethics committee of the Technical University of Munich (Study nr. 5716/13). Connective tissue and blood vessels were dissected and one gram of minced adipose tissue was digested with 5 ml of Krebs-ringer phosphate buffer containing 200 U/ml of collagenase (SERVA, Heidelberg, Germany). Digestion was carried out at 37° C. for 60 minutes in a shaking water bath. Afterwards the suspension was centrifuged at 200 g for 10 minutes and the supernatant was discarded. The pellet containing the SVF was resuspended in DMEM/F12 (Gibco, Thermo Fisher Scientific, Darmstadt) containing 10% FCS (F7524, Sigma-Aldrich, Taufkirchen, Germany) and 1% penicillin-streptomycin (P/S; PAA Laboratories, Linz, Austria). After filtering the cell suspension through a 70 μm cell strainer the cells were plated, washed with PBS on the next day and medium was changed to proliferation medium. Proliferation and differentiation of isolated preadipocytes was carried out as described earlier [DOI[JH1]:10.1056/NEJMoa1502214]
Human primary AMSCs were isolated from liposuction material. Each patient gave written informed consent prior to inclusion and the study protocol was approved by the ethics committee of the Technical University of Munich (study nr. 5716/13). The liposuction material was immediately transported to the laboratory and stored with an equal amount of DMEM-F12 (Gibco, Thermo Fisher Scientific, Darmstadt) containing 1% penicillin-streptomycin (P/S; PAA Laboratories, Linz, Austria) over night at 4° C. On the next day the samples were digested in a 1:4 ration with Krebs-Ringer Phosphate (KRP) buffer containing 200 U/ml collagenase (SERVA, Heidelberg, Germany) at 37° C. in a shaking water bath for 60 minutes. After digestion the adipocyte/oil containing layer was removed and the remaining liquid containing the SVF was filtered through a 2000 μm nylon mesh. The SVF was pelleted through centrifugation for 10 minutes at 200 g. The supernatant was discarded and the pellet was resuspended in 37 ° C. warm erythrocyte lysis buffer (155 mM NH4Cl, 5.7 mM K2HPO4, 0.1 mM EDTA dihydrate) and incubated at room temperature for 10 minutes. The cell suspension was filtered through a 250 Î ¼ m Filter and then through a 150 μm Filter, followed by centrifugation at 200 g for 10 minutes. The supernatant was discarded and the pellet containing AMSCs was resuspended in DMEM/F12 containing 1% P/S and 10% FCS (Sigma, F7524). Cells were seeded and washed with PBS on the next day before switching to proliferation medium. Proliferation and differentiation was carried out as described earlier [DOI[JH3]: 10.1056/NEJMoa1502214]
The 2q24.3 MONW Risk Locus Overlaps with Enhancer Signatures in Adipocyte Progenitors
To identify diseases and traits associated with the 2q24.3 locus, Applicants visualized large-scale phenome-wide associations from the UK Biobank (UKBB) (Gagliano et al., 2020). Jointly analyzing phenotypes across the UKBB Applicants observed that the 2q24.3 locus associated with increased T2D risk as well as a series of body fat-related traits (
The 2q24.3 locus encompasses 55 kilobases, spanning from COBLL1 intronic regions to the intergenic region between GRB14 and COBLL1 (
Next, Applicants examined whether the two haplotypes show differences in chromatin structure during adipocyte differentiation. Specifically, Applicants performed assays for enhancer activity (H3K27ac ChIP-seq) and chromatin accessibility (ATAC-seq) on adipose-derived mesenchymal stem cells (AMSCs) from heterozygous individuals across a time course of differentiation (before induction (Day 0), early differentiation (Day 2), intermediate differentiation (Day 6) and terminal differentiation (Day 14)) and compared the numbers of reads from the two haplotypes (
To identify which of the 19 candidate regulatory variants is likely mediating the differential enhancer activity in adipocyte progenitors (
Applicants next performed in silico saturation mutagenesis to evaluate the predicted change in chromatin accessibility from mutation at every position to each alternative nucleotide within a 20 bp region surrounding rs6712203 using ATAC-Seq data during AMSC differentiation. Applicants found that the rs6712203 T allele is critical for a POU2F2 motif (
Applicants next sought to establish rs6712203 causality by directly confirming that the haplotype-specific effects on enhancer activity and POU2F2 binding are mediated by rs6712203 using CRISPR-based genome editing at this SNP. Applicants edited SGBS preadipocytes (n=5) that are heterozygous at rs6712203 to create isogenic lines for the TT (non-risk genotype) and CC (risk genotype) alleles. Applicants observed that cells harboring the CC homozygous risk showed 2.4-fold lower COBLL1 expression levels compared to the TT non-risk genotype (
Applicants used three-dimensional genome conformation data from Hi-C assays in embryonic stem cell-derived MSCs (Dixon et al. 2015) to define the physical boundaries of potential proximal and long-distant target genes and found that the locus lies in a well-defined contact domain containing only two genes, COBLL1 and GRB14 (
To understand the role of COBLL1 in adipocyte cellular programs, Applicants first examined the gene expression and cellular localization of COBLL1 in differentiating adipocytes and observed that COBLL1 is expressed at any given stage of adipocyte differentiation with an increase in mRNA and protein levels over the course of differentiation (
To connect the 2q24.3 locus to cellular functions in adipose Applicants used genome-wide co-expression matrices in adipocytes matched with a series of cellular assays. Applicants identified COBLL1 co-regulated genes in genome-wide expression data from primary human AMSCs in a cohort of 12 healthy, non-obese individuals. COBLL1 co-expressed genes were highly enriched in biological processes related to ‘Regulation of actin cytoskeleton’ and ‘Regulation of lipolysis in adipocytes’, including ITGAM (Integrin Subunit Alpha M), PIK3CA (Phosphatidylinositol-4,5-Bisphosphate 3-Kinase Catalytic Subunit Alpha), ROCK2 (Rho-associated protein kinase 2), ITGA1 (Integrin alpha-1), ARHGEF7 (Rho Guanine Nucleotide Exchange Factor 7), CRK, FGFR2 (Fibroblast Growth Factor Receptor 2), ARHGEF6 (Rho Guanine Nucleotide Exchange Factor 6) (
To identify morphological and cellular traits associated with altered COBLL1 expression, Applicants used siRNA-mediated knockdown of COBLL1 in AMSCs coupled with a high-content imaging read-out that Applicants recently developed, Adipocyte Profiler (Sec Example BioRXiv). Adipocyte Profiler allows to examine generic as well as adipocyte-specific cellular traits at four time-points of adipocyte differentiation (before differentiation (day 0), three days (day 3), nine days (day 9) and 14 days (day 14) after adipogenic induction) (
To investigate if the COBLL1 effect on actin remodeling in adipocytes impacts adipocyte cellular programs related to metabolic disease, Applicants performed stable ablation of COBLL1 using lentivirus (shCOBLL1) in differentiating adipocytes. Applicants observed that ablation of COBLL1 resulted in decreased capacity to differentiate into metabolically active round-shaped lipid filled mature adipocytes, as shown by decreased Oil-Red-O staining of accumulated triglycerides (
Applicants further examined the effect of GRB14 stable knockdown in AMSCs and observed that GRB14 ablation did not significantly decrease adipocyte differentiation capacity as measured by Oil-Red-O staining, GPDH activity (
Together, Applicants connect COBLL1, an 2q24 effector gene, to actin cytoskeleton remodeling processes in differentiating subcutaneous adipocytes, accompanied by a failure in adipocyte differentiation and function, including increased glucose uptake in response to insulin, and lipid break-down to free fatty acids.
To confirm that the changes on the actin cytoskeleton and subsequent effects on adipocyte functions are under the genetic control of the rs6712203 MONW risk haplotype, Applicants used Adipocyte Profiler (see, Example 1) to phenotypically profile primary human adipocytes across differentiation from individuals carrying the risk haplotype (n=6) compared the non-risk haplotype (n=7) using Adipocyte Profiler (
Applicants generated a CRISPR engineered Cobll1 knockout (Cobll1−/−) mouse model to determine a potential role for Cobll1 in the regulation of metabolic function in vivo. First, Applicants sought to assess the effect of Cobll1 knockout on morphological and cellular profiles in differentiating murine perigonadal AMSCs by Adipocyte Profiler (day 0, day 2 and day 10 of differentiation,
Applicants assessed the impact of the 2q24.3 MONW locus effector COBLL1 on organismal processes, assayed for growth and body composition phenotypes in Cobll1−/− mice. At 10 weeks of age, Applicants found that Cobll1−/− homozygous animals displayed 20-25% less weight gain compared to the WT control and Cobll1 heterozygous (Cobll1+/−) littermates (
The 2q24.3 locus is pleiotropic in nature and, intriguingly, is associated with increased risk of T2D and simultaneously with decreased body fat percentage, reminiscent of a MOHN/MOH phenotype association signature. Here, Applicants applied a series of experimental and computational approaches to systematically dissect the 2q24.3 metabolic risk locus and link it to a causal variant (sr6712203), its effector gene (COBLL1), its causal cell type and cell context (developmental time point, adipose depot) and the cellular mechanisms the locus affects (actin remodeling). Together, these altered cellular functions that are relevant for T2D and body fat percentage and distribution, including adipocyte differentiation into metabolically active subcutaneous adipocytes, lipid metabolism and insulin-responsive glucose uptake. When ablating Cobll1 in mice Applicants show a ‘lipodystrophy-like phenotype’, recapitulating the pleiotropic association with T2D and decreased body fat mass in humans. These data use genetic evidence to provide mechanistic evidence that a common genetic variant limits peripheral energy storage capacity and simultaneously affects insulin responsiveness.
The results of this study lend support to the common hypothesis that the individual risk of T2D and fasting insulin is modified by changes to the mass, distribution and function of adipose tissue (Lotta et al. 2017; Small et al. 2018), and that a metabolically healthy state is largely dependent on subcutaneous adipose tissue expandability. Inherited and acquired lipodystrophies, as characterized by the selective or global perturbation of adipose tissue function, mass and distribution, result in severe forms of insulin resistance and diabetes, and shared molecular mechanisms between rare familial partial lipodystrophy type 1 and common forms of insulin resistance at the genetic level have been previously suggested (Lotta et al. 2017). Several common metabolic risk loci are characterized by a MONW/MOH association, and distinct association signatures suggest multiple mechanisms at play, most of which remain to be identified (Loos and Kilpeläinen 2018; Kilpeläinen et al. 2011; Fathzadeh et al. 2020). Previous work has convincingly implicated variants at the FAM13A locus to affect metabolic disease risk by affecting subcutaneous adipocyte differentiation (Fathzadeh et al. 2020). In this work, Applicants implicate for the first time actin cytoskeleton remodeling as a critical factor for subcutaneous adipocyte function and as causally involved in metabolic disease progression in humans, stressing the notion that MONW/MOH predisposing loci control distinct cellular programs.
Applicants observed evidence of sex-dimorphic effects when conditioning MONW traits on rs6712203 which is in line with a reported sexual dimorphism for WHIR consistently conveying stronger effects in women (Heid et al. 2010; Morris et al. 2012; Randall et al. 2013; Sung et al. 2016) and a sex-independent effect on T2D (Vujkovic et al. 2020; Spracklen et al. 2020) and with a sex-dimorphic effect on gene expression for COBLL1, but not for GRB14 (Lagou et al. 2021).
The COBLL1 protein has been introduced as a biomarker of high prognostic value for different types of cancer (Gordon et al., 2003, 2009; Wang et al., 2013; Han et al., 2017), a modulator of cell morphology in prostate cancer (Takayama et al. 2018), and lipid metabolism and insulin signaling in adipocytes (Chen et al. 2020). Here, Applicants establish a chain-of-causation linking the 2q24.3 locus to its functional variant, its adipocyte cell type and context specific effect, its regulatory element, its effector gene COBLL1, and finally its causal cellular function, i.e. actin remodeling in differentiating adipocytes, which is under the genetic control of both the locus and the target gene. Consequently, Applicants establish the gene as a key regulator of subcutaneous adipocyte differentiation, lipid metabolism and insulin sensitivity at the cellular as well as the organismal level. These findings are in line with recent reports linking actin dynamics, regulated by the F/G-actin ratio, and insulin-stimulated trafficking and fusion of GLUT4 vesicles (Chen et al. 2018; Kanzaki and Pessin 2001; Kim et al. 2019).
While the insulin receptor adaptor protein GRB14 (Growth Factor Receptor Bound Protein 14) is an intuitive effector target gene at the 2q24.3 locus and has been shown to affect glucose tolerance (Cariou et al. 2004; Cooney et al. 2004; Chen et al. 2020), Applicants causally implicate COBLL1 as at least one causal effector gene at the locus. Applicants note that COBLL1 as the effector gene that underlies the T2D) association is further corroborated by the T2D-associated coding variant Asn939Asp in COBLL1 (MAF=0.12, p=4.7×10−11) (Fuchsberger et al. 2016). Furthermore, recent rare variant aggregation analyses at COBLL1 revealed nominal association with WHR (Kan et al. 2016), concordant with the findings that COBLL1 drives at least part of the 2q24.3 genetic risk. the sequence based predictive models score rs6712203 highest across all 2q24.3 haplotype variants though multiple other variants at the locus are predicted to affect regulatory activity as well. Therefore, while beyond the scope of this study, Applicants note that multiple variants could act in concert at this locus potentially implicating GRB14 along with COBLL1 as effector genes.
The 2q24.3 locus is a prime example of a common genetic locus that predisposes to limited peripheral adipose storage capacity and insulin resistance, driven by an impairment of dynamic actin cytoskeleton remodeling process of the differentiating subcutaneous adipocyte.
Applicants obtained AMSCs from subcutaneous and visceral adipose tissue from patients undergoing a range of abdominal laparoscopic surgeries (sleeve gastrectomy, fundoplication or appendectomy). The visceral adipose tissue is derived from the proximity of the angle of His and subcutaneous adipose tissue obtained from beneath the skin at the site of surgical incision. Additionally, human liposuction material was obtained from a collaborating private plastic surgery clinic Medaesthetic Privatklinik Hoffmann & Hoffmann in Munich, Germany. Isolation of AMSCs was performed as previously described (Claussnitzer 2014; Hauner et al. 2001).
For imaging, cells were seeded at 10K cells/well in 96-well plates (High-content imaging; Cell Carrier, Perkin Elmer #6005550) or seeded at 18,000 cells/well in collagen IV coated 8 well μ-slides (Higher-resolution imaging; ibidi, Gräfelfing, Germany #/80822) and induced 4 days after seeding. For RNAseq, cells were seeded at 40K cells/well in 12-well dishes (Corning). Before Induction cells were cultured in proliferation medium (Basic medium consisting of DMEM-F12 1% Penicillin-Streptomycin, 33 μM Biotin and 17 μM Pantothenate supplemented with 0.13 μM Insulin, 0.01 ug/ml EGF, 0.001 ug/ml FGF, 2.5% FCS). Adipogenic differentiation was induced by changing culture medium to induction medium. (Basic medium supplemented with 0.861 μM Insulin, 1 nM T3, 0.1 μM Cortisol, 0.01 mg/ml Transferrin, 1 μM Rosiglitazone, 25 nM Dexamethasone, 2.5 nM IBMX). On day 3 of adipogenic differentiation culture medium was changed to differentiation medium (Basic medium supplemented with 0.861 μM Insulin, 1 nM T3, 0.1 μM Cortisol, 0.01 mg/ml Transferrin). Medium was changed every 3 days. Visceral-derived AMSCs were differentiated by further adding 2% FBS as well as 0.1 mM oleic and linoleic acid to the induction and differentiation media.
Genotyping was performed using the Illumina Global Screening beadchip array. DNA was extracted using Qiagen DNeasy Blood and Tissue Kit (Qiagen 69504) and sent to the Oxford Genotyping Center for genotyping on the Infinium HTS assay on Global Screening Array bead-chips. Genotype QC was done using GenomeStudio and genotypes were converted into PLINK format for downstream analysis. Applicants checked sample missingness but found no sample with missingness >5%. For the remaining sample quality control (QC) steps, Applicants reduced the genotyping data down to a set of high-quality SNPs. These SNPs were: (a) Common (minor allele frequency >10%); (b) Had missingness <0.1%; (c) Independent, pruned at a linkage disequilibrium (R2) threshold of 0.2; (d) Autosomal only; (e) Outside the lactase locus (chr2), the major histocompatibility complex (MHC, chr6), and outside the inversions on chr8 and chr17; (f) In Hardy-Weinberg equilibrium (P>1x10 3). Using the remaining ˜65,000 SNPs, Applicants checked samples for inbreeding (--het in PLINK), but found no samples with excess homozygosity or heterozygosity (no sample >6 standard deviations from the mean). Applicants also checked for relatedness (--genome in PLINK) and found one pair of samples to be identical; Applicants kept the sample with the higher overall genotyping rate. Finally, Applicants performed PCA using EIGENSTRAT and projected the samples onto data from HapMap3, which includes samples from 11 global populations. Six samples appeared to have some amount of non-European ancestral background, while the majority of samples appeared to be of European descent. Applicants removed no samples at this step, selecting to adjust for principal components in genome-wide testing. However, adjustment for principal components failed to eliminate population stratification, and Applicants therefore restricted to samples of European descent only, defined as samples falling within +/−10 standard deviations of the first and second principal component values of the CEU (Northern and Western European-ancestry samples living in Utah) and TSI (Tuscans in Italy) samples included in the HapMap 3 dataset. Finally, sex information was received after initial sample QC was complete. As a result, one sample with potentially mismatching sex information (comparing genotypes and phenotype information) was discovered after analyses were complete and therefore remained in the analysis.
Applicants removed all SNPs with missingness >5% and out of HWE, P<1x10-6. Applicants also removed monomorphic SNPs. Finally, Applicants set heterozygous haploid sites to missing to enable downstream imputation. The final cleaned dataset included 190 samples and ˜700,000 SNPs. Applicants note that histology data was not available for all genotyped samples.
For the genotyped cohorts without imputation data (ENDOX and MOBB) Applicants performed imputation via the Michigan Imputation Server. Applicants aligned SNPs to the positive strand, and then uploaded the data (in VCF format) to the server. Applicants imputed the data with the Haplotype Reference Consortium (IIRC) panel, to be consistent with the fatDIVA data which was already imputed with the IRC panel. Applicants selected EAGLE as the phasing tool to phase the data. To impute chromosome X, Applicants followed the server protocol for imputing this chromosome (including using SHAPEIT to perform the phasing step).
ATAC-seq was performed by adapting the protocol from (Buenrostro et al., 2015) by adding a nuclei preparation step. Differentiating cells were lysed directly in cell culture plate at four time-points during differentiation (before adipogenesis was induced (DO), during early (D3) and advanced differentiation (D6), as well as at terminal differentiation (D24)). Ice-cold lysis buffer was added directly onto cells grown in a 12-well plate. Plates were incubated on ice for 10 minutes until cells were permeabilized and nuclei released. Cells in lysis buffer were gently scraped off the well and transferred into a chilled 1.5 ml tube to create crude nuclei. Nuclei were spun down at 600×g for 10 minutes at 4° C. Nuclei pellets were then re-suspended in 40 μl Tagmentation DNA (TD) Buffer (Nextera, FC-121-1031) and quality of nuclei assessed using trypan blue. Volume of 50,000 nuclei was determined using a haemocytometer. Transposition reaction was performed as previously described (Buenrostro et al., 2015). All tagmented DNA was PCR amplified for 8 cycles using the following PCR conditions: 72° C. for 5 minutes, 98° C. for 30 seconds, followed by thermocycling at 98° C. for 10 seconds, 63° C. for 30 seconds and 72° C. for 1 minute. Quality of ATAC-seq libraries was assessed using a Bioanalyzer High Sensitivity ChIP (Applied Biosystems). The profiles showed that all libraries had a mean fragment size of ˜200 bp and characteristic nucleosome patterning, indicating good quality. Libraries were pooled and sequenced on a HiSeq4000 Illumina, generating 50 mio reads/sample, 75 bp paired end. To reduce bias due to PCR amplification of libraries, duplicate reads were removed. Sequencing reads were aligned to hs37d5 and BWA-MEM was used for mapping. All experiments were performed in technical duplicates.
Human primary AMSCs and mouse AMSCs were plated and differentiated in 96-well CellCarrier plates (Perkinelmer #6005550) for 14 days for high content imaging at day 0, day 3, day 8 and day 14 of adipogenic differentiation. On the respective day of the assay, cell culture media was removed and replaced by 0.5 μM Mitotracker staining solution (1 mM MitoTracker Deep Red stock (Invitrogen #M22426) diluted in culture media) to each well followed by 30 minutes incubation at 37° C. protected from light. After 30 min Mitotracker staining solution was removed and cells were washed twice with Dulbecco's Phosphate-Buffered Saline (1×), DPBS (Corning®) #21-030-CV) and 2.9 μM BODIPY staining solution (3.8 mM BODIPY 505/515 stock (Thermofisher #D3921) diluted in DPBS) was added followed by 15 minutes incubation at 37° C. protected from light. Subsequently, cells were fixed by adding 16% Methanol-free Paraformaldehyde, PFA (Electron Microscopy Sciences #15710-S) directly to the BODIPY staining solution to a final concentration of 3.2% and incubated for 20 minutes at RT protected from light. PFA was removed and cells were washed once with Hank's Balanced Salt Solution (1×), HBSS (Gibco #14025076). To permeabilize cells 0.1% Triton X-100 (Sigma Aldrich #X100) was added and incubated at RT for 10 minutes protected from light. After Permeabilization multi-stain solution (10 units of Alexa Fluor™ 568 Phalloidin (ThermoFisher #/A12380), 0.01 mg/ml Hoechst 33342 (Invitrogen #/H3570), 0.0015 mg/ml Wheat Germ Agglutinin, Alexa Fluor™ 555 Conjugate (ThermoFisher #W32464), 3 μM SYTO™ 14 Green Fluorescent Nucleic Acid Stain (Invitrogen/S7576) diluted in HBSS) was added and cells were incubated at RT for 10 minutes protected from light. Finally, staining solution was removed and cells were washed three times with HBSS. Cells were imaged using a Opera Phenix High content screening system. Per well Applicants imaged 25 fields.
To stain the actin cytoskeleton, COBLL1 and nuclei, cells were washed twice with ice cold PBS and fixed with paraformaldehyde Roti-Histofix 4% (Roth, Karlsruhe, Germany) for 15 min. Cells were washed twice with ice cold PBS for 5 min and incubated with ice cold 0.1% Triton X/PBS (Roth, Karlsruhe, Germany) for 5 min. Cells were washed again twice with PBS and incubated for 1 hour at RT with 4% BSA, then incubated with 1:100 primary COBLL1-antibody (specification: HPA053344; atlas antibodies, Bromma, Sweden) overnight at 4° C., followed by one hour at room temperature. Cells were washed twice with PBS and stained with 0.46% Bisbenzimide H 33258 (Sigma-Aldrich, Steinheim, Germany), 1% Phalloidin-Atto-565 (Sigma-Aldrich, Steinheim, Germany) and the secondary antibody against COBLL1 1:200 Alexa-Fluor 488 (Abcam, Cambridge, UK). Cells were incubated for one hour at RT in the dark. Afterwards, cells were washed twice with PBS for 5 min and kept in PBS at 4° C. until imaging. Images were acquired on a Leica DMi8 microscope using the HIC PL APO ×63/1.40 oil objective. Images were processed using the Leica LasX software.
Quantitation was performed using CellProfiler 3.1.9. Prior to processing, flat field illumination correction was performed using functions generated from the mean intensity across each plate. Nuclei were identified using the DAPI stain and then expanded to identify whole cells using the AGP and Bodipy stains. Regions of cytoplasm were then determined by removing the Nuclei from the Cell segmentations. Speckles of Bodipy staining were enhanced to assist in detection of small and large individual Bodipy objects. For each object set measurements were collected representing size, shape, intensity, granularity, texture, colocalization and distance to neighboring objects. After feature extraction data was filtered by applying automated and manual quality control steps. First, fields with a total cell count less than 50 cells were removed. Second, fields that are corrupted by experimental induced technical artifacts were removed by applying a manually defined quality control mask. Furthermore, blocklisted features that are known to be noisy and generally unreliable were removed. After filtering data were normalised per plate using a robust scaling approach that subtracts the median from each variable and divides it by the interquartile range. For each individual wells were aggregated for downstream analysis by cell depot and day of differentiation. Subsequent data analyses were performed in R3.6.1 using base packages unless noted. For dimensionality reduction visualization Uniform manifold approximation and projection maps (UMAP) were created using the UMAP R package (github.com/Imcinnes/umap) with default settings.
To test whether there is a difference of morphological profiles at any day of differentiation due to COBLL1 KD both individuals were analyzed separately using a t-test. To test whether there is a difference of morphological profiles at any day of differentiation between risk and non-risk haplotype a multi-way analysis of variance (ANOVA) was performed. Differences in morphological profiles between TT (n=7) and CC (n=6) allele carriers were adjusted for sex, age, BMI and batch. To overcome multiple testing burden p-values were corrected using false positive rate (FDR) described in R package “qvalue” (github.com/StoreyLab/qvalue). Features with FDR <5% were classified to be significant and filtered based on redundancy and effect size.
COBLL1 Silencing Using siRNA
All silencing experiments were performed on 4 technical replicates. One day before silencing, AMSCs were plated into 96-well plates with 10K cells/well or collagen IV coated 8 well glass μ-slides with 18K cells/well using growth medium. RNA-based silencing of COBLL1 was performed using RNAiMAX Reagent (ThermoFisher #13778075) and following the manufacturer's protocol. Briefly, Lipofectamine® RNAiMAX Reagent was diluted in Opti-MEM medium (Gibco, Cat #11058021). At the same time, siRNA was diluted in Opti-MEM medium. Then, diluted siRNA was added to the diluted Lipofectamine®) RNAiMAX reagent at a ratio 1:1 and incubated for 5 min. For coated 8 well glass μ-slides incubated for 20 min at RT. The concentration of reagents per well in a 96-well plate were 0.5 μl (10 μM) of silencing oligo (Ambion Cat #4392420, IDs22467) or negative control duplex (Ambion Cat #4390846), and 1.5 μl of lipofectamine RNAiMAX Reagent. The plate was gently swirled and placed in a 37° C.′ incubator at 5% CO2 for three days. Cells were then induced to differentiate following the standard differentiation cocktail or harvested for gene expression analysis to assess knockdown efficiency.
RNA Preparation and qPCR
Total RNA was extracted with Trizol (Ambion 15596026) and the Direct-zol RNA MiniPrep Kit (Zymo R2052) following the manufacturer's instructions. cDNA was synthesized with High-Capacity cDNA Reverse Transcription Kit (Applied Biosystems 4368814) following the manufacturer's instructions. qPCR was performed using Thermo Scientific PCR Master Mix (Thermo Scientific K0172) and taqman probes for target gene COBLL1 (Thermo Scientific, Cat #4448892, ID Hs01117513_m1) and housekeeping gene CANX (Thermo Scientific, Cat #/4448892, ID Hs01558409_m1). Relative gene expression was calculated by the delta delta Ct method. Target gene expression was normalized to expression of CANX.
RNA-seq reads were trimmed using SeqPurge with the following command:
For transcript-level quantification, trimmed reads were analysed using Kallisto (with 25 bootstraps) and the TPM estimates were log-transformed and the top 10 PCs were computed. Next, reads were summed across all transcripts of a given gene to obtain gene-level estimates of the expression in each sample.
For splicing analysis with Leafcutter, reads were mapped with STAR using the following arguments: STAR --twopassMode Basic --outSAMstrandField intronMotif --readFilesCommand zcat --outSAMtype BAM Unsorted
And then processed using samtools and regtools to convert to a junc file with the following command: regtools junctions extract -s 1 -a 8 -m 50 -M 500000
Finally, reads were clustered into splicing events with the following command from the Leafcutter project: leafcutter_cluster_regtools.py -j <files> -m 50 -l 500000
These clusters were then converted to transcripts per million and modeled as a function of rs6712203 genotype.
Transcript-level (log) RNA expression was compared between COBLL1 and all other quantified genes using linear regression. The effect of COBLL1 on other genes was compared adjusted for expression PCs (described above), sample depot source, cell line, and day of differentiation. This resulted in effect sizes of individual genes in terms of how similar they are to COBLL1 and those with estimates that had Bonferroni adjusted P-value >1e-3, absolute effect size <0.1 or >10 were excluded. This left a list of similarly expressed genes with strong association with COBLL1, which were uploaded to Enrichr and analysed as a gene list against the KEGG, WikiPathways, and HCI pathways. The full set of tests is available at maayanlab.cloud/Enrichr/enrich?dataset 1a9a07019bfd8bbddc6cb6c26641bfcf and the sensitivity evaluation in which very lowly expressed and highly expressed genes were not excluded (via the thresholds on absolute effect size described above) are available at maayanlab.cloud/Enrichr/enrich?dataset 231b12708d04818007d93364c489fab7.
PMCA results were replicated from (Claussnitzer et al., 2014). Briefly, transcription factor binding sites and their co-occurrence across species were tallied and classified into complex and non-complex regions. Complex regions were counted on the basis of motifs aligned across species, and those were then plotted against the Basset scores (below) to discover putative causal variants.
Basset models were trained and evaluated as in (Sinnott-Armstrong et al. 2021). Briefly, models were trained to capture chromatin regulation relevant to adipocyte differentiation and these effects were estimated by determining the difference in effect between alleles at each variant. The variants with the largest effect on accessibility were considered the most important and most likely to be causal.
Allele-specific analyses were performed as in (Sinnott-Armstrong et al. 2021). Briefly, reads were aligned from a heterozygous individual on the basis of the variant and the number of reads supporting each allele were tallied at each timepoint and across variants on the haplotype.
Variants (n=6167) within 100 kb of rs6712203 were included in the analysis. White British individuals in the UK Biobank were analyzed with phenotypes type 2 diabetes (as described in (Eastwood et al. 2016)), log waist-to-hip ratio adjusted for body mass index, hip circumference, and whole body fat mass. Individuals were stratified on the basis of reported sex and filtered to the White British unrelated individuals as described in (Sinnott-Armstrong et al. 2021). Conditional analyses and all associations were performed using Plink2.
EMSA experiments were performed using double stranded Cy5-labelled probes with the risk or non-risk allele of each variant at mid-position. The forward Cy5-labelled strand (for rs6712203 5′-TTAATTTGCCTCATTCATCA[C/T]ATGCAATTCTGGCAAGGAA-3′ (SEQ ID NOS: 57-58) and for rs10195252 5′-CCCCACTTCCCTCTAGGGAA[T/C]GGGAAAGAACATTTAACCT-3′ (SEQ ID NOS; 59-60) and respective unlabeled reverse complementary strands were synthesized (Eurofins, Ebersberg, Germany), annealed and purified from single stranded remains by excision from a 12% polyacrylamide gel. Nuclear protein extracts from primary mature human adipocytes were extracted according to the protocol described by Dugail and colleagues (Dugail 2001).
For EMSA experiments, 1-2 μl buffer containing 3-5 ug proteins was added to 10 mM TrisHCl (pH 7.5), 1 mM MgCl2, 50 mM NaCl, 0.5 mM EDTA, 4% (v/v) glycerol, 0.5 mM DTT and 30 ng/μl poly(dI-dC). After 10 minutes incubation on 4° C., 1 ng of the respective Cy-5 labelled probe was added and the samples were incubated for 20 min at 4° C. After addition of loading buffer with 25 mM TrisHCl pH 7.5, 0.02% OrangeG, 4% glycerol, the samples were subjected onto a nondenaturing 5.3% polyacrylamide gel. After gel separation, Cy-5 fluorescence was detected using the Typhoon TRIO I imager (GE Healthcare, Germany).
The Intragenomic Replicates (IGR) method was used for POU2F2 affinity modeling using POU2F2 ChIP-seq data as previously described (Cowper-Sal lari et al. 2012). In order to correct for systematic bias in the sequencing depth around particular k-mers, all scores were offset by a “baseline” value, defined as the average signal between the forward and reverse complement instances of the k-mer between −200 and −195 and between 195 and 199 bases away from the k-mer center. Thus, if the value across the whole −200 to +199 context was approximately equal, then the overall score is approximately zero, and positive estimated affinities are only possible in cases where the score in the middle of the context is significantly higher than the outside. To further include only large effect binding differences, the “prominence” was defined as the maximum score across any point in the context for either the forward or reverse complement version of the k-mer for both alleles and the “maximum difference” as the maximum absolute difference in scores between the two alleles at any point in the window. The “baseline ratio” was defined as the ratio of the maximum difference to the prominence, which varies between 0 (if the two alleles are equal at all points) and 2 (if they are perfectly complementary at their highest absolute point).
In order to find only high-quality putative disrupted binding sites, the k-mer sequence that gave the highest affinity under the germline was recorded as “reference” and the k-mer sequence which gives the highest affinity under the somatic variant as “alternate.” The “quality” of a given kmer was defined as the correlation between the average context plot forward and the reverse of the average context plot of the reverse complement, and the “symmetry” of a given k-mer as the correlation between the average context plot forward and the average context plot reverse. Quality is high when the antiparallel binding is preserved and symmetry is high when the peak signal is centered with respect to the variant. The results were included as “passed” when the Bonferroni corrected p-value for the comparison is less than 0.05, the baseline ratio is greater than 0.5, the quality and symmetry are both greater than 0.85 for one of the alleles, and the quality and symmetry are both greater than 0.5 for the other allele.
A global gene expression measurement was performed, using Illumina HumanRef-8 v.3 BeadChip microarrays from whole abdominal subcutaneous adipose tissue. Signal intensities were quantile normalized before the correlation analysis.
To edit the rs6712203 heterozygous allele in SGBS preadipocytes to the homozygous risk (CC) and non-risk (TT) alleles Applicants applied the CRISPR/Cas9 homology directed repair genome editing approach. The hCas9 vector was purchased from Addgene (Plasmid ID #41815). The guide sequence was selected using the design tool (Zhang Lab, MIT 2013) with a predicted number of 228 potential off target sites, located 211 bp upstream of rs6712203. It was cloned in front of the U6 promoter into the BbsI cloning site of the sgRNA-expression vector (Dr. Ralf Kühn, Helmholtz Zentrum München-Neuherberg), using double stranded oligonucleotides 5′-CACCGACTCTCCACTACCATTGCCA-3′ (SEQ ID NO: 61) and 5′-AAACTGGCAATGGTAGTGGAGAGTC-3′ (SEQ ID NO: 62). For amplification of the 2009 bp homology region with the risk or non-risk allele of rs6712203 at mid position, genomic DNA of SGBS cells was amplified with primers 5′-GGTGGTCCCATTAAAAAGAAAGAAGCTTGG-3′ (SEQ ID NO: 63) and 5′-CTTCTCTTTTACCCTGCTGGCTACTGGTTG-3′ (SEQ ID NO: 64) using the High-Fidelity Q5 DNA polymerase (NEB). The gel purified PCR product was cloned into the blunt end pJet1.2 vector using the CloneJET PCR Cloning kit (Fermentas). A clone with the rs6712203 C allele was selected and the corresponding T allele vector was generated using the Q5 Site-Directed Mutagenesis Kit (NEB) with the primers 5′-TCATTCATCATATGCAATTCTGG-3′ (SEQ ID NO: 65) and 5′-GGCAAATTAATATTTAGGATTATATC-3′ (SEQ ID NO: 66). To avoid Cas9 reactivity after genome editing, the NGG guide target sequence was mutated to NCG in both homology vectors with the primers 5′-CCATTGCCAACGGCTGAGTCAG-3′ (SEQ ID NO: 67) and 5′-TAGTGGAGAGTTCTCACAAAAC-3′ (SEQ ID NO: 68). SGBS cells were co-transfected with the GFP (Lonza), the hCas9, the respective sgRNA, and the pMACS 4.1 (Milteny) plasmids using the Amaxa-Nucleofector device (program U-033) (Lonza). The cells were sorted using the MACSelect™ Transfected Cell Selection kit (Miltenyi). The integrity of each edited vector construct and the SGBS cell nucleotide exchange was confirmed by DNA sequencing (Eurofins, Ebersberg, Germany).
For the production of lentiviral particles, the MISSON® Lentiviral Packaging Mix (Sigma Aldrich, Steinheim, Germany) was used according to the manufacturer's instructions. Briefly, packaging cells HEK293T were grown in a low antibiotic growth medium (DMEM, 10% FCS, 0.1% penicillin/streptomycin). When cells were about 70% confluent they were co-transfected, using X-treme GENE HP (Roche, Penzberg, Germany), with the packaging plasmid pCMVdeltaR8.91, the envelope plasmid pMD2.G and the pLKO-based plasmid containing shRNA against the human target gene COBLL1 NM_014900.2-3071s1c1, COBLL1 NM_014900.2-4440s1c1, GRB14 NM_004490.1-1581s1c1 or empty-vector MISSION® TRC2 pLKO.5-puro plasmid (Sigma Aldrich, Steinheim, Germany). The cells were incubated for 24 hours, the medium was discarded and replaced with a serum rich medium (30% FCS). The supernatant containing the viable virus particles was collected 48 and 72 hours post transfection, centrifuged to remove cellular debris, and stored at −80° C.
SGBS cells were seeded at a concentration of 2.6×104 cells per 6-well plate and grown in normal growth medium. After 24 hours the medium was replaced and supplemented with 8 μg/ml Polybrene (Sigma-Aldrich, Steinheim, Germany) and virus supernatant with a multiplicity of infection (MOI) of 2. On the consecutive 2 days cells were washed with PBS and medium was replaced to remove the virus. The medium was supplemented with 0.5 ug/ml puromycin 96 hours after infection, to select stable clones. When cells were grown confluent, puromycin was removed from the medium and the cells were differentiated until day 16. Target gene silencing was confirmed after selection and on the day of each experiment by qRT-PCR.
Cells were grown to confluence and differentiated until day 16 in 6 well plates. Cells were collected in a GPDH buffer with 0.05 M Tris/HCl (pH 7.4), 1 mM EDTA and 1 mM Mercaptoethanol, before they were stored at −80° C. until further use. Samples were gently defrosted on 4° C.′, and were sonified for 7 sec at 29% and centrifuged for 10 min at 10.000 g on 4° C. GPDH activity was measured as previously described (Pairault and Green 1979). Briefly, GPDH activity was assessed, measuring the conversion of dihydroxyacetone phosphate (DHAP) (Sigma, St. Louis, USA), in the presence of the coenzyme nicotinamide adenine dinucleotide (NADH) (Omnilab-Applichem, Bremen, Germany) at a wavelength of 340 nm, using the Tecan Infinite 200 (Tecan, Crailsheim, Germany). Protein concentrations were assessed using the BCA-RAC protein assay kit (Thermo Scientific, Germany), with BSA standard samples in GPDH buffer for quantification. The value for each condition was calculated using the ratio between GPDH activity and protein concentration.
For glucose, glycerol and Western Blot analysis shRNA COBLL1 and shRNA empty-vector SGBS cells were differentiated until day 16 in 6 well plates. The insulin-stimulated 2-desoxy-D)-glucose (2-DG) uptake experiment was performed as previously described (Claussnitzer et al. 2011). Briefly cells were incubated in glucose-free DMEM and F12 (1:1) containing 1% penicillin/streptomycin, 16 μM biotin, 36 μM pantothenic acid, 14.3 mM NaHCO3 and 0.5 mM Na-pyruvate (Sigma-Aldrich, Steinheim, Germany) for 12 hours. The medium was replaced with 118 mM NaCl, 1.2 mM KH2PO4, 4.8 mM KCl, 1.2 mM MgSO4, 2.5 mM CaCl2, 10 mM HEPES, 2.5 mM Na-pyruvate (Sigma-Aldrich, Steinheim, Germany), 0.5% BSA (Sigma-Aldrich, Steinheim, Germany) (pH 7.35). After 1.5 hours the same buffer was added fresh either without supplement or with 1 μM insulin for 30 min. The radioactive uptake was started by addition of KRH [3H]-2-desoxy-D)-glucose ([3H]-2-DG) at an activity of 1 μCi/ml and 50 μM 2-desoxy-D-glucose. Cells were incubated for 30 min and then washed with PBS. The cells were scraped off, after addition of 200 μL IGEPAL and 150 μM phloretin. The radioactivity was measured using liquid scintillation counting with an external standard.
For the measurement of glycerol release, cells were washed with PBS and incubated for 3 hours in phenol red free DMEM containing 2% FFA (free fatty acid)-free BSA (Roth, Karlsruhe, Germany). The medium was changed and the cells were incubated for 1 hour without supplement for basal lipolysis or addition of 10 μM Isoproterenol (Sigma-Aldrich, Steinheim, Germany) and 0.5 mM IBMX for stimulated lipolysis. The supernatant was collected for spectrophotometric glycerol measurement in a Sirius tube luminometer (Berthold Technologies, Bad Wildbad, Germany), using the glycerokinase (Sigma-Aldrich, Steinheim, Germany) and the ATP Kit SL (BioThema, Handen, Sweden). Remaining cells were collected for protein quantification and Western Blot analysis in RIPA buffer containing 50 mM TrisHCl (pH 8), 150 mM NaCl, 0.2% SDS, 1% NP-40, 0.5% deoxycholate, 1 mM PMSF, phosphatase and protease inhibitors. Western Blot analysis was performed using a mouse anti-human GAPDH IgG (Ambion-Thermo Fisher Scientific, Waltham, USA) and the Lipolysis Activation Antibody Sampler Kit #8334 (Cell Signaling, Danvers, USA) according to the manufacturer's protocol. Secondary IRDye IgG (LI-COR, Bad Homburg, Germany) were used to generate the fluorescence, detected by the Odyssey scanner (LI-COR, Bad Homburg, Germany).
Relative Gene Expression qRT-PCR
Primer pairs were designed using published nucleotide sequences from the human genome GenBank NCBI/UCSC and ensembl, “primer3input” (Untergasser et al. 2012) was used for primer design, “net primer” (Premier Biosoft, San Francisco, USA) for optimization and “primer blast” NCBI GenBank (Ye et al. 2012) to verify specificity against the gene of interest. Primers against the human target genes LEPTIN (forward TGGGAAGGAAAATGCATTGGG (SEQ ID NO: 69); reverse ATAAGGTCAGGATGGGGTGG (SEQ ID NO: 70)) and GLUT4 (forward CTGTGCCATCCTGATGACTG (SEQ ID NO: 71); reverse CCAGGGCCAATCTCAAAA (SEQ ID NO: 72)) and the reference genes IIPRT (forward TGAAAAGGACCCCACGAAG (SEQ ID NO: 73), reverse AAGCAGATGGCCACAGAACTAG (SEQ ID NO: 74)), PPIA (forward TGGTTCCCAGTTTTTCATC (SEQ ID NO: 75); reverse CGAGTTGTCCACAGTCAGC (SEQ ID NO: 76) and IPO8 (forward CGGATTATAGTCTCTGACCATGTG (SEQ ID NO: 77); reverse TGTGTCACCATGTTCTTCAGG (SEQ ID NO: 78)) were synthesised by Eurofins (Ebersberg, Germany).
Total RNA was extracted using the RNeasy Mini Kit (Qiagen, Hilden, Germany) and 0.5 ug was reverse transcribed using the High-Capacity cDNA Reverse Transcription Kit (Thermo Fisher Scientific, Waltham, USA). qRT-PCR was performed using 96 well plates (black frame, white wells) with Heat Sealing Films, fixed by the 4s2 Automated Heat Sealer (all from 4titude, Surrey, UK). The Maxima SYBR Green Mix (Thermo Fisher Scientific, Waltham, USA) was used for amplification in a qRT-PCR Mastercycler® ep realplex (Eppendorf, Hamburg, Germany), with a denaturation step of 95° C. for 10 min and 40 cycles of 95° C. for 15 sec and 60° C. for 40 sec, followed by a melting curve. Relative gene expression was calculated by the delta delta Ct method (Pfaffl 2001) with a reference gene index of HPRT, PPIA and IPO8.
All mice (C57BL/6J) originally were obtained from Charles River Laboratories, Inc. (Wilmington, Massachusetts, USA). To genetically engineer a Cobll1 whole-body knockout (Cobll1−/−) model Applicants used Crispr/Cas9 genome editing system. Male mice were weaned at 4 weeks of age, and body weight was measured every week from 4 to 14 weeks of age. Mice were housed on a 12-hour light/dark cycle with ad libitum access to food (Normal diet: 14% fat, 64.8% carbohydrate, and 21.2% protein, Harlan Teklad). In order to analyze the body fat mass (%), body length (cm), and bone mineral density (BMD, g/cm2) Applicants used the Dual-Energy X-ray Absorptiometry (DEXA) scan. Prior to scanning, animals were anesthetized with ketamine. All procedures were conducted with approval of the Institutional Animal Care and Use Committee (IACUC) of University of Chicago.
To confirm directly that ablation of Cobll1 affects T2D-related phenotypes in vivo Applicants applied the CRISPR/Cas9 system to genetically engineered a Cobll1 whole-body knockout (Cobll1−/−) model. Using specific guide RNAs (sgRNAs), Applicants targeted the Cobll1 gene in the C57BL/6 genetic background. Mice homozygous for a Cobll1-null allele are viable with no evidence of embryonic lethality (data not shown). Applicants used guides with the following sequences: gRNA (exon 2) 5′-TTGCTCACTAGTGGGGTCGCAGG 3′ (SEQ ID NO: 79) and gRNA (exon6) 5′-CTTCCTCCGGCCGAGACGAAGGG-3′ (SEQ ID NO: 80).
The genotypes of Cobll1 mutant mice were determined by PCR amplification of genomic DNA extracted from tails. PCR was performed for 30 cycles at 95° C. for 30 sec, 60° C. for 15 sec, and 72° C. for 30 sec, with a final extension at 72° C. for 5 min. PCR amplification was performed using the primer sets: Forward 5′-AAAAGTTTCCTGATGTGAAAGTCA-3′ (SEQ ID NO: 81) and Reverse 5′ AAAAACAGATGCTCCCCAGA-3′ (SEQ ID NO: 82). The PCR products were size-separated by electrophoresis on a 4% agarose gel for 1 h.
At 16 weeks old, the animals were tested for glucose sensitivity by Intraperitoneal glucose tolerance test (IPGTT). Prior to IPGTT, mice were fasted for 4h and an initial blood glucose reading was taken. This fast was followed by intraperitoneal injection of 2 mg/kg dextrose (Millipore Sigma), and subsequent blood glucose checks using an AccuChek Aviva glucometer (Roche). Blood glucose readings were taken at 15, 30, 60, and 120 min after dextrose injection. After IPGTT, mice resumed a high fat diet. An unpaired two-sided Student's t-test was used to test for significance.
Mouse Real Time qPCR
After establishment of stable Cobll1 knockout mice, the ablation of the Cobll1 expression was confirmed by quantitative RT-PCR in relevant tissues which showed significant decrease in the mRNA fold change of Cobll1 knockout mice compared to WT and heterozygous litter mates. Total RNA was isolated from the inguinal white fat pad (iWAT), kidney and liver using the RNA extraction reagent RNeasy Mini Kit (Qiagen). cDNA synthesis was performed using SuperScript III First-Strand Synthesis System (Thermo Fisher Scientific). Real time qPCR reactions were performed by using SsoAdvanced Universal SYBR Green Supermix. Real time qPCR amplification was performed using the primer sets: qPcrF 5′-CGTCACAGAGCAACAAGACA-3′ (SEQ ID NO: 83) and qPcrR 5′-ACTGAGCACAGAGGAACACG-3′ (SEQ ID NO: 84).
Total RNA was isolated from the inguinal white fat pad (iWAT) using the RNA extraction reagent RNeasy Mini Kit (Qiagen) from adult Cobll1 null mice and WT litter mate. The RNA-sequencing libraries were generated using the NEBNext Ultra™ II RNA Library Prep (New England Biolabs) and were sequenced on Illumina NovaSEQ platform (Illumina).
Primary adipocytes were isolated from dissected perigonadal white fat pad (pWAT) of 6-week-old mice and digested in 1 g/mL type I collagenase solution (containing 3.5% BSA, v/v) in a 37° C.′ water bath with shaking at 120 rpm for 45 min. The suspension was centrifuged at 250×g for 5 min, and then the cell pellet was resuspended in culture media (DMEM High Glucose, 20% FBS, 100 units/ml penicillin and 0.1 mg/ml streptomycin), was filtered through a 45-μm strainer, and was seeded in 25-cm2 flasks. Confluent pre-adipocytes were induced for two days with an adipogenic media (DMEM High Glucose, 10% FBS, Penicillin/Streptomycin (10000U/ml, 10000 ug/ml), 850 nM insulin, 1 nM T3, 500 μM IBMX, 1 μM Dexamethasone, 125 M Indometacin and 1 μM Rosiglitazone), and then switch to differentiation medium (adipogenic media without IBMX, Dexamethasone and Indometacin). Cells were harvested on the 8th day of differentiation and used for further analysis.
Oil Red O staining was used to assess the presence of lipids in mature adipocytes. For Oil Red O staining, cells were washed with phosphate-buffered saline (PBS) and fixed with 4% paraformaldehyde. The fixed cells were then covered with 3 mg/ml Oil Red O dissolved in 60% isopropanol (v/v) for 20 min and then the dye was washed away with H2O. For determination of GPDH activity Applicants used a commercially available kit from TAKARA Bio Inc. (Shiga, Japan), by monitoring the dihydroxyacetone phosphate-dependent oxidation of NADH at 340 nm. The enzyme activity was calculated by the formula described in the manufacturer's protocol and GPDH activity was expressed as unit/mg of protein.
Applicants developed a novel model to link in vitro LipocyteProfiler features to histology cell size estimate features and that these features independently and together can be linked to clinical characteristics. Applicants used a comprehensive and multimodal databank of adipose-derived mesenchymal cells (AMCS) at Melina Claussnitzer Lab (MCL). The databank is a unique resource to investigate associations between in-vivo, cellular, and clinical characteristics of patients. Applicants have used the data to develop a novel analytical pipeline for predicting clinical characteristics in patients. Applicants used datasets for two depots, visceral and subcutaneous adipose cells, containing the cell areas from histology images as reported by Glastonbury C A, Pulit S L, Honecker J, et al. (Machine Learning based histology phenotyping to investigate the epidemiologic and genetic basis of adipocyte morphology and cardiometabolic traits. PLOS Comput Biol. 2020; 16 (8): e1008044. Published 2020 Aug. 14. doi: 10.1371/journal.pcbi.1008044), morphological features identified by LipocyteProfiler (see, example 1), and clinical characteristics of 32 patients (
Applicants first confirmed the known associations between histology-derived features and BMI of patients, and showed that the computational pipeline could identify novel associations between histology-derived features and type-2 diabetes (T2D) in visceral samples. Applicants show some of the associations between in-vitro cellular features and clinical characteristics in Example 1, and expanded upon these results by identifying novel associations between the cellular traits and in-vivo histology derived traits. Applicants show that in-vitro features can be used to estimate histology features (mainly in subcutaneous depot) and similarly the in-vivo features can be used to estimate a diverse set of cellular features in both depots and during the examined differentiation time points (days 0, 3, 8, 14).
Applicants hypothesized that by using linear models with an expanded set of features, associations between the traits can be identified (
The method for predicting any of in vitro, in vivo and clinical characteristics uses preprocessing pipelines. Applicants used two preprocessing pipelines to prepare the in-vivo histology traits, from the Adipocyte U-Net, and in-vitro cellular traits, from the LipocyteProfiler pipeline outputs.
In-vivo histology traits were processed to generate histology features. Applicants previously showed the association between the mean cell sizes and BMI. In order to increase the dimensionality of the features extracted from histology images and to be able to predict further clinical characteristics Applicants defined five cell-size categories and calculate four features per category. Adipocyte U-Net reported 500 cell areas (μm2) per patient. For each depot, Applicants calculated the mean, median, and 25% and 75% quartile points of the 500 cell areas. These values were then used to define five cell-size categories of ‘very small’, ‘small’, ‘medium’, ‘large’, ‘very large’ per depot (‘very small’: cell area <25% point; ‘small’: 25%≤cell area <median; ‘medium’: median≤cell area <mean; ‘large’: mean≤cell area <75% point, and ‘very large’: 75% point≤cell area). For every sample, Applicants grouped the 500 cell areas into the five categories, and for each category Applicants calculated (i) a fraction of the number of cells in the category over 500, (ii) median area, and (iii) the 25 and (iv) 75% interquartile points. Therefore, the histology traits of every sample are captured and represented with 20 features (5 categories×4 variable) (
In-vitro cellular traits were processed to generate morphology features. Examples of features used to represent LipocyteProfiler images is shown (
Applicants examined associations between the three datasets (in-vitro and in-vivo imaging traits and the clinical characteristics of the patients). This contains four sets of analyses: Applicants investigated estimating every variable from the clinical characteristics using (i) in-vitro and (ii) in-vivo imaging traits, (iii) estimating in-vivo imaging traits using the in-vitro variables, and (iv) estimating in-vitro imaging traits using the in-vivo variables.
Applicants used the analysis for estimating clinical characteristics from the in-vivo traits as an example, and this process applies to all four sets of analyses. To estimate a clinical characteristic, a logistic regression model (a generalized linear model with logit link (GLM)) was fit on the entire set of the imaging traits. The linear association with binomial distribution was implemented using the R glm function. The default glm convergence criteria on deviances was used to stop the iterations. The DeLong method was used to calculate confidence intervals for the c-statistics. The Bonferroni adjusted Pearson correlation between the actual and estimated values are also reported. For every clinical variable, Applicants used forward feature selection (R step function) to select the most important imaging traits. The Akaike information criterion (AIC) was used as the stop condition for the feature selection procedure. The R function preProcess was used to normalize (center and scale) the non-dichotic variables.
Applicants used histology-derived size estimates to model clinical characteristics.
Applicants also used LipocyteProfiler traits to model clinical characteristics.
Applicants also used LipocyteProfiler traits to model histology-derived size estimates (
Applicants used clinical characteristics, histology, and Adipocyte Profiler derived morphological traits to study associations between the traits. Applicants developed methods of modeling clinical characteristics. In one example, histology adipocyte size traits were used. While most of the clinical characteristics could be modeled using the visceral adipose samples, the models on the subcutaneous samples showed partial success for BMI, weight, and T2D. In another example, Adipocyte Profiler traits were used. Most clinical characteristics could be modeled at some scattered differentiation time points. Applicants observed no trend in the success rate of the models.
Applicants also show modeling histology-derived adipocyte size traits using in-vitro Adipocyte Profiler features. Higher rates of success were observed during early differentiation days using the visceral cohort. Alternatively, using the subcutaneous cohort the traits could be modeled at almost all time points.
Applicants also show modeling cellular adipocyte traits using histology-derived size estimates. A variety of traits from the compartment subgroups (AGP, BODIPY, DNA, Mito, Other) could be modeled at different differentiation time points.
The modeling of clinical characteristics using histology-derived adipocyte traits align with the current knowledge. The results on connecting in-vitro Adipocyte Profiler high-content imaging traits to clinical traits is shown herein for the first time. Histology-derived size estimates can be modeled by in-vitro Adipocyte Profiler traits, which validates the in-vitro adipocyte model system. The results show novel modeling of in-vitro Adipocyte Profiler traits using histology-derived adipocyte size estimates.
Using PheWAS jointly analyzing many traits and disease states (Taliun et al. 2020), Applicants found rs12454712 to be associated with a number of metabolic traits (
To identify the possible effector transcript(s) mediating risk at the 18q21.33 locus, Applicants next used orthogonal approaches assessing the regulatory architecture surrounding rs12454712. Three-dimensional chromosomal conformation in human mesenchymal stem cells (Dixon et al. 2015) shows that the locus has a concise topologically associated domain (TAD) structure, encompassing three coding genes; BCL2, KDSR, and VPS4B (
To identify possible functional consequences of rs12454712 in adipocytes, Applicants compared morphological and cellular profiles from TT and CC allele carriers in primary human subcutaneous and visceral AMSCs throughout adipocyte differentiation (
Intriguingly, although target gene expression changes were restricted to undifferentiated pre-adipocytes, the described haplotype-driven cellular consequences on mitochondria manifested in maturing adipocytes. To further assess the effect of target gene expression changes in adipocyte progenitors on function in mature adipocytes, Applicants next correlated BCL2, KDSR and VPS4B gene expression across 26 subcutaneous pre-adipocytes at day 0) (the cell stage in which Applicants see a genotype-driven effect on BCL2 and KDSR gene expression) with their morphological profile at day 8 (the cell stage where Applicants observed haplotype-driven effects on mitochondrial morphology and function). Applicants found that BCL2 and KDSR expression in undifferentiated AMSCs correlated with mitochondrial features at day 8 that resembled haplotype-driven effects when comparing TT with CC allele carriers at this time-point (
In terminally differentiated subcutaneous AMSCs (day 14), the TT risk haplotype manifested in a cellular profile that differed in 171 features from the CC non-risk haplotype. Those features spread across all four channels and across all feature classes (
To identify possible, Applicants generated a network consistent of all genes associated with haplotype-driven differential features (<5% FDR) at day 8 based on a linear regression model of LipocyteProfiler-derived features and transcriptome-wide gene expression data of subcutaneous differentiated adipocytes (day 14). Applicants identified 2539 genes that associated significantly (FDR 0.1%) with the morphological and cellular profile of the rs12454712 genotype in subcutaneous adipocytes. The identified genes were significantly enriched for pathways characterizing fatty acid catabolic process (GO: 0009062) and apoptosis (GO): 1900117, 1900118, 1900119) (Table 18). Together, both morphological profiling and gene expression results point towards rs17454712 mediating apoptotic and lipid degradation processes.
To further validate whether the rs12454712-associated morphological profile resembles cellular signatures that Applicants would expect to see in a state of increased apoptosis, Applicants next generated a cellular reference profile of apoptosis by silencing the well-known anti-apoptotic gene BCL2 using siRNA (˜60% knockdown efficiency;
To test if siBCL2-induced cellular changes are associated with altered mitochondrial ROS in a similar fashion as Applicants observed for the rs12454712 genotype, Applicants next applied the same ML-based approach we used to predict ROS levels in TT versus CC allele carriers now in LipocyteProfiles from siBCL2-treated cells. Applicants found that at day 8 and 14 of adipocyte differentiation, adipocytes of BCL2-KD had reduced predicted ROS levels (
Applicants next investigated whether these BCL2-KD-induced morphological changes translate into altered mitochondrial respiration in a mitochondrial stress test using the Seahorse Bioflux Analyser. BCL2-KD increased oxygen consumption rate (OCR) and extracellular acidification rate (ECR) (
Due to their apoptotic properties, BCL2 inhibitors are currently used in the clinic for chronic lymphocytic leukemia and small lymphocytic lymphoma. Importantly, pharmacological inhibition of BCL2 using venetoclax has been reported to cause hyperglycemia in 16% of patients and severe hyperglycemia in 5% in a 1 year follow up clinical trial (Roberts et al. 2016), as well as loss of body weight in 11-13% of cases, indicating that BCL2 inhibition can lead to systemic metabolic adverse effects including reduced insulin sensitivity.
In visceral AMSCs, Applicants observed a rs12454712 genotype-driven effect on predominantly mitochondrial-associated morphological features at day 14 of differentiation (
Specifically, higher expression of VSPB4 (as seen in pre-adipocytes of the TT risk allele) correlated negatively with Mito intensity and texture, but positively with a feature describing overlap between mitochondrial and lipid stain, suggesting a profile of reduced mitochondrial membrane potential and higher colocalization of mitochondria to lipid droplets. This cellular profile is indicative of altered thermogenesis, as mitochondria are anchored to lipid droplets during ATP production and lipid droplet expansion but dissociate during browning-induced fatty acid oxidation (Benador et al. 2018). Disruption of this process has been linked to reduced insulin sensitivity in adipose tissue.
To assess whether visceral AMSCs from TT risk allele carriers resemble morphological profiles that Applicants would expect in a state of reduced thermogenic capacity, Applicants next compared rs12454712-driven morphological signatures with that of isoproterenol-treated visceral AMSCs at day 14 (see Examples 1 and 3). Isoproterenol is an adrenergic agonist known to induce adipocyte browning and increase thermogenic capacity. Applicants found features that are significantly different following isoproterenol treatment and/or between the rs12454712 haplotypes (<FDR 5%). Those features mapped predominantly to the lipid and mitochondria channels (
Applicants finally sought to decipher whether the mechanisms identified for rs12454712 would align with global cellular drivers of polygenic risk for increased WHRadjBMI. Applicants compared morphological profiles of high and low polygenic risk female individuals for WHRadjBMI (
Taken together, Applicants have deciphered multiple mechanisms underlying a metabolic risk locus of previously unknown function that presents pleiotropy at every layer of its regulatory circuitry. Applicants have shown that rs12454712 regulates at least three target genes, in three tissues with distinct cellular and morphological consequences, that converge to modulate disease susceptibility and together manifest in a complex metabolic phenotype. Applicants' findings highlight the complexities that one encounters when dissecting disease-associated loci in humans. Here Applicants have showcased a framework based on integration of high content imaging coupled with transcriptomics in a relatively small set of primary human AMSCs that enables unbiased mechanistic interrogation of genetic risk loci. Specifically, this allowed us to i) unravel the spatio-temporal complexities of a risk locus that modulated target gene expression at a specific developmental window and manifested in cellular phenotypes at another, and to ii) identify cellular mechanism by comparing haplotype-driven morphological profiles with signatures of cellular traits (e.g. ROS, apoptosis and thermogenesis).
In conclusion, natural genetic variation in human primary cells manifested in cellular profiles that made it possible to assign molecular mechanisms of the rs12454712 locus that are consistent with an organismal phenotype of adverse body fat distribution and metabolic disease.
BCL2 Silencing Using siRNA
All silencing experiments were performed on 4 technical replicates. One day before silencing, AMSCs were plated into 96-well plates with 10K cells/well using growth medium. RNA-based silencing of BCL2 was performed using RNAiMAX Reagent (ThermoFisher #13778075) and following the manufacturer's protocol. Briefly, Lipofectamine® RNAiMAX Reagent was diluted in Opti-MEM medium (Gibco, Cat #11058021). At the same time, siRNA was diluted in Opti-MEM medium. Then, diluted siRNA was added to the diluted Lipofectamine® RNAiMAX reagent at a ratio 1:1 and incubated for 5 min. The concentration of reagents per well in a 96-well plate were 0.5 μl (10 μM) of silencing oligo (Ambion Cat #4392421, ID s1915) or negative control duplex (Ambion Cat #4390844), and 1.5 μl of lipofectamine RNAiMAX Reagent. The plate was gently swirled and placed in a 37° C.′ incubator at 5% CO2 for three days. Cells were then induced to differentiate following the standard differentiation cocktail or harvested for gene expression analysis to assess knockdown efficiency.
RNA Preparation and qPCR
Total RNA was extracted with Trizol (Ambion 15596026) and the Direct-zol RNA MiniPrep Kit (Zymo R2052) following the manufacturer's instructions. cDNA was synthesized with High-Capacity cDNA Reverse Transcription Kit (Applied Biosystems 4368814) following the manufacturer's instructions. qPCR was performed using Thermo Scientific PCR Master Mix (Thermo Scientific K0172) and taqman probes for target gene BCL2 (Thermo Scientific, Cat #4448892, II Hs04986394_s1) and housekeeping gene CANX (Thermo Scientific, Cat #4448892, ID Hs01558409_m1). Relative gene expression was calculated by the delta delta Ct method. Target gene expression was normalized to expression of CANX.
Applicants obtained subcutaneous and visceral adipose tissue histology slides from a total of 188 morbidly obese male (35%) and female (65%) patients undergoing a range of abdominal laparoscopic surgeries (sleeve gastrectomy, fundoplication or appendectomy). The visceral adipose tissue is derived from the proximity of the angle of His and subcutaneous adipose tissue obtained from beneath the skin at the site of surgical incision. Images were acquired at 20× magnification with a micron per pixel value of 0.193 μm/pixel. Collagenase digestion and size determination of mature adipocytes was performed as described previously. All samples had genotypes called using the Illumina Global Screening beadchip array.
DNA was extracted and sent to the Oxford Genotyping Center for genotyping on the Infinium HTS assay on Global Screening Array bead-chips. Genotype QC was done using GenomeStudio and genotypes were converted into PLINK format for downstream analysis.
Applicants checked sample missingness but found no sample with missingness >5%. For the remaining sample quality control (QC) steps, Applicants reduced the genotyping data down to a set of high-quality SNPs. These SNPs were:
Using the remaining ˜65,000 SNPs, Applicants checked samples for inbreeding (--het in PLINK), but found no samples with excess homozygosity or heterozygosity (no sample >6 standard deviations from the mean). Applicants also checked for relatedness (--genome in PLINK) and found one pair of samples to be identical; Applicants kept the sample with the higher overall genotyping rate. Finally, Applicants performed PCA using EIGENSTRAT and projected the samples onto data from HapMap3, which includes samples from 11 global populations. Six samples appeared to have some amount of non-European ancestral background, while the majority of samples appeared to be of European descent. Applicants removed no samples at this step, selecting to adjust for principal components in genome-wide testing. However, adjustment for principal components failed to eliminate population stratification, and Applicants therefore restricted to samples of European descent only, defined as samples falling within +/−10 standard deviations of the first and second principal component values of the CEU (Northern and Western European-ancestry samples living in Utah) and TSI (Tuscans in Italy) samples included in the HapMap 3 dataset.4 2.43. Finally, sex information was received after initial sample QC was complete. As a result, one sample with potentially mismatching sex information (comparing genotypes and phenotype information) was discovered after analyses were complete and therefore remained in the analysis.
Applicants removed all SNPs with missingness >5% and out of HWE, P<1×10−6. Applicants also removed monomorphic SNPs. Finally, Applicants set heterozygous haploid sites to missing to enable downstream imputation.
The final cleaned dataset included 190 samples and ˜700,000 SNPs. Applicants note that histology data was not available for all genotyped samples.
For the genotyped cohorts without imputation data (ENDOX and MOBB) Applicants performed imputation via the Michigan Imputation Server. Applicants aligned SNPs to the positive strand, and then uploaded the data (in VCF format) to the server. Applicants imputed the data with the Haplotype Reference Consortium (HRC) panel, to be consistent with the fatDIVA data which was already imputed with the HRC panel. Applicants selected EAGLE as the phasing tool to phase the data. To impute chromosome X, Applicants followed the server protocol for imputing this chromosome (including using SHAPEIT to perform the phasing step).
Human liposuction material used for isolation of preadipocytes was obtained from a collaborating private plastic surgery clinic Medaesthetic Privatklinik Hoffmann & Hoffmann in Munich, Germany. Harvested subcutaneous liposuction material was filled into sterile 1 L laboratory bottles and immediately transported to the laboratory in a secure transportation box. The fat was aliquoted into sterile straight-sided wide-mouth jars, excluding the transfer of liposuction fluid. The fat was stored in cold Adipocyte Basal medium (AC-BM) at a 1:1 ratio of fat to medium and stored at 4° C. to be processed the following day. Additionally, small quantities of the original liposuction material would be aliquoted into T-25 flasks at a 1:1 ratio of fat to medium as controls to check for contamination. These control flasks were stored in the 37° C.′ incubator and were not processed. Krebs-Ringer Phosphate (KRP) buffer was prepared containing 200 U/ml of collagenase and 4% heat shock fraction BSA and sterilized by filtration using a BottleTop Filter 0.22 μm. When the fat reached RT, 12.5 ml of liposuction material was aliquoted into sterile 50-ml tubes with plug seal caps. The tubes were filled to 47.5 ml with warm KRP-BSA-collagenase buffer and the caps were securely tightened and wrapped in Parafilm to avoid leakage. The tubes were incubated in a shaking water bath for 30 minutes at 37° C. with strong shaking. After 30 minutes, the oil on top was discarded and the supernatant was initially filtered through a 2000-μm nylon mesh. The supernatant of all tubes was combined after filtration and centrifuged at 200×g for 10 minutes. The supernatant was discarded and each pellet was resuspended with 3 ml of erythrocyte lysis buffer, then all the pellets were pulled in one tube and incubated for 10 minutes at RT. The cell suspension was filtered through a 250 μm Filter and then through 150 μm Filter, followed by centrifugation at 200 g for 10 minutes. The supernatant was discarded and the pellet containing preadipocytes was resuspended in an appropriate amount of DMEM/F12 with 1% P/S and 10% FCS and seeded in T75 cell culture flasks and stored in the incubator (37° C.′, 5% CO2). The next day the medium was changed to PAC-PM. Once preadipocytes reached 100% confluency in T25 or T75 flasks they were split into 6-well plates at a seeding density on 250,000 cells per plate in PAC-PM. Once they reached 100% confluency, PAC-IM was prepared fresh and added to the preadipocytes to induce differentiation. On day 3 after induction, the medium was changed to PAC-DM and replaced twice a week.
Subcutaneous adipose tissue was sampled from the abdominal area at the site of incision and visceral adipose tissue from the angle of his from patients undergoing elective abdominal laparoscopic surgery. Each patient gave written informed consent prior to inclusion and the study protocol was approved by the ethics committee of the Technical University of Munich (Study nr. 5716/13). Connective tissue and blood vessels were dissected and one gram of minced adipose tissue was digested with 5 ml of Krebs-ringer phosphate buffer containing 200 U/ml of collagenase (SERVA, Heidelberg, Germany). Digestion was carried out at 37° C. for 60 minutes in a shaking water bath. Afterwards the suspension was centrifuged at 200 g for 10 minutes and the supernatant was discarded. The pellet containing the SVF was resuspended in DMEM/F12 (Gibco, Thermo Fisher Scientific, Darmstadt) containing 10% FCS (F7524, Sigma-Aldrich, Taufkirchen, Germany) and 1% penicillin-streptomycin (P/S; PAA Laboratories, Linz, Austria). After filtering the cell suspension through a 70 μm cell strainer the cells were plated, washed with PBS on the next day and medium was changed to proliferation medium. Proliferation and differentiation of isolated preadipocytes was carried out as described earlier. [DOI: 10.1056/NEJMoa1502214]
Human primary AMSCs were isolated from liposuction material. Each patient gave written informed consent prior to inclusion and the study protocol was approved by the ethics committee of the Technical University of Munich (study nr. 5716/13). The liposuction material was immediately transported to the laboratory and stored with an equal amount of DMEM-F12 (Gibco, Thermo Fisher Scientific, Darmstadt) containing 1% penicillin-streptomycin (P/S; PAA Laboratories, Linz, Austria) over night at 4° C. On the next day the samples were digested in a 1:4 ration with Krebs-Ringer Phosphate (KRP) buffer containing 200 U/ml collagenase (SERVA, Heidelberg, Germany) at 37° C. in a shaking water bath for 60 minutes. After digestion the adipocyte/oil containing layer was removed and the remaining liquid containing the SVF was filtered through a 2000 μm nylon mesh. The SVF was pelleted through centrifugation for 10 minutes at 200 g. The supernatant was discarded and the pellet was resuspended in 37° C.′ warm erythrocyte lysis buffer (155 mM NH4Cl, 5.7 mM K2HPO4, 0.1 mM EDTA dihydrate) and incubated at room temperature for 10 minutes. The cell suspension was filtered through a 250 μm Filter and then through a 150 μm Filter, followed by centrifugation at 200 g for 10 minutes. The supernatant was discarded and the pellet containing AMSCs was resuspended in DMEM/F12 containing 1% P/S and 10% FCS (Sigma, F7524). Cells were seeded and washed with PBS on the next day before switching to proliferation medium. Proliferation and differentiation was carried out as described earlier. [DOI: 10.1056/NEJMoa1502214]
Purity of AMCSs was assessed as previously described (Raajendiran et al, 2019). Briefly, cells were stained with 0.05 ug CD34, 0.125 ug CD29, 0.375 ug CD31, 0.125 ug CD45 per 250K cells and analyzed on CytoFlex together with negative control samples of corresponding AMCSs. (
For imaging, cells were seeded at 10K cells/well in 96-well plates (Cell Carrier, Perkin Elmer #6005550) and induced 4 days after seeding. For RNAseq, cells were seeded at 40K cells/well in 12-well dishes (Corning). Before Induction cells were cultured in proliferation medium. Adipogenic differentiation was induced by changing culture medium to induction medium. On day 3 of adipogenic differentiation culture medium was changed to differentiation medium. Medium was changed every 3 days. Visceral-derived AMSCs were differentiated by adding FFA.
Human primary AMSCs were plated and differentiated in 96-well CellCarrier plates (Perkinelmer/6005550) for 14 days for high content imaging at day 0, day 3, day 8 and day 14 of adipogenic differentiation. On the respective day of the assay, cell culture media was removed and replaced by 0.5 uM Mitotracker staining solution (1 mM MitoTracker Deep Red stock (Invitrogen #M22426) diluted in culture media) to each well followed by 30 minutes incubation at 37° C. protected from light. After 30 min Mitotracker staining solution was removed and cells were washed twice with Dulbecco's Phosphate-Buffered Saline (1×), DPBS (Corning® #21-030-CV) and 2.9 uM BODIPY staining solution (3.8 mM BODIPY 505/515 stock (Thermofisher #D3921) diluted in DPBS) was added followed by 15 minutes incubation at 37° C.′ protected from light. Subsequently, cells were fixed by adding 16% Methanol-free Paraformaldehyde, PFA (Electron Microscopy Sciences #15710-S) directly to the BODIPY staining solution to a final concentration of 3.2% and incubated for 20 minutes at RT protected from light. PFA was removed and cells were washed once with Hank's Balanced Salt Solution (1×), HBSS (Gibco #14025076). To permeabilize cells 0.1% Triton X-100 (Sigma Aldrich #X100) was added and incubated at RT for 10 minutes protected from light. After Permeabilization multi-stain solution (10 units of Alexa Fluor™ 568 Phalloidin (ThermoFisher #A12380), 0.01 mg/ml Hoechst 33342 (Invitrogen #H3570), 0.0015 mg/ml Wheat Germ Agglutinin, Alexa Fluor™ 555 Conjugate (ThermoFisher #/W32464), 3 uM SYTO™ 14 Green Fluorescent Nucleic Acid Stain (Invitrogen #/S7576) diluted in HBSS) was added and cells were incubated at RT for 10 minutes protected from light. Finally, staining solution was removed and cells were washed three times with HBSS. Cells were imaged using a Opera Phenix High content screening system. Per well we imaged 25 fields.
Quantitation was performed using CellProfiler 3.1.9. Prior to processing, flat field illumination correction was performed using functions generated from the mean intensity across each plate. Nuclei were identified using the DAPI stain and then expanded to identify whole cells using the AGP and Bodipy stains. Regions of cytoplasm were then determined by removing the Nuclei from the Cell segmentations. Speckles of Bodipy staining were enhanced to assist in detection of small and large individual Bodipy objects. For each object set measurements were collected representing size, shape, intensity, granularity, texture, colocalisation and distance to neighbouring objects.
After feature extraction data was filtered by applying automated and manual quality control steps. First, fields with a total cell count less than 50 cells were removed. Second, fields that are corrupted by experimental induced technical artifacts were removed by applying a manually defined quality control mask. Furthermore, blocklisted features that are known to be noisy and generally unreliable were removed. After filtering data were normalised per plate using a robust scaling approach that subtracts the median from each variable and divides it by the interquartile range. For each individual wells were aggregated for downstream analysis by cell depot and day of differentiation.
Subsequent data analyses were performed in R3.6.1 and Matlab using base packages unless noted. To check for batch effects, Applicants visualised the data using a Principle component analysis and quantifying using a Kolmogorov-Smirnov test implemented in the “BEclear” R package. Additionally, Applicants performed a k-nearest neighbour (knn) supervised machine learning algorithm implemented in the “class” R package (Venables W N, Ripley B D (2002)) to accuracy of predicting. For that the data set, consistent of 3 different cell Types (hWAT, hBAT, SGBS) balanced distributed on the 96-well plate, imaged at 4 days of differentiation, was splitted into equally distributed testing (n=18) and training (n=56) sets. Accuracy of the classification model was predicted based on three different categories cell type, batch and column of the 96-well plate.
For dimensionality reduction visualization Uniform manifold approximation and projection maps (UMAP) were created using the UMAP R package (McInnes and Healy 2018).
To identify patterns of adipocyte differentiation underlying the morphological profiles a sample progression discovery analysis (SPD) was performed using the algorithm described by Peng Qiu et al. Briefly, the two adipose depots were analysed separately and features were clustered into modules based on correlation (correlation coefficient 0.6). Minimal spanning trees (MST) were constructed for each module and MSTs of each module are correlated to each other. Modules that support common MST were selected and an overall MST based on features of all selected modules is reconstructed.
Variance component analysis was performed by fitting multivariable linear regression models:
where y denotes an LipocyteProfiler feature of individual i and x, z, etc. independent variables that could confound the variability of the dataset. Independent variables are day of differentiation, experimental batch, column in 96-well plate, adipose depot, free fatty acid treatment, passaging before freezing, season and year of AMSCs isolation, sex, age, T2D status of individual, LipocyteProfiler feature Cells_Neighbors_PercentTouching_Adjacent corresponding to density of cell seeding and identification numbers of individuals.
To test whether there is a difference of morphological profiles in tail ends of polygenic risk scores (PRS) a multi-way analysis of variance (ANOVA) was performed. For that, individuals belonging to top 25% and bottom 25% of PRS score distribution are categorized into a categorical variable with 2 levels, top 25% or 25% bottom, according to their PRS percentile. Differences in morphological profiles are predicted using the categorized PRS variable adjusted for sex, age, BMI and batch.
To overcome multiple testing burden p-values were corrected using false positive rate (FDR) described in R package “qvalue” (Storey J D, Bass A J, Dabney A, Robinson D, 2020). Features with FDR <5% were classified to be significantly impacted by PRS variable. RNA Silencing
Pre-adipocytes were seeded to be 60-70% confluent at time of transfection. Silencing was performed using Lipofectamine® RNAiMAX Transfection Reagent (ThermoFisher #13778075) and following the manufacturer's protocol. Briefly, Lipofectamine® RNAiMAX Reagent was diluted in Opti-MEM medium. At the same time, siRNA was diluted in Opti-MEM medium. Then, diluted siRNA was added to the diluted Lipofectamine®) RNAiMAX reagent at a ratio 1:1 and incubated for 5 min. All silencing experiments were performed on 4 technical replicates. The plate was gently swirled and placed in a 37° C. incubator at 5% CO2 for 48 hours. Cells were then induced to differentiate following the standard differentiation cocktail or harvested for gene expression analysis and to assess knockdown efficiency. Silencing efficiency was compared between experiments using RT-qPCR with taqman probes for BCL2 (Assay Id U.S. Ser. No. 04/986,394 s1) and CANX as a housekeeping control gene (see RT qPCR section for detailed methods).
Silencer Select Pre-designed siRNA for BCL2 (ambion life technologies, #4392421, s1915) and Silencer Select Negative Control (ambion life technologies, #4390844), were diluted to 100 uM in water.
The protocol for a standard bioenergetics profile is composed of basal mitochondrial respiration, ATP turnover, proton leak and mitochondrial respiratory capacity. First, oxygen consumption rate (OCR) in basal conditions was determined and used to calculate the basal mitochondrial respiration. After this, 2 μM oligomycin was injected from the first port to inhibit ATP synthase, resulting in an accumulation of protons in the mitochondrial intermembrane space and a reduced activity of the electron transport chain. The resulting decrease in OCR reveals the respiration driving ATP synthesis in the cells, indicating ATP turnover. Residual oxygen consumption capacity can be attributed to the proton leak maintaining a minimal ETC and non-mitochondrial respiration. Next, 2 μM of the mitochondrial uncoupler FCCP was injected which results in an increase in OCR as the proton gradient across the inner mitochondrial membrane is dissipated and ETC resumed. This measurement reflects the maximal mitochondrial respiratory capacity. Finally, 2 μM Rotenone/Antimycin A are injected to completely stop ETC activity and the OCR reading at this phase reflects non-mitochondrial respiration. We normalized all data to the relative number of live cells in each well of the 96-well Seahorse plate.
Oxygen Consumption and Bioenergetics Profile was measured using the XF24 extracellular flux analyzer from Seahorse Bioscience. The protocol used in this assay was adapted from Gesta et al., 2011. For this assay, pre-adipocytes were counted and 10K cells per well were seeded onto seahorse 96 well plate in 50 μl of growth media and left to adhere overnight. The next day, silencing was performed as seen in the previous section. Three days later, cells were induced to differentiate within the seahorse plate following the adipogenic differentiation protocol as described previously. Each cell type was run in 8 replicates. When the cells were terminally differentiated at day 14 post adipogenic induction, the assay was performed. The evening before the assay, the seahorse XF-24 instrument cartridge was loaded with seahorse calibrant and placed in a CO2-free incubator at 37° C. overnight.
On the day of the assay, cells were washed in XF Assay Media, L-glutamine 2 mM, sodium pyruvate 2 mM, and glucose 10 mM (pH was measured and adjusted to pH7.4 at 37° C.). The seahorse plate containing the differentiated adipocytes was then incubated for at least 1 hour at 37° C. in a CO2-free incubator to allow CO2 to diffuse out of solution. According to the manufactures protocol, the ports of the seahorse XF-24 analyser cartridge were then loaded with the following compounds:
Before running the assay, the XF-24 instrument cartridge was calibrated.
For total oxygen consumption rate (OCR) measurements, the minimum OCR reading after Rotenone/Antimycin A treatment was subtracted from the initial untreated level, following the manufacturer's protocol. To directly measure mitochondrial thermogenesis, uncoupled respiration (proton leak) was measured by subtracting the minimum OCR level after Rotenone/Antimycin from the minimum level after oligomycin treatment. Oxygen concentrations were measured over time periods of 4 min with 2 min waiting and 2 min mixing.
Staphylococcus
aureus infection
Staphylococcus
aureus infection
Escherichia coli
sapiens 8fb80085-6195-
Homo sapiens 9048d98c-
sapiens 5e904cd6-6193-
Homo sapiens 685baa82-
Homo sapiens 50b98ae0-
Homo sapiens 01c81f4a-
Homo sapiens 3aa9aafa-
sapiens be5084c0-6194-
Homo sapiens ad60647c-
Homo sapiens be498a9b-
sapiens 46a5529b-6191-
sapiens 85755aa4-6195-
Homo sapiens 1ca2bf67-
Homo sapiens 6cfb9873-
sapiens NULL
sapiens 503076a2-6196-
Homo sapiens cd5ca44d-
Homo sapiens 2a3c66e7-
sapiens 3a38a0ca-6191-
sapiens 845321c3-6195-
Homo sapiens 86cd7795-
Homo sapiens 98ed0df6-
sapiens 27e0e369-6191-
sapiens dfb9dc47-6191-
sapiens 821b0c12-6195-
sapiens 400ebdab-6194-
sapiens c3b87da1-6194-
sapiens 1590a3b3-6195-
Homo sapiens fb172ba9-
sapiens a5f1af61-6191-
sapiens 7327884f-6195-
sapiens eff796f3-6195-
sapiens d3a49cee-6195-
sapiens bb67523a-6195-
sapiens e3068f36-6194-
sapiens 7796a240-6195-
Homo sapiens 8b13143b-
Homo sapiens 3a79fddf-
sapiens 5797691b-6195-
Homo sapiens 987a2b9f-
Homo sapiens 95b6b434-
Homo sapiens 054f432f-
sapiens 7cb02d01-6195-
Homo sapiens 9938526b-
sapiens 716a3d34-6194-
Homo sapiens 149a63dc-
sapiens f131cf8e-618b-
sapiens aa07df5d-6187-
Homo sapiens 20fe3c0e-
Homo sapiens f25420e8-
sapiens 8bbf39aa-6193-
sapiens 095aa3ef-6193-
sapiens 4037def0-6196-
Homo sapiens 6104ebb2-
Homo sapiens 5d4f90b6-
Homo sapiens 51b1ed75-
sapiens 073b9f25-6194-
sapiens aa3927b7-6192-
Homo sapiens 5c6b5f5c-
Homo sapiens e5e87977-
Homo sapiens 7a5b8f09-
Homo sapiens c6b6861c-
sapiens 094a8cb0-6195-
Homo sapiens faffa4fc-
Homo sapiens acbf44e2-
Homo sapiens 407a3468-
Homo sapiens 2fd0bc63-
sapiens 20ef2b81-6193-
Homo sapiens 2ddeac89-
Homo sapiens dfba0dfb-
Various modifications and variations of the described methods, pharmaceutical compositions, and kits of the invention will be apparent to those skilled in the art without departing from the scope and spirit of the invention. Although the invention has been described in connection with specific embodiments, it will be understood that it is capable of further modifications and that the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the described modes for carrying out the invention that are obvious to those skilled in the art are intended to be within the scope of the invention. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure come within known customary practice within the art to which the invention pertains and may be applied to the essential features herein before set forth.
This application claims the benefit of U.S. Provisional Application No. 63/218,656, filed Jul. 6, 2021. The entire contents of the above-identified application are hereby fully incorporated herein by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US22/73454 | 7/6/2022 | WO |
Number | Date | Country | |
---|---|---|---|
63218656 | Jul 2021 | US |