The contents of the electronic sequence listing (BROD_2610_ST25.txt”; Size is 10 Kilobytes and it was created on Mar. 17, 2020) is herein incorporated by reference in its entirety.
The subject matter disclosed herein is generally directed to modulation of Th17 differentiation and pathogenicity by use of metabolic targets.
The immune system must strike a balance between mounting proper responses to pathogens and avoiding uncontrolled, autoimmune reaction. Pro-inflammatory IL-17-producing Th17 cells are a prime case in point: as a part of the adaptive immune system, Th17 cells mediate clearance of fungal infections, but they are also strongly implicated in the pathogenesis of autoimmunity (Korn et al., 2009). In mice, although Th17 cells are present at sites of tissue inflammation and autoimmunity (Korn et al., 2009), they are also normally present at mucosal barrier sites, where they maintain barrier functions without inducing tissue inflammation (Blaschitz and Raffatellu, 2010).
Interleukin (IL)-17-producing helper T cells (Th17 cells) have been identified as a distinct lineage of CD4+ T helper cells producing IL-17A and IL-17F and are critical drivers of autoimmune tissue inflammation in experimental autoimmune encephalomyelitis (EAE) and in other autoimmune conditions (Korn et al., 2009). In a recent study, it has been shown that the Th17 cell differentiation program is regulated through two self-reinforcing and mutually antagonistic modules of positive and negative regulators (Yosef et al., 2013). This model was supported by transcriptional silencing and genetic ablation experiments (Yosef et al., 2013), as well as by chromatin immunoprecipitation (ChIP)-seq data (Xiao et al., 2014). The positive regulators promote the Th17 cell program while inhibiting the transcriptional programs of other T helper (Th) cell lineages (Th1, Treg). This suggests that there is not only a need for positive regulators to push the differentiation into a positive direction but also for concurrent inhibition of opposing differentiation programs to achieve unidirectional T cell differentiation. Other studies also support such a mutually antagonistic design in other Th lineages (O'Shea and Paul, 2010), however, how this is achieved for Th17 cells has not been elucidated.
In humans, functionally distinct Th17 cells have been described; for instance, Th17 cells play a protective role in clearing different types of pathogens like Candida albicans (Hernandez-Santos and Gaffen, 2012) or Staphylococcus aureus (Lin et al., 2009), and promote barrier functions at the mucosal surfaces (Symons et al., 2012), despite their pro-inflammatory role in autoimmune diseases such as rheumatoid arthritis, multiple sclerosis, psoriasis systemic lupus erythematous and asthma (Waite and Skokos, 2012). Thus, there is considerable diversity in the biological function of Th17 cells and in their ability to induce tissue inflammation or provide tissue protection.
Accordingly, there exists a need for a better understanding of the dynamic regulatory network that modulates, controls, or otherwise influences T cell balance, including Th17 cell differentiation, maintenance and function, and means for exploiting this network in a variety of therapeutic and diagnostic methods.
In one aspect, the present invention provides for a method of shifting T cell balance in a population of cells comprising T cells, said method comprising contacting the population of cells with one or more agents capable of modulating the polyamine pathway. In certain embodiments, Th17 cell balance is shifted towards Treg-like cells and/or is shifted away from Th17 cells; or is shifted towards Th17 cells and/or is shifted away from Treg-like cells. In certain embodiments, Th17 cell balance is shifted towards non-pathogenic Th17 cells and/or is shifted away from pathogenic Th17 cells; or is shifted towards pathogenic Th17 cells and/or is shifted away from non-pathogenic Th17 cells.
In certain embodiments, the one or more agents capable of shifting T cell balance towards Treg-like cells and/or away from Th17 cells comprise a polyamine or polyamine analogue. In certain embodiments, the polyamine analogue is 2-(difluoromethyl)ornithine (DFMO) or a derivative thereof.
In certain embodiments, the one or more agents modulate the expression, activity or function of one or more proteins in the polyamine pathway or downstream targets thereof. In certain embodiments, the one or more agents modulate the expression, activity or function of SAT1. In certain embodiments, the one or more agents comprise Diminazene-aceturate or a derivative thereof. In certain embodiments, the one or more agents modulate the expression, activity or function of ODC1. In certain embodiments, the one or more agents comprise DFMO or a derivative thereof. In certain embodiments, the one or more agents modulate the expression, activity or function of spermidine synthase (SRM). In certain embodiments, the one or more agents comprise trans-4-methylcyclohexylamine (MCHA) or a derivative thereof. In certain embodiments, the one or more agents modulate the expression, activity or function of spermine synthase (SMS). In certain embodiments, the one or more agents comprise N-(3-aminopropyl)-cyclohexyl amine (APCHA) or a derivative thereof. In certain embodiments, the one or more agents modulate the expression, activity or function of one or more genes or gene products selected from the group consisting of JMJD3, POU2F2, POU2F1, POU5F1B, POU3F4, POU1F1, POU3F2, POU3F3, POU4F2, POU2F3, POU3F1, POU4F1, NFAT5, NFATC2, c-MAF and BATF. In certain embodiments, the one or more agents capable of shifting T cell balance towards Th17 cells and/or away from Treg-like cells comprises GSK-J1. In certain embodiments, the one or more agents capable of shifting T cell balance towards Treg-like cells and/or away from Th17 cells comprises an agonist of JMJD3.
In certain embodiments, the one or more agents comprise a small molecule, small molecule degrader, genetic modifying agent, antibody, antibody fragment, antibody-like protein scaffold, aptamer, protein, or any combination thereof. In certain embodiments, the genetic modifying agent comprises a CRISPR system, RNAi system, zinc finger nuclease system, TALE system, or a meganuclease. In certain embodiments, the CRISPR system is a Class 1 or Class 2 CRISPR system. In certain embodiments, the Class 2 system comprises a Type II Cas polypeptide. In certain embodiments, the Type II Cas is a Cas9. In certain embodiments, the Class 2 system comprises a Type V Cas polypeptide. In certain embodiments, the Type V Cas is Cas12a, Cas12b, Cas12c, Cas12d (CasY), Cas12e(CasX), or Cas14. In certain embodiments, the Class 2 system comprises a Type VI Cas polypeptide. In certain embodiments, the Type VI Cas is Cas13a, Cas13b, Cas13c or Cas13d. In certain embodiments, the CRISPR system comprises a dCas fused or otherwise linked to a nucleotide deaminase. In certain embodiments, the nucleotide deaminase is a cytidine deaminase or an adenosine deaminase. In certain embodiments, the dCas is a dCas9, dCas12 or dCas13. In certain embodiments, the CRISPR system is a prime editing system.
In certain embodiments, the population of cells comprises naïve T cells, Th1 cells and/or Th17 cells. In certain embodiments, the population of cells are in vitro cells. In certain embodiments, the population of cells is an in vivo population of cells in a subject at risk for or suffering from a disease or condition characterized by an aberrant immune response, whereby the one or more agents are used to treat the disease or condition. In certain embodiments, the population of cells are ex vivo cells obtained from a healthy donor subject or from a subject at risk for or suffering from a disease or condition characterized by an aberrant immune response. In certain embodiments, the disease is an inflammatory and/or autoimmune disorder. In certain embodiments, the inflammatory disorder is selected from the group consisting of Multiple Sclerosis (MS), Irritable Bowel Disease (IBD), Crohn's disease, ulcerative colitis, spondyloarthritides, Systemic Lupus Erythematosus (SLE), Vitiligo, rheumatoid arthritis, psoriasis, Sjögren's syndrome, diabetes, psoriasis, Irritable bowel syndrome (IBS), allergic asthma, food allergies and rheumatoid arthritis. In certain embodiments, the condition is an autoimmune response. In certain embodiments, the subject at risk for or suffering from an autoimmune response is a subject undergoing immunotherapy. In certain embodiments, the immunotherapy is checkpoint blockade therapy and/or adoptive cell transfer. In certain embodiments, the checkpoint blockade therapy comprises anti-PD1, anti-CTLA4, anti-PD-L1, anti-TIM3, anti-TIGIT, anti-LAG3, or combinations thereof. In certain embodiments, the one or more agents are administered before, during or after administering the immunotherapy. In certain embodiments, the subject undergoing immunotherapy is suffering from cancer.
In certain embodiments, the naïve T cells are differentiated into Th17 cells, Th1 cells and/or Treg cells. In certain embodiments, the one or more agents are administered to the population of cells during differentiation. In certain embodiments, the differentiation conditions comprise cell culture media supplemented with IL-6 and TGF-β1 or supplemented with IL-1β, IL-6 and IL-23. In certain embodiments, T cell differentiation is shifted towards Treg cells and/or is shifted away from Th17 cells. In certain embodiments, T cell differentiation is shifted towards Th17 cells and/or is shifted away from Treg cells. In certain embodiments, T cell differentiation is shifted towards Th1 cells and/or is shifted away from Th17 cells. In certain embodiments, T cell differentiation is shifted towards Th17 cells and/or is shifted away from Th1 cells. In certain embodiments, T cell differentiation is shifted towards non-pathogenic Th17 cells and/or is shifted away from pathogenic Th17 cells.
In another aspect, the present invention provides for a population of T cells obtained by the method according to any embodiment herein (claims 1-48). In another aspect, the present invention provides for a pharmaceutical composition comprising the population of T cells. In another aspect, the present invention provides for a method of treating a disease or condition characterized by an aberrant immune response comprising administering the pharmaceutical to a subject in need thereof.
In another aspect, the present invention provides for a method of monitoring Th17 mediated autoimmunity in a subject comprising detecting one or more polyamines in the subject, wherein increased polyamines as compared to a control indicates autoimmunity.
In another aspect, the present invention provides for a method of treating autoimmunity in a subject in need thereof, comprising monitoring Th17 mediated autoimmunity in the subject by detecting one or more polyamines in the subject; and treating the subject according to any embodiment herein when increased polyamines are detected.
In another aspect, the present invention provides for a method of shifting Th17 cell pathogenicity in a population of cells comprising T cells, said method comprising contacting the population of cells with one or more agents capable of modulating a reaction of the glycolysis pathway according to Table 1 or 2. In certain embodiments, the one or more agents modulate expression, activity, or function of a gene or gene product selected from the group consisting of: G6PD, PKM, Aldo, PFKM, TA, G6PC, PGAM, GK, ENO1, PCK1, TPI1, PGK1, GAPDHS, PDHA1, and GPD1. In certain embodiments, the one or more agents is selected from the group consisting of 2,5-Anhydro-D-glucitol-1,6-diphosphate, S-HD-CoA, DHEA, TX1, Gimeracil, Shikonin, Pyruvate Kinase Inhibitor III, 2,3-Dihydroxypropyl dichloroacetate (DCA), 2,9-Dimethyl-BC, Koningic acid, CBR-470-1, EGCG, SF2312, PhAh, ENOblock, 3-MPA, and 6,8-Bis(benzylthio)octanoic acid.
In certain embodiments, the one or more agents comprise a small molecule, small molecule degrader, genetic modifying agent, antibody, antibody fragment, antibody-like protein scaffold, aptamer, protein, or any combination thereof. In certain embodiments, the genetic modifying agent comprises a CRISPR system, RNAi system, zinc finger nuclease system, TALE system, or a meganuclease. In certain embodiments, the CRISPR system is a Class 1 or Class 2 CRISPR system. In certain embodiments, the Class 2 system comprises a Type II Cas polypeptide. In certain embodiments, the Type II Cas is a Cas9. In certain embodiments, the Class 2 system comprises a Type V Cas polypeptide. In certain embodiments, the Type V Cas is Cas12a, Cas12b, Cas12c, Cas12d (CasY), Cas12e(CasX), or Cas14. In certain embodiments, the Class 2 system comprises a Type VI Cas polypeptide. In certain embodiments, the Type VI Cas is Cas13a, Cas13b, Cas13c or Cas13d. In certain embodiments, the CRISPR system comprises a dCas fused or otherwise linked to a nucleotide deaminase. In certain embodiments, the nucleotide deaminase is a cytidine deaminase or an adenosine deaminase. In certain embodiments, the dCas is a dCas9, dCas12 or dCas13. In certain embodiments, the CRISPR system is a prime editing system.
In certain embodiments, the population of cells comprises naïve T cells, Th1 cells and/or Th17 cells. In certain embodiments, the population of cells are in vitro cells. In certain embodiments, the population of cells is an in vivo population of cells in a subject at risk for or suffering from a disease or condition characterized by an aberrant immune response, whereby the one or more agents are used to treat the disease or condition. In certain embodiments, the population of cells are ex vivo cells obtained from a healthy donor subject or from a subject at risk for or suffering from a disease or condition characterized by an aberrant immune response. In certain embodiments, the disease is an inflammatory and/or autoimmune disorder. In certain embodiments, the inflammatory disorder is selected from the group consisting of Multiple Sclerosis (MS), Irritable Bowel Disease (IBD), Crohn's disease, ulcerative colitis, spondyloarthritides, Systemic Lupus Erythematosus (SLE), Vitiligo, rheumatoid arthritis, psoriasis, Sjögren's syndrome, diabetes, psoriasis, Irritable bowel syndrome (IBS), allergic asthma, food allergies and rheumatoid arthritis. In certain embodiments, the condition is an autoimmune response. In certain embodiments, the subject at risk for or suffering from an autoimmune response is a subject undergoing immunotherapy. In certain embodiments, the immunotherapy is checkpoint blockade therapy and/or adoptive cell transfer. In certain embodiments, the checkpoint blockade therapy comprises anti-PD1, anti-CTLA4, anti-PD-L1, anti-TIM3, anti-TIGIT, anti-LAG3, or combinations thereof. In certain embodiments, the one or more agents are administered before, during or after administering the immunotherapy. In certain embodiments, the subject undergoing immunotherapy is suffering from cancer.
In certain embodiments, the naïve T cells are differentiated into Th17 cells. In certain embodiments, the one or more agents are administered to the population of cells during differentiation. In certain embodiments, the differentiation conditions comprise cell culture media supplemented with IL-6 and TGF-β1 or supplemented with IL-1β, IL-6 and IL-23. In certain embodiments, T cell differentiation is shifted towards non-pathogenic Th17 cells and/or is shifted away from pathogenic Th17 cells.
In another aspect, the present invention provides for a population of T cells obtained by the method according to any embodiment herein (claims 54-83). In another aspect, the present invention provides for a pharmaceutical composition comprising the population of T cells. In another aspect, the present invention provides for a method of treating a disease or condition characterized by an aberrant immune response comprising administering the pharmaceutical composition to a subject in need thereof.
In another aspect, the present invention provides for a data driven method of detecting metabolic target genes and pathways comprising: providing single cell RNA-seq reads obtained from a population of cells or an RNA library from multiple cells, wherein each single cell represents an observation, and the number of observations allows statistical power to discern statistically significant metabolic targets; providing metabolic data comprising the metabolic reactions in the population of cells; simulating the metabolic fluxes at the single-cell level by projecting the transcriptomic profile onto the metabolic network, thereby producing a quantitative metabolic profile of each cell. In certain embodiments, spatial, temporal or lineage delineated RNA-seq data is used to identify the metabolic state in single cells across a tissue, over time or in a cell lineage. In certain embodiments, the method comprises treating a population of cells with a drug for use in identifying metabolic adaptation to the drug. In certain embodiments, the method comprises predicting targets that will shift a population of cells in one state to another state. In certain embodiments, the state is shifted towards Treg-like cells and/or is shifted away from Th17 cells; or towards Th17 cells and/or is shifted away from Treg-like cells; or towards non-pathogenic Th17 cells and/or is shifted away from pathogenic Th17 cells; or towards pathogenic Th17 cells and/or is shifted away from non-pathogenic Th17 cells. In certain embodiments, the method is used to determine resistance to a drug, wherein the method comprises determining metabolic pathways modulated in resistant subjects as compared to sensitive subjects. In certain embodiments, the method comprises analyzing single cells obtained from a diseased tissue for use in determining metabolic shifts in disease. In certain embodiments, the disease comprises a degenerative disease, cancer, a metabolic disease, aging, autoimmunity or inflammation. In certain embodiments, the disease comprises cardiovascular disease. In certain embodiments, the disease comprises diabetes. In certain embodiments, the single cells comprise cells from an animal, plant, or bacteria. In certain embodiments, the method comprises identifying metabolic shifts in a host cell contacted with a microbiome (e.g., symbiotic microbial cells harbored by a host organism consisting of trillions of microorganisms (also called microbiota or microbes) of thousands of different species including not only bacteria, but fungi, parasites, and viruses).
In another aspect, the present invention provides for a population of T cells obtained by the method according to any embodiment herein. In another aspect, the present invention provides for a pharmaceutical composition comprising the population of T cells according to any embodiment herein. In another aspect, the present invention provides for a method of treating a disease or condition characterized by an aberrant immune response comprising administering the pharmaceutical composition of any embodiment herein to a subject in need thereof.
These and other aspects, objects, features, and advantages of the example embodiments will become apparent to those having ordinary skill in the art upon consideration of the following detailed description of illustrated example embodiments.
An understanding of the features and advantages of the present invention will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the invention may be utilized, and the accompanying drawings of which:
The figures herein are for illustrative purposes only and are not necessarily drawn to scale.
Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure pertains. Definitions of common terms and techniques in molecular biology may be found in Molecular Cloning: A Laboratory Manual, 2nd edition (1989) (Sambrook, Fritsch, and Maniatis); Molecular Cloning: A Laboratory Manual, 4th edition (2012) (Green and Sambrook); Current Protocols in Molecular Biology (1987) (F. M. Ausubel et al. eds.); the series Methods in Enzymology (Academic Press, Inc.): PCR 2: A Practical Approach (1995) (M. J. MacPherson, B. D. Hames, and G. R. Taylor eds.): Antibodies, A Laboratory Manual (1988) (Harlow and Lane, eds.): Antibodies A Laboratory Manual, 2nd edition 2013 (E. A. Greenfield ed.); Animal Cell Culture (1987) (R. I. Freshney, ed.); Benjamin Lewin, Genes IX, published by Jones and Bartlet, 2008 (ISBN 0763752223); Kendrew et al. (eds.), The Encyclopedia of Molecular Biology, published by Blackwell Science Ltd., 1994 (ISBN 0632021829); Robert A. Meyers (ed.), Molecular Biology and Biotechnology: a Comprehensive Desk Reference, published by VCH Publishers, Inc., 1995 (ISBN 9780471185710); Singleton et al., Dictionary of Microbiology and Molecular Biology 2nd ed., J. Wiley & Sons (New York, N.Y. 1994), March, Advanced Organic Chemistry Reactions, Mechanisms and Structure 4th ed., John Wiley & Sons (New York, N.Y. 1992); and Marten H. Hofker and Jan van Deursen, Transgenic Mouse Methods and Protocols, 2nd edition (2011).
As used herein, the singular forms “a”, “an”, and “the” include both singular and plural referents unless the context clearly dictates otherwise.
The term “optional” or “optionally” means that the subsequent described event, circumstance or substituent may or may not occur, and that the description includes instances where the event or circumstance occurs and instances where it does not.
The recitation of numerical ranges by endpoints includes all numbers and fractions subsumed within the respective ranges, as well as the recited endpoints.
The terms “about” or “approximately” as used herein when referring to a measurable value such as a parameter, an amount, a temporal duration, and the like, are meant to encompass variations of and from the specified value, such as variations of +/−10% or less, +1-5% or less, +/−1% or less, and +/−0.1% or less of and from the specified value, insofar such variations are appropriate to perform in the disclosed invention. It is to be understood that the value to which the modifier “about” or “approximately” refers is itself also specifically, and preferably, disclosed.
As used herein, a “biological sample” may contain whole cells and/or live cells and/or cell debris. The biological sample may contain (or be derived from) a “bodily fluid”. The present invention encompasses embodiments wherein the bodily fluid is selected from amniotic fluid, aqueous humour, vitreous humour, bile, blood serum, breast milk, cerebrospinal fluid, cerumen (earwax), chyle, chyme, endolymph, perilymph, exudates, feces, female ejaculate, gastric acid, gastric juice, lymph, mucus (including nasal drainage and phlegm), pericardial fluid, peritoneal fluid, pleural fluid, pus, rheum, saliva, sebum (skin oil), semen, sputum, synovial fluid, sweat, tears, urine, vaginal secretion, vomit and mixtures of one or more thereof. Biological samples include cell cultures, bodily fluids, cell cultures from bodily fluids. Bodily fluids may be obtained from a mammal organism, for example by puncture, or other collecting or sampling procedures.
The terms “subject,” “individual,” and “patient” are used interchangeably herein to refer to a vertebrate, preferably a mammal, more preferably a human. Mammals include, but are not limited to, murines, simians, humans, farm animals, sport animals, and pets. Tissues, cells and their progeny of a biological entity obtained in vivo or cultured in vitro are also encompassed.
Various embodiments are described hereinafter. It should be noted that the specific embodiments are not intended as an exhaustive description or as a limitation to the broader aspects discussed herein. One aspect described in conjunction with a particular embodiment is not necessarily limited to that embodiment and can be practiced with any other embodiment(s). Reference throughout this specification to “one embodiment”, “an embodiment,” “an example embodiment,” means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, appearances of the phrases “in one embodiment,” “in an embodiment,” or “an example embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment, but may. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner, as would be apparent to a person skilled in the art from this disclosure, in one or more embodiments. Furthermore, while some embodiments described herein include some but not other features included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention. For example, in the appended claims, any of the claimed embodiments can be used in any combination.
Reference is made to the manuscript and included tables submitted in a Biorxiv posting titled “Metabolic and Epigenomic Regulation of Th17/Treg Balance by the Polyamine Pathway,” by Chao Wang, Allon Wagner, Johannes Fessler, Julian Avila-Pacheco, Jim Karminski, Pratiksha Thakore, Sarah Zaghouani, Kerry Pierce, Lloyd Bod, Alexandra Schnell, David DeTomaso, Noga Ron-Harel, Marcia Haigis, Daniel Puleston, Erika Pearce, Manoocher Soleimani, Ray Sobel, Clary Clish, Aviv Regev, Nir Yosef, and Vijay K. Kuchroo, (Wang et al., 2020). Reference is also made to the manuscript and included Tables submitted in a Biorxiv posting titled “In Silico Modeling of Metabolic State in Single Th17 Cells Reveals Novel Regulators of Inflammation and Autoimmunity,” by Allon Wagner, Chao Wang, David DeTomaso, Julian Avila-Pacheco, Sarah Zaghouani, Johannes Fessler, Elliot Akama-Garren, Kerry Pierce, Noga Ron-Harel, Vivian Paraskevi Douglas, Marcia Haigis, Raymond A. Sobel, Clary Clish, Aviv Regev, Vijay K. Kuchroo, and Nir Yosef, (Wagner et al., 2020).
All publications, published patent documents, and patent applications cited herein are hereby incorporated by reference to the same extent as though each individual publication, published patent document, or patent application was specifically and individually indicated as being incorporated by reference.
Embodiments disclosed herein provide methods of shifting T cell balance in a population of cells comprising T cells and therapeutic compositions thereof. Embodiments disclosed herein also provide for methods of treating inflammatory diseases and autoimmune responses. In certain embodiments, T cell differentiation is shifted towards or away from Th17 cell gene expression, is shifted towards or away from Treg cell gene expression, and/or is shifted towards or away from Th1 cell gene expression. In certain embodiments, T cell balance is shifted by contacting the T cells with a polyamine, polyamine analogue or an agent capable of modulating the polyamine pathway. In certain embodiments, T cell balance is shifted by contacting the T cells with a drug targeting a reaction in the glycolysis pathway.
Cellular metabolism is a powerful regulator of immune responses. Th17 cells become very proliferative and active after they are stimulated by an antigen, and this transition depends on a metabolic shift—from oxidative phosphorylation to glycolysis. This shift makes them divergent from immunosuppressive T cells that remain dependent on fatty acid oxidation and the TCA cycle (see, e.g., O'Sullivan & L Pearce, 2014, Fatty acid synthesis tips the TH17-Treg cell balance, Nature Medicine volume 20, pages 1235-1236). For example, Tregs are dependent on fatty acid oxidation and oxidative phosphorylation and Th17 cells are dependent on de-novo fatty acid synthesis and glycolysis.
Applicants identified the polyamine pathway and glycolysis pathway as associated with Th17 cell pathogenicity using both novel algorithm (COMPASS) and fluxomics/metabolomics. Applicants analyzed metabolic pathways associated with Th17 pathogenicity using COMPASS, a computational algorithm used to characterize the metabolic landscape of single cells based on single-cell RNA-Seq profiles and flux balance analysis. Applicants used COMPASS to characterize the metabolic heterogeneity in Th17 cells, whose pathogenic state triggers auto-immunity, yet whose non-pathogenic form promotes tissue homeostasis and barrier functions. COMPASS recovered known metabolic switches and predicted that the polyamine pathway should be a novel, powerful regulator of Th17 pathogenicity. Applicants validated the pathway's effect through an array of transcriptome, LC/MS metabolome, and functional assays. Deletion of polyamine enzymes in T cells resulted in altered metabolic space, T cell functions and, most importantly, aggravated symptoms in EAE, a murine model of multiple sclerosis. Further, Applicants identified for the first time that treatment of T cells with polyamines and polyamine analogues altered T cell balance. Applicants showed that inhibition of the polyamine pathway by a drug, DFMO, in Th17 cells are effective in reducing canonical Th17 genes and shift Th17 cells to Treg-like transcriptome. DFMO specifically reduced accessibility in regions specific to Th17 cells in ATAC-seq. Applicants showed that DFMO is effective in reducing EAE. DFMO reduces the expression of the enzyme Sat1, an enzyme involved in the polyamine pathway and Applicants showed conditional deletion of Sat1 in T cells resulted in increased Treg frequency, delayed EAE onset and reduced severity similar to DFMO treatment. Applicants also identified that polyamines are significantly upregulated in MS patients and in IBD patients. Applicants identified inhibitors of glycolysis pathway enzymes could also shift Th17 pathogenicity.
Cellular metabolism can orchestrate immune cell function. Previously it was demonstrated that lipid biosynthesis represents one such gatekeeper to Th17 cell functional state. Utilizing a transcriptome-based in silico fluxomics tool, Applicants constructed a comprehensive metabolic circuitry in association with Th17 cell function and identified the polyamine pathway as a candidate metabolic node, the flux of which regulates the inflammatory function of T cells. Indeed, expression and activities of enzymes of the polyamine pathway are suppressed in regulatory T cells and Th17 cells at the regulatory state. Perturbation of the polyamine pathway in Th17 cells suppressed canonical Th17 cell cytokines and promoted the expression of Foxp3, accompanied by dramatic shift in transcriptome and epigenome, transition Th17 cells into a Treg-like state in a cMaf dependent manner. Importantly, genetic and molecular perturbation of the polyamine pathway resulted in attenuation of autoimmune inflammation in the EAE model.
Cellular metabolism is a powerful modulator of immune response that can now be studied through the lens of single-cell RNA-Seq. However, single-cell analysis requires novel computational methods to address its unique challenges and unlock its unique potential. Here, Applicants present Compass, an algorithm to characterize the metabolic landscape of single cells based on single-cell RNA-Seq profiles and flux balance analysis. Applicants used Compass to study the landscape of metabolic heterogeneity in T helper 17 (Th17) cells and search for novel metabolic regulators of their immune functions. Compass recovered known metabolic switches but surprisingly predicted a glycolytic reaction (phosphoglycerate mutase) that, contrary to common immunometabolic understanding of glycolysis, promotes an anti-inflammatory phenotype. Applicants validated the predicted effects through an array of transcriptome, LC/MS and 13C-traced metabolome, and functional assays. While the study is concerned with Th17 cells, Compass is generally applicable, and can be used to characterize the metabolic states of any cell population based on its single-cell transcriptome profiles.
T lymphocytes include a variety of T cell types, e.g., Th17, regulatory T cells (Tregs), Treg-like cells, Th1 cells or Th1-like cells, or naïve T cells. As used herein, terms such as “Th17 cell” and/or “Th17 phenotype” and all grammatical variations thereof refer to a differentiated T helper cell that expresses one or more cytokines selected from the group the consisting of interleukin 17A (IL-17A), interleukin 17F (IL-17F), and interleukin 17A/F heterodimer (IL17-AF). As used herein, terms such as “Th1 cell” and/or “Th1 phenotype” and all grammatical variations thereof refer to a differentiated T helper cell that expresses interferon gamma (IFNγ). As used herein, terms such as “Th2 cell” and/or “Th2 phenotype” and all grammatical variations thereof refer to a differentiated T helper cell that expresses one or more cytokines selected from the group the consisting of interleukin 4 (IL-4), interleukin 5 (IL-5) and interleukin 13 (IL-13). As used herein, terms such as “Treg cell” and/or “Treg phenotype” and all grammatical variations thereof refer to a differentiated T cell that expresses Foxp3. “Naive T cells” and/or “naïve T cell phenotype” and all grammatical variations thereof as used herein are typically unable to produce proinflammatory cytokines, and are precursors for T-effector subsets. Naive T cells typically lack expression of previous activation, such as, for example, CD25, CD44, CD69, CD45RO, or HLA-DR. (see, e.g. T. Eagar and S. Miller, 2019, Helper T-Cell Subsets and Control of the Inflammatory Response, Clinical Immunology (Fifth Edition), 2019).
The invention also provides compositions and methods for modulating T cell balance. The invention provides T cell modulating agents that modulate T cell balance. For example, in some embodiments, the invention provides T cell modulating agents and methods of using these T cell modulating agents to regulate, influence, shift or otherwise impact the level of and/or balance between T cell types, e.g., between Th17 and other T cell types, for example, regulatory T cells (Tregs), Treg-like cells, Th1 cells or Th1-like cells. For example, in some embodiments, the invention provides T cell modulating agents and methods of using these T cell modulating agents to regulate, influence, shift, or otherwise impact the level of and/or balance between Th17 activity and inflammatory potential. Shifting the balance in a population of cells comprising T cells can comprise a change in T cell differentiation. T cell differentiation can shift towards non-pathogenic Th17 cells, Th1 cells, Treg cells, and/or is shifted away from pathogenic Th17 cells, Treg cells, or Th1 cells. Methods shifting the T cell balance can comprise differentiation of naïve T cells into Th17 cells, Th1 cells and/or Treg cells.
As used herein, terms such as “pathogenic Th17 cell” and/or “pathogenic Th17 phenotype” and all grammatical variations thereof refer to Th17 cells that, when induced in the presence of TGF-β3 or TGF-β1+IL-6+IL-23, express an elevated level of one or more genes selected from Cxcl3, IL22, IL3, Cc14, Gzmb, Lrmp, Ccl5, Casp1, Csf2, Ccl3, Tbx21, Icos, IL17r, Stat4, Lgals3 and Lag, as compared to the level of expression in TGF-β1+IL-6-induced Th17 cells. As used herein, terms such as “non-pathogenic Th17 cell” and/or “non-pathogenic Th17 phenotype” and all grammatical variations thereof refer to Th17 cells that, when induced in the presence of TGF-β1+IL-6, express an increased level of one or more genes selected from IL6st, IL1rn, Ikzf3, Maf, Ahr, IL9 and IL10, as compared to the level of expression in a TGF-β3-induced or TGF-β1+IL-6+IL-23-induced Th17 cells.
Depending on the cytokines used for differentiation (pathogenic conditions are TGF-β3 or TGF-β1+IL-6+IL-23 and non-pathogenic conditions are TGF-β1+IL-6), in vitro polarized Th17 cells can either cause severe autoimmune responses upon adoptive transfer (‘pathogenic Th17 cells’) or have little or no effect in inducing autoimmune disease (‘non-pathogenic cells’) (Ghoreschi et al., 2010; Lee et al., 2012). In vitro differentiation of naïve CD4 T cells in the presence of TGF-β1+IL-6 induces an IL-17A and IL-10 producing population of Th17 cells, that are generally nonpathogenic, whereas activation of naïve T cells in the presence IL-1β+IL-6+IL-23 induces a T cell population that produces IL-17A and IFN-γ, and are potent inducers of autoimmune disease induction (Ghoreschi et al., 2010).
A dynamic regulatory network controls Th17 differentiation (See e.g., Yosef et al., Dynamic regulatory network controlling Th17 cell differentiation, Nature, vol. 496: 461-468 (2013); Wang et al., CD5L/AIM Regulates Lipid Biosynthesis and Restrains Th17 Cell Pathogenicity, Cell Volume 163, Issue 6, p1413-1427, 3 Dec. 2015; Gaublomme et al., Single-Cell Genomics Unveils Critical Regulators of Th17 Cell Pathogenicity, Cell Volume 163, Issue 6, p1400-1412, 3 Dec. 2015; and International Patent Publication Nos. WO2016138488A2, WO2015130968, WO/2012/048265, WO/2014/145631 and WO/2014/134351 the contents of which are hereby incorporated herein by reference in their entirety). Accordingly, shifting the T cell balance in a population of cells may include contacting the population of cells with IL-6 and TGF-β1 or IL-1β, IL-6, and IL-23. In certain embodiments, the IL-6 and TGF-β1 or IL-1β, IL-6, and IL-23 supplement a cell culture media. In one embodiment, the administration of the agents differentiates naïve T cells into Th17 cells. Optionally, the agents are administered to the population of cells during differentiation.
As used herein, a population of cells contacted with one or more agents can be in vivo or in vitro or ex vivo.
As used herein, the term “polyamine” refers to an organic compound having more than two amino groups. Polyamines are naturally occurring polycations that are required for cell growth, and manipulation of cellular polyamine levels can lead to decreased proliferation, and, in some cases, increased cell death. Natural polyamine biosynthesis is regulated by the rate-limiting enzymes ornithine decarboxylase (ODC) and S-Adenosylmethionine decarboxylase (SAMDC), while polyamine catabolism is driven by spermidine/spermine N1-acetyltransferase/polyamine oxidase (SSAT/PAO) and spermine oxidase SMO(PAOh1). (See, e.g., Huang et al., Cancer Biol Ther. 2005 September; 4(9): 1006-1013).
In certain embodiments, genes and polypeptides belonging to the polyamine pathway are modulated or targeted. All gene name symbols as used herein refer to the gene as commonly known in the art. The examples described herein that refer to the mouse gene names are to be understood to also encompasses human genes, as well as genes in any other organism (e.g., homologous, orthologous genes). The term, homolog, may apply to the relationship between genes separated by the event of speciation (e.g., ortholog). Orthologs are genes in different species that evolved from a common ancestral gene by speciation. Normally, orthologs retain the same function in the course of evolution. Gene symbols may be those referred to by the HUGO Gene Nomenclature Committee (HGNC) or National Center for Biotechnology Information (NCBI). Any reference to the gene symbol is a reference made to the entire gene or variants of the gene. The signature as described herein may encompass any of the genes described herein.
The gene name SAT1, SSAT-1, SSAT, SAT, Spermidine/Spermine N1-Acetyltransferase 1, Polyamine N-Acetyltransferase 1, Diamine N-Acetyltransferase 1, Putrescine Acetyltransferase, Spermidine/Spermine N1-Acetyltransferase Alpha, Spermidine/Spermine N(1)-Acetyltransferase 1, Spermidine/Spermine N1-Acetyltransferase, Diamine Acetyltransferase 1, EC 2.3.1.57, KF SDX, DC21, and KFSD may refer to the gene or polypeptide according to NCBI Reference Sequence accession numbers NM_002970.3 and NM_009121.4. SAT1 is a highly regulated enzyme that allows a fine attenuation of the intracellular concentration of polyamines. SAT1 is also involved in the regulation of polyamine transport out of cells. SAT1 acts on 1,3-diaminopropane, 1,5-diaminopentane, putrescine, spermidine (forming N(1)- and N(8)-acetyl spermidine), spermine, N(1)-acetyl spermidine and N(8)-acetyl spermidine. As described further herein, SAT1 is a top-ranking gene associated with Th17 pathogenicity and SAT1 activity is associated with pathogenicity.
As used herein, “modulating” or “to modulate” generally means either reducing or inhibiting the expression or activity of, or alternatively increasing the expression or activity of a target (e.g., polyamine pathway). In particular, “modulating” or “to modulate” can mean either reducing or inhibiting the activity of, or alternatively increasing a (relevant or intended) biological activity of, a target or antigen as measured using a suitable in vitro, cellular or in vivo assay (which will usually depend on the target involved), by at least 5%, at least 10%, at least 25%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or more, compared to activity of the target in the same assay under the same conditions but without the presence of an agent. An “increase” or “decrease” refers to a statistically significant increase or decrease respectively. For the avoidance of doubt, an increase or decrease will be at least 10% relative to a reference, such as at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 97%, at least 98%, or more, up to and including at least 100% or more, in the case of an increase, for example, at least 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least 9-fold, at least 10-fold, at least 50-fold, at least 100-fold, or more. “Modulating” can also involve effecting a change (which can either be an increase or a decrease) in affinity, avidity, specificity and/or selectivity of a target or antigen, such as polyamine pathway enzyme binding. “Modulating” can also mean effecting a change with respect to one or more biological or physiological mechanisms, effects, responses, functions, pathways or activities in which the target or antigen (or in which its substrate(s), ligand(s) or pathway(s) are involved, such as its signaling pathway or metabolic pathway and their associated biological or physiological effects) is involved. Again, as will be clear to the skilled person, such an action as an agonist or an antagonist can be determined in any suitable manner and/or using any suitable assay known or described herein (e.g., in vitro or cellular assay), depending on the target or antigen involved.
Modulating can, for example, also involve allosteric modulation of the target and/or reducing or inhibiting the binding of the target to one of its substrates or ligands and/or competing with a natural ligand, substrate for binding to the target. Modulating can also involve activating the target or the mechanism or pathway in which it is involved. Modulating can for example also involve effecting a change in respect of the folding or confirmation of the target, or in respect of the ability of the target to fold, to change its conformation (for example, upon binding of a ligand), to associate with other (sub)units, or to disassociate. Modulating can for example also involve effecting a change in the ability of the target to signal, phosphorylate, dephosphorylate, and the like.
In certain embodiments, a T cell modulating agent comprises a polyamine analogue. Polyamine analogues have been synthesized as metabolic modulators that deplete natural intracellular polyamine pools, or polyamine mimetics that displace the natural polyamines from binding sites, but do not substitute for their growth promoting function. Symmetrically substituted bis(alkyl)polyamine analogues represent the first generation of these analogues, some of which downregulate polyamine biosynthesis and increase SSAT activity in certain tumor cell types like non-small cell lung cancer cells, melanoma and human breast cancer cells. A second generation of polyamine analogues are unsymmetrically substituted compounds that display structure-dependent and cell type-specific effects on regulation of polyamine metabolism. Recently, a series of new polyamine analogues designated conformationally restricted, cyclic and oligoamine analogues have been developed. Some of these agents incorporate alterations that limit the free rotation of the single bonds in otherwise flexible molecules such as spermine or its analogues, thus restricting the molecular conformation that they may assume. Oligoamine analogues consist of synthetic octa-, deca-, dodeca- and tetradecamines with longer chains than natural mammalian polyamine molecules, with or without conformational restriction. Some of these novel analogues have shown significant activity against multiple human tumors both in vitro and in vivo (See, e.g., Huang et al., Cancer Biol Ther. 2005 September; 4(9): 1006-1013).
The fluorinated ornithine analog α-difluoromethylornithine (DFMO, eflornithine, alpha-difluoromethylomithine, Ornidyl®, Vaniqa®) is an FDA approved irreversible suicide inhibitor of ornithine decarboxylase (ODC), the first and rate-limiting enzyme of polyamine biosynthesis (see, e.g., LoGiudice et al., Alpha-Difluoromethylornithine, an Irreversible Inhibitor of Polyamine Biosynthesis, as a Therapeutic Strategy against Hyperproliferative and Infectious Diseases. Med. Sci. 2018, 6(1), 12; US20170273926A1). DFMO is a structural analog of the amino acid L-omithine and has a chemical formula C6H12N2O2F2. DFMO can be employed in the methods of the invention as a racemic (50/50) mixture of D- and L-enantiomers, or as a mixture of D- and L-isomers where the D-isomer is enriched relative to the L-isomer, for example, 70%, 80%, 90% or more by weight of the D-isomer relative to the L-isomer. The DFMO employed may also be substantially free of the L-enantiomer.
The initial promise of DFMO as a therapeutic ODC inhibitor for use in the treatment of various neoplasias has failed to translate into the clinic because, although DFMO does, in fact, irreversibly inhibit ODC activity, cells treated in vivo with DFMO significantly increase their uptake of exogenous putrescine as described in U.S. Pat. No. 4,925,835.
The use of eflornithine (DFMO) is disclosed in U.S. Pat. No. 6,653,351. U.S. Pat. No. 6,277,411 discloses formulations for the administration of eflornithine, including a core having a rapid release DFMO-containing granules and a slow release granule and an outer layer surrounding the core comprising a pH responsive coating.
In certain embodiments, DFMO can be administered either orally or by injection, such as intravenously or intraperitoneally. In certain embodiments, the daily dose of DFMO is about 3.0 to 9.0 g/m2 given in three equal administrations each eight hours. In other embodiments, the dose of eflornithine may be varied considering the treatment and condition of the subject. Such modifications of dosage are generally routine to one of skill in the art. The forms of eflomithine include both isolated L-eflornithine and D-eflornithine, as well as a racemic mixture of L- and D-eflornithine. A higher dose of the D-form may be utilized, such as about 20 g/m2, about 30 g/m2, about 40 g/m2, or about 50 g/m2. Strategies to make DFMO more acceptable to human patients are described in U.S. Pat. No. 4,859,452. Formulations of DFMO are described which include essential amino acids in combination with either arginine or omithine to help reduce DFMO-induced toxicities.
In certain embodiments, a histone demethylation agent is used to modulate Th17/Treg balance. A non-limiting example inhibitor is GSK-J1 (C22H23N5O2) (see, e.g., Kruidenier et al (2012) A selective jumonji H3K27 demethylase inhibitor modulates the proinflammatory macrophage response. Nature 488 404; and Heinemann et al (2014) Inhibition of demethylases by GSK-J1/J4. Nature 514 E1). GSK-J1 is a Potent inhibitor of the H3K27 histone demethylases JMJD3 (KDM6B) and UTX (KDM6A) (IC50 values are 28 and 53 nM respectively). GSK-J1 also inhibits KDMSB, KDMSC and KDMSA (IC50 values are 170, 550 and 6,800 nM respectively). GSK-J1 exhibits no activity against a panel of other histone demethylases (IC50>20 μM), and displays no significant inhibitory activity against 100 protein kinases at a concentration of 30 μM.
In certain embodiments, the one or more agents is a small molecule. The term “small molecule” refers to compounds, preferably organic compounds, with a size comparable to those organic molecules generally used in pharmaceuticals. The term excludes biological macromolecules (e.g., proteins, peptides, nucleic acids, etc.). Preferred small organic molecules range in size up to about 5000 Da, e.g., up to about 4000, preferably up to 3000 Da, more preferably up to 2000 Da, even more preferably up to about 1000 Da, e.g., up to about 900, 800, 700, 600 or up to about 500 Da. In certain embodiments, the small molecule may act as an antagonist or agonist (e.g., blocking an enzyme active site or activating a receptor by binding to a ligand binding site).
One type of small molecule applicable to the present invention is a degrader molecule. Proteolysis Targeting Chimera (PROTAC) technology is a rapidly emerging alternative therapeutic strategy with the potential to address many of the challenges currently faced in modern drug development programs. PROTAC technology employs small molecules that recruit target proteins for ubiquitination and removal by the proteasome (see, e.g., Zhou et al., Discovery of a Small-Molecule Degrader of Bromodomain and Extra-Terminal (BET) Proteins with Picomolar Cellular Potencies and Capable of Achieving Tumor Regression. J. Med. Chem. 2018, 61, 462-481; Bondeson and Crews, Targeted Protein Degradation by Small Molecules, Annu Rev Pharmacol Toxicol. 2017 Jan. 6; 57: 107-123; and Lai et al., Modular PROTAC Design for the Degradation of Oncogenic BCR-ABL Angew Chem Int Ed Engl. 2016 Jan. 11; 55(2): 807-810).
In certain embodiments, the small molecule inhibits an enzyme in the polyamine pathway. In certain embodiments, the small molecule includes, but is not limited to diminazene aceturate (Berenil) (PMID: 1510731) (inhibitor of SAT1), trans-4-methyl cyclohexyl amine (MCHA) (spermidine synthase inhibitor), or N-(3-aminopropyl)cyclohexylamine (APCHA) (spermine synthase inhibitor).
In certain embodiments, the small molecule targets an enzyme in the glycolysis pathway. The small molecules may modulate the activity or function of a gene or gene product selected from the group consisting of: PGAM, G6PD, PKM, Aldo, PFKM, TA, G6PC, GK, ENO1, PCK1, TPI1, PGK1, GAPDHS, PDHA1, and GPD1. Small molecules known to inhibit the enzymes include EGCG (see, e.g., Nagle, et al., Epigallocatechin-3-gallate (EGCG): Chemical and biomedical perspectives, Phytochemistry. 2006 September; 67(17): 1849-1855), 2,5-Anhydro-D-glucitol-1,6-diphosphate (see, e.g., US20180133192A1), S-hexadecyl-CoA (S-HD-CoA) (see, e.g., Jenkins et al., Reversible High Affinity Inhibition of Phosphofructokinase-1 by Acyl-CoA-A MECHANISM INTEGRATING GLYCOLYTIC FLUX WITH LIPID METABOLISM, J Biol Chem. 2011 Apr 8; 286(14): 11937-11950), DHEA (see, e.g., Schwartz and Pashko, Dehydroepiandrosterone, glucose-6-phosphate dehydrogenase, and longevity. Ageing Res Rev. 2004 Apr; 3(2):171-87), poldatin (see, e.g., Mele, et al., A new inhibitor of glucose-6-phosphate dehydrogenase blocks pentose phosphate pathway and suppresses malignant proliferation and metastasis in vivo, Cell Death & Disease volume 9, Article number: 572 (2018)), TX1 (see, e.g., Stancu, et al., fasebj.31.1_supplement.921.1; and Cho, et al., A Fluorescence-Based High-Throughput Assay for the Identification of Anticancer Reagents Targeting Fructose-1,6-Bisphosphate Aldolase. SLAS Discov. 2018 January; 23(1):1-10), Gimeracil (see, e.g., Sakata, et al., Gimeracil, an inhibitor of dihydropyrimidine dehydrogenase, inhibits the early step in homologous recombination. Cancer Sci. 2011 September; 102(9):1712-6), Shikonin (see, e.g., Wang, et al., PKM2 Inhibitor Shikonin Overcomes the Cisplatin Resistance in Bladder Cancer by Inducing Necroptosis. Int J Biol Sci. 2018 Oct 20; 14(13):1883-1891), Pyruvate Kinase Inhibitor III (see, e.g., Vander Heiden, M. G., et al. 2010. Biochem. Pharmacol. 79, 1118), 2,3-dihydroxypropyl dichloroacetate (DCA) (see, e.g., Tisdale and Threadgill, (+/−)2,3-Dihydroxypropyl dichloroacetate, an inhibitor of glycerol kinase. Cancer Biochem Biophys. 1984 September; 7(3):253-9), 2,9-Dimethyl-BC (see, e.g., Bonnet, et al., The strong inhibition of triosephosphate isomerase by the natural beta-carbolines may explain their neurotoxic actions. Neuroscience. 2004; 127(2):443-53), Koningic acid (see, e.g., Endo A et al. Specific inhibition of glyceraldehyde-3-phosphate dehydrogenase by koningic acid (heptelidic acid). J Antibiot (Tokyo) 38:920-5 (1985)), CBR-470-1 (see, e.g., Bollong, et al., A metabolite-derived protein modification integrates glycolysis with KEAP1-NRF2 signalling. Nature. 2018 October; 562(7728):600-604), SF2312 (see, e.g., Leonard, et al., SF2312 is a natural phosphonate inhibitor of enolase. Nat Chem Biol. 2016 December; 12(12):1053-1058), PhAh (see, e.g., Anderson, et al., “Reaction intermediate analogues for enolase,” Biochemistry, 23(12):2779-2789, 1984), ENOblock (see, e.g., Cho, et al., ENOblock, a unique small molecule inhibitor of the non-glycolytic functions of enolase, alleviates the symptoms of type 2 diabetes. Sci Rep. 2017 Mar. 8; 7:44186), 3-MPA (see, e.g., Ma, et al. A Pck1-directed glycogen metabolic program regulates formation and maintenance of memory CD8+ T cells. Nat Cell Biol. (2018) 20:21-7), and 6,8-Bis(benzylthio)octanoic acid (see, e.g., U.S. Ser. No. 10/391,177B2). Shikonin inhibits PKM2, dehydroepiandrosterone (DHEA) inhibits G6PD, epigallocatechin-3-gallate (EGCG) inhibits PGAM1, and 2,3-dihydroxypropyl-dichloroacetate (DCA) inhibits GK.
In certain embodiments, the one or more modulating agents may be a genetic modifying agent. The genetic modifying agent may comprise a CRISPR system, a zinc finger nuclease system, a TALEN, a meganuclease or RNAi system. In some embodiments, a polynucleotide of the present invention described elsewhere herein can be modified using a genetic modifying agent (e.g., one or more target genes are selected from SAT1, ODC1, SRM, SMS, JMJD3, POU2F2, POU2F1, POU5F1B, POU3F4, POU1F1, POU3F2, POU3F3, POU4F2, POU2F3, POU3F1, POU4F1, NFAT5, NFATC2, c-MAF and BATF; or one or more target genes are selected from PGAM, G6PD, PKM, Aldo, PFKM, TA, G6PC, GK, ENO1, PCK1, TPI1, PGK1, GAPDHS, PDHA1, and GPD1; or a combination of one or more of the genes). In preferred embodiments, modulation of expression or a gene using a genetic modifying agent (e.g., enzyme) is temporary (e.g., modulated for a period of time to shift T cell balance without adverse effects). Temporary modulation may be achieved by targeting RNA (e.g., RNA targeting CRISPR system, RNAi) or by targeting regulatory elements (e.g., CRISPRa/i).
In some embodiments, a polynucleotide of the present invention described elsewhere herein can be modified using a CRISPR-Cas and/or Cas-based system.
In general, a CRISPR-Cas or CRISPR system as used in herein and in documents, such as International Patent Publication No. WO 2014/093622 (PCT/US2013/074667), refers collectively to transcripts and other elements involved in the expression of or directing the activity of CRISPR-associated (“Cas”) genes, including sequences encoding a Cas gene, a tracr (transactivating CRISPR) sequence (e.g. tracrRNA or an active partial tracrRNA), a tracr-mate sequence (encompassing a “direct repeat” and a tracrRNA-processed partial direct repeat in the context of an endogenous CRISPR system), a guide sequence (also referred to as a “spacer” in the context of an endogenous CRISPR system), or “RNA(s)” as that term is herein used (e.g., RNA(s) to guide Cas, such as Cas9, e.g. CRISPR RNA and transactivating (tracr) RNA or a single guide RNA (sgRNA) (chimeric RNA)) or other sequences and transcripts from a CRISPR locus. In general, a CRISPR system is characterized by elements that promote the formation of a CRISPR complex at the site of a target sequence (also referred to as a protospacer in the context of an endogenous CRISPR system). See, e.g., Shmakov et al. (2015) “Discovery and Functional Characterization of Diverse Class 2 CRISPR-Cas Systems”, Molecular Cell, DOI: dx.doi.org/10.1016/j.molcel.2015.10.008.
CRISPR-Cas systems can generally fall into two classes based on their architectures of their effector molecules, which are each further subdivided by type and subtype. The two class are Class 1 and Class 2. Class 1 CRISPR-Cas systems have effector modules composed of multiple Cas proteins, some of which form crRNA-binding complexes, while Class 2 CRISPR-Cas systems include a single, multi-domain crRNA-binding protein.
In some embodiments, the CRISPR-Cas system that can be used to modify a polynucleotide of the present invention described herein can be a Class 1 CRISPR-Cas system. In some embodiments, the CRISPR-Cas system that can be used to modify a polynucleotide of the present invention described herein can be a Class 2 CRISPR-Cas system.
In some embodiments, the CRISPR-Cas system that can be used to modify a polynucleotide of the present invention described herein can be a Class 1 CRISPR-Cas system. Class 1 CRISPR-Cas systems are divided into types I, II, and IV. Makarova et al. 2020. Nat. Rev. 18: 67-83., particularly as described in
The Class 1 systems typically use a multi-protein effector complex, which can, in some embodiments, include ancillary proteins, such as one or more proteins in a complex referred to as a CRISPR-associated complex for antiviral defense (Cascade), one or more adaptation proteins (e.g., Cas1, Cas2, RNA nuclease), and/or one or more accessory proteins (e.g., Cas 4, DNA nuclease), CRISPR associated Rossman fold (CARF) domain containing proteins, and/or RNA transcriptase.
The backbone of the Class 1 CRISPR-Cas system effector complexes can be formed by RNA recognition motif domain-containing protein(s) of the repeat-associated mysterious proteins (RAMPs) family subunits (e.g., Cas 5, Cas6, and/or Cas7). RAMP proteins are characterized by having one or more RNA recognition motif domains. In some embodiments, multiple copies of RAMPs can be present. In some embodiments, the Class I CRISPR-Cas system can include 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or more Cas5, Cas6, and/or Cas 7 proteins. In some embodiments, the Cas6 protein is an RNAse, which can be responsible for pre-crRNA processing. When present in a Class 1 CRISPR-Cas system, Cas6 can be optionally physically associated with the effector complex.
Class 1 CRISPR-Cas system effector complexes can, in some embodiments, also include a large subunit. The large subunit can be composed of or include a Cas8 and/or Cas10 protein. See, e.g.,
Class 1 CRISPR-Cas system effector complexes can, in some embodiments, include a small subunit (for example, Cash 1). See, e.g.,
In some embodiments, the Class 1 CRISPR-Cas system can be a Type I CRISPR-Cas system. In some embodiments, the Type I CRISPR-Cas system can be a subtype I-A CRISPR-Cas system. In some embodiments, the Type I CRISPR-Cas system can be a subtype I-B CRISPR-Cas system. In some embodiments, the Type I CRISPR-Cas system can be a subtype I-C CRISPR-Cas system. In some embodiments, the Type I CRISPR-Cas system can be a subtype I-D CRISPR-Cas system. In some embodiments, the Type I CRISPR-Cas system can be a subtype I-E CRISPR-Cas system. In some embodiments, the Type I CRISPR-Cas system can be a subtype I-F1 CRISPR-Cas system. In some embodiments, the Type I CRISPR-Cas system can be a subtype I-F2 CRISPR-Cas system. In some embodiments, the Type I CRISPR-Cas system can be a subtype I-F3 CRISPR-Cas system. In some embodiments, the Type I CRISPR-Cas system can be a subtype I-G CRISPR-Cas system. In some embodiments, the Type I CRISPR-Cas system can be a CRISPR Cas variant, such as a Type I-A, I-B, I-E, I-F and I-U variants, which can include variants carried by transposons and plasmids, including versions of subtype I-F encoded by a large family of Tn7-like transposon and smaller groups of Tn7-like transposons that encode similarly degraded subtype I-B systems as previously described.
In some embodiments, the Class 1 CRISPR-Cas system can be a Type III CRISPR-Cas system. In some embodiments, the Type III CRISPR-Cas system can be a subtype III-A CRISPR-Cas system. In some embodiments, the Type III CRISPR-Cas system can be a subtype III-B CRISPR-Cas system. In some embodiments, the Type III CRISPR-Cas system can be a subtype III-C CRISPR-Cas system. In some embodiments, the Type III CRISPR-Cas system can be a subtype III-D CRISPR-Cas system. In some embodiments, the Type III CRISPR-Cas system can be a subtype III-E CRISPR-Cas system. In some embodiments, the Type III CRISPR-Cas system can be a subtype III-F CRISPR-Cas system.
In some embodiments, the Class 1 CRISPR-Cas system can be a Type IV CRISPR-Cas-system. In some embodiments, the Type IV CRISPR-Cas system can be a subtype IV-A CRISPR-Cas system. In some embodiments, the Type IV CRISPR-Cas system can be a subtype IV-B CRISPR-Cas system. In some embodiments, the Type IV CRISPR-Cas system can be a subtype IV-C CRISPR-Cas system.
The effector complex of a Class 1 CRISPR-Cas system can, in some embodiments, include a Cas3 protein that is optionally fused to a Cas2 protein, a Cas4, a Cas5, a Cash, a Cas7, a Cas8, a Cas10, a Cas11, or a combination thereof. In some embodiments, the effector complex of a Class 1 CRISPR-Cas system can have multiple copies, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14, of any one or more Cas proteins.
Class 2 CRISPR-Cas Systems
The compositions, systems, and methods described in greater detail elsewhere herein can be designed and adapted for use with Class 2 CRISPR-Cas systems. Thus, in some embodiments, the CRISPR-Cas system is a Class 2 CRISPR-Cas system. Class 2 systems are distinguished from Class 1 systems in that they have a single, large, multi-domain effector protein. In certain example embodiments, the Class 2 system can be a Type II, Type V, or Type VI system, which are described in Makarova et al. “Evolutionary classification of CRISPR-Cas systems: a burst of class 2 and derived variants” Nature Reviews Microbiology, 18:67-81 (February 2020), incorporated herein by reference. Each type of Class 2 system is further divided into subtypes. See Markova et al. 2020, particularly at Figure. 2. Class 2, Type II systems can be divided into 4 subtypes: II-A, II-B, II-C1, and II-C2. Class 2, Type V systems can be divided into 17 subtypes: V-A, V-B1, V-B2, V-C, V-D, V-E, V-F1, V-F1(V-U3), V-F2, V-F3, V-G, V-H, V-I, V-K (V-U5), V-U1, V-U2, and V-U4. Class 2, Type IV systems can be divided into 5 subtypes: VI-A, VI-B1, VI-B2, VI-C, and VI-D.
The distinguishing feature of these types is that their effector complexes consist of a single, large, multi-domain protein. Type V systems differ from Type II effectors (e.g., Cas9), which contain two nuclear domains that are each responsible for the cleavage of one strand of the target DNA, with the HNH nuclease inserted inside the Ruv-C like nuclease domain sequence. The Type V systems (e.g., Cas12) only contain a RuvC-like nuclease domain that cleaves both strands. Type VI (Cas13) are unrelated to the effectors of Type II and V systems and contain two HEPN domains and target RNA. Cas13 proteins also display collateral activity that is triggered by target recognition. Some Type V systems have also been found to possess this collateral activity with two single-stranded DNA in in vitro contexts.
In some embodiments, the Class 2 system is a Type II system. In some embodiments, the Type II CRISPR-Cas system is a II-A CRISPR-Cas system. In some embodiments, the Type II CRISPR-Cas system is a II-B CRISPR-Cas system. In some embodiments, the Type II CRISPR-Cas system is a II-C1 CRISPR-Cas system. In some embodiments, the Type II CRISPR-Cas system is a II-C2 CRISPR-Cas system. In some embodiments, the Type II system is a Cas9 system. In some embodiments, the Type II system includes a Cas9.
In some embodiments, the Class 2 system is a Type V system. In some embodiments, the Type V CRISPR-Cas system is a V-A CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-B1 CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-B2 CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-C CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-D CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-E CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-F1 CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-F1 (V-U3) CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-F2 CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-F3 CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-G CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-H CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-I CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-K (V-U5) CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-U1 CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-U2 CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-U4 CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system includes a Cas12a (Cpf1), Cas12b (C2c1), Cas12c (C2c3), CasX, and/or Cas14.
In some embodiments the Class 2 system is a Type VI system. In some embodiments, the Type VI CRISPR-Cas system is a VI-A CRISPR-Cas system. In some embodiments, the Type VI CRISPR-Cas system is a VI-B1 CRISPR-Cas system. In some embodiments, the Type VI CRISPR-Cas system is a VI-B2 CRISPR-Cas system. In some embodiments, the Type VI CRISPR-Cas system is a VI-C CRISPR-Cas system. In some embodiments, the Type VI CRISPR-Cas system is a VI-D CRISPR-Cas system. In some embodiments, the Type VI CRISPR-Cas system includes a Cas13a (C2c2), Cas13b (Group 29/30), Cas13c, and/or Cas13d.
In some embodiments, the system is a Cas-based system that is capable of performing a specialized function or activity. For example, the Cas protein may be fused, operably coupled to, or otherwise associated with one or more functionals domains. In certain example embodiments, the Cas protein may be a catalytically dead Cas protein (“dCas”) and/or have nickase activity. A nickase is a Cas protein that cuts only one strand of a double stranded target. In such embodiments, the dCas or nickase provide a sequence specific targeting functionality that delivers the functional domain to or proximate a target sequence. Example functional domains that may be fused to, operably coupled to, or otherwise associated with a Cas protein can be or include, but are not limited to a nuclear localization signal (NLS) domain, a nuclear export signal (NES) domain, a translational activation domain, a transcriptional activation domain (e.g. VP64, p65, MyoD1, HSF1, RTA, and SETT/9), a translation initiation domain, a transcriptional repression domain (e.g., a KRAB domain, NuE domain, NcoR domain, and a SID domain such as a SID4X domain), a nuclease domain (e.g., Fold), a histone modification domain (e.g., a histone acetyltransferase), a light inducible/controllable domain, a chemically inducible/controllable domain, a transposase domain, a homologous recombination machinery domain, a recombinase domain, an integrase domain, and combinations thereof. Methods for generating catalytically dead Cas9 or a nickase Cas9 (WO 2014/204725, Ran et al. Cell. 2013 Sep. 12; 154(6):1380-1389), Cas12 (Liu et al. Nature Communications, 8, 2095 (2017), and Cas13 (International Patent Publication Nos. WO 2019/005884, and WO2019/060746) are known in the art and incorporated herein by reference.
In some embodiments, the functional domains can have one or more of the following activities: methylase activity, demethylase activity, translation activation activity, translation initiation activity, translation repression activity, transcription activation activity, transcription repression activity, transcription release factor activity, histone modification activity, nuclease activity, single-strand RNA cleavage activity, double-strand RNA cleavage activity, single-strand DNA cleavage activity, double-strand DNA cleavage activity, molecular switch activity, chemical inducibility, light inducibility, and nucleic acid binding activity. In some embodiments, the one or more functional domains may comprise epitope tags or reporters. Non-limiting examples of epitope tags include histidine (His) tags, V5 tags, FLAG tags, influenza hemagglutinin (HA) tags, Myc tags, VSV-G tags, and thioredoxin (Trx) tags. Examples of reporters include, but are not limited to, glutathione-S-transferase (GST), horseradish peroxidase (HRP), chloramphenicol acetyltransferase (CAT) beta-galactosidase, beta-glucuronidase, luciferase, green fluorescent protein (GFP), HcRed, DsRed, cyan fluorescent protein (CFP), yellow fluorescent protein (YFP), and auto-fluorescent proteins including blue fluorescent protein (BFP).
The one or more functional domain(s) may be positioned at, near, and/or in proximity to a terminus of the effector protein (e.g., a Cas protein). In embodiments having two or more functional domains, each of the two can be positioned at or near or in proximity to a terminus of the effector protein (e.g., a Cas protein). In some embodiments, such as those where the functional domain is operably coupled to the effector protein, the one or more functional domains can be tethered or linked via a suitable linker (including, but not limited to, GlySer linkers) to the effector protein (e.g., a Cas protein). When there is more than one functional domain, the functional domains can be same or different. In some embodiments, all the functional domains are the same. In some embodiments, all of the functional domains are different from each other. In some embodiments, at least two of the functional domains are different from each other. In some embodiments, at least two of the functional domains are the same as each other.
Other suitable functional domains can be found, for example, in International Application Publication No. WO 2019/018423.
In some embodiments, the CRISPR-Cas system is a split CRISPR-Cas system. See e.g., Zetche et al., 2015. Nat. Biotechnol. 33(2): 139-142 and WO 2019/018423, the compositions and techniques of which can be used in and/or adapted for use with the present invention. Split CRISPR-Cas proteins are set forth herein and in documents incorporated herein by reference in further detail herein. In certain embodiments, each part of a split CRISPR protein are attached to a member of a specific binding pair, and when bound with each other, the members of the specific binding pair maintain the parts of the CRISPR protein in proximity. In certain embodiments, each part of a split CRISPR protein is associated with an inducible binding pair. An inducible binding pair is one which is capable of being switched “on” or “off” by a protein or small molecule that binds to both members of the inducible binding pair. In some embodiments, CRISPR proteins may preferably split between domains, leaving domains intact. In particular embodiments, said Cas split domains (e.g., RuvC and HNH domains in the case of Cas9) can be simultaneously or sequentially introduced into the cell such that said split Cas domain(s) process the target nucleic acid sequence in the algae cell. The reduced size of the split Cas compared to the wild type Cas allows other methods of delivery of the systems to the cells, such as the use of cell penetrating peptides as described herein.
In some embodiments, a polynucleotide of the present invention described elsewhere herein can be modified using a base editing system. In some embodiments, a Cas protein is connected or fused to a nucleotide deaminase. Thus, in some embodiments the Cas-based system can be a base editing system. As used herein “base editing” refers generally to the process of polynucleotide modification via a CRISPR-Cas-based or Cas-based system that does not include excising nucleotides to make the modification. Base editing can convert base pairs at precise locations without generating excess undesired editing byproducts that can be made using traditional CRISPR-Cas systems.
In certain example embodiments, the nucleotide deaminase may be a DNA base editor used in combination with a DNA binding Cas protein such as, but not limited to, Class 2 Type II and Type V systems. Two classes of DNA base editors are generally known: cytosine base editors (CBEs) and adenine base editors (ABEs). CBEs convert a C•G base pair into a T•A base pair (Komor et al. 2016. Nature. 533:420-424; Nishida et al. 2016. Science. 353; and Li et al. Nat. Biotech. 36:324-327) and ABEs convert an A•T base pair to a G•C base pair. Collectively, CBEs and ABEs can mediate all four possible transition mutations (C to T, A to G, T to C, and G to A). Rees and Liu. 2018. Nat. Rev. Genet. 19(12): 770-788, particularly at FIGS. 1b, 2a-2c, 3a-3f, and Table 1. In some embodiments, the base editing system includes a CBE and/or an ABE. In some embodiments, a polynucleotide of the present invention described elsewhere herein can be modified using a base editing system. Rees and Liu. 2018. Nat. Rev. Gent. 19(12):770-788. Base editors also generally do not need a DNA donor template and/or rely on homology-directed repair. Komor et al. 2016. Nature. 533:420-424; Nishida et al. 2016. Science. 353; and Gaudeli et al. 2017. Nature. 551:464-471. Upon binding to a target locus in the DNA, base pairing between the guide RNA of the system and the target DNA strand leads to displacement of a small segment of ssDNA in an “R-loop”. Nishimasu et al. Cell. 156:935-949. DNA bases within the ssDNA bubble are modified by the enzyme component, such as a deaminase. In some systems, the catalytically disabled Cas protein can be a variant or modified Cas can have nickase functionality and can generate a nick in the non-edited DNA strand to induce cells to repair the non-edited strand using the edited strand as a template. Komor et al. 2016. Nature. 533:420-424; Nishida et al. 2016. Science. 353; and Gaudeli et al. 2017. Nature. 551:464-471.
Other Example Type V base editing systems are described in International Patent Publication Nos. WO 2018/213708 and WO 2018/213726, and International Patent Application Nos. PCT/US2018/067207, PCT/US2018/067225, and PCT/US2018/067307, which are incorporated herein by reference.
In certain example embodiments, the base editing system may be an RNA base editing system. As with DNA base editors, a nucleotide deaminase capable of converting nucleotide bases may be fused to a Cas protein. However, in these embodiments, the Cas protein will need to be capable of binding RNA. Example RNA binding Cas proteins include, but are not limited to, RNA-binding Cas9s such as Francisella novicida Cas9 (“FnCas9”), and Class 2 Type VI Cas systems. The nucleotide deaminase may be a cytidine deaminase or an adenosine deaminase, or an adenosine deaminase engineered to have cytidine deaminase activity. In certain example embodiments, the RNA based editor may be used to delete or introduce a post-translation modification site in the expressed mRNA. In contrast to DNA base editors, whose edits are permanent in the modified cell, RNA base editors can provide edits where finer temporal control may be needed, for example in modulating a particular immune response. Example Type VI RNA-base editing systems are described in Cox et al. 2017. Science 358: 1019-1027, International Patent Publication Nos. WO 2019/005884, WO 2019/005886, and WO 2019/071048, and International Patent Application Nos. PCT/US20018/05179 and PCT/US2018/067207, which are incorporated herein by reference. An example FnCas9 system that may be adapted for RNA base editing purposes is described in International Patent Publication No. WO 2016/106236, which is incorporated herein by reference.
An example method for delivery of base-editing systems, including use of a split-intein approach to divide CBE and ABE into reconstitutable halves, is described in Levy et al. Nature Biomedical Engineering doi.org/10.1038/s41441-019-0505-5 (2019), which is incorporated herein by reference.
In some embodiments, a polynucleotide of the present invention described elsewhere herein can be modified using a prime editing system. See e.g. Anzalone et al. 2019. Nature. 576: 149-157. Like base editing systems, prime editing systems can be capable of targeted modification of a polynucleotide without generating double stranded breaks and does not require donor templates. Further prime editing systems can be capable of all 12 possible combination swaps. Prime editing can operate via a “search-and-replace” methodology and can mediate targeted insertions, deletions, all 12 possible base-to-base conversion, and combinations thereof. Generally, a prime editing system, as exemplified by PE1, PE2, and PE3 (Id.), can include a reverse transcriptase fused or otherwise coupled or associated with an RNA-programmable nickase, and a prime-editing extended guide RNA (pegRNA) to facility direct copying of genetic information from the extension on the pegRNA into the target polynucleotide. Embodiments that can be used with the present invention include these and variants thereof. Prime editing can have the advantage of lower off-target activity than traditional CRIPSR-Cas systems along with few byproducts and greater or similar efficiency as compared to traditional CRISPR-Cas systems.
In some embodiments, the prime editing guide molecule can specify both the target polynucleotide information (e.g. sequence) and contain a new polynucleotide cargo that replaces target polynucleotides. To initiate transfer from the guide molecule to the target polynucleotide, the PE system can nick the target polynucleotide at a target side to expose a 3′hydroxyl group, which can prime reverse transcription of an edit-encoding extension region of the guide molecule (e.g. a prime editing guide molecule or peg guide molecule) directly into the target site in the target polynucleotide. See e.g. Anzalone et al. 2019. Nature. 576: 149-157, particularly at FIGS. 1b, 1c, related discussion, and Supplementary discussion.
In some embodiments, a prime editing system can be composed of a Cas polypeptide having nickase activity, a reverse transcriptase, and a guide molecule. The Cas polypeptide can lack nuclease activity. The guide molecule can include a target binding sequence as well as a primer binding sequence and a template containing the edited polynucleotide sequence. The guide molecule, Cas polypeptide, and/or reverse transcriptase can be coupled together or otherwise associate with each other to form an effector complex and edit a target sequence. In some embodiments, the Cas polypeptide is a Class 2, Type V Cas polypeptide. In some embodiments, the Cas polypeptide is a Cas9 polypeptide (e.g. is a Cas9 nickase). In some embodiments, the Cas polypeptide is fused to the reverse transcriptase. In some embodiments, the Cas polypeptide is linked to the reverse transcriptase.
In some embodiments, the prime editing system can be a PE1 system or variant thereof, a PE2 system or variant thereof, or a PE3 (e.g. PE3, PE3b) system. See e.g., Anzalone et al. 2019. Nature. 576: 149-157, particularly at pgs. 2-3,
The peg guide molecule can be about 10 to about 200 or more nucleotides in length, such as 10 to/or 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, or 200 or more nucleotides in length. Optimization of the peg guide molecule can be accomplished as described in Anzalone et al. 2019. Nature. 576: 149-157, particularly at pg. 3, FIG. 2a-2b, and Extended Data FIGS. 5a-c.
In some embodiments, a polynucleotide of the present invention described elsewhere herein can be modified using a CRISPR Associated Transposase (“CAST”) system. CAST system can include a Cas protein that is catalytically inactive, or engineered to be catalytically active, and further comprises a transposase (or subunits thereof) that catalyze RNA-guided DNA transposition. Such systems are able to insert DNA sequences at a target site in a DNA molecule without relying on host cell repair machinery. CAST systems can be Class 1 or Class 2 CAST systems. An example Class 1 system is described in Klompe et al. Nature, doi:10.1038/s41586-019-1323, which is in incorporated herein by reference. An example Class 2 system is described in Strecker et al. Science. 10/1126/science. aax9181 (2019), and PCT/US2019/066835 which are incorporated herein by reference.
The CRISPR-Cas or Cas-Based system described herein can, in some embodiments, include one or more guide molecules. The terms guide molecule, guide sequence and guide polynucleotide, refer to polynucleotides capable of guiding Cas to a target genomic locus and are used interchangeably as in foregoing cited documents such as International Patent Publication No. WO 2014/093622 (PCT/US2013/074667). In general, a guide sequence is any polynucleotide sequence having sufficient complementarity with a target polynucleotide sequence to hybridize with the target sequence and direct sequence-specific binding of a CRISPR complex to the target sequence. The guide molecule can be a polynucleotide.
The ability of a guide sequence (within a nucleic acid-targeting guide RNA) to direct sequence-specific binding of a nucleic acid-targeting complex to a target nucleic acid sequence may be assessed by any suitable assay. For example, the components of a nucleic acid-targeting CRISPR system sufficient to form a nucleic acid-targeting complex, including the guide sequence to be tested, may be provided to a host cell having the corresponding target nucleic acid sequence, such as by transfection with vectors encoding the components of the nucleic acid-targeting complex, followed by an assessment of preferential targeting (e.g., cleavage) within the target nucleic acid sequence, such as by Surveyor assay (Qui et al. 2004. BioTechniques. 36(4)702-707). Similarly, cleavage of a target nucleic acid sequence may be evaluated in a test tube by providing the target nucleic acid sequence, components of a nucleic acid-targeting complex, including the guide sequence to be tested and a control guide sequence different from the test guide sequence, and comparing binding or rate of cleavage at the target sequence between the test and control guide sequence reactions. Other assays are possible and will occur to those skilled in the art.
In some embodiments, the guide molecule is an RNA. The guide molecule(s) (also referred to interchangeably herein as guide polynucleotide and guide sequence) that are included in the CRISPR-Cas or Cas based system can be any polynucleotide sequence having sufficient complementarity with a target nucleic acid sequence to hybridize with the target nucleic acid sequence and direct sequence-specific binding of a nucleic acid-targeting complex to the target nucleic acid sequence. In some embodiments, the degree of complementarity, when optimally aligned using a suitable alignment algorithm, can be about or more than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99%, or more. Optimal alignment may be determined with the use of any suitable algorithm for aligning sequences, non-limiting examples of which include the Smith-Waterman algorithm, the Needleman-Wunsch algorithm, algorithms based on the Burrows-Wheeler Transform (e.g., the Burrows Wheeler Aligner), ClustalW, Clustal X, BLAT, Novoalign (Novocraft Technologies; available at www.novocraft.com), ELAND (Illumina, San Diego, Calif.), SOAP (available at soap.genomics.org.cn), and Maq (available at maq.sourceforge.net).
A guide sequence, and hence a nucleic acid-targeting guide may be selected to target any target nucleic acid sequence. The target sequence may be DNA. The target sequence may be any RNA sequence. In some embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of messenger RNA (mRNA), pre-mRNA, ribosomal RNA (rRNA), transfer RNA (tRNA), micro-RNA (miRNA), small interfering RNA (siRNA), small nuclear RNA (snRNA), small nucleolar RNA (snoRNA), double stranded RNA (dsRNA), non-coding RNA (ncRNA), long non-coding RNA (lncRNA), and small cytoplasmatic RNA (scRNA). In some preferred embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of mRNA, pre-mRNA, and rRNA. In some preferred embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of ncRNA, and lncRNA. In some more preferred embodiments, the target sequence may be a sequence within an mRNA molecule or a pre-mRNA molecule.
In some embodiments, a nucleic acid-targeting guide is selected to reduce the degree secondary structure within the nucleic acid-targeting guide. In some embodiments, about or less than about 75%, 50%, 40%, 30%, 25%, 20%, 15%, 10%, 5%, 1%, or fewer of the nucleotides of the nucleic acid-targeting guide participate in self-complementary base pairing when optimally folded. Optimal folding may be determined by any suitable polynucleotide folding algorithm. Some programs are based on calculating the minimal Gibbs free energy. An example of one such algorithm is mFold, as described by Zuker and Stiegler (Nucleic Acids Res. 9 (1981), 133-148). Another example folding algorithm is the online webserver RNAfold, developed at Institute for Theoretical Chemistry at the University of Vienna, using the centroid structure prediction algorithm (see e.g., A. R. Gruber et al., 2008, Cell 106(1): 23-24; and PA Carr and GM Church, 2009, Nature Biotechnology 27(12): 1151-62).
In certain embodiments, a guide RNA or crRNA may comprise, consist essentially of, or consist of a direct repeat (DR) sequence and a guide sequence or spacer sequence. In certain embodiments, the guide RNA or crRNA may comprise, consist essentially of, or consist of a direct repeat sequence fused or linked to a guide sequence or spacer sequence. In certain embodiments, the direct repeat sequence may be located upstream (i.e., 5′) from the guide sequence or spacer sequence. In other embodiments, the direct repeat sequence may be located downstream (i.e., 3′) from the guide sequence or spacer sequence.
In certain embodiments, the crRNA comprises a stem loop, preferably a single stem loop. In certain embodiments, the direct repeat sequence forms a stem loop, preferably a single stem loop.
In certain embodiments, the spacer length of the guide RNA is from 15 to 35 nt. In certain embodiments, the spacer length of the guide RNA is at least 15 nucleotides. In certain embodiments, the spacer length is from 15 to 17 nt, e.g., 15, 16, or 17 nt, from 17 to 20 nt, e.g., 17, 18, 19, or 20 nt, from 20 to 24 nt, e.g., 20, 21, 22, 23, or 24 nt, from 23 to 25 nt, e.g., 23, 24, or 25 nt, from 24 to 27 nt, e.g., 24, 25, 26, or 27 nt, from 27 to 30 nt, e.g., 27, 28, 29, or 30 nt, from 30 to 35 nt, e.g., 30, 31, 32, 33, 34, or 35 nt, or 35 nt or longer.
The “tracrRNA” sequence or analogous terms includes any polynucleotide sequence that has sufficient complementarity with a crRNA sequence to hybridize. In some embodiments, the degree of complementarity between the tracrRNA sequence and crRNA sequence along the length of the shorter of the two when optimally aligned is about or more than about 25%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 97.5%, 99%, or higher. In some embodiments, the tracr sequence is about or more than about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 40, 50, or more nucleotides in length. In some embodiments, the tracr sequence and crRNA sequence are contained within a single transcript, such that hybridization between the two produces a transcript having a secondary structure, such as a hairpin.
In general, degree of complementarity is with reference to the optimal alignment of the sca sequence and tracr sequence, along the length of the shorter of the two sequences. Optimal alignment may be determined by any suitable alignment algorithm, and may further account for secondary structures, such as self-complementarity within either the sca sequence or tracr sequence. In some embodiments, the degree of complementarity between the tracr sequence and sca sequence along the length of the shorter of the two when optimally aligned is about or more than about 25%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 97.5%, 99%, or higher.
In some embodiments, the degree of complementarity between a guide sequence and its corresponding target sequence can be about or more than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99%, or 100%; a guide or RNA or sgRNA can be about or more than about 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75, or more nucleotides in length; or guide or RNA or sgRNA can be less than about 75, 50, 45, 40, 35, 30, 25, 20, 15, 12, or fewer nucleotides in length; and tracr RNA can be 30 or 50 nucleotides in length. In some embodiments, the degree of complementarity between a guide sequence and its corresponding target sequence is greater than 94.5% or 95% or 95.5% or 96% or 96.5% or 97% or 97.5% or 98% or 98.5% or 99% or 99.5% or 99.9%, or 100%. Off target is less than 100% or 99.9% or 99.5% or 99% or 99% or 98.5% or 98% or 97.5% or 97% or 96.5% or 96% or 95.5% or 95% or 94.5% or 94% or 93% or 92% or 91% or 90% or 89% or 88% or 87% or 86% or 85% or 84% or 83% or 82% or 81% or 80% complementarity between the sequence and the guide, with it advantageous that off target is 100% or 99.9% or 99.5% or 99% or 99% or 98.5% or 98% or 97.5% or 97% or 96.5% or 96% or 95.5% or 95% or 94.5% complementarity between the sequence and the guide.
In some embodiments according to the invention, the guide RNA (capable of guiding Cas to a target locus) may comprise (1) a guide sequence capable of hybridizing to a genomic target locus in the eukaryotic cell; (2) a tracr sequence; and (3) a tracr mate sequence. All (1) to (3) may reside in a single RNA, i.e., an sgRNA (arranged in a 5′ to 3′ orientation), or the tracr RNA may be a different RNA than the RNA containing the guide and tracr sequence. The tracr hybridizes to the tracr mate sequence and directs the CRISPR/Cas complex to the target sequence. Where the tracr RNA is on a different RNA than the RNA containing the guide and tracr sequence, the length of each RNA may be optimized to be shortened from their respective native lengths, and each may be independently chemically modified to protect from degradation by cellular RNase or otherwise increase stability.
Many modifications to guide sequences are known in the art and are further contemplated within the context of this invention. Various modifications may be used to increase the specificity of binding to the target sequence and/or increase the activity of the Cas protein and/or reduce off-target effects. Example guide sequence modifications are described in International Patent Application No. PCT US2019/045582, specifically paragraphs [0178]-[0333]. which is incorporated herein by reference.
In the context of formation of a CRISPR complex, “target sequence” refers to a sequence to which a guide sequence is designed to have complementarity, where hybridization between a target sequence and a guide sequence promotes the formation of a CRISPR complex. A target sequence may comprise RNA polynucleotides. The term “target RNA” refers to an RNA polynucleotide being or comprising the target sequence. In other words, the target polynucleotide can be a polynucleotide or a part of a polynucleotide to which a part of the guide sequence is designed to have complementarity with and to which the effector function mediated by the complex comprising the CRISPR effector protein and a guide molecule is to be directed. In some embodiments, a target sequence is located in the nucleus or cytoplasm of a cell.
The guide sequence can specifically bind a target sequence in a target polynucleotide. The target polynucleotide may be DNA. The target polynucleotide may be RNA. The target polynucleotide can have one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, etc. or more) target sequences. The target polynucleotide can be on a vector. The target polynucleotide can be genomic DNA. The target polynucleotide can be episomal. Other forms of the target polynucleotide are described elsewhere herein.
The target sequence may be DNA. The target sequence may be any RNA sequence. In some embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of messenger RNA (mRNA), pre-mRNA, ribosomal RNA (rRNA), transfer RNA (tRNA), micro-RNA (miRNA), small interfering RNA (siRNA), small nuclear RNA (snRNA), small nucleolar RNA (snoRNA), double stranded RNA (dsRNA), non-coding RNA (ncRNA), long non-coding RNA (lncRNA), and small cytoplasmatic RNA (scRNA). In some preferred embodiments, the target sequence (also referred to herein as a target polynucleotide) may be a sequence within an RNA molecule selected from the group consisting of mRNA, pre-mRNA, and rRNA. In some preferred embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of ncRNA, and lncRNA. In some more preferred embodiments, the target sequence may be a sequence within an mRNA molecule or a pre-mRNA molecule.
PAM elements are sequences that can be recognized and bound by Cas proteins. Cas proteins/effector complexes can then unwind the dsDNA at a position adjacent to the PAM element. It will be appreciated that Cas proteins and systems that include them that target RNA do not require PAM sequences (Marraffini et al. 2010. Nature. 463:568-571). Instead, many rely on PFSs, which are discussed elsewhere herein. In certain embodiments, the target sequence should be associated with a PAM (protospacer adjacent motif) or PFS (protospacer flanking sequence or site), that is, a short sequence recognized by the CRISPR complex. Depending on the nature of the CRISPR-Cas protein, the target sequence should be selected, such that its complementary sequence in the DNA duplex (also referred to herein as the non-target sequence) is upstream or downstream of the PAM. In the embodiments, the complementary sequence of the target sequence is downstream or 3′ of the PAM or upstream or 5′ of the PAM. The precise sequence and length requirements for the PAM differ depending on the Cas protein used, but PAMs are typically 2-5 base pair sequences adjacent the protospacer (that is, the target sequence). Examples of the natural PAM sequences for different Cas proteins are provided herein below and the skilled person will be able to identify further PAM sequences for use with a given Cas protein.
The ability to recognize different PAM sequences depends on the Cas polypeptide(s) included in the system. See e.g., Gleditzsch et al. 2019. RNA Biology. 16(4):504-517. Table 3 (from Gleditzsch et al. 2019) below shows several Cas polypeptides and the PAM sequence they recognize.
In a preferred embodiment, the CRISPR effector protein may recognize a 3′ PAM. In certain embodiments, the CRISPR effector protein may recognize a 3′ PAM which is 5′H, wherein H is A, C or U.
Further, engineering of the PAM Interacting (PI) domain on the Cas protein may allow programing of PAM specificity, improve target site recognition fidelity, and increase the versatility of the CRISPR-Cas protein, for example as described for Cas9 in Kleinstiver B P et al. Engineered CRISPR-Cas9 nucleases with altered PAM specificities. Nature. 2015 Jul. 23; 523(7561):481-5. doi: 10.1038/nature14592. As further detailed herein, the skilled person will understand that Cas13 proteins may be modified analogously. Gao et al, “Engineered Cpf1 Enzymes with Altered PAM Specificities,” bioRxiv 091611; doi: http://dx.doi.org/10.1101/091611 (Dec. 4, 2016). Doench et al. created a pool of sgRNAs, tiling across all possible target sites of a panel of six endogenous mouse and three endogenous human genes and quantitatively assessed their ability to produce null alleles of their target gene by antibody staining and flow cytometry. The authors showed that optimization of the PAM improved activity and also provided an on-line tool for designing sgRNAs.
PAM sequences can be identified in a polynucleotide using an appropriate design tool, which are commercially available as well as online. Such freely available tools include, but are not limited to, CRISPRFinder and CRISPRTarget. Mojica et al. 2009. Microbiol. 155 (Pt. 3):733-740; Atschul et al. 1990. J. Mol. Biol. 215:403-410; Biswass et al. 2013 RNA Biol. 10:817-827; and Grissa et al. 2007. Nucleic Acid Res. 35:W52-57. Experimental approaches to PAM identification can include, but are not limited to, plasmid depletion assays (Jiang et al. 2013. Nat. Biotechnol. 31:233-239; Esvelt et al. 2013. Nat. Methods. 10:1116-1121; Kleinstiver et al. 2015. Nature. 523:481-485), screened by a high-throughput in vivo model called PAM-SCNAR (Pattanayak et al. 2013. Nat. Biotechnol. 31:839-843 and Leenay et al. 2016.Mol. Cell. 16:253), and negative screening (Zetsche et al. 2015. Cell. 163:759-771).
As previously mentioned, CRISPR-Cas systems that target RNA do not typically rely on PAM sequences. Instead such systems typically recognize protospacer flanking sites (PFSs) instead of PAMs Thus, Type VI CRISPR-Cas systems typically recognize protospacer flanking sites (PFSs) instead of PAMs. PFSs represents an analogue to PAMs for RNA targets. Type VI CRISPR-Cas systems employ a Cas13. Some Cas13 proteins analyzed to date, such as Cas13a (C2c2) identified from Leptotrichia shahii (LShCAs13a) have a specific discrimination against G at the 3′ end of the target RNA. The presence of a C at the corresponding crRNA repeat site can indicate that nucleotide pairing at this position is rejected. However, some Cas13 proteins (e.g., LwaCAs13a and PspCas13b) do not seem to have a PFS preference. See e.g., Gleditzsch et al. 2019. RNA Biology. 16(4):504-517.
Some Type VI proteins, such as subtype B, have 5′-recognition of D (G, T, A) and a 3′-motif requirement of NAN or NNA. One example is the Cas13b protein identified in Bergeyella zoohelcum (BzCas13b). See e.g., Gleditzsch et al. 2019. RNA Biology. 16(4):504-517.
Overall Type VI CRISPR-Cas systems appear to have less restrictive rules for substrate (e.g., target sequence) recognition than those that target DNA (e.g., Type V and type II).
In some embodiments, the polynucleotide is modified using a Zinc Finger nuclease or system thereof. One type of programmable DNA-binding domain is provided by artificial zinc-finger (ZF) technology, which involves arrays of ZF modules to target new DNA-binding sites in the genome. Each finger module in a ZF array targets three DNA bases. A customized array of individual zinc finger domains is assembled into a ZF protein (ZFP).
ZFPs can comprise a functional domain. The first synthetic zinc finger nucleases (ZFNs) were developed by fusing a ZF protein to the catalytic domain of the Type IIS restriction enzyme FokI. (Kim, Y. G. et al., 1994, Chimeric restriction endonuclease, Proc. Natl. Acad. Sci. U.S.A. 91, 883-887; Kim, Y. G. et al., 1996, Hybrid restriction enzymes: zinc finger fusions to Fok I cleavage domain. Proc. Natl. Acad. Sci. U.S.A. 93, 1156-1160). Increased cleavage specificity can be attained with decreased off target activity by use of paired ZFN heterodimers, each targeting different nucleotide sequences separated by a short spacer. (Doyon, Y. et al., 2011, Enhancing zinc-finger-nuclease activity with improved obligate heterodimeric architectures. Nat. Methods 8, 74-79). ZFPs can also be designed as transcription activators and repressors and have been used to target many genes in a wide variety of organisms. Exemplary methods of genome editing using ZFNs can be found for example in U.S. Pat. Nos. 6,534,261, 6,607,882, 6,746,838, 6,794,136, 6,824,978, 6,866,997, 6,933,113, 6,979,539, 7,013,219, 7,030,215, 7,220,719, 7,241,573, 7,241,574, 7,585,849, 7,595,376, 6,903,185, and 6,479,626, all of which are specifically incorporated herein by reference.
In some embodiments, one or more components (e.g., the Cas protein and/or deaminase) in the composition for engineering cells may comprise one or more sequences related to nucleus targeting and transportation. Such sequence may facilitate the one or more components in the composition for targeting a sequence within a cell. In order to improve targeting of the CRISPR-Cas protein and/or the nucleotide deaminase protein or catalytic domain thereof used in the methods of the present disclosure to the nucleus, it may be advantageous to provide one or both of these components with one or more nuclear localization sequences (NLSs).
In some embodiments, the NLSs used in the context of the present disclosure are heterologous to the proteins. Non-limiting examples of NLSs include an NLS sequence derived from: the NLS of the SV40 virus large T-antigen, having the amino acid sequence PKKKRKV (SEQ ID NO:1) or PKKKRKVEAS (SEQ ID NO:2); the NLS from nucleoplasmin (e.g., the nucleoplasmin bipartite NLS with the sequence KRPAATKKAGQAKKKK (SEQ ID NO:3)); the c-myc NLS having the amino acid sequence PAAKRVKLD (SEQ ID NO:4) or RQRRNELKRSP (SEQ ID NO:5); the hRNPA1 M9 NLS having the sequence NQSSNFGPMKGGNFGGRSSGPYGGGGQYFAKPRNQGGY (SEQ ID NO:6); the sequence RMRIZFKNKGKDTAELRRRRVEVSVELRKAKKDEQILKRRNV (SEQ ID NO:7) of the IBB domain from importin-alpha; the sequences VSRKRPRP (SEQ ID NO:8) and PPKKARED (SEQ ID NO:9) of the myoma T protein; the sequence PQPKKKPL (SEQ ID NO:10) of human p53; the sequence SALIKKKKKMAP (SEQ ID NO:11) of mouse c-abl IV; the sequences DRLRR (SEQ ID NO:12) and PKQKKRK (SEQ ID NO:13) of the influenza virus NS1; the sequence RKLKKKIKKL (SEQ ID NO:14) of the Hepatitis virus delta antigen; the sequence REKKKFLKRR (SEQ ID NO:15) of the mouse Mx1 protein; the sequence KRKGDEVDGVDEVAKKKSKK (SEQ ID NO:16) of the human poly(ADP-ribose) polymerase; and the sequence RKCLQAGMNLEARKTKK (SEQ ID NO:17) of the steroid hormone receptors (human) glucocorticoid. In general, the one or more NLSs are of sufficient strength to drive accumulation of the DNA-targeting Cas protein in a detectable amount in the nucleus of a eukaryotic cell. In general, strength of nuclear localization activity may derive from the number of NLSs in the CRISPR-Cas protein, the particular NLS(s) used, or a combination of these factors. Detection of accumulation in the nucleus may be performed by any suitable technique. For example, a detectable marker may be fused to the nucleic acid-targeting protein, such that location within a cell may be visualized, such as in combination with a means for detecting the location of the nucleus (e.g., a stain specific for the nucleus such as DAPI). Cell nuclei may also be isolated from cells, the contents of which may then be analyzed by any suitable process for detecting protein, such as immunohistochemistry, Western blot, or enzyme activity assay. Accumulation in the nucleus may also be determined indirectly, such as by an assay for the effect of nucleic acid-targeting complex formation (e.g., assay for deaminase activity) at the target sequence, or assay for altered gene expression activity affected by DNA-targeting complex formation and/or DNA-targeting), as compared to a control not exposed to the CRISPR-Cas protein and deaminase protein, or exposed to a CRISPR-Cas and/or deaminase protein lacking the one or more NLSs.
The CRISPR-Cas and/or nucleotide deaminase proteins may be provided with 1 or more, such as with, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more heterologous NLSs. In some embodiments, the proteins comprises about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the amino-terminus, about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the carboxy-terminus, or a combination of these (e.g., zero or at least one or more NLS at the amino-terminus and zero or at one or more NLS at the carboxy terminus). When more than one NLS is present, each may be selected independently of the others, such that a single NLS may be present in more than one copy and/or in combination with one or more other NLSs present in one or more copies. In some embodiments, an NLS is considered near the N- or C-terminus when the nearest amino acid of the NLS is within about 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50, or more amino acids along the polypeptide chain from the N- or C-terminus. In preferred embodiments of the CRISPR-Cas proteins, an NLS attached to the C-terminal of the protein.
In certain embodiments, the CRISPR-Cas protein and the deaminase protein are delivered to the cell or expressed within the cell as separate proteins. In these embodiments, each of the CRISPR-Cas and deaminase protein can be provided with one or more NLSs as described herein. In certain embodiments, the CRISPR-Cas and deaminase proteins are delivered to the cell or expressed with the cell as a fusion protein. In these embodiments one or both of the CRISPR-Cas and deaminase protein is provided with one or more NLSs. Where the nucleotide deaminase is fused to an adaptor protein (such as MS2) as described above, the one or more NLS can be provided on the adaptor protein, provided that this does not interfere with aptamer binding. In particular embodiments, the one or more NLS sequences may also function as linker sequences between the nucleotide deaminase and the CRISPR-Cas protein.
In certain embodiments, guides of the disclosure comprise specific binding sites (e.g. aptamers) for adapter proteins, which may be linked to or fused to a nucleotide deaminase or catalytic domain thereof. When such a guide forms a CRISPR complex (e.g., CRISPR-Cas protein binding to guide and target) the adapter proteins bind and, the nucleotide deaminase or catalytic domain thereof associated with the adapter protein is positioned in a spatial orientation which is advantageous for the attributed function to be effective.
The skilled person will understand that modifications to the guide which allow for binding of the adapter+nucleotide deaminase, but not proper positioning of the adapter+nucleotide deaminase (e.g. due to steric hindrance within the three-dimensional structure of the CRISPR complex) are modifications which are not intended. The one or more modified guide may be modified at the tetra loop, the stem loop 1, stem loop 2, or stem loop 3, as described herein, preferably at either the tetra loop or stem loop 2, and in some cases at both the tetra loop and stem loop 2.
In some embodiments, a component (e.g., the dead Cas protein, the nucleotide deaminase protein or catalytic domain thereof, or a combination thereof) in the systems may comprise one or more nuclear export signals (NES), one or more nuclear localization signals (NLS), or any combinations thereof. In some cases, the NES may be an HIV Rev NES. In certain cases, the NES may be MAPK NES. When the component is a protein, the NES or NLS may be at the C terminus of component. Alternatively or additionally, the NES or NLS may be at the N terminus of component. In some examples, the Cas protein and optionally said nucleotide deaminase protein or catalytic domain thereof comprise one or more heterologous nuclear export signal(s) (NES(s)) or nuclear localization signal(s) (NLS(s)), preferably an HIV Rev NES or MAPK NES, preferably C-terminal.
In some embodiments, the composition for engineering cells comprise a template, e.g., a recombination template. A template may be a component of another vector as described herein, contained in a separate vector, or provided as a separate polynucleotide. In some embodiments, a recombination template is designed to serve as a template in homologous recombination, such as within or near a target sequence nicked or cleaved by a nucleic acid-targeting effector protein as a part of a nucleic acid-targeting complex.
In an embodiment, the template nucleic acid alters the sequence of the target position. In an embodiment, the template nucleic acid results in the incorporation of a modified, or non-naturally occurring base into the target nucleic acid.
The template sequence may undergo a breakage mediated or catalyzed recombination with the target sequence. In an embodiment, the template nucleic acid may include a sequence that corresponds to a site on the target sequence that is cleaved by a Cas protein mediated cleavage event. In an embodiment, the template nucleic acid may include a sequence that corresponds to both, a first site on the target sequence that is cleaved in a first Cas protein mediated event, and a second site on the target sequence that is cleaved in a second Cas protein mediated event.
In certain embodiments, the template nucleic acid can include a sequence which results in an alteration in the coding sequence of a translated sequence, e.g., one which results in the substitution of one amino acid for another in a protein product, e.g., transforming a mutant allele into a wild type allele, transforming a wild type allele into a mutant allele, and/or introducing a stop codon, insertion of an amino acid residue, deletion of an amino acid residue, or a nonsense mutation. In certain embodiments, the template nucleic acid can include a sequence which results in an alteration in a non-coding sequence, e.g., an alteration in an exon or in a 5′ or 3′ non-translated or non-transcribed region. Such alterations include an alteration in a control element, e.g., a promoter, enhancer, and an alteration in a cis-acting or trans-acting control element.
A template nucleic acid having homology with a target position in a target gene may be used to alter the structure of a target sequence. The template sequence may be used to alter an unwanted structure, e.g., an unwanted or mutant nucleotide. The template nucleic acid may include a sequence which, when integrated, results in decreasing the activity of a positive control element; increasing the activity of a positive control element; decreasing the activity of a negative control element; increasing the activity of a negative control element; decreasing the expression of a gene; increasing the expression of a gene; increasing resistance to a disorder or disease; increasing resistance to viral entry; correcting a mutation or altering an unwanted amino acid residue conferring, increasing, abolishing or decreasing a biological property of a gene product, e.g., increasing the enzymatic activity of an enzyme, or increasing the ability of a gene product to interact with another molecule.
The template nucleic acid may include sequence which results in a change in sequence of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or more nucleotides of the target sequence.
A template polynucleotide may be of any suitable length, such as about or more than about 10, 15, 20, 25, 50, 75, 100, 150, 200, 500, 1000, or more nucleotides in length. In an embodiment, the template nucleic acid may be 20+/−10, 30+/−10, 40+/−10, 50+/−10, 60+/−10, 70+/−10, 80+/−10, 90+/−10, 100+/−10, 110+/−10, 120+/−10, 130+/−10, 140+/−10, 150+/−10, 160+/−10, 170+/−10, 180+/−10, 190+/−10, 200+/−10, 210+/−10, of 220+/−10 nucleotides in length. In an embodiment, the template nucleic acid may be 30+/−20, 40+/−20, 50+/−20, 60+/−20, 70+/−20, 80+/−20, 90+/−20, 100+/−20, 110+/−20, 120+/−20, 130+/−20, 140+/−20, I 50+/−20, 160+/−20, 170+/−20, 180+/−20, 190+/−20, 200+/−20, 210+/−20, of 220+/−20 nucleotides in length. In an embodiment, the template nucleic acid is 10 to 1,000, 20 to 900, 30 to 800, 40 to 700, 50 to 600, 50 to 500, 50 to 400, 50 to 300, 50 to 200, or 50 to 100 nucleotides in length.
In some embodiments, the template polynucleotide is complementary to a portion of a polynucleotide comprising the target sequence. When optimally aligned, a template polynucleotide might overlap with one or more nucleotides of a target sequence (e.g., about or more than about 1, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100 or more nucleotides). In some embodiments, when a template sequence and a polynucleotide comprising a target sequence are optimally aligned, the nearest nucleotide of the template polynucleotide is within about 1, 5, 10, 15, 20, 25, 50, 75, 100, 200, 300, 400, 500, 1000, 5000, 10000, or more nucleotides from the target sequence.
The exogenous polynucleotide template comprises a sequence to be integrated (e.g., a mutated gene). The sequence for integration may be a sequence endogenous or exogenous to the cell. Examples of a sequence to be integrated include polynucleotides encoding a protein or a non-coding RNA (e.g., a microRNA). Thus, the sequence for integration may be operably linked to an appropriate control sequence or sequences. Alternatively, the sequence to be integrated may provide a regulatory function.
An upstream or downstream sequence may comprise from about 20 bp to about 2500 bp, for example, about 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400, or 2500 bp. In some methods, the exemplary upstream or downstream sequence have about 200 bp to about 2000 bp, about 600 bp to about 1000 bp, or more particularly about 700 bp to about 1000.
An upstream or downstream sequence may comprise from about 20 bp to about 2500 bp, for example, about 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400, or 2500 bp. In some methods, the exemplary upstream or downstream sequence have about 200 bp to about 2000 bp, about 600 bp to about 1000 bp, or more particularly about 700 bp to about 1000
In certain embodiments, one or both homology arms may be shortened to avoid including certain sequence repeat elements. For example, a 5′ homology arm may be shortened to avoid a sequence repeat element. In other embodiments, a 3′ homology arm may be shortened to avoid a sequence repeat element. In some embodiments, both the 5′ and the 3′ homology arms may be shortened to avoid including certain sequence repeat elements.
In some methods, the exogenous polynucleotide template may further comprise a marker. Such a marker may make it easy to screen for targeted integrations. Examples of suitable markers include restriction sites, fluorescent proteins, or selectable markers. The exogenous polynucleotide template of the disclosure can be constructed using recombinant techniques (see, for example, Sambrook et al., 2001 and Ausubel et al., 1996).
In certain embodiments, a template nucleic acid for correcting a mutation may designed for use as a single-stranded oligonucleotide. When using a single-stranded oligonucleotide, 5′ and 3′ homology arms may range up to about 200 base pairs (bp) in length, e.g., at least 25, 50, 75, 100, 125, 150, 175, or 200 by in length.
Suzuki et al. describe in vivo genome editing via CRISPR/Cas9 mediated homology-independent targeted integration (2016, Nature 540:144-149).
In some embodiments, a TALE nuclease or TALE nuclease system can be used to modify a polynucleotide. In some embodiments, the methods provided herein use isolated, non-naturally occurring, recombinant or engineered DNA binding proteins that comprise TALE monomers or TALE monomers or half monomers as a part of their organizational structure that enable the targeting of nucleic acid sequences with improved efficiency and expanded specificity.
Naturally occurring TALEs or “wild type TALEs” are nucleic acid binding proteins secreted by numerous species of proteobacteria. TALE polypeptides contain a nucleic acid binding domain composed of tandem repeats of highly conserved monomer polypeptides that are predominantly 33, 34 or 35 amino acids in length and that differ from each other mainly in amino acid positions 12 and 13. In advantageous embodiments the nucleic acid is DNA. As used herein, the term “polypeptide monomers”, “TALE monomers” or “monomers” will be used to refer to the highly conserved repetitive polypeptide sequences within the TALE nucleic acid binding domain and the term “repeat variable di-residues” or “RVD” will be used to refer to the highly variable amino acids at positions 12 and 13 of the polypeptide monomers. As provided throughout the disclosure, the amino acid residues of the RVD are depicted using the IUPAC single letter code for amino acids. A general representation of a TALE monomer which is comprised within the DNA binding domain is X1-11—(X12X13)—X14-33 or 34 or 35, where the subscript indicates the amino acid position and X represents any amino acid. X12X13 indicate the RVDs. In some polypeptide monomers, the variable amino acid at position 13 is missing or absent and in such monomers, the RVD consists of a single amino acid. In such cases the RVD may be alternatively represented as X*, where X represents X12 and (*) indicates that X13 is absent. The DNA binding domain comprises several repeats of TALE monomers and this may be represented as (X1-11—(X12X13)—X14-33 or 34 or 35)z, where in an advantageous embodiment, z is at least 5 to 40. In a further advantageous embodiment, z is at least 10 to 26.
The TALE monomers can have a nucleotide binding affinity that is determined by the identity of the amino acids in its RVD. For example, polypeptide monomers with an RVD of NI can preferentially bind to adenine (A), monomers with an RVD of NG can preferentially bind to thymine (T), monomers with an RVD of HD can preferentially bind to cytosine (C) and monomers with an RVD of NN can preferentially bind to both adenine (A) and guanine (G). In some embodiments, monomers with an RVD of IG can preferentially bind to T. Thus, the number and order of the polypeptide monomer repeats in the nucleic acid binding domain of a TALE determines its nucleic acid target specificity. In some embodiments, monomers with an RVD of NS can recognize all four base pairs and can bind to A, T, G or C. The structure and function of TALEs is further described in, for example, Moscou et al., Science 326:1501 (2009); Boch et al., Science 326:1509-1512 (2009); and Zhang et al., Nature Biotechnology 29:149-153 (2011).
The polypeptides used in methods of the invention can be isolated, non-naturally occurring, recombinant or engineered nucleic acid-binding proteins that have nucleic acid or DNA binding regions containing polypeptide monomer repeats that are designed to target specific nucleic acid sequences.
As described herein, polypeptide monomers having an RVD of HN or NH preferentially bind to guanine and thereby allow the generation of TALE polypeptides with high binding specificity for guanine containing target nucleic acid sequences. In some embodiments, polypeptide monomers having RVDs RN, NN, NK, SN, NH, KN, HN, NQ, HH, RG, KH, RH and SS can preferentially bind to guanine. In some embodiments, polypeptide monomers having RVDs RN, NK, NQ, HH, KH, RH, SS and SN can preferentially bind to guanine and can thus allow the generation of TALE polypeptides with high binding specificity for guanine containing target nucleic acid sequences. In some embodiments, polypeptide monomers having RVDs HH, KH, NH, NK, NQ, RH, RN and SS can preferentially bind to guanine and thereby allow the generation of TALE polypeptides with high binding specificity for guanine containing target nucleic acid sequences. In some embodiments, the RVDs that have high binding specificity for guanine are RN, NH RH and KH. Furthermore, polypeptide monomers having an RVD of NV can preferentially bind to adenine and guanine. In some embodiments, monomers having RVDs of H*, HA, KA, N*, NA, NC, NS, RA, and S* bind to adenine, guanine, cytosine and thymine with comparable affinity.
The predetermined N-terminal to C-terminal order of the one or more polypeptide monomers of the nucleic acid or DNA binding domain determines the corresponding predetermined target nucleic acid sequence to which the polypeptides of the invention will bind. As used herein the monomers and at least one or more half monomers are “specifically ordered to target” the genomic locus or gene of interest. In plant genomes, the natural TALE-binding sites always begin with a thymine (T), which may be specified by a cryptic signal within the non-repetitive N-terminus of the TALE polypeptide; in some cases, this region may be referred to as repeat 0. In animal genomes, TALE binding sites do not necessarily have to begin with a thymine (T) and polypeptides of the invention may target DNA sequences that begin with T, A, G or C. The tandem repeat of TALE monomers always ends with a half-length repeat or a stretch of sequence that may share identity with only the first 20 amino acids of a repetitive full-length TALE monomer and this half repeat may be referred to as a half-monomer. Therefore, it follows that the length of the nucleic acid or DNA being targeted is equal to the number of full monomers plus two.
As described in Zhang et al., Nature Biotechnology 29:149-153 (2011), TALE polypeptide binding efficiency may be increased by including amino acid sequences from the “capping regions” that are directly N-terminal or C-terminal of the DNA binding region of naturally occurring TALEs into the engineered TALEs at positions N-terminal or C-terminal of the engineered TALE DNA binding region. Thus, in certain embodiments, the TALE polypeptides described herein further comprise an N-terminal capping region and/or a C-terminal capping region.
An exemplary amino acid sequence of a N-terminal capping region is:
An exemplary amino acid sequence of a C-terminal capping region is:
As used herein the predetermined “N-terminus” to “C terminus” orientation of the N-terminal capping region, the DNA binding domain comprising the repeat TALE monomers and the C-terminal capping region provide structural basis for the organization of different domains in the d-TALEs or polypeptides of the invention.
The entire N-terminal and/or C-terminal capping regions are not necessary to enhance the binding activity of the DNA binding region. Therefore, in certain embodiments, fragments of the N-terminal and/or C-terminal capping regions are included in the TALE polypeptides described herein.
In certain embodiments, the TALE polypeptides described herein contain a N-terminal capping region fragment that included at least 10, 20, 30, 40, 50, 54, 60, 70, 80, 87, 90, 94, 100, 102, 110, 117, 120, 130, 140, 147, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260 or 270 amino acids of an N-terminal capping region. In certain embodiments, the N-terminal capping region fragment amino acids are of the C-terminus (the DNA-binding region proximal end) of an N-terminal capping region. As described in Zhang et al., Nature Biotechnology 29:149-153 (2011), N-terminal capping region fragments that include the C-terminal 240 amino acids enhance binding activity equal to the full length capping region, while fragments that include the C-terminal 147 amino acids retain greater than 80% of the efficacy of the full length capping region, and fragments that include the C-terminal 117 amino acids retain greater than 50% of the activity of the full-length capping region.
In some embodiments, the TALE polypeptides described herein contain a C-terminal capping region fragment that included at least 6, 10, 20, 30, 37, 40, 50, 60, 68, 70, 80, 90, 100, 110, 120, 127, 130, 140, 150, 155, 160, 170, 180 amino acids of a C-terminal capping region. In certain embodiments, the C-terminal capping region fragment amino acids are of the N-terminus (the DNA-binding region proximal end) of a C-terminal capping region. As described in Zhang et al., Nature Biotechnology 29:149-153 (2011), C-terminal capping region fragments that include the C-terminal 68 amino acids enhance binding activity equal to the full-length capping region, while fragments that include the C-terminal 20 amino acids retain greater than 50% of the efficacy of the full-length capping region.
In certain embodiments, the capping regions of the TALE polypeptides described herein do not need to have identical sequences to the capping region sequences provided herein. Thus, in some embodiments, the capping region of the TALE polypeptides described herein have sequences that are at least 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identical or share identity to the capping region amino acid sequences provided herein. Sequence identity is related to sequence homology. Homology comparisons may be conducted by eye, or more usually, with the aid of readily available sequence comparison programs. These commercially available computer programs may calculate percent (%) homology between two or more sequences and may also calculate the sequence identity shared by two or more amino acid or nucleic acid sequences. In some preferred embodiments, the capping region of the TALE polypeptides described herein have sequences that are at least 95% identical or share identity to the capping region amino acid sequences provided herein.
Sequence homologies can be generated by any of a number of computer programs known in the art, which include but are not limited to BLAST or FASTA. Suitable computer programs for carrying out alignments like the GCG Wisconsin Bestfit package may also be used. Once the software has produced an optimal alignment, it is possible to calculate % homology, preferably % sequence identity. The software typically does this as part of the sequence comparison and generates a numerical result.
In some embodiments described herein, the TALE polypeptides of the invention include a nucleic acid binding domain linked to the one or more effector domains. The terms “effector domain” or “regulatory and functional domain” refer to a polypeptide sequence that has an activity other than binding to the nucleic acid sequence recognized by the nucleic acid binding domain. By combining a nucleic acid binding domain with one or more effector domains, the polypeptides of the invention may be used to target the one or more functions or activities mediated by the effector domain to a particular target DNA sequence to which the nucleic acid binding domain specifically binds.
In some embodiments of the TALE polypeptides described herein, the activity mediated by the effector domain is a biological activity. For example, in some embodiments the effector domain is a transcriptional inhibitor (i.e., a repressor domain), such as an mSin interaction domain (SID). SID4X domain or a Kruppel-associated box (KRAB) or fragments of the KRAB domain. In some embodiments the effector domain is an enhancer of transcription (i.e., an activation domain), such as the VP16, VP64 or p65 activation domain. In some embodiments, the nucleic acid binding is linked, for example, with an effector domain that includes but is not limited to a transposase, integrase, recombinase, resolvase, invertase, protease, DNA methyltransferase, DNA demethylase, histone acetylase, histone deacetylase, nuclease, transcriptional repressor, transcriptional activator, transcription factor recruiting, protein nuclear-localization signal or cellular uptake signal.
In some embodiments, the effector domain is a protein domain which exhibits activities which include but are not limited to transposase activity, integrase activity, recombinase activity, resolvase activity, invertase activity, protease activity, DNA methyltransferase activity, DNA demethylase activity, histone acetylase activity, histone deacetylase activity, nuclease activity, nuclear-localization signaling activity, transcriptional repressor activity, transcriptional activator activity, transcription factor recruiting activity, or cellular uptake signaling activity. Other preferred embodiments of the invention may include any combination of the activities described herein.
In some embodiments, a meganuclease or system thereof can be used to modify a polynucleotide. Meganucleases, which are endodeoxyribonucleases characterized by a large recognition site (double-stranded DNA sequences of 12 to 40 base pairs). Exemplary methods for using meganucleases can be found in U.S. Pat. Nos. 8,163,514, 8,133,697, 8,021,867, 8,119,361, 8,119,381, 8,124,369, and 8,129,134, which are specifically incorporated herein by reference.
In certain embodiments, the genetic modifying agent is RNAi (e.g., shRNA). As used herein, “gene silencing” or “gene silenced” in reference to an activity of an RNAi molecule, for example a siRNA or miRNA refers to a decrease in the mRNA level in a cell for a target gene by at least about 5%, about 10%, about 20%, about 30%, about 40%, about 50%, about 60%, about 70%, about 80%, about 90%, about 95%, about 99%, about 100% of the mRNA level found in the cell without the presence of the miRNA or RNA interference molecule. In one preferred embodiment, the mRNA levels are decreased by at least about 70%, about 80%, about 90%, about 95%, about 99%, about 100%.
As used herein, the term “RNAi” refers to any type of interfering RNA, including but not limited to, siRNAi, shRNAi, endogenous microRNA and artificial microRNA. For instance, it includes sequences previously identified as siRNA, regardless of the mechanism of down-stream processing of the RNA (i.e. although siRNAs are believed to have a specific method of in vivo processing resulting in the cleavage of mRNA, such sequences can be incorporated into the vectors in the context of the flanking sequences described herein). The term “RNAi” can include both gene silencing RNAi molecules, and also RNAi effector molecules which activate the expression of a gene.
As used herein, a “siRNA” refers to a nucleic acid that forms a double stranded RNA, which double stranded RNA has the ability to reduce or inhibit expression of a gene or target gene when the siRNA is present or expressed in the same cell as the target gene. The double stranded RNA siRNA can be formed by the complementary strands. In one embodiment, a siRNA refers to a nucleic acid that can form a double stranded siRNA. The sequence of the siRNA can correspond to the full-length target gene, or a subsequence thereof. Typically, the siRNA is at least about 15-50 nucleotides in length (e.g., each complementary sequence of the double stranded siRNA is about 15-50 nucleotides in length, and the double stranded siRNA is about 15-50 base pairs in length, preferably about 19-30 base nucleotides, preferably about 20-25 nucleotides in length, e.g., 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides in length).
As used herein “shRNA” or “small hairpin RNA” (also called stem loop) is a type of siRNA. In one embodiment, these shRNAs are composed of a short, e.g. about 19 to about 25 nucleotide, antisense strand, followed by a nucleotide loop of about 5 to about 9 nucleotides, and the analogous sense strand. Alternatively, the sense strand can precede the nucleotide loop structure and the antisense strand can follow.
The terms “microRNA” or “miRNA” are used interchangeably herein are endogenous RNAs, some of which are known to regulate the expression of protein-coding genes at the posttranscriptional level. Endogenous microRNAs are small RNAs naturally present in the genome that are capable of modulating the productive utilization of mRNA. The term artificial microRNA includes any type of RNA sequence, other than endogenous microRNA, which is capable of modulating the productive utilization of mRNA. MicroRNA sequences have been described in publications such as Lim, et al., Genes & Development, 17, p. 991-1008 (2003), Lim et al Science 299, 1540 (2003), Lee and Ambros Science, 294, 862 (2001), Lau et al., Science 294, 858-861 (2001), Lagos-Quintana et al, Current Biology, 12, 735-739 (2002), Lagos Quintana et al, Science 294, 853-857 (2001), and Lagos-Quintana et al, RNA, 9, 175-179 (2003), which are incorporated by reference. Multiple microRNAs can also be incorporated into a precursor molecule. Furthermore, miRNA-like stem-loops can be expressed in cells as a vehicle to deliver artificial miRNAs and short interfering RNAs (siRNAs) for the purpose of modulating the expression of endogenous genes through the miRNA and or RNAi pathways.
As used herein, “double stranded RNA” or “dsRNA” refers to RNA molecules that are comprised of two strands. Double-stranded molecules include those comprised of a single RNA molecule that doubles back on itself to form a two-stranded structure. For example, the stem loop structure of the progenitor molecules from which the single-stranded miRNA is derived, called the pre-miRNA (Bartel et al. 2004. Cell 116:281-297), comprises a dsRNA molecule.
In certain embodiments, the one or more agents is an antibody. The term “antibody” is used interchangeably with the term “immunoglobulin” herein, and includes intact antibodies, fragments of antibodies, e.g., Fab, F(ab′)2 fragments, and intact antibodies and fragments that have been mutated either in their constant and/or variable region (e.g., mutations to produce chimeric, partially humanized, or fully humanized antibodies, as well as to produce antibodies with a desired trait, e.g., enhanced binding and/or reduced FcR binding). The term “fragment” refers to a part or portion of an antibody or antibody chain comprising fewer amino acid residues than an intact or complete antibody or antibody chain. Fragments can be obtained via chemical or enzymatic treatment of an intact or complete antibody or antibody chain. Fragments can also be obtained by recombinant means. Exemplary fragments include Fab, Fab′, F(ab′)2, Fabc, Fd, dAb, VHH and scFv and/or Fv fragments.
As used herein, a preparation of antibody protein having less than about 50% of non-antibody protein (also referred to herein as a “contaminating protein”), or of chemical precursors, is considered to be “substantially free.” 40%, 30%, 20%, 10% and more preferably 5% (by dry weight), of non-antibody protein, or of chemical precursors is considered to be substantially free. When the antibody protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 30%, preferably less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume or mass of the protein preparation.
The term “antigen-binding fragment” refers to a polypeptide fragment of an immunoglobulin or antibody that binds antigen or competes with intact antibody (i.e., with the intact antibody from which they were derived) for antigen binding (i.e., specific binding). As such these antibodies or fragments thereof are included in the scope of the invention, provided that the antibody or fragment binds specifically to a target molecule.
It is intended that the term “antibody” encompass any Ig class or any Ig subclass (e.g. the IgG1, IgG2, IgG3, and IgG4 subclasses of IgG) obtained from any source (e.g., humans and non-human primates, and in rodents, lagomorphs, caprines, bovines, equines, ovines, etc.).
The term “Ig class” or “immunoglobulin class”, as used herein, refers to the five classes of immunoglobulin that have been identified in humans and higher mammals, IgG, IgM, IgA, IgD, and IgE. The term “Ig subclass” refers to the two subclasses of IgM (H and L), three subclasses of IgA (IgA1, IgA2, and secretory IgA), and four subclasses of IgG (IgG1, IgG2, IgG3, and IgG4) that have been identified in humans and higher mammals. The antibodies can exist in monomeric or polymeric form; for example, lgM antibodies exist in pentameric form, and IgA antibodies exist in monomeric, dimeric or multimeric form.
The term “IgG subclass” refers to the four subclasses of immunoglobulin class IgG-IgG1, IgG2, IgG3, and IgG4 that have been identified in humans and higher mammals by the heavy chains of the immunoglobulins, V1-γ4, respectively. The term “single-chain immunoglobulin” or “single-chain antibody” (used interchangeably herein) refers to a protein having a two-polypeptide chain structure consisting of a heavy and a light chain, said chains being stabilized, for example, by interchain peptide linkers, which has the ability to specifically bind antigen. The term “domain” refers to a globular region of a heavy or light chain polypeptide comprising peptide loops (e.g., comprising 3 to 4 peptide loops) stabilized, for example, by β pleated sheet and/or intrachain disulfide bond. Domains are further referred to herein as “constant” or “variable”, based on the relative lack of sequence variation within the domains of various class members in the case of a “constant” domain, or the significant variation within the domains of various class members in the case of a “variable” domain. Antibody or polypeptide “domains” are often referred to interchangeably in the art as antibody or polypeptide “regions”. The “constant” domains of an antibody light chain are referred to interchangeably as “light chain constant regions”, “light chain constant domains”, “CL” regions or “CL” domains. The “constant” domains of an antibody heavy chain are referred to interchangeably as “heavy chain constant regions”, “heavy chain constant domains”, “CH” regions or “CH” domains. The “variable” domains of an antibody light chain are referred to interchangeably as “light chain variable regions”, “light chain variable domains”, “VL” regions or “VL” domains. The “variable” domains of an antibody heavy chain are referred to interchangeably as “heavy chain constant regions”, “heavy chain constant domains”, “VH” regions or “VH” domains.
The term “region” can also refer to a part or portion of an antibody chain or antibody chain domain (e.g., a part or portion of a heavy or light chain or a part or portion of a constant or variable domain, as defined herein), as well as more discrete parts or portions of said chains or domains. For example, light and heavy chains or light and heavy chain variable domains include “complementarity determining regions” or “CDRs” interspersed among “framework regions” or “FRs”, as defined herein.
The term “conformation” refers to the tertiary structure of a protein or polypeptide (e.g., an antibody, antibody chain, domain or region thereof). For example, the phrase “light (or heavy) chain conformation” refers to the tertiary structure of a light (or heavy) chain variable region, and the phrase “antibody conformation” or “antibody fragment conformation” refers to the tertiary structure of an antibody or fragment thereof.
The term “antibody-like protein scaffolds” or “engineered protein scaffolds” broadly encompasses proteinaceous non-immunoglobulin specific-binding agents, typically obtained by combinatorial engineering (such as site-directed random mutagenesis in combination with phage display or other molecular selection techniques). Usually, such scaffolds are derived from robust and small soluble monomeric proteins (such as Kunitz inhibitors or lipocalins) or from a stably folded extra-membrane domain of a cell surface receptor (such as protein A, fibronectin or the ankyrin repeat).
Such scaffolds have been extensively reviewed in Binz et al. (Engineering novel binding proteins from nonimmunoglobulin domains. Nat Biotechnol 2005, 23:1257-1268), Gebauer and Skerra (Engineered protein scaffolds as next-generation antibody therapeutics. Curr Opin Chem Biol. 2009, 13:245-55), Gill and Damle (Biopharmaceutical drug discovery using novel protein scaffolds. Curr Opin Biotechnol 2006, 17:653-658), Skerra (Engineered protein scaffolds for molecular recognition. J Mol Recognit 2000, 13:167-187), and Skerra (Alternative non-antibody scaffolds for molecular recognition. Curr Opin Biotechnol 2007, 18:295-304), and include without limitation affibodies, based on the Z-domain of staphylococcal protein A, a three-helix bundle of 58 residues providing an interface on two of its alpha-helices (Nygren, Alternative binding proteins: Affibody binding proteins developed from a small three-helix bundle scaffold. FEBS J 2008, 275:2668-2676); engineered Kunitz domains based on a small (ca. 58 residues) and robust, disulphide-crosslinked serine protease inhibitor, typically of human origin (e.g. LACI-D1), which can be engineered for different protease specificities (Nixon and Wood, Engineered protein inhibitors of proteases. Curr Opin Drug Discov Dev 2006, 9:261-268); monobodies or adnectins based on the 10th extracellular domain of human fibronectin III (10Fn3), which adopts an Ig-like beta-sandwich fold (94 residues) with 2-3 exposed loops, but lacks the central disulphide bridge (Koide and Koide, Monobodies: antibody mimics based on the scaffold of the fibronectin type III domain. Methods Mol Biol 2007, 352:95-109); anticalins derived from the lipocalins, a diverse family of eight-stranded beta-barrel proteins (ca. 180 residues) that naturally form binding sites for small ligands by means of four structurally variable loops at the open end, which are abundant in humans, insects, and many other organisms (Skerra, Alternative binding proteins: Anticalins—harnessing the structural plasticity of the lipocalin ligand pocket to engineer novel binding activities. FEBS J 2008, 275:2677-2683); DARPins, designed ankyrin repeat domains (166 residues), which provide a rigid interface arising from typically three repeated beta-turns (Stumpp et al., DARPins: a new generation of protein therapeutics. Drug Discov Today 2008, 13:695-701); avimers (multimerized LDLR-A module) (Silverman et al., Multivalent avimer proteins evolved by exon shuffling of a family of human receptor domains. Nat Biotechnol 2005, 23:1556-1561); and cysteine-rich knottin peptides (Kolmar, Alternative binding proteins: biological activity and therapeutic potential of cystine-knot miniproteins. FEBS J 2008, 275:2684-2690).
“Specific binding” of an antibody means that the antibody exhibits appreciable affinity for a particular antigen or epitope and, generally, does not exhibit significant cross reactivity. “Appreciable” binding includes binding with an affinity of at least 25 μM. Antibodies with affinities greater than 1×107 M−1 (or a dissociation coefficient of 1 μM or less or a dissociation coefficient of 1 nm or less) typically bind with correspondingly greater specificity. Values intermediate of those set forth herein are also intended to be within the scope of the present invention and antibodies of the invention bind with a range of affinities, for example, 100 nM or less, 75 nM or less, 50 nM or less, 25 nM or less, for example 10 nM or less, 5 nM or less, 1 nM or less, or in embodiments 500 pM or less, 100 pM or less, 50 pM or less or 25 pM or less. An antibody that “does not exhibit significant crossreactivity” is one that will not appreciably bind to an entity other than its target (e.g., a different epitope or a different molecule). For example, an antibody that specifically binds to a target molecule will appreciably bind the target molecule but will not significantly react with non-target molecules or peptides. An antibody specific for a particular epitope will, for example, not significantly crossreact with remote epitopes on the same protein or peptide. Specific binding can be determined according to any art-recognized means for determining such binding. Preferably, specific binding is determined according to Scatchard analysis and/or competitive binding assays.
As used herein, the term “affinity” refers to the strength of the binding of a single antigen-combining site with an antigenic determinant. Affinity depends on the closeness of stereochemical fit between antibody combining sites and antigen determinants, on the size of the area of contact between them, on the distribution of charged and hydrophobic groups, etc. Antibody affinity can be measured by equilibrium dialysis or by the kinetic BIACORE™ method. The dissociation constant, Kd, and the association constant, Ka, are quantitative measures of affinity.
As used herein, the term “monoclonal antibody” refers to an antibody derived from a clonal population of antibody-producing cells (e.g., B lymphocytes or B cells) which is homogeneous in structure and antigen specificity. The term “polyclonal antibody” refers to a plurality of antibodies originating from different clonal populations of antibody-producing cells which are heterogeneous in their structure and epitope specificity but which recognize a common antigen. Monoclonal and polyclonal antibodies may exist within bodily fluids, as crude preparations, or may be purified, as described herein.
The term “binding portion” of an antibody (or “antibody portion”) includes one or more complete domains, e.g., a pair of complete domains, as well as fragments of an antibody that retain the ability to specifically bind to a target molecule. It has been shown that the binding function of an antibody can be performed by fragments of a full-length antibody. Binding fragments are produced by recombinant DNA techniques, or by enzymatic or chemical cleavage of intact immunoglobulins. Binding fragments include Fab, Fab′, F(ab′)2, Fabc, Fd, dAb, Fv, single chains, single-chain antibodies, e.g., scFv, and single domain antibodies.
“Humanized” forms of non-human (e.g., murine) antibodies are chimeric antibodies that contain minimal sequence derived from non-human immunoglobulin. For the most part, humanized antibodies are human immunoglobulins (recipient antibody) in which residues from a hypervariable region of the recipient are replaced by residues from a hypervariable region of a non-human species (donor antibody) such as mouse, rat, rabbit or nonhuman primate having the desired specificity, affinity, and capacity. In some instances, FR residues of the human immunoglobulin are replaced by corresponding non-human residues. Furthermore, humanized antibodies may comprise residues that are not found in the recipient antibody or in the donor antibody. These modifications are made to further refine antibody performance. In general, the humanized antibody will comprise substantially all of at least one, and typically two, variable domains, in which all or substantially all of the hypervariable regions correspond to those of a non-human immunoglobulin and all or substantially all of the FR regions are those of a human immunoglobulin sequence. The humanized antibody optionally also will comprise at least a portion of an immunoglobulin constant region (Fc), typically that of a human immunoglobulin.
Examples of portions of antibodies or epitope-binding proteins encompassed by the present definition include: (i) the Fab fragment, having VL, CL, VH and CH1 domains; (ii) the Fab′ fragment, which is a Fab fragment having one or more cysteine residues at the C-terminus of the CH1 domain; (iii) the Fd fragment having VH and CH1 domains; (iv) the Fd′ fragment having VH and CH1 domains and one or more cysteine residues at the C-terminus of the CHI domain; (v) the Fv fragment having the VL and VH domains of a single arm of an antibody; (vi) the dAb fragment (Ward et al., 341 Nature 544 (1989)) which consists of a VH domain or a VL domain that binds antigen; (vii) isolated CDR regions or isolated CDR regions presented in a functional framework; (viii) F(ab′)2 fragments which are bivalent fragments including two Fab′ fragments linked by a disulphide bridge at the hinge region; (ix) single chain antibody molecules (e.g., single chain Fv; scFv) (Bird et al., 242 Science 423 (1988); and Huston et al., 85 PNAS 5879 (1988)); (x) “diabodies” with two antigen binding sites, comprising a heavy chain variable domain (VH) connected to a light chain variable domain (VL) in the same polypeptide chain (see, e.g., EP 404,097; WO 93/11161; Hollinger et al., 90 PNAS 6444 (1993)); (xi) “linear antibodies” comprising a pair of tandem Fd segments (VH—Ch1—VH—-Ch1) which, together with complementary light chain polypeptides, form a pair of antigen binding regions (Zapata et al., Protein Eng. 8(10):1057-62 (1995); and U.S. Pat. No. 5,641,870).
As used herein, a “blocking” antibody or an antibody “antagonist” is one which inhibits or reduces biological activity of the antigen(s) it binds. In certain embodiments, the blocking antibodies or antagonist antibodies or portions thereof described herein completely inhibit the biological activity of the antigen(s).
Antibodies may act as agonists or antagonists of the recognized polypeptides. For example, the present invention includes antibodies which disrupt receptor/ligand interactions either partially or fully. The invention features both receptor-specific antibodies and ligand-specific antibodies. The invention also features receptor-specific antibodies which do not prevent ligand binding but prevent receptor activation. Receptor activation (i.e., signaling) may be determined by techniques described herein or otherwise known in the art. For example, receptor activation can be determined by detecting the phosphorylation (e.g., tyrosine or serine/threonine) of the receptor or of one of its down-stream substrates by immunoprecipitation followed by western blot analysis. In specific embodiments, antibodies are provided that inhibit ligand activity or receptor activity by at least 95%, at least 90%, at least 85%, at least 80%, at least 75%, at least 70%, at least 60%, or at least 50% of the activity in absence of the antibody.
The invention also features receptor-specific antibodies which both prevent ligand binding and receptor activation as well as antibodies that recognize the receptor-ligand complex. Likewise, encompassed by the invention are neutralizing antibodies which bind the ligand and prevent binding of the ligand to the receptor, as well as antibodies which bind the ligand, thereby preventing receptor activation, but do not prevent the ligand from binding the receptor. Further included in the invention are antibodies which activate the receptor. These antibodies may act as receptor agonists, i.e., potentiate or activate either all or a subset of the biological activities of the ligand-mediated receptor activation, for example, by inducing dimerization of the receptor. The antibodies may be specified as agonists, antagonists or inverse agonists for biological activities comprising the specific biological activities of the peptides disclosed herein. The antibody agonists and antagonists can be made using methods known in the art. See, e.g., PCT publication WO 96/40281; U.S. Pat. No. 5,811,097; Deng et al., Blood 92(6):1981-1988 (1998); Chen et al., Cancer Res. 58(16):3668-3678 (1998); Harrop et al., J. Immunol. 161(4):1786-1794 (1998); Zhu et al., Cancer Res. 58(15):3209-3214 (1998); Yoon et al., J. Immunol. 160(7):3170-3179 (1998); Prat et al., J. Cell. Sci. III (Pt2):237-247 (1998); Pitard et al., J. Immunol. Methods 205(2):177-190 (1997); Liautard et al., Cytokine 9(4):233-241 (1997); Carlson et al., J. Biol. Chem. 272(17):11295-11301 (1997); Taryman et al., Neuron 14(4):755-762 (1995); Muller et al., Structure 6(9):1153-1167 (1998); Bartunek et al., Cytokine 8(1):14-20 (1996).
The antibodies as defined for the present invention include derivatives that are modified, i.e., by the covalent attachment of any type of molecule to the antibody such that covalent attachment does not prevent the antibody from generating an anti-idiotypic response. For example, but not by way of limitation, the antibody derivatives include antibodies that have been modified, e.g., by glycosylation, acetylation, pegylation, phosphylation, amidation, derivatization by known protecting/blocking groups, proteolytic cleavage, linkage to a cellular ligand or other protein, etc. Any of numerous chemical modifications may be carried out by known techniques, including, but not limited to specific chemical cleavage, acetylation, formylation, metabolic synthesis of tunicamycin, etc. Additionally, the derivative may contain one or more non-classical amino acids.
Simple binding assays can be used to screen for or detect agents that bind to a target protein, or disrupt the interaction between proteins (e.g., a receptor and a ligand). Because certain targets of the present invention are transmembrane proteins, assays that use the soluble forms of these proteins rather than full-length protein can be used, in some embodiments. Soluble forms include, for example, those lacking the transmembrane domain and/or those comprising the IgV domain or fragments thereof which retain their ability to bind their cognate binding partners. Further, agents that inhibit or enhance protein interactions for use in the compositions and methods described herein, can include recombinant peptido-mimetics.
Detection methods useful in screening assays include antibody-based methods, detection of a reporter moiety, detection of cytokines as described herein, and detection of a gene signature as described herein.
Another variation of assays to determine binding of a receptor protein to a ligand protein is through the use of affinity biosensor methods. Such methods may be based on the piezoelectric effect, electrochemistry, or optical methods, such as ellipsometry, optical wave guidance, and surface plasmon resonance (SPR).
In certain embodiments, the one or more agents is an aptamer. Nucleic acid aptamers are nucleic acid species that have been engineered through repeated rounds of in vitro selection or equivalently, SELEX (systematic evolution of ligands by exponential enrichment) to bind to various molecular targets such as small molecules, proteins, nucleic acids, cells, tissues and organisms. Nucleic acid aptamers have specific binding affinity to molecules through interactions other than classic Watson-Crick base pairing. Aptamers are useful in biotechnological and therapeutic applications as they offer molecular recognition properties similar to antibodies. In addition to their discriminate recognition, aptamers offer advantages over antibodies as they can be engineered completely in a test tube, are readily produced by chemical synthesis, possess desirable storage properties, and elicit little or no immunogenicity in therapeutic applications. In certain embodiments, RNA aptamers may be expressed from a DNA construct. In other embodiments, a nucleic acid aptamer may be linked to another polynucleotide sequence. The polynucleotide sequence may be a double stranded DNA polynucleotide sequence. The aptamer may be covalently linked to one strand of the polynucleotide sequence. The aptamer may be ligated to the polynucleotide sequence. The polynucleotide sequence may be configured, such that the polynucleotide sequence may be linked to a solid support or ligated to another polynucleotide sequence.
Aptamers, like peptides generated by phage display or monoclonal antibodies (“mAbs”), are capable of specifically binding to selected targets and modulating the target's activity, e.g., through binding, aptamers may block their target's ability to function. A typical aptamer is 10-15 kDa in size (30-45 nucleotides), binds its target with sub-nanomolar affinity, and discriminates against closely related targets (e.g., aptamers will typically not bind other proteins from the same gene family). Structural studies have shown that aptamers are capable of using the same types of binding interactions (e.g., hydrogen bonding, electrostatic complementarity, hydrophobic contacts, steric exclusion) that drives affinity and specificity in antibody-antigen complexes.
Aptamers have a number of desirable characteristics for use in research and as therapeutics and diagnostics including high specificity and affinity, biological efficacy, and excellent pharmacokinetic properties. In addition, they offer specific competitive advantages over antibodies and other protein biologics. Aptamers are chemically synthesized and are readily scaled as needed to meet production demand for research, diagnostic or therapeutic applications. Aptamers are chemically robust. They are intrinsically adapted to regain activity following exposure to factors such as heat and denaturants and can be stored for extended periods (>1 yr) at room temperature as lyophilized powders. Not being bound by a theory, aptamers bound to a solid support or beads may be stored for extended periods.
Oligonucleotides in their phosphodiester form may be quickly degraded by intracellular and extracellular enzymes such as endonucleases and exonucleases. Aptamers can include modified nucleotides conferring improved characteristics on the ligand, such as improved in vivo stability or improved delivery characteristics. Examples of such modifications include chemical substitutions at the ribose and/or phosphate and/or base positions. SELEX identified nucleic acid ligands containing modified nucleotides are described, e.g., in U.S. Pat. No. 5,660,985, which describes oligonucleotides containing nucleotide derivatives chemically modified at the 2′ position of ribose, 5 position of pyrimidines, and 8 position of purines, U.S. Pat. No. 5,756,703 which describes oligonucleotides containing various 2′-modified pyrimidines, and U.S. Pat. No. 5,580,737 which describes highly specific nucleic acid ligands containing one or more nucleotides modified with 2′-amino (2′-NH2), 2′-fluoro (2′-F), and/or 2′-O-methyl (2′-OMe) substituents. Modifications of aptamers may also include, modifications at exocyclic amines, substitution of 4-thiouridine, substitution of 5-bromo or 5-iodo-uracil; backbone modifications, phosphorothioate or allyl phosphate modifications, methylations, and unusual base-pairing combinations such as the isobases isocytidine and isoguanosine. Modifications can also include 3′ and 5′ modifications such as capping. As used herein, the term phosphorothioate encompasses one or more non-bridging oxygen atoms in a phosphodiester bond replaced by one or more sulfur atoms. In further embodiments, the oligonucleotides comprise modified sugar groups, for example, one or more of the hydroxyl groups is replaced with halogen, aliphatic groups, or functionalized as ethers or amines. In one embodiment, the 2′-position of the furanose residue is substituted by any of an O-methyl, O-alkyl, 0-allyl, S-alkyl, S-allyl, or halo group. Methods of synthesis of 2′-modified sugars are described, e.g., in Sproat, et al., Nucl. Acid Res. 19:733-738 (1991); Cotten, et al, Nucl. Acid Res. 19:2629-2635 (1991); and Hobbs, et al, Biochemistry 12:5138-5145 (1973). Other modifications are known to one of ordinary skill in the art. In certain embodiments, aptamers include aptamers with improved off-rates as described in International Patent Publication No. WO 2009012418, “Method for generating aptamers with improved off-rates,” incorporated herein by reference in its entirety. In certain embodiments aptamers are chosen from a library of aptamers. Such libraries include, but are not limited to those described in Rohloff et al., “Nucleic Acid Ligands With Protein-like Side Chains: Modified Aptamers and Their Use as Diagnostic and Therapeutic Agents,” Molecular Therapy Nucleic Acids (2014) 3, e201. Aptamers are also commercially available (see, e.g., SomaLogic, Inc., Boulder, Colo.). In certain embodiments, the present invention may utilize any aptamer containing any modification as described herein.
In certain embodiments, modulation of T cell balance may be used to treat inflammatory diseases, disorders or aberrant autoimmune responses. Specific autoimmune responses resulting from an immunotherapy is described further herein. As used throughout the present specification, the terms “autoimmune disease” or “autoimmune disorder” used interchangeably refer to a diseases or disorders caused by an immune response against a self-tissue or tissue component (self-antigen) and include a self-antibody response and/or cell-mediated response. The terms encompass organ-specific autoimmune diseases, in which an autoimmune response is directed against a single tissue, as well as non-organ specific autoimmune diseases, in which an autoimmune response is directed against a component present in two or more, several or many organs throughout the body.
Examples of autoimmune diseases include but are not limited to acute disseminated encephalomyelitis (ADEM); Addison's disease; ankylosing spondylitis; antiphospholipid antibody syndrome (APS); aplastic anemia; autoimmune gastritis; autoimmune hepatitis; autoimmune thrombocytopenia; Behcet's disease; coeliac disease; dermatomyositis; diabetes mellitus type I; Goodpasture's syndrome; Graves' disease; Guillain-Barré syndrome (GBS); Hashimoto's disease; idiopathic thrombocytopenic purpura; inflammatory bowel disease (IBD) including Crohn's disease and ulcerative colitis; mixed connective tissue disease; multiple sclerosis (MS); myasthenia gravis; opsoclonus myoclonus syndrome (OMS); optic neuritis; Ord's thyroiditis; pemphigus; pernicious anaemia; polyarteritis nodosa; polymyositis; primary biliary cirrhosis; primary myoxedema; psoriasis; rheumatic fever; rheumatoid arthritis; Reiter's syndrome; scleroderma; Sjögren's syndrome; systemic lupus erythematosus; Takayasu's arteritis; temporal arteritis; vitiligo; warm autoimmune hemolytic anemia; or Wegener's granulomatosis.
Examples of inflammatory diseases or disorders include, but are not limited to, asthma, allergy, allergic rhinitis, allergic airway inflammation, atopic dermatitis (AD), chronic obstructive pulmonary disease (COPD), inflammatory bowel disease (IBD), Irritable bowel syndrome (IBS), multiple sclerosis, arthritis, psoriasis, eosinophilic esophagitis, eosinophilic pneumonia, eosinophilic psoriasis, hypereosinophilic syndrome, graft-versus-host disease, uveitis, cardiovascular disease, pain, multiple sclerosis, lupus, vasculitis, chronic idiopathic urticaria and Eosinophilic Granulomatosis with Polyangiitis (Churg-Strauss Syndrome).
The asthma may be allergic asthma, non-allergic asthma, severe refractory asthma, asthma exacerbations, viral-induced asthma or viral-induced asthma exacerbations, steroid resistant asthma, steroid sensitive asthma, eosinophilic asthma or non-eosinophilic asthma and other related disorders characterized by airway inflammation or airway hyperresponsiveness (AHR).
The COPD may be a disease or disorder associated in part with, or caused by, cigarette smoke, air pollution, occupational chemicals, allergy or airway hyperresponsiveness.
The allergy may be associated with foods, pollen, mold, dust mites, animals, or animal dander.
The IBD may be ulcerative colitis (UC), Crohn's Disease, collagenous colitis, lymphocytic colitis, ischemic colitis, diversion colitis, Behcet's syndrome, infective colitis, indeterminate colitis, and other disorders characterized by inflammation of the mucosal layer of the large intestine or colon.
The arthritis may be selected from the group consisting of osteoarthritis, rheumatoid arthritis and psoriatic arthritis.
Immunotherapy can include checkpoint blockers (CBP), chimeric antigen receptors (CARs), and adoptive T-cell therapy. Antibodies that block the activity of checkpoint receptors, including CTLA-4, PD-1, Tim-3, Lag-3, and TIGIT, either alone or in combination, have been associated with improved effector CD8+ T cell responses in multiple pre-clinical cancer models (Johnston et al., 2014. The immunoreceptor TIGIT regulates antitumor and antiviral CD8(+) T cell effector function. Cancer cell 26, 923-937; Ngiow et al., 2011. Anti-TIM3 antibody promotes T cell IFN-gamma-mediated antitumor immunity and suppresses established tumors. Cancer research 71, 3540-3551; Sakuishi et al., 2010. Targeting Tim-3 and PD-1 pathways to reverse T cell exhaustion and restore anti-tumor immunity. The Journal of experimental medicine 207, 2187-2194; and Woo et al., 2012. Immune inhibitory molecules LAG-3 and PD-1 synergistically regulate T-cell function to promote tumoral immune escape. Cancer research 72, 917-927). Similarly, blockade of CTLA-4 and PD-1 in patients (Brahmer et al., 2012. Safety and activity of anti-PD-L1 antibody in patients with advanced cancer. The New England journal of medicine 366, 2455-2465; Hodi et al., 2010. Improved survival with ipilimumab in patients with metastatic melanoma. The New England journal of medicine 363, 711-723; Schadendorf et al., 2015. Pooled Analysis of Long-Term Survival Data From Phase II and Phase III Trials of Ipilimumab in Unresectable or Metastatic Melanoma. Journal of clinical oncology: official journal of the American Society of Clinical Oncology 33, 1889-1894; Topalian et al., 2012. Safety, activity, and immune correlates of anti-PD-1 antibody in cancer. The New England journal of medicine 366, 2443-2454; and Wolchok et al., 2017. Overall Survival with Combined Nivolumab and Ipilimumab in Advanced Melanoma. The New England journal of medicine 377, 1345-1356) has shown increased frequencies of proliferating T cells, often with specificity for tumor antigens, as well as increased CD8+ T cell effector function (Ayers et al., 2017. IFN-gamma-related mRNA profile predicts clinical response to PD-1 blockade. The Journal of clinical investigation 127, 2930-2940; Das et al., 2015. Combination therapy with anti-CTLA-4 and anti-PD-1 leads to distinct immunologic changes in vivo. Journal of immunology 194, 950-959; Gubin et al., 2014. Checkpoint blockade cancer immunotherapy targets tumour-specific mutant antigens. Nature 515, 577-581; Huang et al., 2017. T-cell invigoration to tumour burden ratio associated with anti-PD-1 response. Nature 545, 60-65; Kamphorst et al., 2017. Proliferation of PD-1+CD8 T cells in peripheral blood after PD-1-targeted therapy in lung cancer patients. Proceedings of the National Academy of Sciences of the United States of America 114, 4993-4998; Kvistborg et al., 2014. Anti-CTLA-4 therapy broadens the melanoma-reactive CD8+ T cell response. Science translational medicine 6, 254ra128; van Rooij et al., 2013. Tumor exome analysis reveals neoantigen-specific T-cell reactivity in an ipilimumab-responsive melanoma. Journal of clinical oncology: official journal of the American Society of Clinical Oncology 31, e439-442; and Yuan et al., 2008. CTLA-4 blockade enhances polyfunctional NY-ESO-1 specific T cell responses in metastatic melanoma patients with clinical benefit. Proceedings of the National Academy of Sciences of the United States of America 105, 20410-20415). Accordingly, the success of checkpoint receptor blockade has been attributed to the binding of blocking antibodies to checkpoint receptors expressed on dysfunctional CD8+ T cells and restoring effector function in these cells. The check point blockade therapy may be an inhibitor of any check point protein described herein. The checkpoint blockade therapy may comprise anti-TIM3, anti-CTLA4, anti-PD-L1, anti-PD1, anti-TIGIT, anti-LAG3, or combinations thereof. Anti-PD1 antibodies are disclosed in U.S. Pat. No. 8,735,553. Antibodies to LAG-3 are disclosed in U.S. Pat. No. 9,132,281. Anti-CTLA4 antibodies are disclosed in U.S. Pat. Nos. 9,327,014, 9,320,811, and 9,062,111. Specific check point inhibitors include, but are not limited to anti-CTLA4 antibodies (e.g., Ipilimumab and tremelimumab), anti-PD-1 antibodies (e.g., Nivolumab, Pembrolizumab), and anti-PD-L1 antibodies (e.g., Atezolizumab).
In certain embodiments, immunotherapy leads to immune-related adverse events (irAEs) (see, e.g., Byun et al., (2017) Cancer immunotherapy—immune checkpoint blockade and associated endocrinopathies. Nat Rev Endocrinol. 2017 April; 13(4): 195-207; Abdel-Wahab et al., (2016) Adverse Events Associated with Immune Checkpoint Blockade in Patients with Cancer: A Systematic Review of Case Reports. PLoS ONE 11 (7): e0160221. doi:10.1371/journal.pone.0160221; and Gelao et al., Immune Checkpoint Blockade in Cancer Treatment: A Double-Edged Sword Cross-Targeting the Host as an “Innocent Bystander”, Toxins 2014, 6, 914-933; doi:10.3390/toxins6030914). Thus, patients receiving immunotherapy are at risk for adverse autoimmune responses.
In certain embodiments, irAEs are related to Th17 pathogenicity. In one study, patients treated with ipilimumab had fluctuations in serum IL-17 levels, such that serum IL-17 levels in patients with colitis versus no irAEs demonstrated significantly higher serum IL-17 levels in the patients with colitis (Callahan et al., (2011) Evaluation of serum IL-17 levels during ipilimumab therapy: Correlation with colitis. Journal of Clinical Oncology 29, no. 15 suppl 2505-2505).
In certain embodiments, the modulating agents described herein can be used to shift T cell balance away from Th17 autoimmune responses in patients treated with checkpoint blockade therapy. In certain embodiments, agents modulating the polyamine pathway or glycolysis pathway are used as part of a cancer therapy regimen.
In certain embodiments, T cells differentiated according to the present invention (e.g., treated with DFMO, a polyamine, or other polyamine analogue, polyamine pathway targeting drug, glycolysis targeting drug, or genetically modified) are used in adoptive cell transfer to treat an aberrant inflammatory response (e.g., autoimmune response). In certain embodiments, a modulating agent according to the present invention is used in combination with ACT to prevent an aberrant immune response.
As used herein, “ACT”, “adoptive cell therapy” and “adoptive cell transfer” may be used interchangeably. In certain embodiments, Adoptive cell therapy (ACT) can refer to the transfer of cells to a patient with the goal of transferring the functionality and characteristics into the new host by engraftment of the cells (see, e.g., Mettananda et al., Editing an α-globin enhancer in primary human hematopoietic stem cells as a treatment for β-thalassemia, Nat Commun. 2017 Sep. 4; 8(1):424). As used herein, the term “engraft” or “engraftment” refers to the process of cell incorporation into a tissue of interest in vivo through contact with existing cells of the tissue. Adoptive cell therapy (ACT) can refer to the transfer of cells, most commonly immune-derived cells, back into the same patient or into a new recipient host with the goal of transferring the immunologic functionality and characteristics into the new host. If possible, use of autologous cells helps the recipient by minimizing GVHD issues. The adoptive transfer of autologous tumor infiltrating lymphocytes (TIL) (Zacharakis et al., (2018) Nat Med. 2018 June; 24(6):724-730; Besser et al., (2010) Clin. Cancer Res 16 (9) 2646-55; Dudley et al., (2002) Science 298 (5594): 850-4; and Dudley et al., (2005) Journal of Clinical Oncology 23 (10): 2346-57) or genetically re-directed peripheral blood mononuclear cells (Johnson et al., (2009) Blood 114 (3): 535-46; and Morgan et al., (2006) Science 314(5796) 126-9) has been used to successfully treat patients with advanced solid tumors, including melanoma, metastatic breast cancer and colorectal carcinoma, as well as patients with CD19-expressing hematologic malignancies (Kalos et al., (2011) Science Translational Medicine 3 (95): 95ra73). In certain embodiments, allogenic cells immune cells are transferred (see, e.g., Ren et al., (2017) Clin Cancer Res 23 (9) 2255-2266). As described further herein, allogenic cells can be edited to reduce alloreactivity and prevent graft-versus-host disease. Thus, use of allogenic cells allows for cells to be obtained from healthy donors and prepared for use in patients as opposed to preparing autologous cells from a patient after diagnosis.
Aspects of the invention involve the adoptive transfer of immune system cells, such as T cells, specific for selected antigens, such as tumor associated antigens or tumor specific neoantigens (see, e.g., Maus et al., 2014, Adoptive Immunotherapy for Cancer or Viruses, Annual Review of Immunology, Vol. 32: 189-225; Rosenberg and Restifo, 2015, Adoptive cell transfer as personalized immunotherapy for human cancer, Science Vol. 348 no. 6230 pp. 62-68; Restifo et al., 2015, Adoptive immunotherapy for cancer: harnessing the T cell response. Nat. Rev. Immunol. 12(4): 269-281; and Jenson and Riddell, 2014, Design and implementation of adoptive therapy with chimeric antigen receptor-modified T cells. Immunol Rev. 257(1): 127-144; and Rajasagi et al., 2014, Systematic identification of personal tumor-specific neoantigens in chronic lymphocytic leukemia. Blood. 2014 Jul. 17; 124(3):453-62).
In certain embodiments, an antigen (such as a tumor antigen) to be targeted in adoptive cell therapy (such as particularly CAR or TCR T-cell therapy) of a disease (such as particularly of tumor or cancer) may be selected from a group consisting of B cell maturation antigen (BCMA) (see, e.g., Friedman et al., Effective Targeting of Multiple BCMA-Expressing Hematological Malignancies by Anti-BCMA CAR T Cells, Hum Gene Ther. 2018 Mar. 8; Berdeja J G, et al. Durable clinical responses in heavily pretreated patients with relapsed/refractory multiple myeloma: updated results from a multicenter study of bb2121 anti-Bcma CAR T cell therapy. Blood. 2017; 130:740; and Mouhieddine and Ghobrial, Immunotherapy in Multiple Myeloma: The Era of CAR T Cell Therapy, Hematologist, May-June 2018, Volume 15, issue 3); PSA (prostate-specific antigen); prostate-specific membrane antigen (PSMA); PSCA (Prostate stem cell antigen); Tyrosine-protein kinase transmembrane receptor ROR1; fibroblast activation protein (FAP); Tumor-associated glycoprotein 72 (TAG72); Carcinoembryonic antigen (CEA); Epithelial cell adhesion molecule (EPCAM); Mesothelin; Human Epidermal growth factor Receptor 2 (ERBB2 (Her2/neu)); Prostate; Prostatic acid phosphatase (PAP); elongation factor 2 mutant (ELF2M); Insulin-like growth factor 1 receptor (IGF-1R); gplOO; BCR-ABL (breakpoint cluster region-Abelson); tyrosinase; New York esophageal squamous cell carcinoma 1 (NY-ESO-1); κ-light chain, LAGE (L antigen); MAGE (melanoma antigen); Melanoma-associated antigen 1 (MAGE-A1); MAGE A3; MAGE A6; legumain; Human papillomavirus (HPV) E6; HPV E7; prostein; survivin; PCTA1 (Galectin 8); Melan-A/MART-1; Ras mutant; TRP-1 (tyrosinase related protein 1, or gp75); Tyrosinase-related Protein 2 (TRP2); TRP-2/INT2 (TRP-2/intron 2); RAGE (renal antigen); receptor for advanced glycation end products 1 (RAGE1); Renal ubiquitous 1, 2 (RU1, RU2); intestinal carboxyl esterase (iCE); Heat shock protein 70-2 (HSP70-2) mutant; thyroid stimulating hormone receptor (TSHR); CD123; CD171; CD19; CD20; CD22; CD26; CD30; CD33; CD44v7/8 (cluster of differentiation 44, exons 7/8); CD53; CD92; CD100; CD148; CD150; CD200; CD261; CD262; CD362; CS-1 (CD2 subset 1, CRACC, SLAMF7, CD319, and 19A24); C-type lectin-like molecule-1 (CLL-1); ganglioside GD3 (aNeu5Ac(2-8)aNeu5Ac(2-3)bDGalp(1-4)bDG1cp(1-1)Cer); Tn antigen (Tn Ag); Fms-Like Tyrosine Kinase 3 (FLT3); CD38; CD138; CD44v6; B7H3 (CD276); KIT (CD117); Interleukin-13 receptor subunit alpha-2 (IL-13Ra2); Interleukin 11 receptor alpha (IL-11Ra); prostate stem cell antigen (PSCA); Protease Serine 21 (PRSS21); vascular endothelial growth factor receptor 2 (VEGFR2); Lewis(Y) antigen; CD24; Platelet-derived growth factor receptor beta (PDGFR-beta); stage-specific embryonic antigen-4 (SSEA-4); Mucin 1, cell surface associated (MUC1); mucin 16 (MUC16); epidermal growth factor receptor (EGFR); epidermal growth factor receptor variant III (EGFRvIII); neural cell adhesion molecule (NCAM); carbonic anhydrase IX (CAIX); Proteasome (Prosome, Macropain) Subunit, Beta Type, 9 (LMP2); ephrin type-A receptor 2 (EphA2); Ephrin B2; Fucosyl GM1; sialyl Lewis adhesion molecule (sLe); ganglioside GM3 (aNeu5Ac(2-3)bDGalp(1-4)bDGlcp(1-1)Cer); TGS5; high molecular weight-melanoma-associated antigen (HMWMAA); o-acetyl-GD2 ganglioside (OAcGD2); Folate receptor alpha; Folate receptor beta; tumor endothelial marker 1 (TEM1/CD248); tumor endothelial marker 7-related (TEM7R); claudin 6 (CLDN6); G protein-coupled receptor class C group 5, member D (GPRC5D); chromosome X open reading frame 61 (CXORF61); CD97; CD179a; anaplastic lymphoma kinase (ALK); Polysialic acid; placenta-specific 1 (PLAC1); hexasaccharide portion of globoH glycoceramide (GloboH); mammary gland differentiation antigen (NY-BR-1); uroplakin 2 (UPK2); Hepatitis A virus cellular receptor 1 (HAVCR1); adrenoceptor beta 3 (ADRB3); pannexin 3 (PANX3); G protein-coupled receptor 20 (GPR20); lymphocyte antigen 6 complex, locus K 9 (LY6K); Olfactory receptor 51E2 (OR51E2); TCR Gamma Alternate Reading Frame Protein (TARP); Wilms tumor protein (WT1); ETS translocation-variant gene 6, located on chromosome 12p (ETV6-AML); sperm protein 17 (SPA17); X Antigen Family, Member 1A (XAGE1); angiopoietin-binding cell surface receptor 2 (Tie 2); CT (cancer/testis (antigen)); melanoma cancer testis antigen-1 (MAD-CT-1); melanoma cancer testis antigen-2 (MAD-CT-2); Fos-related antigen 1; p53; p53 mutant; human Telomerase reverse transcriptase (hTERT); sarcoma translocation breakpoints; melanoma inhibitor of apoptosis (ML-IAP); ERG (transmembrane protease, serine 2 (TMPRSS2) ETS fusion gene); N-Acetyl glucosaminyl-transferase V (NA17); paired box protein Pax-3 (PAX3); Androgen receptor; Cyclin B 1; Cyclin D1; v-myc avian myelocytomatosis viral oncogene neuroblastoma derived homolog (MYCN); Ras Homolog Family Member C (RhoC); Cytochrome P450 1B1 (CYP1B1); CCCTC-Binding Factor (Zinc Finger Protein)-Like (BORIS); Squamous Cell Carcinoma Antigen Recognized By T Cells-1 or 3 (SART1, SART3); Paired box protein Pax-5 (PAXS); proacrosin binding protein sp32 (OY-TES1); lymphocyte-specific protein tyrosine kinase (LCK); A kinase anchor protein 4 (AKAP-4); synovial sarcoma, X breakpoint-1, -2, -3 or -4 (SSX1, SSX2, SSX3, SSX4); CD79a; CD79b; CD72; Leukocyte-associated immunoglobulin-like receptor 1 (LAIR1); Fc fragment of IgA receptor (FCAR); Leukocyte immunoglobulin-like receptor subfamily A member 2 (LILRA2); CD300 molecule-like family member f (CD300LF); C-type lectin domain family 12 member A (CLEC12A); bone marrow stromal cell antigen 2 (BST2); EGF-like module-containing mucin-like hormone receptor-like 2 (EMR2); lymphocyte antigen 75 (LY75); Glypican-3 (GPC3); Fc receptor-like 5 (FCRL5); mouse double minute 2 homolog (MDM2); livin; alphafetoprotein (AFP); transmembrane activator and CAML Interactor (TACI); B-cell activating factor receptor (BAFF-R); V-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (KRAS); immunoglobulin lambda-like polypeptide 1 (IGLL1); 707-AP (707 alanine proline); ART-4 (adenocarcinoma antigen recognized by T4 cells); BAGE (B antigen; b-catenin/m, b-catenin/mutated); CAMEL (CTL-recognized antigen on melanoma); CAP1 (carcinoembryonic antigen peptide 1); CASP-8 (caspase-8); CDC27m (cell-division cycle 27 mutated); CDK4/m (cycline-dependent kinase 4 mutated); Cyp-B (cyclophilin B); DAM (differentiation antigen melanoma); EGP-2 (epithelial glycoprotein 2); EGP-40 (epithelial glycoprotein 40); Erbb2, 3, 4 (erythroblastic leukemia viral oncogene homolog-2, -3, 4); FBP (folate binding protein); fAchR (Fetal acetylcholine receptor); G250 (glycoprotein 250); GAGE (G antigen); GnT-V (N-acetylglucosaminyltransferase V); HAGE (helicose antigen); ULA-A (human leukocyte antigen-A); HST2 (human signet ring tumor 2); KIAA0205; KDR (kinase insert domain receptor); LDLR/FUT (low density lipid receptor/GDP L-fucose: b-D-galactosidase 2-a-L fucosyltransferase); L1CAM (L1 cell adhesion molecule); MC1R (melanocortin 1 receptor); Myosin/m (myosin mutated); MUM-1, -2, -3 (melanoma ubiquitous mutated 1, 2, 3); NA88-A (NA cDNA clone of patient M88); KG2D (Natural killer group 2, member D) ligands; oncofetal antigen (h5T4); p190 minor bcr-abl (protein of 190KD bcr-abl); Pml/RARa (promyelocytic leukaemia/retinoic acid receptor a); PRAME (preferentially expressed antigen of melanoma); SAGE (sarcoma antigen); TEL/AML1 (translocation Ets-family leukemia/acute myeloid leukemia 1); TPI/m (triosephosphate isomerase mutated); CD70; and any combination thereof.
In certain embodiments, an antigen to be targeted in adoptive cell therapy (such as particularly CAR or TCR T-cell therapy) of a disease (such as particularly of tumor or cancer) is a tumor-specific antigen (TSA).
In certain embodiments, an antigen to be targeted in adoptive cell therapy (such as particularly CAR or TCR T-cell therapy) of a disease (such as particularly of tumor or cancer) is a neoantigen.
In certain embodiments, an antigen to be targeted in adoptive cell therapy (such as particularly CAR or TCR T-cell therapy) of a disease (such as particularly of tumor or cancer) is a tumor-associated antigen (TAA).
In certain embodiments, an antigen to be targeted in adoptive cell therapy (such as particularly CAR or TCR T-cell therapy) of a disease (such as particularly of tumor or cancer) is a universal tumor antigen. In certain preferred embodiments, the universal tumor antigen is selected from the group consisting of: a human telomerase reverse transcriptase (hTERT), survivin, mouse double minute 2 homolog (MDM2), cytochrome P450 1B 1 (CYP1B), HER2/neu, Wilms' tumor gene 1 (WT1), livin, alphafetoprotein (AFP), carcinoembryonic antigen (CEA), mucin 16 (MUC16), MUC1, prostate-specific membrane antigen (PSMA), p53, cyclin (Dl), and any combinations thereof.
In certain embodiments, an antigen (such as a tumor antigen) to be targeted in adoptive cell therapy (such as particularly CAR or TCR T-cell therapy) of a disease (such as particularly of tumor or cancer) may be selected from a group consisting of: CD19, BCMA, CD70, CLL-1, MAGE A3, MAGE A6, HPV E6, HPV E7, WT1, CD22, CD171, ROR1, MUC16, and SSX2. In certain preferred embodiments, the antigen may be CD19. For example, CD19 may be targeted in hematologic malignancies, such as in lymphomas, more particularly in B-cell lymphomas, such as without limitation in diffuse large B-cell lymphoma, primary mediastinal b-cell lymphoma, transformed follicular lymphoma, marginal zone lymphoma, mantle cell lymphoma, acute lymphoblastic leukemia including adult and pediatric ALL, non-Hodgkin lymphoma, indolent non-Hodgkin lymphoma, or chronic lymphocytic leukemia. For example, BCMA may be targeted in multiple myeloma or plasma cell leukemia (see, e.g., 2018 American Association for Cancer Research (AACR) Annual meeting Poster: Allogeneic Chimeric Antigen Receptor T Cells Targeting B Cell Maturation Antigen). For example, CLL1 may be targeted in acute myeloid leukemia. For example, MAGE A3, MAGE A6, SSX2, and/or KRAS may be targeted in solid tumors. For example, HPV E6 and/or HPV E7 may be targeted in cervical cancer or head and neck cancer. For example, WT1 may be targeted in acute myeloid leukemia (AML), myelodysplastic syndromes (MDS), chronic myeloid leukemia (CIVIL), non-small cell lung cancer, breast, pancreatic, ovarian or colorectal cancers, or mesothelioma. For example, CD22 may be targeted in B cell malignancies, including non-Hodgkin lymphoma, diffuse large B-cell lymphoma, or acute lymphoblastic leukemia. For example, CD171 may be targeted in neuroblastoma, glioblastoma, or lung, pancreatic, or ovarian cancers. For example, ROR1 may be targeted in ROR1+ malignancies, including non-small cell lung cancer, triple negative breast cancer, pancreatic cancer, prostate cancer, ALL, chronic lymphocytic leukemia, or mantle cell lymphoma. For example, MUC16 may be targeted in MUC16ecto+ epithelial ovarian, fallopian tube or primary peritoneal cancer. For example, CD70 may be targeted in both hematologic malignancies as well as in solid cancers such as renal cell carcinoma (RCC), gliomas (e.g., GBM), and head and neck cancers (HNSCC). CD70 is expressed in both hematologic malignancies as well as in solid cancers, while its expression in normal tissues is restricted to a subset of lymphoid cell types (see, e.g., 2018 American Association for Cancer Research (AACR) Annual meeting Poster: Allogeneic CRISPR Engineered Anti-CD70 CAR-T Cells Demonstrate Potent Preclinical Activity Against Both Solid and Hematological Cancer Cells).
Various strategies may for example be employed to genetically modify T cells by altering the specificity of the T cell receptor (TCR) for example by introducing new TCR a and chains with selected peptide specificity (see U.S. Pat. No. 8,697,854; PCT Patent Publications: WO2003020763, WO2004033685, WO2004044004, WO2005114215, WO2006000830, WO2008038002, WO2008039818, WO2004074322, WO2005113595, WO2006125962, WO2013166321, WO2013039889, WO2014018863, WO2014083173; U.S. Pat. No. 8,088,379).
As an alternative to, or addition to, TCR modifications, chimeric antigen receptors (CARs) may be used in order to generate immunoresponsive cells, such as T cells, specific for selected targets, such as malignant cells, with a wide variety of receptor chimera constructs having been described (see U.S. Pat. Nos. 5,843,728, 5,851,828, 5,912,170, 6,004,811, 6,284,240, 6,392,013, 6,410,014, 6,753,162, 8,211,422, and International Patent Publication WO9215322).
In general, CARs are comprised of an extracellular domain, a transmembrane domain, and an intracellular domain, wherein the extracellular domain comprises an antigen-binding domain that is specific for a predetermined target. While the antigen-binding domain of a CAR is often an antibody or antibody fragment (e.g., a single chain variable fragment, scFv), the binding domain is not particularly limited so long as it results in specific recognition of a target. For example, in some embodiments, the antigen-binding domain may comprise a receptor, such that the CAR is capable of binding to the ligand of the receptor. Alternatively, the antigen-binding domain may comprise a ligand, such that the CAR is capable of binding the endogenous receptor of that ligand.
The antigen-binding domain of a CAR is generally separated from the transmembrane domain by a hinge or spacer. The spacer is also not particularly limited, and it is designed to provide the CAR with flexibility. For example, a spacer domain may comprise a portion of a human Fc domain, including a portion of the CH3 domain, or the hinge region of any immunoglobulin, such as IgA, IgD, IgE, IgG, or IgM, or variants thereof. Furthermore, the hinge region may be modified so as to prevent off-target binding by FcRs or other potential interfering objects. For example, the hinge may comprise an IgG4 Fc domain with or without a S228P, L235E, and/or N297Q mutation (according to Kabat numbering) in order to decrease binding to FcRs. Additional spacers/hinges include, but are not limited to, CD4, CD8, and CD28 hinge regions.
The transmembrane domain of a CAR may be derived either from a natural or from a synthetic source. Where the source is natural, the domain may be derived from any membrane bound or transmembrane protein. Transmembrane regions of particular use in this disclosure may be derived from CD8, CD28, CD3, CD45, CD4, CD5, CDS, CD9, CD 16, CD22, CD33, CD37, CD64, CD80, CD86, CD 134, CD137, CD 154, TCR. Alternatively, the transmembrane domain may be synthetic, in which case it will comprise predominantly hydrophobic residues such as leucine and valine. Preferably a triplet of phenylalanine, tryptophan and valine will be found at each end of a synthetic transmembrane domain. Optionally, a short oligo- or polypeptide linker, preferably between 2 and 10 amino acids in length may form the linkage between the transmembrane domain and the cytoplasmic signaling domain of the CAR. A glycine-serine doublet provides a particularly suitable linker.
Alternative CAR constructs may be characterized as belonging to successive generations. First-generation CARs typically consist of a single-chain variable fragment of an antibody specific for an antigen, for example comprising a VL linked to a VH of a specific antibody, linked by a flexible linker, for example by a CD8a hinge domain and a CD8a transmembrane domain, to the transmembrane and intracellular signaling domains of either CD3 or FcRγ (scFv-CD3ζ or scFv-FcRγ; see U.S. Pat. Nos. 7,741,465; 5,912,172; 5,906,936). Second-generation CARs incorporate the intracellular domains of one or more costimulatory molecules, such as CD28, OX40 (CD134), or 4-1BB (CD137) within the endodomain (for example scFv-CD28/0X40/4-1BB-CD3ζ; see U.S. Pat. Nos. 8,911,993; 8,916,381; 8,975,071; 9,101,584; 9,102,760; 9,102,761). Third-generation CARs include a combination of costimulatory endodomains, such a CD3ζ-chain, CD97, GDI la-CD18, CD2, ICOS, CD27, CD154, CDS, OX40, 4-1BB, CD2, CD7, LIGHT, LFA-1, NKG2C, B7-H3, CD30, CD40, PD-1, or CD28 signaling domains (for example scFv-CD28-4-1BB-CD3ζ or scFv-CD28-OX40-CD3ζ; see U.S. Pat. Nos. 8,906,682, 8,399,645, 5,686,281, and International Patent Publication Nos. WO2014134165 and WO2012079000). In certain embodiments, the primary signaling domain comprises a functional signaling domain of a protein selected from the group consisting of CD3 zeta, CD3 gamma, CD3 delta, CD3 epsilon, common FcR gamma (FCERIG), FcR beta (Fc Epsilon Rib), CD79a, CD79b, Fc gamma RIM, DAP10, and DAP12. In certain preferred embodiments, the primary signaling domain comprises a functional signaling domain of CD3ζ or FcRγ. In certain embodiments, the one or more costimulatory signaling domains comprise a functional signaling domain of a protein selected, each independently, from the group consisting of CD27, CD28, 4-1BB (CD137), OX40, CD30, CD40, PD-1, ICOS, lymphocyte function-associated antigen-1 (LFA-1), CD2, CD7, LIGHT, NKG2C, B7-H3, a ligand that specifically binds with CD83, CDS, ICAM-1, GITR, BAFFR, HVEM (LIGHTR), SLAMF7, NKp80 (KLRF1), CD160, CD19, CD4, CD8 alpha, CD8 beta, IL2R beta, IL2R gamma, IL7R alpha, ITGA4, VLA1, CD49a, ITGA4, IA4, CD49D, ITGA6, VLA-6, CD49f, ITGAD, CD11d, ITGAE, CD103, ITGAL, CD11a, LFA-1, ITGAM, CD11b, ITGAX, CD11c, ITGB1, CD29, ITGB2, CD18, ITGB7, TNFR2, TRANCE/RANKL, DNAM1 (CD226), SLAMF4 (CD244, 2B4), CD84, CD96 (Tactile), CEACAM1, CRTAM, Ly9 (CD229), CD160 (BY55), PSGL1, CD100 (SEMA4D), CD69, SLAMF6 (NTB-A, Ly108), SLAM (SLAMF1, CD150, IPO-3), BLAME (SLAMF8), SELPLG (CD162), LTBR, LAT, GADS, SLP-76, PAG/Cbp, NKp44, NKp30, NKp46, and NKG2D. In certain embodiments, the one or more costimulatory signaling domains comprise a functional signaling domain of a protein selected, each independently, from the group consisting of: 4-1BB, CD27, and CD28. In certain embodiments, a chimeric antigen receptor may have the design as described in U.S. Pat. No. 7,446,190, comprising an intracellular domain of CD3 chain (such as amino acid residues 52-163 of the human CD3 zeta chain, as shown in SEQ ID NO:14 of U.S. Pat. No. 7,446,190), a signaling region from CD28 and an antigen-binding element (or portion or domain; such as scFv). The CD28 portion, when between the zeta chain portion and the antigen-binding element, may suitably include the transmembrane and signaling domains of CD28 (such as amino acid residues 114-220 of SEQ ID NO: 10, full sequence shown in SEQ ID NO:6 of U.S. Pat. No. 7,446,190; these can include the following portion of CD28 as set forth in Genbank identifier NM_006139 (sequence version 1, 2 or 3):
Alternatively, when the zeta sequence lies between the CD28 sequence and the antigen-binding element, intracellular domain of CD28 can be used alone (such as amino sequence set forth in SEQ ID NO:9 of U.S. Pat. No. 7,446,190). Hence, certain embodiments employ a CAR comprising (a) a zeta chain portion comprising the intracellular domain of human CD3ζ chain, (b) a costimulatory signaling region, and (c) an antigen-binding element (or portion or domain), wherein the costimulatory signaling region comprises the amino acid sequence encoded by SEQ ID NO:6 of U.S. Pat. No. 7,446,190.
Alternatively, costimulation may be orchestrated by expressing CARs in antigen-specific T cells, chosen so as to be activated and expanded following engagement of their native αβTCR, for example by antigen on professional antigen-presenting cells, with attendant costimulation. In addition, additional engineered receptors may be provided on the immunoresponsive cells, for example to improve targeting of a T-cell attack and/or minimize side effects
By means of an example and without limitation, Kochenderfer et al., (2009) J Immunother. 32 (7): 689-702 described anti-CD19 chimeric antigen receptors (CAR). FMC63-28Z CAR contained a single chain variable region moiety (scFv) recognizing CD19 derived from the FMC63 mouse hybridoma (described in Nicholson et al., (1997) Molecular Immunology 34: 1157-1165), a portion of the human CD28 molecule, and the intracellular component of the human TCR-ζ molecule. FMC63-CD828BBZ CAR contained the FMC63 scFv, the hinge and transmembrane regions of the CD8 molecule, the cytoplasmic portions of CD28 and 4-1BB, and the cytoplasmic component of the TCR-molecule. The exact sequence of the CD28 molecule included in the FMC63-28Z CAR corresponded to Genbank identifier NM_006139; the sequence included all amino acids starting with the amino acid sequence IEVMYPPPY (SEQ ID NO:21) and continuing all the way to the carboxy-terminus of the protein. To encode the anti-CD19 scFv component of the vector, the authors designed a DNA sequence which was based on a portion of a previously published CAR (Cooper et al., (2003) Blood 101: 1637-1644). This sequence encoded the following components in frame from the 5′ end to the 3′ end: an XhoI site, the human granulocyte-macrophage colony-stimulating factor (GM-CSF) receptor α-chain signal sequence, the FMC63 light chain variable region (as in Nicholson et al., supra), a linker peptide (as in Cooper et al., supra), the FMC63 heavy chain variable region (as in Nicholson et al., supra), and a NotI site. A plasmid encoding this sequence was digested with XhoI and NotI. To form the MSGV-FMC63-28Z retroviral vector, the XhoI and NotI-digested fragment encoding the FMC63 scFv was ligated into a second XhoI and NotI-digested fragment that encoded the MSGV retroviral backbone (as in Hughes et al., (2005) Human Gene Therapy 16: 457-472) as well as part of the extracellular portion of human CD28, the entire transmembrane and cytoplasmic portion of human CD28, and the cytoplasmic portion of the human TCR-molecule (as in Maher et al., 2002) Nature Biotechnology 20: 70-75). The FMC63-28Z CAR is included in the KTE-C19 (axicabtagene ciloleucel) anti-CD19 CAR-T therapy product in development by Kite Pharma, Inc. for the treatment of inter alia patients with relapsed/refractory aggressive B-cell non-Hodgkin lymphoma (NHL). Accordingly, in certain embodiments, cells intended for adoptive cell therapies, more particularly immunoresponsive cells such as T cells, may express the FMC63-28Z CAR as described by Kochenderfer et al. (supra). Hence, in certain embodiments, cells intended for adoptive cell therapies, more particularly immunoresponsive cells such as T cells, may comprise a CAR comprising an extracellular antigen-binding element (or portion or domain; such as scFv) that specifically binds to an antigen, an intracellular signaling domain comprising an intracellular domain of a CD3ζ chain, and a costimulatory signaling region comprising a signaling domain of CD28. Preferably, the CD28 amino acid sequence is as set forth in Genbank identifier NM_006139 (sequence version 1, 2 or 3) starting with the amino acid sequence IEVMYPPPY (SEQ ID NO:21) and continuing all the way to the carboxy-terminus of the protein. The sequence is reproduced herein:
Preferably, the antigen is CD19, more preferably the antigen-binding element is an anti-CD19 scFv, even more preferably the anti-CD19 scFv as described by Kochenderfer et al. (supra).
Additional anti-CD19 CARs are further described in International Patent Publication WO2015187528. More particularly Example 1 and Table 1 of WO2015187528, incorporated by reference herein, demonstrate the generation of anti-CD19 CARs based on a fully human anti-CD19 monoclonal antibody (47G4, as described in US20100104509) and murine anti-CD19 monoclonal antibody (as described in Nicholson et al. and explained above). Various combinations of a signal sequence (human CD8-alpha or GM-CSF receptor), extracellular and transmembrane regions (human CD8-alpha) and intracellular T-cell signaling domains (CD28-CD3; 4-1BB-CD3ζ; CD27-CD3ζ; CD28-CD27-CD3ζ, 4-1BB-CD27-CD3ζ; CD27-4-1BB-CD3ζ; CD28-CD27-FcεRI gamma chain; or CD28-FcεRT gamma chain) were disclosed. Hence, in certain embodiments, cells intended for adoptive cell therapies, more particularly immunoresponsive cells such as T cells, may comprise a CAR comprising an extracellular antigen-binding element that specifically binds to an antigen, an extracellular and transmembrane region as set forth in Table 1 of WO2015187528 and an intracellular T-cell signaling domain as set forth in Table 1 of WO2015187528. Preferably, the antigen is CD19, more preferably the antigen-binding element is an anti-CD19 scFv, even more preferably the mouse or human anti-CD19 scFv as described in Example 1 of WO2015187528. In certain embodiments, the CAR comprises, consists essentially of or consists of an amino acid sequence of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, or SEQ ID NO: 13 as set forth in Table 1 of WO2015187528.
By means of an example and without limitation, chimeric antigen receptor that recognizes the CD70 antigen is described in WO2012058460A2 (see also, Park et al., CD70 as a target for chimeric antigen receptor T cells in head and neck squamous cell carcinoma, Oral Oncol. 2018 March; 78:145-150; and Jin et al., CD70, a novel target of CAR T-cell therapy for gliomas, Neuro Oncol. 2018 Jan. 10; 20(1):55-65). CD70 is expressed by diffuse large B-cell and follicular lymphoma and also by the malignant cells of Hodgkins lymphoma, Waldenstrom's macroglobulinemia and multiple myeloma, and by HTLV-1- and EBV-associated malignancies. (Agathanggelou et al. Am. J.Pathol. 1995; 147: 1152-1160; Hunter et al., Blood 2004; 104:4881. 26; Lens et al., J Immunol. 2005; 174:6212-6219; Baba et al., J Virol. 2008; 82:3843-3852.) In addition, CD70 is expressed by non-hematological malignancies such as renal cell carcinoma and glioblastoma. (Junker et al., J Urol. 2005; 173:2150-2153; Chahlavi et al., Cancer Res 2005; 65:5428-5438) Physiologically, CD70 expression is transient and restricted to a subset of highly activated T, B, and dendritic cells.
By means of an example and without limitation, chimeric antigen receptor that recognizes BCMA has been described (see, e.g., US Patent Publication Nos. US 20160046724A1, US 20180085444 A1, and US 20170283504 A1, and International Patent Publications No. WO2016014789A2, WO2017211900A1, WO2015158671A1, WO2018028647A1, and WO2013154760A1).
In certain embodiments, the immune cell may, in addition to a CAR or exogenous TCR as described herein, further comprise a chimeric inhibitory receptor (inhibitory CAR) that specifically binds to a second target antigen and is capable of inducing an inhibitory or immunosuppressive or repressive signal to the cell upon recognition of the second target antigen. In certain embodiments, the chimeric inhibitory receptor comprises an extracellular antigen-binding element (or portion or domain) configured to specifically bind to a target antigen, a transmembrane domain, and an intracellular immunosuppressive or repressive signaling domain. In certain embodiments, the second target antigen is an antigen that is not expressed on the surface of a cancer cell or infected cell or the expression of which is downregulated on a cancer cell or an infected cell. In certain embodiments, the second target antigen is an MHC-class I molecule. In certain embodiments, the intracellular signaling domain comprises a functional signaling portion of an immune checkpoint molecule, such as for example PD-1 or CTLA4. Advantageously, the inclusion of such inhibitory CAR reduces the chance of the engineered immune cells attacking non-target (e.g., non-cancer) tissues.
Alternatively, T-cells expressing CARs may be further modified to reduce or eliminate expression of endogenous TCRs in order to reduce off-target effects. Reduction or elimination of endogenous TCRs can reduce off-target effects and increase the effectiveness of the T cells (U.S. Pat. No. 9,181,527). T cells stably lacking expression of a functional TCR may be produced using a variety of approaches. T cells internalize, sort, and degrade the entire T cell receptor as a complex, with a half-life of about 10 hours in resting T cells and 3 hours in stimulated T cells (von Essen, M. et al. 2004. J. Immunol. 173:384-393). Proper functioning of the TCR complex requires the proper stoichiometric ratio of the proteins that compose the TCR complex. TCR function also requires two functioning TCR zeta proteins with ITAM motifs. The activation of the TCR upon engagement of its MHC-peptide ligand requires the engagement of several TCRs on the same T cell, which all must signal properly. Thus, if a TCR complex is destabilized with proteins that do not associate properly or cannot signal optimally, the T cell will not become activated sufficiently to begin a cellular response.
Accordingly, in some embodiments, TCR expression may be eliminated using RNA interference (e.g., shRNA, siRNA, miRNA, etc.), CRISPR, or other methods that target the nucleic acids encoding specific TCRs (e.g., TCR-α and TCR-β) and/or CD3 chains in primary T cells. By blocking expression of one or more of these proteins, the T cell will no longer produce one or more of the key components of the TCR complex, thereby destabilizing the TCR complex and preventing cell surface expression of a functional TCR.
In some instances, CAR may also comprise a switch mechanism for controlling expression and/or activation of the CAR. For example, a CAR may comprise an extracellular, transmembrane, and intracellular domain, in which the extracellular domain comprises a target-specific binding element that comprises a label, binding domain, or tag that is specific for a molecule other than the target antigen that is expressed on or by a target cell. In such embodiments, the specificity of the CAR is provided by a second construct that comprises a target antigen binding domain (e.g., an scFv or a bispecific antibody that is specific for both the target antigen and the label or tag on the CAR) and a domain that is recognized by or binds to the label, binding domain, or tag on the CAR. See, e.g., International Patent Publication Nos. WO 2013/044225, WO 2016/000304, WO 2015/057834, WO 2015/057852, and WO 2016/070061, U.S. Pat. No. 9,233,125, and US Patent Publication No. US 2016/0129109. In this way, a T-cell that expresses the CAR can be administered to a subject, but the CAR cannot bind its target antigen until the second composition comprising an antigen-specific binding domain is administered.
Alternative switch mechanisms include CARs that require multimerization in order to activate their signaling function (see, e.g., US Patent Publication Nos. US 2015/0368342, US 2016/0175359, and US 2015/0368360) and/or an exogenous signal, such as a small molecule drug (US 2016/0166613, Yung et al., Science, 2015), in order to elicit a T-cell response. Some CARs may also comprise a “suicide switch” to induce cell death of the CAR T-cells following treatment (Buddee et al., PLoS One, 2013) or to downregulate expression of the CAR following binding to the target antigen (WO 2016/011210).
Alternative techniques may be used to transform target immunoresponsive cells, such as protoplast fusion, lipofection, transfection or electroporation. A wide variety of vectors may be used, such as retroviral vectors, lentiviral vectors, adenoviral vectors, adeno-associated viral vectors, plasmids or transposons, such as a Sleeping Beauty transposon (see U.S. Pat. Nos. 6,489,458; 7,148,203; 7,160,682; 7,985,739; 8,227,432), may be used to introduce CARs, for example using 2nd generation antigen-specific CARs signaling through CD3ζ and either CD28 or CD137. Viral vectors may for example include vectors based on HIV, SV40, EBV, HSV or BPV.
Cells that are targeted for transformation may for example include T cells, Natural Killer (NK) cells, cytotoxic T lymphocytes (CTL), regulatory T cells, human embryonic stem cells, tumor-infiltrating lymphocytes (TIL) or a pluripotent stem cell from which lymphoid cells may be differentiated. T cells expressing a desired CAR may for example be selected through co-culture with γ-irradiated activating and propagating cells (AaPC), which co-express the cancer antigen and co-stimulatory molecules. The engineered CAR T-cells may be expanded, for example by co-culture on AaPC in presence of soluble factors, such as IL-2 and IL-21. This expansion may for example be carried out so as to provide memory CAR+ T cells (which may for example be assayed by non-enzymatic digital array and/or multi-panel flow cytometry). In this way, CAR T cells may be provided that have specific cytotoxic activity against antigen-bearing tumors (optionally in conjunction with production of desired chemokines such as interferon-y). CART cells of this kind may for example be used in animal models, for example to treat tumor xenografts.
In certain embodiments, ACT includes co-transferring CD4+Th1 cells and CD8+ CTLs to induce a synergistic antitumor response (see, e.g., Li et al., Adoptive cell therapy with CD4+T helper 1 cells and CD8+ cytotoxic T cells enhances complete rejection of an established tumor, leading to generation of endogenous memory responses to non-targeted tumor epitopes. Clin Transl Immunology. 2017 October; 6(10): e160).
In certain embodiments, Th17 cells are transferred to a subject in need thereof. Th17 cells have been reported to directly eradicate melanoma tumors in mice to a greater extent than Th1 cells (Muranski P, et al., Tumor-specific Th17-polarized cells eradicate large established melanoma. Blood. 2008 Jul 15; 112(2):362-73; and Martin-Orozco N, et al., T helper 17 cells promote cytotoxic T cell activation in tumor immunity. Immunity. 2009 Nov. 20; 31(5):787-98). Those studies involved an adoptive T cell transfer (ACT) therapy approach, which takes advantage of CD4+ T cells that express a TCR recognizing tyrosinase tumor antigen. Exploitation of the TCR leads to rapid expansion of Th17 populations to large numbers ex vivo for reinfusion into the autologous tumor-bearing hosts.
In certain embodiments, ACT may include autologous iPSC-based vaccines, such as irradiated iPSCs in autologous anti-tumor vaccines (see e.g., Kooreman, Nigel G. et al., Autologous iPSC-Based Vaccines Elicit Anti-tumor Responses In Vivo, Cell Stem Cell 22, 1-13, 2018, doi.org/10.1016/j.stem.2018.01.016).
Unlike T-cell receptors (TCRs) that are MHC restricted, CARs can potentially bind any cell surface-expressed antigen and can thus be more universally used to treat patients (see Irving et al., Engineering Chimeric Antigen Receptor T-Cells for Racing in Solid Tumors: Don't Forget the Fuel, Front. Immunol., 3 Apr. 2017, doi.org/10.3389/fimmu.2017.00267). In certain embodiments, in the absence of endogenous T-cell infiltrate (e.g., due to aberrant antigen processing and presentation), which precludes the use of TIL therapy and immune checkpoint blockade, the transfer of CAR T-cells may be used to treat patients (see, e.g., Hinrichs C S, Rosenberg S A. Exploiting the curative potential of adoptive T-cell therapy for cancer. Immunol Rev (2014) 257(1):56-71. doi:10.1111/imr.12132).
Approaches such as the foregoing may be adapted to provide methods of treating and/or increasing survival of a subject having a disease, such as a neoplasia, for example by administering an effective amount of an immunoresponsive cell comprising an antigen recognizing receptor that binds a selected antigen, wherein the binding activates the immunoresponsive cell, thereby treating or preventing the disease (such as a neoplasia, a pathogen infection, an autoimmune disorder, or an allogeneic transplant reaction).
In certain embodiments, the treatment can be administered after lymphodepleting pretreatment in the form of chemotherapy (typically a combination of cyclophosphamide and fludarabine) or radiation therapy. Initial studies in ACT had short lived responses and the transferred cells did not persist in vivo for very long (Houot et al., T-cell-based immunotherapy: adoptive cell transfer and checkpoint inhibition. Cancer Immunol Res (2015) 3(10):1115-22; and Kamta et al., Advancing Cancer Therapy with Present and Emerging Immuno-Oncology Approaches. Front. Oncol. (2017) 7:64). Immune suppressor cells like Tregs and MDSCs may attenuate the activity of transferred cells by outcompeting them for the necessary cytokines. Not being bound by a theory lymphodepleting pretreatment may eliminate the suppressor cells allowing the TILs to persist.
In one embodiment, the treatment can be administrated into patients undergoing an immunosuppressive treatment (e.g., glucocorticoid treatment). The cells or population of cells, may be made resistant to at least one immunosuppressive agent due to the inactivation of a gene encoding a receptor for such immunosuppressive agent. In certain embodiments, the immunosuppressive treatment provides for the selection and expansion of the immunoresponsive T cells within the patient.
In certain embodiments, the treatment can be administered before primary treatment (e.g., surgery or radiation therapy) to shrink a tumor before the primary treatment. In another embodiment, the treatment can be administered after primary treatment to remove any remaining cancer cells.
In certain embodiments, immunometabolic barriers can be targeted therapeutically prior to and/or during ACT to enhance responses to ACT or CAR T-cell therapy and to support endogenous immunity (see, e.g., Irving et al., Engineering Chimeric Antigen Receptor T-Cells for Racing in Solid Tumors: Don't Forget the Fuel, Front. Immunol., 3 Apr. 2017, doi.org/10.3389/fimmu.2017.00267).
The administration of cells or population of cells, such as immune system cells or cell populations, such as more particularly immunoresponsive cells or cell populations, as disclosed herein may be carried out in any convenient manner, including by aerosol inhalation, injection, ingestion, transfusion, implantation or transplantation. The cells or population of cells may be administered to a patient subcutaneously, intradermally, intratumorally, intranodally, intramedullary, intramuscularly, intrathecally, by intravenous or intralymphatic injection, or intraperitoneally. In some embodiments, the disclosed CARs may be delivered or administered into a cavity formed by the resection of tumor tissue (i.e. intracavity delivery) or directly into a tumor prior to resection (i.e. intratumoral delivery). In one embodiment, the cell compositions of the present invention are preferably administered by intravenous injection.
The administration of the cells or population of cells can consist of the administration of 104-109 cells per kg body weight, preferably 105 to 106 cells/kg body weight including all integer values of cell numbers within those ranges. Dosing in CAR T cell therapies may for example involve administration of from 106 to 109 cells/kg, with or without a course of lymphodepletion, for example with cyclophosphamide. The cells or population of cells can be administrated in one or more doses. In another embodiment, the effective number of cells are administrated as a single dose. In another embodiment, the effective number of cells are administrated as more than one dose over a period time. Timing of administration is within the judgment of managing physician and depends on the clinical condition of the patient. The cells or population of cells may be obtained from any source, such as a blood bank or a donor. While individual needs vary, determination of optimal ranges of effective amounts of a given cell type for a particular disease or conditions are within the skill of one in the art. An effective amount means an amount which provides a therapeutic or prophylactic benefit. The dosage administrated will be dependent upon the age, health and weight of the recipient, kind of concurrent treatment, if any, frequency of treatment and the nature of the effect desired.
In another embodiment, the effective number of cells or composition comprising those cells are administrated parenterally. The administration can be an intravenous administration. The administration can be directly done by injection within a tumor.
To guard against possible adverse reactions, engineered immunoresponsive cells may be equipped with a transgenic safety switch, in the form of a transgene that renders the cells vulnerable to exposure to a specific signal. For example, the herpes simplex viral thymidine kinase (TK) gene may be used in this way, for example by introduction into allogeneic T lymphocytes used as donor lymphocyte infusions following stem cell transplantation (Greco, et al., Improving the safety of cell therapy with the TK-suicide gene. Front. Pharmacol. 2015; 6: 95). In such cells, administration of a nucleoside prodrug such as ganciclovir or acyclovir causes cell death. Alternative safety switch constructs include inducible caspase 9, for example triggered by administration of a small-molecule dimerizer that brings together two nonfunctional icasp9 molecules to form the active enzyme. A wide variety of alternative approaches to implementing cellular proliferation controls have been described (see U.S. Patent Publication No. 20130071414; PCT Patent Publication WO2011146862; PCT Patent Publication WO2014011987; PCT Patent Publication WO2013040371; Zhou et al. BLOOD, 2014, 123/25:3895-3905; Di Stasi et al., The New England Journal of Medicine 2011; 365:1673-1683; Sadelain M, The New England Journal of Medicine 2011; 365:1735-173; Ramos et al., Stem Cells 28(6):1107-15 (2010)).
In a further refinement of adoptive therapies, genome editing may be used to tailor immunoresponsive cells to alternative implementations, for example providing edited CAR T cells (see Poirot et al., 2015, Multiplex genome edited T-cell manufacturing platform for “off-the-shelf” adoptive T-cell immunotherapies, Cancer Res 75 (18): 3853; Ren et al., 2017, Multiplex genome editing to generate universal CAR T cells resistant to PD1 inhibition, Clin Cancer Res. 2017 May 1; 23(9):2255-2266. doi: 10.1158/1078-0432.CCR-16-1300. Epub 2016 Nov. 4; Qasim et al., 2017, Molecular remission of infant B-ALL after infusion of universal TALEN gene-edited CART cells, Sci Transl Med. 2017 Jan. 25; 9 (374); Legut, et al., 2018, CRISPR-mediated TCR replacement generates superior anticancer transgenic T cells. Blood, 131(3), 311-322; and Georgiadis et al., Long Terminal Repeat CRISPR-CAR-Coupled “Universal” T Cells Mediate Potent Anti-leukemic Effects, Molecular Therapy, In Press, Corrected Proof, Available online 6 Mar. 2018). Cells may be edited using any CRISPR system and method of use thereof as described herein. CRISPR systems may be delivered to an immune cell by any method described herein. In preferred embodiments, cells are edited ex vivo and transferred to a subject in need thereof. Immunoresponsive cells, CAR T cells or any cells used for adoptive cell transfer may be edited. Editing may be performed for example to insert or knock-in an exogenous gene, such as an exogenous gene encoding a CAR or a TCR, at a preselected locus in a cell (e.g. TRAC locus); to eliminate potential alloreactive T-cell receptors (TCR) or to prevent inappropriate pairing between endogenous and exogenous TCR chains, such as to knock-out or knock-down expression of an endogenous TCR in a cell; to disrupt the target of a chemotherapeutic agent in a cell; to block an immune checkpoint, such as to knock-out or knock-down expression of an immune checkpoint protein or receptor in a cell; to knock-out or knock-down expression of other gene or genes in a cell, the reduced expression or lack of expression of which can enhance the efficacy of adoptive therapies using the cell; to knock-out or knock-down expression of an endogenous gene in a cell, said endogenous gene encoding an antigen targeted by an exogenous CAR or TCR; to knock-out or knock-down expression of one or more WIC constituent proteins in a cell; to activate a T cell; to modulate cells such that the cells are resistant to exhaustion or dysfunction; and/or increase the differentiation and/or proliferation of functionally exhausted or dysfunctional CD8+ T-cells (see International Patent Publications WO2013176915, WO2014059173, WO2014172606, WO2014184744, and WO2014191128).
In certain embodiments, editing may result in inactivation of a gene. By inactivating a gene, it is intended that the gene of interest is not expressed in a functional protein form. In a particular embodiment, the CRISPR system specifically catalyzes cleavage in one targeted gene thereby inactivating said targeted gene. The nucleic acid strand breaks caused are commonly repaired through the distinct mechanisms of homologous recombination or non-homologous end joining (NHEJ). However, NHEJ is an imperfect repair process that often results in changes to the DNA sequence at the site of the cleavage. Repair via non-homologous end joining (NHEJ) often results in small insertions or deletions (Indel) and can be used for the creation of specific gene knockouts. Cells in which a cleavage induced mutagenesis event has occurred can be identified and/or selected by well-known methods in the art. In certain embodiments, homology directed repair (HDR) is used to concurrently inactivate a gene (e.g., TRAC) and insert an endogenous TCR or CAR into the inactivated locus.
Hence, in certain embodiments, editing of cells (such as by CRISPR/Cas), particularly cells intended for adoptive cell therapies, more particularly immunoresponsive cells such as T cells, may be performed to insert or knock-in an exogenous gene, such as an exogenous gene encoding a CAR or a TCR, at a preselected locus in a cell. Conventionally, nucleic acid molecules encoding CARs or TCRs are transfected or transduced to cells using randomly integrating vectors, which, depending on the site of integration, may lead to clonal expansion, oncogenic transformation, variegated transgene expression and/or transcriptional silencing of the transgene. Directing of transgene(s) to a specific locus in a cell can minimize or avoid such risks and advantageously provide for uniform expression of the transgene(s) by the cells. Without limitation, suitable ‘safe harbor’ loci for directed transgene integration include CCR5 or AAVS1. Homology-directed repair (HDR) strategies are known and described elsewhere in this specification allowing to insert transgenes into desired loci (e.g., TRAC locus).
Further suitable loci for insertion of transgenes, in particular CAR or exogenous TCR transgenes, include without limitation loci comprising genes coding for constituents of endogenous T-cell receptor, such as T-cell receptor alpha locus (TRA) or T-cell receptor beta locus (TRB), for example T-cell receptor alpha constant (TRAC) locus, T-cell receptor beta constant 1 (TRBC1) locus or T-cell receptor beta constant 2 (TRBC1) locus. Advantageously, insertion of a transgene into such locus can simultaneously achieve expression of the transgene, potentially controlled by the endogenous promoter, and knock-out expression of the endogenous TCR. This approach has been exemplified in Eyquem et al., (2017) Nature 543: 113-117, wherein the authors used CRISPR/Cas9 gene editing to knock-in a DNA molecule encoding a CD19-specific CAR into the TRAC locus downstream of the endogenous promoter; the CAR-T cells obtained by CRISPR were significantly superior in terms of reduced tonic CAR signaling and exhaustion.
T cell receptors (TCR) are cell surface receptors that participate in the activation of T cells in response to the presentation of antigen. The TCR is generally made from two chains, α and β, which assemble to form a heterodimer and associates with the CD3-transducing subunits to form the T cell receptor complex present on the cell surface. Each α and β chain of the TCR consists of an immunoglobulin-like N-terminal variable (V) and constant (C) region, a hydrophobic transmembrane domain, and a short cytoplasmic region. As for immunoglobulin molecules, the variable region of the α and β chains are generated by V(D)J recombination, creating a large diversity of antigen specificities within the population of T cells. However, in contrast to immunoglobulins that recognize intact antigen, T cells are activated by processed peptide fragments in association with an MHC molecule, introducing an extra dimension to antigen recognition by T cells, known as MHC restriction. Recognition of MHC disparities between the donor and recipient through the T cell receptor leads to T cell proliferation and the potential development of graft versus host disease (GVHD). The inactivation of TCRα or TCRβ can result in the elimination of the TCR from the surface of T cells preventing recognition of alloantigen and thus GVHD. However, TCR disruption generally results in the elimination of the CD3 signaling component and alters the means of further T cell expansion.
Hence, in certain embodiments, editing of cells (such as by CRISPR/Cas), particularly cells intended for adoptive cell therapies, more particularly immunoresponsive cells such as T cells, may be performed to knock-out or knock-down expression of an endogenous TCR in a cell. For example, NHEJ-based or HDR-based gene editing approaches can be employed to disrupt the endogenous TCR alpha and/or beta chain genes. For example, gene editing system or systems, such as CRISPR/Cas system or systems, can be designed to target a sequence found within the TCR beta chain conserved between the beta 1 and beta 2 constant region genes (TRBC1 and TRBC2) and/or to target the constant region of the TCR alpha chain (TRAC) gene.
Allogeneic cells are rapidly rejected by the host immune system. It has been demonstrated that, allogeneic leukocytes present in non-irradiated blood products will persist for no more than 5 to 6 days (Boni, Muranski et al. 2008 Blood 1; 112(12):4746-54). Thus, to prevent rejection of allogeneic cells, the host's immune system usually has to be suppressed to some extent. However, in the case of adoptive cell transfer the use of immunosuppressive drugs also have a detrimental effect on the introduced therapeutic T cells. Therefore, to effectively use an adoptive immunotherapy approach in these conditions, the introduced cells would need to be resistant to the immunosuppressive treatment. Thus, in a particular embodiment, the present invention further comprises a step of modifying T cells to make them resistant to an immunosuppressive agent, preferably by inactivating at least one gene encoding a target for an immunosuppressive agent. An immunosuppressive agent is an agent that suppresses immune function by one of several mechanisms of action. An immunosuppressive agent can be, but is not limited to a calcineurin inhibitor, a target of rapamycin, an interleukin-2 receptor α-chain blocker, an inhibitor of inosine monophosphate dehydrogenase, an inhibitor of dihydrofolic acid reductase, a corticosteroid or an immunosuppressive antimetabolite. The present invention allows conferring immunosuppressive resistance to T cells for immunotherapy by inactivating the target of the immunosuppressive agent in T cells. As non-limiting examples, targets for an immunosuppressive agent can be a receptor for an immunosuppressive agent such as: CD52, glucocorticoid receptor (GR), a FKBP family gene member and a cyclophilin family gene member.
In certain embodiments, editing of cells (such as by CRISPR/Cas), particularly cells intended for adoptive cell therapies, more particularly immunoresponsive cells such as T cells, may be performed to block an immune checkpoint, such as to knock-out or knock-down expression of an immune checkpoint protein or receptor in a cell. Immune checkpoints are inhibitory pathways that slow down or stop immune reactions and prevent excessive tissue damage from uncontrolled activity of immune cells. In certain embodiments, the immune checkpoint targeted is the programmed death-1 (PD-1 or CD279) gene (PDCD1). In other embodiments, the immune checkpoint targeted is cytotoxic T-lymphocyte-associated antigen (CTLA-4). In additional embodiments, the immune checkpoint targeted is another member of the CD28 and CTLA4 Ig superfamily such as BTLA, LAG3, ICOS, PDL1 or KIR. In further additional embodiments, the immune checkpoint targeted is a member of the TNFR superfamily such as CD40, OX40, CD137, GITR, CD27 or TIM-3.
Additional immune checkpoints include Src homology 2 domain-containing protein tyrosine phosphatase 1 (SHP-1) (Watson H A, et al., SHP-1: the next checkpoint target for cancer immunotherapy? Biochem Soc Trans. 2016 Apr 15; 44(2):356-62). SHP-1 is a widely expressed inhibitory protein tyrosine phosphatase (PTP). In T-cells, it is a negative regulator of antigen-dependent activation and proliferation. It is a cytosolic protein, and therefore not amenable to antibody-mediated therapies, but its role in activation and proliferation makes it an attractive target for genetic manipulation in adoptive transfer strategies, such as chimeric antigen receptor (CAR) T cells. Immune checkpoints may also include T cell immunoreceptor with Ig and ITIM domains (TIGIT/Vstm3/WUCAM/VSIG9) and VISTA (Le Mercier I, et al., (2015) Beyond CTLA-4 and PD-1, the generation Z of negative checkpoint regulators. Front. Immunol. 6:418).
International Patent Publication No. WO2014172606 relates to the use of MT1 and/or MT2 inhibitors to increase proliferation and/or activity of exhausted CD8+ T-cells and to decrease CD8+ T-cell exhaustion (e.g., decrease functionally exhausted or unresponsive CD8+ immune cells). In certain embodiments, metallothioneins are targeted by gene editing in adoptively transferred T cells.
In certain embodiments, targets of gene editing may be at least one targeted locus involved in the expression of an immune checkpoint protein. Such targets may include, but are not limited to CTLA4, PPP2CA, PPP2CB, PTPN6, PTPN22, PDCD1, ICOS (CD278), PDL1, KIR, LAG3, HAVCR2, BTLA, CD160, TIGIT, CD96, CRTAM, LAIR1, SIGLEC7, SIGLEC9, CD244 (2B4), TNFRSF10B, TNFRSF10A, CASP8, CASP10, CASP3, CASP6, CASP7, FADD, FAS, TGFBRII, TGFRBRI, SMAD2, SMAD3, SMAD4, SMAD10, SKI, SKIL, TGIF1, IL10RA, IL10RB, HMOX2, IL6R, IL6ST, EIF2AK4, CSK, PAG1, SIT1, FOXP3, PRDM1, BATF, VISTA, GUCY1A2, GUCY1A3, GUCY1B2, GUCY1B3, MT1, MT2, CD40, OX40, CD137, GITR, CD27, SHP-1, TIM-3, CEACAM-1, CEACAM-3, or CEACAM-5. In preferred embodiments, the gene locus involved in the expression of PD-1 or CTLA-4 genes is targeted. In other preferred embodiments, combinations of genes are targeted, such as but not limited to PD-1 and TIGIT.
By means of an example and without limitation, International Patent Publication No. WO2016196388 concerns an engineered T cell comprising (a) a genetically engineered antigen receptor that specifically binds to an antigen, which receptor may be a CAR; and (b) a disrupted gene encoding a PD-L1, an agent for disruption of a gene encoding a PD-L1, and/or disruption of a gene encoding PD-L1, wherein the disruption of the gene may be mediated by a gene editing nuclease, a zinc finger nuclease (ZFN), CRISPR/Cas9 and/or TALEN. International Patent Publication No. WO2015142675 relates to immune effector cells comprising a CAR in combination with an agent (such as CRISPR, TALEN or ZFN) that increases the efficacy of the immune effector cells in the treatment of cancer, wherein the agent may inhibit an immune inhibitory molecule, such as PD1, PD-L1, CTLA-4, TIM-3, LAG-3, VISTA, BTLA, TIGIT, LAIR1, CD160, 2B4, TGFR beta, CEACAM-1, CEACAM-3, or CEACAM-5. Ren et al., (2017) Clin Cancer Res 23 (9) 2255-2266 performed lentiviral delivery of CAR and electro-transfer of Cas9 mRNA and gRNAs targeting endogenous TCR, β-2 microglobulin (B2M) and PD1 simultaneously, to generate gene-disrupted allogeneic CAR T cells deficient of TCR, HLA class I molecule and PD1.
In certain embodiments, cells may be engineered to express a CAR, wherein expression and/or function of methylcytosine dioxygenase genes (TET1, TET2 and/or TET3) in the cells has been reduced or eliminated, such as by CRISPR, ZNF or TALEN (for example, as described in WO201704916).
In certain embodiments, editing of cells (such as by CRISPR/Cas), particularly cells intended for adoptive cell therapies, more particularly immunoresponsive cells such as T cells, may be performed to knock-out or knock-down expression of an endogenous gene in a cell, said endogenous gene encoding an antigen targeted by an exogenous CAR or TCR, thereby reducing the likelihood of targeting of the engineered cells. In certain embodiments, the targeted antigen may be one or more antigen selected from the group consisting of CD38, CD138, CS-1, CD33, CD26, CD30, CD53, CD92, CD100, CD148, CD150, CD200, CD261, CD262, CD362, human telomerase reverse transcriptase (hTERT), survivin, mouse double minute 2 homolog (MDM2), cytochrome P450 1B1 (CYP1B), HER2/neu, Wilms' tumor gene 1 (WT1), livin, alphafetoprotein (AFP), carcinoembryonic antigen (CEA), mucin 16 (MUC16), MUC1, prostate-specific membrane antigen (PSMA), p53, cyclin (D1), B cell maturation antigen (BCMA), transmembrane activator and CAML Interactor (TACI), and B-cell activating factor receptor (BAFF-R) (for example, as described in WO2016011210 and WO2017011804).
In certain embodiments, editing of cells (such as by CRISPR/Cas), particularly cells intended for adoptive cell therapies, more particularly immunoresponsive cells such as T cells, may be performed to knock-out or knock-down expression of one or more MHC constituent proteins, such as one or more HLA proteins and/or beta-2 microglobulin (B2M), in a cell, whereby rejection of non-autologous (e.g., allogeneic) cells by the recipient's immune system can be reduced or avoided. In preferred embodiments, one or more HLA class I proteins, such as HLA-A, B and/or C, and/or B2M may be knocked-out or knocked-down. Preferably, B2M may be knocked-out or knocked-down. By means of an example, Ren et al., (2017) Clin Cancer Res 23 (9) 2255-2266 performed lentiviral delivery of CAR and electro-transfer of Cas9 mRNA and gRNAs targeting endogenous TCR, β-2 microglobulin (B2M) and PD1 simultaneously, to generate gene-disrupted allogeneic CAR T cells deficient of TCR, HLA class I molecule and PD1.
In other embodiments, at least two genes are edited. Pairs of genes may include, but are not limited to PD1 and TCRα, PD1 and TCRβ, CTLA-4 and TCRα, CTLA-4 and TCRβ, LAG3 and TCRα, LAG3 and TCRβ, Tim3 and TCRα, Tim3 and TCRβ, BTLA and TCRα, BTLA and TCRβ, BY55 and TCRα, BY55 and TCRβ, TIGIT and TCRα, TIGIT and TCRβ, B7H5 and TCRα, B7H5 and TCRβ, LAIR1 and TCRα, LAIR1 and TCRβ, SIGLEC10 and TCRα, SIGLEC10 and TCRβ, 2B4 and TCRα, 2B4 and TCRβ, B2M and TCRα, B2M and TCRβ.
In certain embodiments, a cell may be multiply edited (multiplex genome editing) as taught herein to (1) knock-out or knock-down expression of an endogenous TCR (for example, TRBC1, TRBC2 and/or TRAC), (2) knock-out or knock-down expression of an immune checkpoint protein or receptor (for example PD1, PD-L1 and/or CTLA4); and (3) knock-out or knock-down expression of one or more MHC constituent proteins (for example, HLA-A, B and/or C, and/or B2M, preferably B2M).
Whether prior to or after genetic modification of the T cells, the T cells can be activated and expanded generally using methods as described, for example, in U.S. Pat. Nos. 6,352,694, 6,534,055, 6,905,680, 5,858,358, 6,887,466, 6,905,681, 7,144,575, 7,232,566, 7,175,843, 5,883,223, 6,905,874, 6,797,514, 6,867,041, and 7,572,631. T cells can be expanded in vitro or in vivo.
Immune cells may be obtained using any method known in the art. In one embodiment, allogenic T cells may be obtained from healthy subjects. In one embodiment T cells that have infiltrated a tumor are isolated. T cells may be removed during surgery. T cells may be isolated after removal of tumor tissue by biopsy. T cells may be isolated by any means known in the art. In one embodiment, T cells are obtained by apheresis. In one embodiment, the method may comprise obtaining a bulk population of T cells from a tumor sample by any suitable method known in the art. For example, a bulk population of T cells can be obtained from a tumor sample by dissociating the tumor sample into a cell suspension from which specific cell populations can be selected. Suitable methods of obtaining a bulk population of T cells may include, but are not limited to, any one or more of mechanically dissociating (e.g., mincing) the tumor, enzymatically dissociating (e.g., digesting) the tumor, and aspiration (e.g., as with a needle).
The bulk population of T cells obtained from a tumor sample may comprise any suitable type of T cell. Preferably, the bulk population of T cells obtained from a tumor sample comprises tumor infiltrating lymphocytes (TILs).
The tumor sample may be obtained from any mammal. Unless stated otherwise, as used herein, the term “mammal” refers to any mammal including, but not limited to, mammals of the order Logomorpha, such as rabbits; the order Carnivora, including Felines (cats) and Canines (dogs); the order Artiodactyla, including Bovines (cows) and Swines (pigs); or of the order Perssodactyla, including Equines (horses). The mammals may be non-human primates, e.g., of the order Primates, Ceboids, or Simoids (monkeys) or of the order Anthropoids (humans and apes). In some embodiments, the mammal may be a mammal of the order Rodentia, such as mice and hamsters. Preferably, the mammal is a non-human primate or a human. An especially preferred mammal is the human.
T cells can be obtained from a number of sources, including peripheral blood mononuclear cells (PBMC), bone marrow, lymph node tissue, spleen tissue, and tumors. In certain embodiments of the present invention, T cells can be obtained from a unit of blood collected from a subject using any number of techniques known to the skilled artisan, such as Ficoll separation. In one preferred embodiment, cells from the circulating blood of an individual are obtained by apheresis or leukapheresis. The apheresis product typically contains lymphocytes, including T cells, monocytes, granulocytes, B cells, other nucleated white blood cells, red blood cells, and platelets. In one embodiment, the cells collected by apheresis may be washed to remove the plasma fraction and to place the cells in an appropriate buffer or media for subsequent processing steps. In one embodiment of the invention, the cells are washed with phosphate buffered saline (PBS). In an alternative embodiment, the wash solution lacks calcium and may lack magnesium or may lack many if not all divalent cations. Initial activation steps in the absence of calcium lead to magnified activation. As those of ordinary skill in the art would readily appreciate a washing step may be accomplished by methods known to those in the art, such as by using a semi-automated “flow-through” centrifuge (for example, the Cobe 2991 cell processor) according to the manufacturer's instructions. After washing, the cells may be resuspended in a variety of biocompatible buffers, such as, for example, Ca-free, Mg-free PBS. Alternatively, the undesirable components of the apheresis sample may be removed and the cells directly resuspended in culture media.
In another embodiment, T cells are isolated from peripheral blood lymphocytes by lysing the red blood cells and depleting the monocytes, for example, by centrifugation through a PERCOLL™ gradient. A specific subpopulation of T cells, such as CD28+, CD4+, CDC, CD45RA+, and CD45RO+ T cells, can be further isolated by positive or negative selection techniques. For example, in one preferred embodiment, T cells are isolated by incubation with anti-CD3/anti-CD28 (i.e., 3×28)-conjugated beads, such as DYNABEADS® M-450 CD3/CD28 T, or XCYTE DYNABEADS™ for a time period sufficient for positive selection of the desired T cells. In one embodiment, the time period is about 30 minutes. In a further embodiment, the time period ranges from 30 minutes to 36 hours or longer and all integer values there between. In a further embodiment, the time period is at least 1, 2, 3, 4, 5, or 6 hours. In yet another preferred embodiment, the time period is 10 to 24 hours. In one preferred embodiment, the incubation time period is 24 hours. For isolation of T cells from patients with leukemia, use of longer incubation times, such as 24 hours, can increase cell yield. Longer incubation times may be used to isolate T cells in any situation where there are few T cells as compared to other cell types, such in isolating tumor infiltrating lymphocytes (TIL) from tumor tissue or from immunocompromised individuals. Further, use of longer incubation times can increase the efficiency of capture of CD8+ T cells.
Enrichment of a T cell population by negative selection can be accomplished with a combination of antibodies directed to surface markers unique to the negatively selected cells. A preferred method is cell sorting and/or selection via negative magnetic immunoadherence or flow cytometry that uses a cocktail of monoclonal antibodies directed to cell surface markers present on the cells negatively selected. For example, to enrich for CD4+ cells by negative selection, a monoclonal antibody cocktail typically includes antibodies to CD14, CD20, CD11b, CD16, HLA-DR, and CD8.
Further, monocyte populations (i.e., CD14+ cells) may be depleted from blood preparations by a variety of methodologies, including anti-CD14 coated beads or columns, or utilization of the phagocytotic activity of these cells to facilitate removal. Accordingly, in one embodiment, the invention uses paramagnetic particles of a size sufficient to be engulfed by phagocytotic monocytes. In certain embodiments, the paramagnetic particles are commercially available beads, for example, those produced by Life Technologies under the trade name Dynabeads™. In one embodiment, other non-specific cells are removed by coating the paramagnetic particles with “irrelevant” proteins (e.g., serum proteins or antibodies). Irrelevant proteins and antibodies include those proteins and antibodies or fragments thereof that do not specifically target the T cells to be isolated. In certain embodiments, the irrelevant beads include beads coated with sheep anti-mouse antibodies, goat anti-mouse antibodies, and human serum albumin.
In brief, such depletion of monocytes is performed by preincubating T cells isolated from whole blood, apheresed peripheral blood, or tumors with one or more varieties of irrelevant or non-antibody coupled paramagnetic particles at any amount that allows for removal of monocytes (approximately a 20:1 bead:cell ratio) for about 30 minutes to 2 hours at 22 to 37 degrees C., followed by magnetic removal of cells which have attached to or engulfed the paramagnetic particles. Such separation can be performed using standard methods available in the art. For example, any magnetic separation methodology may be used including a variety of which are commercially available, (e.g., DYNAL® Magnetic Particle Concentrator (DYNAL MPC®)). Assurance of requisite depletion can be monitored by a variety of methodologies known to those of ordinary skill in the art, including flow cytometric analysis of CD14 positive cells, before and after depletion.
For isolation of a desired population of cells by positive or negative selection, the concentration of cells and surface (e.g., particles such as beads) can be varied. In certain embodiments, it may be desirable to significantly decrease the volume in which beads and cells are mixed together (i.e., increase the concentration of cells), to ensure maximum contact of cells and beads. For example, in one embodiment, a concentration of 2 billion cells/ml is used. In one embodiment, a concentration of 1 billion cells/ml is used. In a further embodiment, greater than 100 million cells/ml is used. In a further embodiment, a concentration of cells of 10, 15, 20, 25, 30, 35, 40, 45, or 50 million cells/ml is used. In yet another embodiment, a concentration of cells from 75, 80, 85, 90, 95, or 100 million cells/ml is used. In further embodiments, concentrations of 125 or 150 million cells/ml can be used. Using high concentrations can result in increased cell yield, cell activation, and cell expansion. Further, use of high cell concentrations allows more efficient capture of cells that may weakly express target antigens of interest, such as CD28-negative T cells, or from samples where there are many tumor cells present (i.e., leukemic blood, tumor tissue, etc). Such populations of cells may have therapeutic value and would be desirable to obtain. For example, using high concentration of cells allows more efficient selection of CD8+ T cells that normally have weaker CD28 expression.
In a related embodiment, it may be desirable to use lower concentrations of cells. By significantly diluting the mixture of T cells and surface (e.g., particles such as beads), interactions between the particles and cells is minimized. This selects for cells that express high amounts of desired antigens to be bound to the particles. For example, CD4+ T cells express higher levels of CD28 and are more efficiently captured than CD8+ T cells in dilute concentrations. In one embodiment, the concentration of cells used is 5×106/ml. In other embodiments, the concentration used can be from about 1×105/ml to 1×106/ml, and any integer value in between.
T cells can also be frozen. Wishing not to be bound by theory, the freeze and subsequent thaw step provides a more uniform product by removing granulocytes and to some extent monocytes in the cell population. After a washing step to remove plasma and platelets, the cells may be suspended in a freezing solution. While many freezing solutions and parameters are known in the art and will be useful in this context, one method involves using PBS containing 20% DMSO and 8% human serum albumin, or other suitable cell freezing media, the cells then are frozen to −80° C. at a rate of 1° per minute and stored in the vapor phase of a liquid nitrogen storage tank. Other methods of controlled freezing may be used as well as uncontrolled freezing immediately at −20° C. or in liquid nitrogen.
T cells for use in the present invention may also be antigen-specific T cells. For example, tumor-specific T cells can be used. In certain embodiments, antigen-specific T cells can be isolated from a patient of interest, such as a patient afflicted with a cancer or an infectious disease. In one embodiment, neoepitopes are determined for a subject and T cells specific to these antigens are isolated. Antigen-specific cells for use in expansion may also be generated in vitro using any number of methods known in the art, for example, as described in U.S. Patent Publication No. US 20040224402 entitled, Generation and Isolation of Antigen-Specific T Cells, or in U.S. Pat. No. 6,040,177. Antigen-specific cells for use in the present invention may also be generated using any number of methods known in the art, for example, as described in Current Protocols in Immunology, or Current Protocols in Cell Biology, both published by John Wiley & Sons, Inc., Boston, Mass.
In a related embodiment, it may be desirable to sort or otherwise positively select (e.g., via magnetic selection) the antigen specific cells prior to or following one or two rounds of expansion. Sorting or positively selecting antigen-specific cells can be carried out using peptide-MEW tetramers (Altman, et al., Science. 1996 Oct. 4; 274(5284):94-6). In another embodiment, the adaptable tetramer technology approach is used (Andersen et al., 2012 Nat Protoc. 7:891-902). Tetramers are limited by the need to utilize predicted binding peptides based on prior hypotheses, and the restriction to specific HLAs. Peptide-MHC tetramers can be generated using techniques known in the art and can be made with any MEW molecule of interest and any antigen of interest as described herein. Specific epitopes to be used in this context can be identified using numerous assays known in the art. For example, the ability of a polypeptide to bind to MEW class I may be evaluated indirectly by monitoring the ability to promote incorporation of 125I labeled β2-microglobulin (β2m) into MHC class I/β2m/peptide heterotrimeric complexes (see Parker et al., J. Immunol. 152:163, 1994).
In one embodiment cells are directly labeled with an epitope-specific reagent for isolation by flow cytometry followed by characterization of phenotype and TCRs. In one embodiment, T cells are isolated by contacting with T cell specific antibodies. Sorting of antigen-specific T cells, or generally any cells of the present invention, can be carried out using any of a variety of commercially available cell sorters, including, but not limited to, MoFlo sorter (DakoCytomation, Fort Collins, Colo.), FACSAria™, FACSArray™, FACSVantage™, BD™ LSR II, and FACSCalibur™ (BD Biosciences, San Jose, Calif.).
In a preferred embodiment, the method comprises selecting cells that also express CD3. The method may comprise specifically selecting the cells in any suitable manner. Preferably, the selecting is carried out using flow cytometry. The flow cytometry may be carried out using any suitable method known in the art. The flow cytometry may employ any suitable antibodies and stains. Preferably, the antibody is chosen such that it specifically recognizes and binds to the particular biomarker being selected. For example, the specific selection of CD3, CD8, TIM-3, LAG-3, 4-1BB, or PD-1 may be carried out using anti-CD3, anti-CD8, anti-TIM-3, anti-LAG-3, anti-4-1BB, or anti-PD-1 antibodies, respectively. The antibody or antibodies may be conjugated to a bead (e.g., a magnetic bead) or to a fluorochrome. Preferably, the flow cytometry is fluorescence-activated cell sorting (FACS). TCRs expressed on T cells can be selected based on reactivity to autologous tumors. Additionally, T cells that are reactive to tumors can be selected for based on markers using the methods described in International Patent Publication Nos. WO2014133567 and WO2014133568, herein incorporated by reference in their entirety. Additionally, activated T cells can be selected for based on surface expression of CD107a.
In one embodiment of the invention, the method further comprises expanding the numbers of T cells in the enriched cell population. Such methods are described in U.S. Pat. No. 8,637,307 and is herein incorporated by reference in its entirety. The numbers of T cells may be increased at least about 3-fold (or 4-, 5-, 6-, 7-, 8-, or 9-fold), more preferably at least about 10-fold (or 20-, 30-, 40-, 50-, 60-, 70-, 80-, or 90-fold), more preferably at least about 100-fold, more preferably at least about 1,000 fold, or most preferably at least about 100,000-fold. The numbers of T cells may be expanded using any suitable method known in the art. Exemplary methods of expanding the numbers of cells are described in International Patent Publication No. WO 2003057171, U.S. Pat. No. 8,034,334, and U.S. Patent Publication No. 2012/0244133, each of which is incorporated herein by reference.
In one embodiment, ex vivo T cell expansion can be performed by isolation of T cells and subsequent stimulation or activation followed by further expansion. In one embodiment of the invention, the T cells may be stimulated or activated by a single agent. In another embodiment, T cells are stimulated or activated with two agents, one that induces a primary signal and a second that is a co-stimulatory signal. Ligands useful for stimulating a single signal or stimulating a primary signal and an accessory molecule that stimulates a second signal may be used in soluble form. Ligands may be attached to the surface of a cell, to an Engineered Multivalent Signaling Platform (EMSP), or immobilized on a surface. In a preferred embodiment both primary and secondary agents are co-immobilized on a surface, for example a bead or a cell. In one embodiment, the molecule providing the primary activation signal may be a CD3 ligand, and the co-stimulatory molecule may be a CD28 ligand or 4-1BB ligand.
In certain embodiments, T cells comprising a CAR or an exogenous TCR may be manufactured as described in WO2015120096, by a method comprising: enriching a population of lymphocytes obtained from a donor subject; stimulating the population of lymphocytes with one or more T-cell stimulating agents to produce a population of activated T cells, wherein the stimulation is performed in a closed system using serum-free culture medium; transducing the population of activated T cells with a viral vector comprising a nucleic acid molecule which encodes the CAR or TCR, using a single cycle transduction to produce a population of transduced T cells, wherein the transduction is performed in a closed system using serum-free culture medium; and expanding the population of transduced T cells for a predetermined time to produce a population of engineered T cells, wherein the expansion is performed in a closed system using serum-free culture medium. In certain embodiments, T cells comprising a CAR or an exogenous TCR, may be manufactured as described in WO2015120096, by a method comprising: obtaining a population of lymphocytes; stimulating the population of lymphocytes with one or more stimulating agents to produce a population of activated T cells, wherein the stimulation is performed in a closed system using serum-free culture medium; transducing the population of activated T cells with a viral vector comprising a nucleic acid molecule which encodes the CAR or TCR, using at least one cycle transduction to produce a population of transduced T cells, wherein the transduction is performed in a closed system using serum-free culture medium; and expanding the population of transduced T cells to produce a population of engineered T cells, wherein the expansion is performed in a closed system using serum-free culture medium. The predetermined time for expanding the population of transduced T cells may be 3 days. The time from enriching the population of lymphocytes to producing the engineered T cells may be 6 days. The closed system may be a closed bag system. Further provided is population of T cells comprising a CAR or an exogenous TCR obtainable or obtained by said method, and a pharmaceutical composition comprising such cells.
In certain embodiments, T cell maturation or differentiation in vitro may be delayed or inhibited by the method as described in WO2017070395, comprising contacting one or more T cells from a subject in need of a T cell therapy with an AKT inhibitor (such as, e.g., one or a combination of two or more AKT inhibitors disclosed in claim 8 of WO2017070395) and at least one of exogenous Interleukin-7 (IL-7) and exogenous Interleukin-15 (IL-15), wherein the resulting T cells exhibit delayed maturation or differentiation, and/or wherein the resulting T cells exhibit improved T cell function (such as, e.g., increased T cell proliferation; increased cytokine production; and/or increased cytolytic activity) relative to a T cell function of a T cell cultured in the absence of an AKT inhibitor.
In certain embodiments, a patient in need of a T cell therapy may be conditioned by a method as described in WO2016191756 comprising administering to the patient a dose of cyclophosphamide between 200 mg/m2/day and 2000 mg/m2/day and a dose of fludarabine between 20 mg/m2/day and 900 mg/m2/day.
In certain embodiments, polyamines or enzymes of the polyamine pathway are used as biomarkers to detect an immune response (e.g., any disease or condition described herein). In certain embodiments, increased polyamines or specific enzymes (e.g., SAT1) indicate an inflammatory response, such as an autoimmune response. Detection of polyamines or enzymes of the polyamine pathway may be used in diagnosing, prognosing or monitoring a disease an immune response.
The terms “diagnosis” and “monitoring” are commonplace and well-understood in medical practice. By means of further explanation and without limitation the term “diagnosis” generally refers to the process or act of recognizing, deciding on or concluding on a disease or condition in a subject on the basis of symptoms and signs and/or from results of various diagnostic procedures (such as, for example, from knowing the presence, absence and/or quantity of one or more biomarkers characteristic of the diagnosed disease or condition).
The term “monitoring” generally refers to the follow-up of a disease or a condition in a subject for any changes which may occur over time.
The terms “prognosing” or “prognosis” generally refer to an anticipation on the progression of a disease or condition and the prospect (e.g., the probability, duration, and/or extent) of recovery. A good prognosis of the diseases or conditions taught herein may generally encompass anticipation of a satisfactory partial or complete recovery from the diseases or conditions, preferably within an acceptable time period. A good prognosis of such may more commonly encompass anticipation of not further worsening or aggravating of such, preferably within a given time period. A poor prognosis of the diseases or conditions as taught herein may generally encompass anticipation of a substandard recovery and/or unsatisfactorily slow recovery, or to substantially no recovery or even further worsening of such.
The terms also encompass prediction of a disease. The terms “predicting” or “prediction” generally refer to an advance declaration, indication or foretelling of a disease or condition in a subject not (yet) having said disease or condition. For example, a prediction of a disease or condition in a subject may indicate a probability, chance or risk that the subject will develop said disease or condition, for example within a certain time period or by a certain age. Said probability, chance or risk may be indicated inter alia as an absolute value, range or statistics, or may be indicated relative to a suitable control subject or subject population (such as, e.g., relative to a general, normal or healthy subject or subject population). Hence, the probability, chance or risk that a subject will develop a disease or condition may be advantageously indicated as increased or decreased, or as fold-increased or fold-decreased relative to a suitable control subject or subject population. As used herein, the term “prediction” of the conditions or diseases as taught herein in a subject may also particularly mean that the subject has a ‘positive’ prediction of such, i.e., that the subject is at risk of having such (e.g., the risk is significantly increased vis-à-vis a control subject or subject population). The term “prediction of no” diseases or conditions as taught herein as described herein in a subject may particularly mean that the subject has a ‘negative’ prediction of such, i.e., that the subject's risk of having such is not significantly increased vis-à-vis a control subject or subject population.
The term “biomarker” is widespread in the art and commonly broadly denotes a biological molecule, more particularly an endogenous biological molecule, and/or a detectable portion thereof, whose qualitative and/or quantitative evaluation in a tested object (e.g., in or on a cell, cell population, tissue, organ, or organism, e.g., in a biological sample of a subject) is predictive or informative with respect to one or more aspects of the tested object's phenotype and/or genotype (e.g., detecting polyamines). The terms “marker” and “biomarker” may be used interchangeably throughout this specification. Biomarkers as intended herein may be metabolites (e.g., polyamines), nucleic acid-based or peptide-, polypeptide- and/or protein-based. For example, a marker may be comprised of peptide(s), polypeptide(s) and/or protein(s) encoded by a given gene, or of detectable portions thereof. Further, whereas the term “nucleic acid” generally encompasses DNA, RNA and DNA/RNA hybrid molecules, in the context of markers the term may typically refer to heterogeneous nuclear RNA (hnRNA), pre-mRNA, messenger RNA (mRNA), or complementary DNA (cDNA), or detectable portions thereof. Such nucleic acid species are particularly useful as markers, since they contain qualitative and/or quantitative information about the expression of the gene. Particularly preferably, a nucleic acid-based marker may encompass mRNA of a given gene, or cDNA made of the mRNA, or detectable portions thereof. Any such nucleic acid(s), peptide(s), polypeptide(s) and/or protein(s) encoded by or produced from a given gene are encompassed by the term “gene product(s)”.
Preferably, markers as intended herein may be extracellular or cell surface markers (e.g., metabolites), as methods to measure extracellular or cell surface marker(s) need not disturb the integrity of the cell membrane and may not require fixation/permeabilization of the cells.
Unless otherwise apparent from the context, reference herein to any marker, such as a metabolite, peptide, polypeptide, protein, or nucleic acid, may generally also encompass modified forms of said marker, such as bearing post-expression modifications including, for example, phosphorylation, glycosylation, lipidation, methylation, cysteinylation, sulphonation, glutathionylation, acetylation, oxidation of methionine to methionine sulphoxide or methionine sulphone, and the like.
The term “peptide” as used throughout this specification preferably refers to a polypeptide as used herein consisting essentially of 50 amino acids or less, e.g., 45 amino acids or less, preferably 40 amino acids or less, e.g., 35 amino acids or less, more preferably 30 amino acids or less, e.g., 25 or less, 20 or less, 15 or less, 10 or less or 5 or less amino acids.
The term “polypeptide” as used throughout this specification generally encompasses polymeric chains of amino acid residues linked by peptide bonds. Hence, insofar a protein is only composed of a single polypeptide chain, the terms “protein” and “polypeptide” may be used interchangeably herein to denote such a protein. The term is not limited to any minimum length of the polypeptide chain. The term may encompass naturally, recombinantly, semi-synthetically or synthetically produced polypeptides. The term also encompasses polypeptides that carry one or more co- or post-expression-type modifications of the polypeptide chain, such as, without limitation, glycosylation, acetylation, phosphorylation, sulfonation, methylation, ubiquitination, signal peptide removal, N-terminal Met removal, conversion of pro-enzymes or pre-hormones into active forms, etc. The term further also includes polypeptide variants or mutants which carry amino acid sequence variations vis-à-vis a corresponding native polypeptide, such as, e.g., amino acid deletions, additions and/or substitutions. The term contemplates both full-length polypeptides and polypeptide parts or fragments, e.g., naturally-occurring polypeptide parts that ensue from processing of such full-length polypeptides.
The term “protein” as used throughout this specification generally encompasses macromolecules comprising one or more polypeptide chains, i.e., polymeric chains of amino acid residues linked by peptide bonds. The term may encompass naturally, recombinantly, semi-synthetically or synthetically produced proteins. The term also encompasses proteins that carry one or more co- or post-expression-type modifications of the polypeptide chain(s), such as, without limitation, glycosylation, acetylation, phosphorylation, sulfonation, methylation, ubiquitination, signal peptide removal, N-terminal Met removal, conversion of pro-enzymes or pre-hormones into active forms, etc. The term further also includes protein variants or mutants which carry amino acid sequence variations vis-à-vis a corresponding native protein, such as, e.g., amino acid deletions, additions and/or substitutions. The term contemplates both full-length proteins and protein parts or fragments, e.g., naturally-occurring protein parts that ensue from processing of such full-length proteins.
The reference to any marker, including any metabolite, peptide, polypeptide, protein, or nucleic acid, corresponds to the marker commonly known under the respective designations in the art. The terms encompass such markers of any organism where found, and particularly of animals, preferably warm-blooded animals, more preferably vertebrates, yet more preferably mammals, including humans and non-human mammals, still more preferably of humans.
The terms particularly encompass such markers, including any metabolites, peptides, polypeptides, proteins, or nucleic acids, with a native sequence, i.e., ones of which the primary sequence is the same as that of the markers found in or derived from nature. A skilled person understands that native sequences may differ between different species due to genetic divergence between such species. Moreover, native sequences may differ between or within different individuals of the same species due to normal genetic diversity (variation) within a given species. Also, native sequences may differ between or even within different individuals of the same species due to somatic mutations, or post-transcriptional or post-translational modifications. Any such variants or isoforms of markers are intended herein. Accordingly, all sequences of markers found in or derived from nature are considered “native”. The terms encompass the markers when forming a part of a living organism, organ, tissue or cell, when forming a part of a biological sample, as well as when at least partly isolated from such sources. The terms also encompass markers when produced by recombinant or synthetic means.
In certain embodiments, markers, including any metabolites, peptides, polypeptides, proteins, or nucleic acids, may be human, i.e., their primary sequence may be the same as a corresponding primary sequence of or present in a naturally occurring human markers. Hence, the qualifier “human” in this connection relates to the primary sequence of the respective markers, rather than to their origin or source. For example, such markers may be present in or isolated from samples of human subjects or may be obtained by other means (e.g., by recombinant expression, cell-free transcription or translation, or non-biological nucleic acid or peptide synthesis).
The reference herein to any marker, including any metabolite, peptide, polypeptide, protein, or nucleic acid, also encompasses fragments thereof. Hence, the reference herein to measuring (or measuring the quantity of) any one marker may encompass measuring the marker and/or measuring one or more fragments thereof.
For example, any marker and/or one or more fragments thereof may be measured collectively, such that the measured quantity corresponds to the sum amounts of the collectively measured species. In another example, any marker and/or one or more fragments thereof may be measured each individually. The terms encompass fragments arising by any mechanism, in vivo and/or in vitro, such as, without limitation, by alternative transcription or translation, exo- and/or endo-proteolysis, exo- and/or endo-nucleolysis, or degradation of the peptide, polypeptide, protein, or nucleic acid, such as, for example, by physical, chemical and/or enzymatic proteolysis or nucleolysis.
The term “fragment” as used throughout this specification with reference to a peptide, polypeptide, or protein generally denotes a portion of the peptide, polypeptide, or protein, such as typically an N- and/or C-terminally truncated form of the peptide, polypeptide, or protein. Preferably, a fragment may comprise at least about 30%, e.g., at least about 50% or at least about 70%, preferably at least about 80%, e.g., at least about 85%, more preferably at least about 90%, and yet more preferably at least about 95% or even about 99% of the amino acid sequence length of said peptide, polypeptide, or protein. For example, insofar not exceeding the length of the full-length peptide, polypeptide, or protein, a fragment may include a sequence of 5 consecutive amino acids, or 10 consecutive amino acids, or 20 consecutive amino acids, or 30 consecutive amino acids, e.g., ≥10 consecutive amino acids, such as for example 50 consecutive amino acids, e.g., 60, 70, 80, 90, 100, 200, 300, 400, 500 or 600 consecutive amino acids of the corresponding full-length peptide, polypeptide, or protein.
The term “fragment” as used throughout this specification with reference to a nucleic acid (polynucleotide) generally denotes a 5′- and/or 3′-truncated form of a nucleic acid. Preferably, a fragment may comprise at least about 30%, e.g., at least about 50% or at least about 70%, preferably at least about 80%, e.g., at least about 85%, more preferably at least about 90%, and yet more preferably at least about 95% or even about 99% of the nucleic acid sequence length of said nucleic acid. For example, insofar not exceeding the length of the full-length nucleic acid, a fragment may include a sequence of ≥5 consecutive nucleotides, or ≥10 consecutive nucleotides, or ≥20 consecutive nucleotides, or ≥30 consecutive nucleotides, e.g., ≥40 consecutive nucleotides, such as for example ≥50 consecutive nucleotides, e.g., ≥60, ≥70, ≥80, ≥90, ≥100, ≥200, ≥300, ≥400, ≥500 or ≥600 consecutive nucleotides of the corresponding full-length nucleic acid.
Cells such as immune cells as disclosed herein may in the context of the present specification be said to “comprise the expression” or conversely to “not express” one or more markers, such as one or more genes or gene products; or be described as “positive” or conversely as “negative” for one or more markers, such as one or more genes or gene products; or be said to “comprise” a defined “gene or gene product signature”.
Such terms are commonplace and well-understood by the skilled person when characterizing cell phenotypes. By means of additional guidance, when a cell is said to be positive for or to express or comprise expression of a given marker, such as a given gene or gene product, a skilled person would conclude the presence or evidence of a distinct signal for the marker when carrying out a measurement capable of detecting or quantifying the marker in or on the cell. Suitably, the presence or evidence of the distinct signal for the marker would be concluded based on a comparison of the measurement result obtained for the cell to a result of the same measurement carried out for a negative control (for example, a cell known to not express the marker) and/or a positive control (for example, a cell known to express the marker). Where the measurement method allows for a quantitative assessment of the marker, a positive cell may generate a signal for the marker that is at least 1.5-fold higher than a signal generated for the marker by a negative control cell or than an average signal generated for the marker by a population of negative control cells, e.g., at least 2-fold, at least 4-fold, at least 10-fold, at least 20-fold, at least 30-fold, at least 40-fold, at least 50-fold higher or even higher. Further, a positive cell may generate a signal for the marker that is 3.0 or more standard deviations, e.g., 3.5 or more, 4.0 or more, 4.5 or more, or 5.0 or more standard deviations, higher than an average signal generated for the marker by a population of negative control cells.
A marker, for example a gene or gene product, for example a peptide, polypeptide, protein, or nucleic acid, or a group of two or more markers, is “detected” or “measured” in a tested object (e.g., in or on a cell, cell population, tissue, organ, or organism, e.g., in a biological sample of a subject) when the presence or absence and/or quantity of said marker or said group of markers is detected or determined in the tested object, preferably substantially to the exclusion of other molecules and analytes, e.g., other genes or gene products.
The terms “increased” or “increase” or “upregulated” or “upregulate” as used herein generally mean an increase by a statically significant amount. For avoidance of doubt, “increased” means a statistically significant increase of at least 10% as compared to a reference level, including an increase of at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100% or more, including, for example at least 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 10-fold increase or greater as compared to a reference level, as that term is defined herein.
The term “reduced” or “reduce” or “decrease” or “decreased” or “downregulate” or “downregulated” as used herein generally means a decrease by a statistically significant amount relative to a reference. For avoidance of doubt, “reduced” means statistically significant decrease of at least 10% as compared to a reference level, for example a decrease by at least 20%, at least 30%, at least 40%, at least 50%, or at least 60%, or at least 70%, or at least 80%, at least 90% or more, up to and including a 100% decrease (i.e., absent level as compared to a reference sample), or any decrease between 10-100% as compared to a reference level, as that.
The terms “quantity”, “amount” and “level” are synonymous and generally well-understood in the art. The terms as used throughout this specification may particularly refer to an absolute quantification of a marker in a tested object (e.g., in or on a cell, cell population, tissue, organ, or organism, e.g., in a biological sample of a subject), or to a relative quantification of a marker in a tested object, i.e., relative to another value such as relative to a reference value, or to a range of values indicating a base-line of the marker. Such values or ranges may be obtained as conventionally known.
An absolute quantity of a marker may be advantageously expressed as weight or as molar amount, or more commonly as a concentration, e.g., weight per volume or mol per volume. A relative quantity of a marker may be advantageously expressed as an increase or decrease or as a fold-increase or fold-decrease relative to said another value, such as relative to a reference value. Performing a relative comparison between first and second variables (e.g., first and second quantities) may but need not require determining first the absolute values of said first and second variables. For example, a measurement method may produce quantifiable readouts (such as, e.g., signal intensities) for said first and second variables, wherein said readouts are a function of the value of said variables, and wherein said readouts may be directly compared to produce a relative value for the first variable vs. the second variable, without the actual need to first convert the readouts to absolute values of the respective variables.
Reference values may be established according to known procedures previously employed for other cell populations, biomarkers and gene or gene product signatures. For example, a reference value may be established in an individual or a population of individuals characterized by a particular diagnosis, prediction and/or prognosis of said disease or condition (i.e., for whom said diagnosis, prediction and/or prognosis of the disease or condition holds true). Such population may comprise without limitation 2 or more, 10 or more, 100 or more, or even several hundred or more individuals.
A “deviation” of a first value from a second value may generally encompass any direction (e.g., increase: first value>second value; or decrease: first value<second value) and any extent of alteration.
For example, a deviation may encompass a decrease in a first value by, without limitation, at least about 10% (about 0.9-fold or less), or by at least about 20% (about 0.8-fold or less), or by at least about 30% (about 0.7-fold or less), or by at least about 40% (about 0.6-fold or less), or by at least about 50% (about 0.5-fold or less), or by at least about 60% (about 0.4-fold or less), or by at least about 70% (about 0.3-fold or less), or by at least about 80% (about 0.2-fold or less), or by at least about 90% (about 0.1-fold or less), relative to a second value with which a comparison is being made.
For example, a deviation may encompass an increase of a first value by, without limitation, at least about 10% (about 1.1-fold or more), or by at least about 20% (about 1.2-fold or more), or by at least about 30% (about 1.3-fold or more), or by at least about 40% (about 1.4-fold or more), or by at least about 50% (about 1.5-fold or more), or by at least about 60% (about 1.6-fold or more), or by at least about 70% (about 1.7-fold or more), or by at least about 80% (about 1.8-fold or more), or by at least about 90% (about 1.9-fold or more), or by at least about 100% (about 2-fold or more), or by at least about 150% (about 2.5-fold or more), or by at least about 200% (about 3-fold or more), or by at least about 500% (about 6-fold or more), or by at least about 700% (about 8-fold or more), or like, relative to a second value with which a comparison is being made.
Preferably, a deviation may refer to a statistically significant observed alteration. For example, a deviation may refer to an observed alteration which falls outside of error margins of reference values in a given population (as expressed, for example, by standard deviation or standard error, or by a predetermined multiple thereof, e.g., ±1×SD or ±2×SD or ±3×SD, or ±1×SE or ±2×SE or ±3×SE). Deviation may also refer to a value falling outside of a reference range defined by values in a given population (for example, outside of a range which comprises ≥40%, ≥50%, ≥60%, ≥70%, ≥75% or ≥80% or ≥85% or ≥90% or ≥95% or even ≥0% of values in said population).
In a further embodiment, a deviation may be concluded if an observed alteration is beyond a given threshold or cut-off. Such threshold or cut-off may be selected as generally known in the art to provide for a chosen sensitivity and/or specificity of the prediction methods, e.g., sensitivity and/or specificity of at least 50%, or at least 60%, or at least 70%, or at least 80%, or at least 85%, or at least 90%, or at least 95%.
For example, receiver-operating characteristic (ROC) curve analysis can be used to select an optimal cut-off value of the quantity of a given immune cell population, biomarker or gene or gene product signatures, for clinical use of the present diagnostic tests, based on acceptable sensitivity and specificity, or related performance measures which are well-known per se, such as positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (LR+), negative likelihood ratio (LR−), Youden index, or similar.
Detection of a biomarker may be by any means known in the art. Methods of detection include, but are not limited to enzymatic assays, flow cytometry, mass cytometry, fluorescence activated cell sorting (FACS), fluorescence microscopy, affinity separation, magnetic cell separation, microfluidic separation, RNA-seq (e.g., bulk or single cell), quantitative PCR, MERFISH (multiplex (in situ) RNA FISH), immunological assay methods by specific binding between a separable, detectable and/or quantifiable immunological binding agent (antibody) and the marker, mass spectrometry analysis methods, chromatography methods and combinations thereof. Immunological assay methods include without limitation immunohistochemistry, immunocytochemistry, flow cytometry, mass cytometry, fluorescence activated cell sorting (FACS), fluorescence microscopy, fluorescence based cell sorting using microfluidic systems, immunoaffinity adsorption based techniques such as affinity chromatography, magnetic particle separation, magnetic activated cell sorting or bead based cell sorting using microfluidic systems, enzyme-linked immunosorbent assay (ELISA) and ELISPOT based techniques, radioimmunoassay (MA), Western blot, etc. While particulars of chromatography are well known in the art, for further guidance see, e.g., Meyer M., 1998, ISBN: 047198373X, and “Practical HPLC Methodology and Applications”, Bidlingmeyer, B. A., John Wiley & Sons Inc., 1993. Exemplary types of chromatography include, without limitation, high-performance liquid chromatography (HPLC), normal phase HPLC (NP-HPLC), reversed phase HPLC (RP-HPLC), ion exchange chromatography (IEC), such as cation or anion exchange chromatography, hydrophilic interaction chromatography (HILIC), hydrophobic interaction chromatography (HIC), size exclusion chromatography (SEC) including gel filtration chromatography or gel permeation chromatography, chromatofocusing, affinity chromatography such as immunoaffinity, immobilized metal affinity chromatography, and the like.
Biomarker detection may also be evaluated using mass spectrometry methods. A variety of configurations of mass spectrometers can be used to detect biomarker values. Several types of mass spectrometers are available or can be produced with various configurations. In general, a mass spectrometer has the following major components: a sample inlet, an ion source, a mass analyzer, a detector, a vacuum system, and instrument-control system, and a data system. Difference in the sample inlet, ion source, and mass analyzer generally define the type of instrument and its capabilities. For example, an inlet can be a capillary-column liquid chromatography source or can be a direct probe or stage such as used in matrix-assisted laser desorption. Common ion sources are, for example, electrospray, including nanospray and microspray or matrix-assisted laser desorption. Common mass analyzers include a quadrupole mass filter, ion trap mass analyzer and time-of-flight mass analyzer. Additional mass spectrometry methods are well known in the art (see Burlingame et al., Anal. Chem. 70:647 R-716R (1998); Kinter and Sherman, New York (2000)).
Protein biomarkers and biomarker values can be detected and measured by any of the following: electrospray ionization mass spectrometry (ESI-MS), ESI-MS/MS, ESI-MS/(MS)n, matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF-MS), surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS), desorption/ionization on silicon (DIOS), secondary ion mass spectrometry (SIMS), quadrupole time-of-flight (Q-TOF), tandem time-of-flight (TOF/TOF) technology, called ultraflex III TOF/TOF, atmospheric pressure chemical ionization mass spectrometry (APCI-MS), APCI-MS/MS, APCI-(MS).sup.N, atmospheric pressure photoionization mass spectrometry (APPI-MS), APPI-MS/MS, and APPI-(MS).sup.N, quadrupole mass spectrometry, Fourier transform mass spectrometry (FTMS), quantitative mass spectrometry, and ion trap mass spectrometry.
Sample preparation strategies are used to label and enrich samples before mass spectroscopic characterization of protein biomarkers and determination biomarker values. Labeling methods include but are not limited to isobaric tag for relative and absolute quantitation (iTRAQ) and stable isotope labeling with amino acids in cell culture (SILAC). Capture reagents used to selectively enrich samples for candidate biomarker proteins prior to mass spectroscopic analysis include but are not limited to aptamers, antibodies, nucleic acid probes, chimeras, small molecules, an F(ab′)2 fragment, a single chain antibody fragment, an Fv fragment, a single chain Fv fragment, a nucleic acid, a lectin, a ligand-binding receptor, affybodies, nanobodies, ankyrins, domain antibodies, alternative antibody scaffolds (e.g. diabodies etc.) imprinted polymers, avimers, peptidomimetics, peptoids, peptide nucleic acids, threose nucleic acid, a hormone receptor, a cytokine receptor, and synthetic receptors, and modifications and fragments of these.
Immunoassay methods are based on the reaction of an antibody to its corresponding target or analyte and can detect the analyte in a sample depending on the specific assay format. To improve specificity and sensitivity of an assay method based on immunoreactivity, monoclonal antibodies are often used because of their specific epitope recognition. Polyclonal antibodies have also been successfully used in various immunoassays because of their increased affinity for the target as compared to monoclonal antibodies Immunoassays have been designed for use with a wide range of biological sample matrices Immunoassay formats have been designed to provide qualitative, semi-quantitative, and quantitative results.
Quantitative results may be generated through the use of a standard curve created with known concentrations of the specific analyte to be detected. The response or signal from an unknown sample is plotted onto the standard curve, and a quantity or value corresponding to the target in the unknown sample is established.
Numerous immunoassay formats have been designed. ELISA or EIA can be quantitative for the detection of an analyte/biomarker. This method relies on attachment of a label to either the analyte or the antibody and the label component includes, either directly or indirectly, an enzyme. ELISA tests may be formatted for direct, indirect, competitive, or sandwich detection of the analyte. Other methods rely on labels such as, for example, radioisotopes (I125) or fluorescence. Additional techniques include, for example, agglutination, nephelometry, turbidimetry, Western blot, immunoprecipitation, immunocytochemistry, immunohistochemistry, flow cytometry, Luminex assay, and others (see ImmunoAssay: A Practical Guide, edited by Brian Law, published by Taylor & Francis, Ltd., 2005 edition).
Exemplary assay formats include enzyme-linked immunosorbent assay (ELISA), radioimmunoassay, fluorescent, chemiluminescence, and fluorescence resonance energy transfer (FRET) or time resolved-FRET (TR-FRET) immunoassays. Examples of procedures for detecting biomarkers include biomarker immunoprecipitation followed by quantitative methods that allow size and peptide level discrimination, such as gel electrophoresis, capillary electrophoresis, planar electrochromatography, and the like.
Methods of detecting and/or quantifying a detectable label or signal generating material depend on the nature of the label. The products of reactions catalyzed by appropriate enzymes (where the detectable label is an enzyme; see above) can be, without limitation, fluorescent, luminescent, or radioactive or they may absorb visible or ultraviolet light. Examples of detectors suitable for detecting such detectable labels include, without limitation, x-ray film, radioactivity counters, scintillation counters, spectrophotometers, colorimeters, fluorometers, luminometers, and densitometers.
Any of the methods for detection can be performed in any format that allows for any suitable preparation, processing, and analysis of the reactions. This can be, for example, in multi-well assay plates (e.g., 96 wells or 384 wells) or using any suitable array or microarray. Stock solutions for various agents can be made manually or robotically, and all subsequent pipetting, diluting, mixing, distribution, washing, incubating, sample readout, data collection and analysis can be done robotically using commercially available analysis software, robotics, and detection instrumentation capable of detecting a detectable label.
Such applications are hybridization assays in which a nucleic acid that displays “probe” nucleic acids for each of the genes to be assayed/profiled in the profile to be generated is employed. In these assays, a sample of target nucleic acids is first prepared from the initial nucleic acid sample being assayed, where preparation may include labeling of the target nucleic acids with a label, e.g., a member of a signal producing system. Following target nucleic acid sample preparation, the sample is contacted with the array under hybridization conditions, whereby complexes are formed between target nucleic acids that are complementary to probe sequences attached to the array surface. The presence of hybridized complexes is then detected, either qualitatively or quantitatively. Specific hybridization technology which may be practiced to generate the expression profiles employed in the subject methods includes the technology described in U.S. Pat. Nos. 5,143,854, 5,288,644, 5,324,633, 5,432,049, 5,470,710, 5,492,806, 5,503,980, 5,510,270, 5,525,464, 5,547,839, 5,580,732, 5,661,028, and 5,800,992,the disclosures of which are herein incorporated by reference, as well as International Patent Publication Nos. WO 95/21265, WO 96/31622, WO 97/10365, WO 97/27317, and EP 373203; and EP 785280. In these methods, an array of “probe” nucleic acids that includes a probe for each of the biomarkers whose expression is being assayed is contacted with target nucleic acids as described above. Contact is carried out under hybridization conditions, e.g., stringent hybridization conditions as described above, and unbound nucleic acid is then removed. The resultant pattern of hybridized nucleic acids provides information regarding expression for each of the biomarkers that have been probed, where the expression information is in terms of whether or not the gene is expressed and, typically, at what level, where the expression data, i.e., expression profile, may be both qualitative and quantitative.
Optimal hybridization conditions will depend on the length (e.g., oligomer vs. polynucleotide greater than 200 bases) and type (e.g., RNA, DNA, PNA) of labeled probe and immobilized polynucleotide or oligonucleotide. General parameters for specific (i.e., stringent) hybridization conditions for nucleic acids are described in Sambrook et al., supra, and in Ausubel et al., “Current Protocols in Molecular Biology”, Greene Publishing and Wiley-interscience, NY (1987), which is incorporated in its entirety for all purposes. When the cDNA microarrays are used, typical hybridization conditions are hybridization in 5×SSC plus 0.2% SDS at 65C for 4 hours followed by washes at 25° C. in low stringency wash buffer (1×SSC plus 0.2% SDS) followed by 10 minutes at 25° C. in high stringency wash buffer (0.1SSC plus 0.2% SDS) (see Shena et al., Proc. Natl. Acad. Sci. USA, Vol. 93, p. 10614 (1996)). Useful hybridization conditions are also provided in, e.g., Tijessen, Hybridization With Nucleic Acid Probes”, Elsevier Science Publishers B.V. (1993) and Kricka, “Nonisotopic DNA Probe Techniques”, Academic Press, San Diego, Calif. (1992).
In certain embodiments, the invention involves targeted nucleic acid profiling (e.g., sequencing, quantitative reverse transcription polymerase chain reaction, and the like) (see e.g., Geiss G K, et al., Direct multiplexed measurement of gene expression with color-coded probe pairs. Nat Biotechnol. 2008 March; 26(3):317-25). In certain embodiments, a target nucleic acid molecule (e.g., RNA molecule) may be sequenced by any method known in the art, for example, methods of high-throughput sequencing, also known as next generation sequencing or deep sequencing. A nucleic acid target molecule labeled with a barcode (for example, an origin-specific barcode) can be sequenced with the barcode to produce a single read and/or contig containing the sequence, or portions thereof, of both the target molecule and the barcode. Exemplary next generation sequencing technologies include, for example, Illumina sequencing, Ion Torrent sequencing, 454 sequencing, SOLiD sequencing, and nanopore sequencing amongst others.
In certain embodiments, the invention involves single cell RNA sequencing (see, e.g., Kalisky, T., Blainey, P. & Quake, S. R. Genomic Analysis at the Single-Cell Level. Annual review of genetics 45, 431-445, (2011); Kalisky, T. & Quake, S. R. Single-cell genomics. Nature Methods 8, 311-314 (2011); Islam, S. et al. Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq. Genome Research, (2011); Tang, F. et al. RNA-Seq analysis to capture the transcriptome landscape of a single cell. Nature Protocols 5, 516-535, (2010); Tang, F. et al. mRNA-Seq whole-transcriptome analysis of a single cell. Nature Methods 6, 377-382, (2009); Ramskold, D. et al. Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nature Biotechnology 30, 777-782, (2012); and Hashimshony, T., Wagner, F., Sher, N. & Yanai, I. CEL-Seq: Single-Cell RNA-Seq by Multiplexed Linear Amplification. Cell Reports, Cell Reports, Volume 2, Issue 3, p666-6′73, 2012).
In certain embodiments, the invention involves plate based single cell RNA sequencing (see, e.g., Picelli, S. et al., 2014, “Full-length RNA-seq from single cells using Smart-seq2” Nature protocols 9, 171-181, doi:10.1038/nprot.2014.006).
In certain embodiments, the invention involves high-throughput single-cell RNA-seq. In this regard reference is made to Macosko et al., 2015, “Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets” Cell 161, 1202-1214; International patent application number PCT/US2015/049178, published as WO2016/040476 on Mar. 17, 2016; Klein et al., 2015, “Droplet Barcoding for Single-Cell Transcriptomics Applied to Embryonic Stem Cells” Cell 161, 1187-1201; International patent application number PCT/US2016/027734, published as WO2016168584A1 on Oct. 20, 2016; Zheng, et al., 2016, “Haplotyping germline and cancer genomes with high-throughput linked-read sequencing” Nature Biotechnology 34, 303-311; Zheng, et al., 2017, “Massively parallel digital transcriptional profiling of single cells” Nat. Commun. 8, 14049 doi: 10.1038/ncomms14049; International patent publication number WO2014210353A2; Zilionis, et al., 2017, “Single-cell barcoding and sequencing using droplet microfluidics” Nat Protoc. Jan; 12(1):44-73; Cao et al., 2017, “Comprehensive single cell transcriptional profiling of a multicellular organism by combinatorial indexing” bioRxiv preprint first posted online Feb. 2, 2017, doi: dx.doi.org/10.1101/104844; Rosenberg et al., 2017, “Scaling single cell transcriptomics through split pool barcoding” bioRxiv preprint first posted online Feb. 2, 2017, doi: dx.doi.org/10.1101/105163; Rosenberg et al., “Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding” Science 15 Mar. 2018; Vitak, et al., “Sequencing thousands of single-cell genomes with combinatorial indexing” Nature Methods, 14(3):302-308, 2017; Cao, et al., Comprehensive single-cell transcriptional profiling of a multicellular organism. Science, 357(6352):661-667, 2017; Gierahn et al., “Seq-Well: portable, low-cost RNA sequencing of single cells at high throughput” Nature Methods 14, 395-398 (2017); and Hughes, et al., “Highly Efficient, Massively-Parallel Single-Cell RNA-Seq Reveals Cellular States and Molecular Features of Human Skin Pathology” bioRxiv 689273, all the contents and disclosure of each of which are herein incorporated by reference in their entirety.
In certain embodiments, the invention involves single nucleus RNA sequencing. In this regard reference is made to Swiech et al., 2014, “In vivo interrogation of gene function in the mammalian brain using CRISPR-Cas9” Nature Biotechnology Vol. 33, pp. 102-106; Habib et al., 2016, “Div-Seq: Single-nucleus RNA-Seq reveals dynamics of rare adult newborn neurons” Science, Vol. 353, Issue 6302, pp. 925-928; Habib et al., 2017, “Massively parallel single-nucleus RNA-seq with DroNc-seq” Nat Methods. 2017 October; 14(10):955-958; and International patent application number PCT/US2016/059239, published as WO2017164936 on Sep. 28, 2017, which are herein incorporated by reference in their entirety.
In certain embodiments, the invention involves the Assay for Transposase Accessible Chromatin using sequencing (ATAC-seq) as described. (see, e.g., Buenrostro, et al., Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nature methods 2013; 10 (12): 1213-1218; Buenrostro et al., Single-cell chromatin accessibility reveals principles of regulatory variation. Nature 523, 486-490 (2015); Cusanovich, D. A., Daza, R., Adey, A., Pliner, H., Christiansen, L., Gunderson, K. L., Steemers, F. J., Trapnell, C. & Shendure, J. Multiplex single-cell profiling of chromatin accessibility by combinatorial cellular indexing. Science. 2015 May 22; 348(6237):910-4. doi: 10.1126/science.aab1601. Epub 2015 May 7; US Patent Publication Nos. US20160208323A1 and US20160060691A1; and International Patent Publication No. WO2017156336A1).
A “pharmaceutical composition” refers to a composition that usually contains an excipient, such as a pharmaceutically acceptable carrier that is conventional in the art and that is suitable for administration to cells or to a subject.
The pharmaceutical composition according to the present invention can, in one alternative, include a prodrug. When a pharmaceutical composition according to the present invention includes a prodrug, prodrugs and active metabolites of a compound may be identified using routine techniques known in the art. (See, e.g., Bertolini et al., J. Med. Chem., 40, 2011-2016 (1997); Shan et al., J. Pharm. Sci., 86 (7), 765-767; Bagshawe, Drug Dev. Res., 34, 220-230 (1995); Bodor, Advances in Drug Res., 13, 224-331 (1984); Bundgaard, Design of Prodrugs (Elsevier Press 1985); Larsen, Design and Application of Prodrugs, Drug Design and Development (Krogsgaard-Larsen et al., eds., Harwood Academic Publishers, 1991); Dear et al., J. Chromatogr. B, 748, 281-293 (2000); Spraul et al., J. Pharmaceutical & Biomedical Analysis, 10, 601-605 (1992); and Prox et al., Xenobiol., 3, 103-112 (1992)).
The term “pharmaceutically acceptable” as used throughout this specification is consistent with the art and means compatible with the other ingredients of a pharmaceutical composition and not deleterious to the recipient thereof.
As used herein, “carrier” or “excipient” includes any and all solvents, diluents, buffers (such as, e.g., neutral buffered saline or phosphate buffered saline), solubilizers, colloids, dispersion media, vehicles, fillers, chelating agents (such as, e.g., EDTA or glutathione), amino acids (such as, e.g., glycine), proteins, disintegrants, binders, lubricants, wetting agents, emulsifiers, sweeteners, colorants, flavorings, aromatizers, thickeners, agents for achieving a depot effect, coatings, antifungal agents, preservatives, stabilizers, antioxidants, tonicity controlling agents, absorption delaying agents, and the like. The use of such media and agents for pharmaceutical active components is well known in the art. Such materials should be non-toxic and should not interfere with the activity of the cells or active components.
The precise nature of the carrier or excipient or other material will depend on the route of administration. For example, the composition may be in the form of a parenterally acceptable aqueous solution, which is pyrogen-free and has suitable pH, isotonicity and stability. For general principles in medicinal formulation, the reader is referred to Cell Therapy: Stem Cell Transplantation, Gene Therapy, and Cellular Immunotherapy, by G. Morstyn & W. Sheridan eds., Cambridge University Press, 1996; and Hematopoietic Stem Cell Therapy, E. D. Ball, J. Lister & P. Law, Churchill Livingstone, 2000.
The pharmaceutical composition can be applied parenterally, rectally, orally or topically. Preferably, the pharmaceutical composition may be used for intravenous, intramuscular, subcutaneous, peritoneal, peridural, rectal, nasal, pulmonary, mucosal, or oral application. In a preferred embodiment, the pharmaceutical composition according to the invention is intended to be used as an infusion. The skilled person will understand that compositions which are to be administered orally or topically will usually not comprise cells, although it may be envisioned for oral compositions to also comprise cells, for example when gastro-intestinal tract indications are treated. Each of the cells or active components (e.g., immunomodulants) as discussed herein may be administered by the same route or may be administered by a different route. By means of example, and without limitation, cells may be administered parenterally and other active components may be administered orally.
Liquid pharmaceutical compositions may generally include a liquid carrier such as water or a pharmaceutically acceptable aqueous solution. For example, physiological saline solution, tissue or cell culture media, dextrose or other saccharide solution or glycols such as ethylene glycol, propylene glycol or polyethylene glycol may be included.
The composition may include one or more cell protective molecules, cell regenerative molecules, growth factors, anti-apoptotic factors or factors that regulate gene expression in the cells. Such substances may render the cells independent of their environment.
Such pharmaceutical compositions may contain further components ensuring the viability of the cells therein. For example, the compositions may comprise a suitable buffer system (e.g., phosphate or carbonate buffer system) to achieve desirable pH, more usually near neutral pH, and may comprise sufficient salt to ensure isoosmotic conditions for the cells to prevent osmotic stress. For example, suitable solution for these purposes may be phosphate-buffered saline (PBS), sodium chloride solution, Ringer's Injection or Lactated Ringer's Injection, as known in the art. Further, the composition may comprise a carrier protein, e.g., albumin (e.g., bovine or human albumin), which may increase the viability of the cells.
Further suitably pharmaceutically acceptable carriers or additives are well known to those skilled in the art and for instance may be selected from proteins such as collagen or gelatine, carbohydrates such as starch, polysaccharides, sugars (dextrose, glucose and sucrose), cellulose derivatives like sodium or calcium carboxymethylcellulose, hydroxypropyl cellulose or hydroxypropylmethyl cellulose, pregeletanized starches, pectin agar, carrageenan, clays, hydrophilic gums (acacia gum, guar gum, arabic gum and xanthan gum), alginic acid, alginates, hyaluronic acid, polyglycolic and polylactic acid, dextran, pectins, synthetic polymers such as water-soluble acrylic polymer or polyvinylpyrrolidone, proteoglycans, calcium phosphate and the like.
In certain embodiments, a pharmaceutical cell preparation as taught herein may be administered in a form of liquid composition. In embodiments, the cells or pharmaceutical composition comprising such can be administered systemically, topically, within an organ or at a site of organ dysfunction or lesion.
Preferably, the pharmaceutical compositions may comprise a therapeutically effective amount of the specified immune cells and/or other active components (e.g., immunomodulants). The term “therapeutically effective amount” refers to an amount which can elicit a biological or medicinal response in a tissue, system, animal or human that is being sought by a researcher, veterinarian, medical doctor or other clinician, and in particular can prevent or alleviate one or more of the local or systemic symptoms or features of a disease or condition being treated.
It will be appreciated that administration of therapeutic entities in accordance with the invention will be administered with suitable carriers, excipients, and other agents that are incorporated into formulations to provide improved transfer, delivery, tolerance, and the like. A multitude of appropriate formulations can be found in the formulary known to all pharmaceutical chemists: Remington's Pharmaceutical Sciences (15th ed, Mack Publishing Company, Easton, Pa. (1975)), particularly Chapter 87 by Blaug, Seymour, therein. These formulations include, for example, powders, pastes, ointments, jellies, waxes, oils, lipids, lipid (cationic or anionic) containing vesicles (such as Lipofectin™), DNA conjugates, anhydrous absorption pastes, oil-in-water and water-in-oil emulsions, emulsions carbowax (polyethylene glycols of various molecular weights), semi-solid gels, and semi-solid mixtures containing carbowax. Any of the foregoing mixtures may be appropriate in treatments and therapies in accordance with the present invention, provided that the active ingredient in the formulation is not inactivated by the formulation and the formulation is physiologically compatible and tolerable with the route of administration. See also Baldrick P. “Pharmaceutical excipient development: the need for preclinical guidance.” Regul. Toxicol Pharmacol. 32(2):210-8 (2000), Wang W. “Lyophilization and development of solid protein pharmaceuticals.” Int. J. Pharm. 203(1-2):1-60 (2000), Charman W N “Lipids, lipophilic drugs, and oral drug delivery-some emerging concepts.” J Pharm Sci. 89(8):967-78 (2000), Powell et al. “Compendium of excipients for parenteral formulations” PDA J Pharm Sci Technol. 52:238-311 (1998) and the citations therein for additional information related to formulations, excipients and carriers well known to pharmaceutical chemists.
The medicaments of the invention are prepared in a manner known to those skilled in the art, for example, by means of conventional dissolving, lyophilizing, mixing, granulating or confectioning processes. Methods well known in the art for making formulations are found, for example, in Remington: The Science and Practice of Pharmacy, 20th ed., ed. A. R. Gennaro, 2000, Lippincott Williams & Wilkins, Philadelphia, and Encyclopedia of Pharmaceutical Technology, eds. J. Swarbrick and J. C. Boylan, 1988-1999, Marcel Dekker, New York.
Administration of medicaments of the invention may be by any suitable means that results in a compound concentration that is effective for treating or inhibiting (e.g., by delaying) the development of a disease. The compound is admixed with a suitable carrier substance, e.g., a pharmaceutically acceptable excipient that preserves the therapeutic properties of the compound with which it is administered. One exemplary pharmaceutically acceptable excipient is physiological saline. The suitable carrier substance is generally present in an amount of 1-95% by weight of the total weight of the medicament. The medicament may be provided in a dosage form that is suitable for administration. Thus, the medicament may be in form of, e.g., tablets, capsules, pills, powders, granulates, suspensions, emulsions, solutions, gels including hydrogels, pastes, ointments, creams, plasters, drenches, delivery devices, injectables, implants, sprays, or aerosols.
Administration can be systemic or local. In addition, it may be advantageous to administer the composition into the central nervous system by any suitable route, including intraventricular and intrathecal injection. Pulmonary administration may also be employed by use of an inhaler or nebulizer, and formulation with an aerosolizing agent. It may also be desirable to administer the agent locally to the area in need of treatment; this may be achieved by, for example, and not by way of limitation, local infusion during surgery, topical application, by injection, by means of a catheter, by means of a suppository, or by means of an implant.
Various delivery systems are known and can be used to administer the pharmacological compositions including, but not limited to, encapsulation in liposomes, microparticles, microcapsules; minicells; polymers; capsules; tablets; and the like. In one embodiment, the agent may be delivered in a vesicle, in particular a liposome. In a liposome, the agent is combined, in addition to other pharmaceutically acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, sulfatides, lysolecithin, phospholipids, saponin, bile acids, and the like. Preparation of such liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. Pat. Nos. 4,837,028 and 4,737,323. In yet another embodiment, the pharmacological compositions can be delivered in a controlled release system including, but not limited to: a delivery pump (See, for example, Saudek, et al., New Engl. J. Med. 321: 574 (1989) and a semi-permeable polymeric material (See, for example, Howard, et al., J. Neurosurg. 71: 105 (1989)). Additionally, the controlled release system can be placed in proximity of the therapeutic target (e.g., a tumor), thus requiring only a fraction of the systemic dose. See, for example, Goodson, In: Medical Applications of Controlled Release, 1984. (CRC Press, Boca Raton, Fla.).
The amount of the agents which will be effective in the treatment of a particular disorder or condition will depend on the nature of the disorder or condition, and may be determined by standard clinical techniques by those of skill within the art. In addition, in vitro assays may optionally be employed to help identify optimal dosage ranges. The precise dose to be employed in the formulation will also depend on the route of administration, and the overall seriousness of the disease or disorder, and should be decided according to the judgment of the practitioner and each patient's circumstances. Ultimately, the attending physician will decide the amount of the agent with which to treat each individual patient. In certain embodiments, the attending physician will administer low doses of the agent and observe the patient's response. Larger doses of the agent may be administered until the optimal therapeutic effect is obtained for the patient, and at that point the dosage is not increased further. Effective doses may be extrapolated from dose-response curves derived from in vitro or animal model test systems. Ultimately the attending physician will decide on the appropriate duration of therapy using compositions of the present invention. Dosage will also vary according to the age, weight and response of the individual patient.
There are a variety of techniques available for introducing nucleic acids into viable cells. The techniques vary depending upon whether the nucleic acid is transferred into cultured cells in vitro, or in vivo in the cells of the intended host. Techniques suitable for the transfer of nucleic acid into mammalian cells in vitro include the use of liposomes, electroporation, microinjection, cell fusion, DEAE-dextran, the calcium phosphate precipitation method, etc. The currently preferred in vivo gene transfer techniques include transfection with viral (typically retroviral) vectors and viral coat protein-liposome mediated transfection.
The invention is further described in the following examples, which do not limit the scope of the invention described in the claims.
Applicants used COMPASS to characterize the metabolic heterogeneity in sorted Th17 cells and identified metabolic pathways associated with pathogenicity (
Applicants validated the association of the polyamine pathway with pathogenic Th17 cells using fluxomics and metabolomics analysis (
Inhibition of the polyamine pathway using 2-(difluoromethyl)ornithine (DFMO) (
Applicants show that suppression of IL-17 by DFMO is dependent on the timing of DFMO treatment (
Applicants Treatment of T cells with DFMO decreases expression of SAT1 (
Thus, Applicants found that perturbation of Sat1 partially mimics has an additive effect with DFMO on Th17 cell function, and alleviates EAE.
Cellular metabolism is a mediator and modulator of immune cell differentiation and function. Applicants previously identified lipid biosynthesis as a key regulator of helper TH17 cell function by altering transcriptional activity of Rorgt [1], providing proof of concept that metabolic processes can be directly involved in gene regulation and balancing proinflammatory and regulatory states of T cells. A full appreciation of metabolic circuitry and its connection with immune cell function is limited by available technology that typically investigates the average metabolic state of a large population of cells. Applicants developed a novel algorithm (COMPASS) that allows prediction of metabolic fluxes of cells using transcriptome data at the single cell level, allowing comprehensive profiling of how metabolic pathways are interconnected within a cell. Combining this novel tool, metabolomics and functional biology, Applicants investigated Th17 cells at different functional state in the context of EAE and identified the polyamine pathway as a modulator of epigenetic landscape and function of proinflammatory Th17 cells in autoimmune responses.
Polyamines are polycations including putrescine (Put), spermidine (Spd) and spermine (Spm) mainly synthesized from ornithine/methionine via ornithine decarboxylase 1 (ODC1) and S-adenosylmethionine decarboxylase (AMD) [2]. Polyamines exist in all kingdom of life and single nucleotide polymorphisms resulting in alterations of polyamine metabolism have been implicated in a number of human diseases including mental illness and cancer [3, 4]. Polyamines appear to regulate gene expression, cell proliferation and stress responses due to their ability to bind to nucleic acids (both DNA, RNA), alter posttranslational modification and regulate ion channels [3, 4]. Numerous studies have alluded to a role of polyamines in regulating gene expression due to their polycationic nature and ability to function as a sink to S-Adenosylmethionine and Acetyl-coA, both critical meatabolites for histone modifications [5-8]. Intracellular polyamines and their analogues are also known to inhibit lysine-specific demethyltransferases such as LSD1 [9]. Despite its relevance in human disease and extensive clinical interest [10, 11], the role of the intrinsic polyamine pathway in immune cells is largely unexplored. Recent work demonstrated that the polyamine pathway and its connection to hypusination can regulate OXPHOS in macrophage [12], suggesting the relevance of this pathway in immune cells. The current study showed that enzymes of the polyamine pathway are suppressed and cellular polyamine content is significantly lower in regulatory T cells and non-pathogenic Th17 cells (Th17n) as compared to Th17 cells in the proinflammatory state (Th17p) due to alternative fluxing. Perturbation of the polyamine pathway in Th17 cells suppressed canonical Th17 cytokines and promoted Foxp3 expression, shifting the Th17 cell transcriptome in favor of Tregs-like state. Applicants demonstrated that the polyamine pathway is critical in maintaining the Th17-specific chromatin landscape against the induction of Tregs-like program. Consistent with the cellular phenotype, chemical inhibition and genetic perturbation of the polyamine pathway in T cells restricted the development of autoimmune responses in the EAE model.
To better analyze the metabolic landscape of Th17 cells in association with their functional state, Applicants first used two approaches: untargeted metabolomics (
To circumvent challenges in identifying metabolic pathways using traditional approach, Applicants next investigated the metabolic circuitry of Th17 cells using COMPASS (
To investigate the polyamine metabolic process (
As the polyamine pathway is regulated beyond the transcriptional level similar to most metabolic pathways, next, Applicants directly measured total cellular polyamine content using an enzymatic assay (Material and Methods). Compared to Th17p cells, Applicants found that Tregs and Th17n have significantly reduced levels of total polyamines (
To further investigate the concentrations and activities of different polyamines in Th17 cells at different functional state, Applicants applied both targeted metabolomics and carbon tracing approach. Th17n and Th17p cells are differentiated as previously described for 68 hours and the amount of polyamines and related precursors in cell and media are measured by LC/MS (
To directly investigate polyamine biosynthesis, Applicants cultured differentiated Th17n and Th17p cells in the presence of low amount of C13 labeled arginine, which can be used to synthesize ornithine, a precursor to the polyamine pathway (
To investigate the functional relevance, Applicants used inhibitors of the polyamine pathway and studied their effects on Th17 cells at different functional state differentiated in vitro. Applicants first used difluoromethylornithine (DFMO), a competitive inhibitor of ODC1 (
To determine whether DFMO inhibited Th17 cell differentiation, Applicants measured the expression and activity of transcription factors. Interestingly, DFMO did not consistently alter Rorgt expression (
To determine whether other enzymes of the polyamine pathway could play a similar role in regulating Th17 cell function, Applicants used inhibitors of spermidine synthase (SRM), spermine synthase (SMS), and SAT1 (
Applicants next confirmed that the effect of DFMO is through the inhibition of ODC1 as addition of putrescine to cells treated with DFMO completely reversed their phenotype (
DFMO restricts Th17-cell transcriptome and epigenome in favor of Treg-like state
To gain mechanistic insight on the effects of inhibiting polyamine biosynthesis in Th17 cells, Applicants performed RNAseq on Th17n, Th17p and compared that to iTregs treated with ctrl or DFMO. DFMO has profound impact on the transcriptome of all Th cell lineages, clearly driving cells towards Treg cells in principal component analysis (
As Applicants observed Foxp3 upregulation at least in Th17n cells (
The profound impact of DFMO on transcriptome prompted Applicants to investigate the mechanism by which the polyamine pathway regulates Th17 cell functions. As DFMO does not appear to consistently restrict phosphorylation of key Th17 cell regulators (
To directly test the hypothesis that the polyamine pathway may regulate Th17 function by altering histone modification and DNA accessibility, Applicants measured chromatin accessibility by performing ATACseq in Th17p, Th17n and iTregs cells treated with either control or DFMO (Material and Methods). Overall, Applicants observed significant changes in accessible peaks in all Th cells analyzed in response to DFMO treatment (
Next, Applicants asked whether the chromatin accessibility changes could be driving the transcriptome regulation. To this end, Applicants first examined Th17-specific and iTreg-specific genomic regions corresponding to Il17a-Il17f, Il23r and Foxp3 (
Upregulation of Foxp3 in DFMO Treatment is cMAF Dependent
To investigate what transcription factor network may be responsible for the suppression of Th17 specific program and upregulation of iTreg program, Applicants performed motif analysis in the ATACseq peaks using existing ChIPseq data (
Next, Applicants investigated the accessible regions that are iTreg-specific (
To investigate the relevance of the polyamine pathway in vivo (
Applicants first analyzed the role of ODC1 inhibition by adding DFMO in drinking water for mice immunized with MOG/CFA for the induction of EAE (Material and Methods). DFMO significantly delayed EAE onset and severity (
Next, Applicants studied the T cell intrinsic effect of the polyamine pathway in EAE development. Applicants generated SAT1 conditional deletion mice in T cells (SAT 1fl/flCD4cre). Applicants confirmed that genetic deletion of SAT1 in T cells resulted in loss of polyamine acetylation as reflected in the reduced acetyl-putrescine and acetyl-spermidine (
Mice. C57BL/6 wildtype (WT) were obtained from Jackson laboratory (Bar Harbor, Me.). CD4Cre SAT1flox mice were kindly provided by Dr. Soleimani ( ). For experiments, mice were matched for sex and age, and most mice were 6-10 weeks old. For EAE experiment, littermate control WT was used in comparison to CD4Cre SATlflox mice in one experiment which produced similar results compared to WT from Jackson. All experiments were conducted in accordance with animal protocols approved by the Harvard Medical Area Standing Committee on Animals or BWH IACUC.
Single-cell RNAseq data acquisition and analysis. Applicants prepared single-cell mRNA SMART-Seq libraries using microfluidic chips (Fluidigm C1) for single-cell capture, lysis, reverse transcription, and PCR amplification, followed by transposon-based library construction. For quality assurance, Applicants also profiled corresponding population controls (>50,000 cells for in vitro samples; 2,000-20,000 cells for in vivo samples, as available), with at least two replicates for each condition. RNA-seq reads were aligned to the NCBI Build 37 (UCSC mm9) of the mouse genome using TopHat (Trapnell et al., 2009). The resulting alignments were processed by Cufflinks to evaluate the abundance (using FPKM) of transcripts from RefSeq (Pruitt et al., 2007). Applicants used log transform and quantile normalization to further normalize the expression values (FPKM) within each batch of samples (i.e., all single-cells in a given run). To account for low (or zero) expression values Applicants added a value of 1 prior to log transform. Applicants filtered the set of analyzed cells by a set of quality metrics (such as sequencing depth), and added an additional normalization step specifically controlling for these quantitative confounding factors as well as batch effects. The analysis is based on ˜7,000 appreciably expressed genes (fragments per kilobase of exon per million (FPKM)>10 in at least 20% of cells in each sample) for in vitro experiments and ˜4,000 for in vivo ones. Applicants also developed a strategy to account for expressed transcripts that are not detected (false negatives) due to the limitations of single-cell RNA-seq (Deng et al., 2014; Shalek et al., 2014). The analysis (e.g., computing signature scores, and principle components) down-weighted the contribution of less reliably measured transcripts. The ranking of regulators shown in
T cell differentiation culture & Flow cytometry. Naïve CD4+CD44-CD62L+CD25-T cells were sorted using BD FACSAria sorter and activated with plate-bound anti-CD3 (1 μg/ml) and antiCD28 antibodies (2 μg/ml) in the presence of cytokines at a concentration of 5×105 cells/ml. For T cell differentiations the following combinations of cytokines were used: pathogenic Th17: 25 ng/ml rmIL-6, 20 ng/ml rmIL-1b (both Miltenyi Biotec) and 20 ng/ml rmIL-23 (R&D systems); non-pathogenic Th17: 25 ng/ml rmIL-6 and 2 ng/ml of rhTGFb1 (Miltenyi Biotec); iTreg: 2 ng/ml of rhTGFb1; Th1: 20 ng/ml rmIL-12 (R&D systems); Th2: 20 ng/ml rmIL-4 (Miltenyi Biotec). For differentiation experiments, cells were harvested at 72 hours and were performed in the presence or absence of 200 mM DFMO or 2.5 mM Putrescine (both Sigma) as indicated.
Intracellular cytokine staining was performed after incubation for 4-6h with Cell Stimulation cocktail plus Golgi transport inhibitors (Thermo Fisher Scientific) using the BD Cytofix/Cytoperm buffer set (BD Biosciences) per manufacturer's instructions. Transcription factor staining was performed using the Foxp3/Transcription Factor Staining Buffer Set (eBioscience).
Proliferation was assessed by staining with CellTrace Violet (Thermo Fisher Scientific) per manufacturer's instructions. Apoptosis was assessed using Annexin V staining kit (BioLegend). Phosphorylation of proteins to determine cell signaling was performed with BD Phosflow buffer system (BD bioscience) as per manufacturer's instructions.
Legendplex. Cytokine concentrations in supernatants of in vitro cultures were analyzed by the LegendPlex Mouse Th Cytokine Panel (13-plex) (BioLegend) according to the manufacturer's instructions and analyzed on a FACS LSR II (BD Biosciences).
qPCR. RNA was isolated using RNeasy Plus Mini Kit (Qiagen) and reverse transcribed to cDNA with iScript cDNA Synthesis Kit (Bio-Rad). Gene expression was analyzed by quantitative real-time PCR on a ViiA7 System (Thermo Fisher Scientific) using TaqMan Fast
Advanced Master Mix (Thermo Fisher Scientific) with the following primer/probe sets: Il-17a (Mm00439618_m1), Il-17f (Mm00521423_m1), Foxp3 (Mm00475162_m1), Tead1 (Mm00493507_m1), Taz (Mm00504978_m1), Sat1 (Mm00485911_g1) and Actb (Applied Biosystems). Expression values were calculated relative to Actb detected in the same sample by duplex qPCR.
Antibodies. All other flow cytometry antibodies were purchased from Biolegend.
Experimental Autoimmune Encephalomyelitis (EAE). For active EAE immunization, MOG35-55 peptide was emulsified in complete freund adjuvant (CFA). Equivalent of 40 μg MOG peptide was injected per mouse subcutaneously followed by pertussis toxin injection intravenously on day 0 and day 2 of immunization. Mice were treated with 0.5% DFMO in drinking water for 10 days as indicated. DFMO was replenished every third day.
Suppression assay. Freshly isolated naïve CD4+ T cells (4×104) were stained with CellTrace Violet as described above and cultured with various differentiated T cells in a 1:1 or 2:1 ratio in 96-well round-bottom plates (Corning Inc.) in the presence of anti-CD3/CD28 beads (50×103 beads/well; Invitrogen). After 72 hrs cells were analyzed to assess proliferation.
RNA-seq. For population (bulk) RNA-seq, in vitro differentiated T-cells were sorted for live cells and lysed with RLT Plus buffer and RNA was extracted using the RNeasy Plus Mini Kit (Qiagen). Full-length RNA-seq libraries were prepared as previously described [27] and paired-end sequenced (75 bp×2) with a 150 cycle Nextseq 500 high output V2 kit.
ATAC-seq. For population ATAC-seq, in vitro differentiated T-cells were sorted for live cells and froze down in Bambanker freezing media (Thermo Fisher Scientific).
Alignment of ATAC-Seq and Peak Calling. All ATAC-Seq reads were trimmed using Trimmomatic [28] to remove primer and low-quality bases. Reads <36 bp were dropped. Reads were then passed to FastQC [www.bioinformatics.babraham.ac.uk/projects/fastqc/] to check the quality of the trimmed reads. The paired-end reads were then aligned to the mm10 reference genome using bowtie2 [29], allowing maximum insert sizes of 2000 bp, with the “—no-mixed” and “—no-discordant” parameters added. Reads with a mapping quality (MAPA) below 30 were removed. Duplicates were removed with PicardTools, and the reads mapping to the blacklist regions and mitochondrial DNA were also removed. Reads mapping to the positive strand were moved +4 bp, and reads mapping to the negative strand were moved −5 bp following the procedure outlined in [30] to account for the binding of the Tn5 transposase.
Peaks were called using macs2 on the aligned fragments [31] with a qvalue cutoff of 0.001 and overlapping peaks among replicates were merged.
Tests of Differential Accessibility in ChARs. Differential accessibility was assessed using DESeq2 [32] on with a matrix of peaks by samples replacing the genes by samples matrix. Counts of Tn5 cuts were used instead of gene expression values. Peaks were considered differentially accessible if they had an adjusted pvalue <0.05.
Alignment of ChIP-Seq and Peak Calling. ChIP-Seq Peaks from Xiao et al 2014 [17] were transferred from mm9 to mm10 using the UCSC liftOver tool. Xiao et al 2014-RORyt www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM1350476; Xiao et al 2014—Foxp3 www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM1350486
ChIP-Seq replicates from Ciofani et al 2012 were downloaded and were trimmed using Trimmomatic [28] to remove primer and low-quality bases. Reads were then passed to FastQC [www.bioinformatics.babraham.ac.uk/projects/fastqc/] to check the quality of the trimmed reads. These single-end reads were then aligned to the mm10 reference genome using bowtie2 [29], allowing maximum insert sizes of 2000 bp, with the “—no-mixed” and “—no-discordant” parameters added. Reads with a mapping quality (MAPA) below 30 were removed. Duplicates were removed with PicardTools, and the reads mapping to the blacklist regions and mitochondrial DNA were also removed.
ChIP-Seq peaks were called in each replicate, versus a control sample, using macs2 [31] with a qvalue cutoff of 0.05.
Statistical Analysis. Unless otherwise specified, all statistical analyses were performed using the two-tail student t test using GraphPad Prism software. P value less than 0.05 is considered significant (P<0.05=*; P<0.01=**; P<0.001=***) unless otherwise indicated.
Applicants used COMPASS to identify additional pathways associated with Th17 pathogenicity. COMPASS predicted reactions in the glycolysis pathway that were positively and negatively associated with Th17 pathogenicity (
Applicants validated four of the reactions associated with pathogenicity of Th17 cells in the glycolysis pathway using inhibitors for each (
Cellular metabolism is both a mediator and a regulator of cell functions. Metabolic alterations are key in healthy cellular processes, such as differentiation, but also in diseases among which are cancer and ageing. Recently, the study of metabolism in immune cells (immunometabolism) has gained particular attention as the significance of intracellular metabolic phenotypes in viral-specific responses, autoimmunity, and cancer immunotherapy became evident1-4.
The rapid advances in single-cell RNA-Sequencing (scRNA-Seq) enable novel ways of exploring the role of metabolism in health and disease, namely by studying cell-to-cell metabolic heterogeneity. In a recent review5, Applicants suggested that a cell's molecular contents, as measured by scRNA-Seq, for example, are the product of the instantaneous intersection of multiple biological factors, or vectors, that affected the cell. Specialized computational methods are needed to glean the unique information that can be inferred from single-cell data, while overcoming its challenges, such as sparsity due to dropout.
Here, Applicants address this challenge in the realm of cellular metabolism. Applicants present Compass, a novel algorithm to characterize and interpret the metabolic heterogeneity among cells in a quantitative and unsupervised manner. Compass belongs to the family of Flux Balance Analysis (FBA) algorithms6-8. It leverages a priori knowledge on the metabolic network's topology and stoichiometry in combination with the single-cell resolution and statistical power afforded by scRNA-Seq to map cell-to-cell metabolic heterogeneity and discover metabolic correlates of phenotypes of interest.
To demonstrate Compass's utility, Applicants analyze data from murine T helper 17 (Th17) cells, which is a heterogeneous cell type. On the one hand, Th17 cells are potent inducers of tissue inflammation in autoimmune disorders, among which are multiple sclerosis (MS) and inflammatory bowel disease (IBD)9,10. On the other hand, they can play a protective role in promoting gut homeostasis and barrier functions11,12 Importantly, their effector functions, similar to those of other CD4+ and CD8+ T cell subsets are tightly linked to their metabolic state13-18. For these reasons, Th17 metabolism presents compelling questions that can be addressed via scRNA-Seq and Compass analyses.
Applicants show that Compass predicts known and novel associations between Th17 cells' metabolic state and their pathogenicity. Th17 pathogenicity is their capability to trigger autoimmune disease, which Applicants quantify with a transcriptomic (non-metabolic) signature 19. Applicants demonstrate both inter-group and intra-group analysis, i.e., both a comparative analysis of differences between two Th17 differentiation protocols, and an association study within a seemingly homogenous group of cells that were all differentiated using the same protocol. Notably, while previous studies in Th1716,20 as well as other immune cells21-24 have all linked higher glycolytic activity with a pro-inflammatory cell state, Compass predicts, and subsequent assays validated, that phosphoglycerate mutase (PGAM)—a central reaction on the glycolysis path—is negatively associated with Th17 pathogenicity.
metabolites. G=glucose; G6P=glucose 6-phospohate; F6P=fructose 6-phosphate; F1,6BP=fructose 1,6-biphosphate; GAP=glyceraldehyde 3-phosphate; DHAP=dihydroxyacetone phosphate (also called glycerone phosphate); 1,3 BPG=1,3-biphosphoglycerate; 3PG=3-phosphoglycerate; 2PG=2-phosphoglycerate; PEP=phosphoenolpyruvate; P=pyruvate; Lac=lactate; AcCoA=acetyl-CoA; OA=oxaloacetate; TCA=tricarboxylic acid cycle; GL3P=sn-Glycerol 3-phosphate; GL=glycerol; DGL6P=D-glucono-1,5-lactone 6-phosphate; Ru5P=ribulose 5-phosphate; R5P=ribose-5-phosphate; X5P=xylulose 5-phosphate; S7P=sedoheptulose 7-phosphate;
reactions. G6PD=glucose 6-phosphate dehydrogenase (EC 1.1.1.49); PKM=pyruvate kinase (EC 2.7.1.40); PGAM=phosphoglycerate mutase (EC 5.4.2.1); GK=glycerol kinase (EC. 2.7.1.30); KYAT=kynurenine-oxoglutarate transaminase (EC 2.6.1.7); ALTA=alanine transaminase (EC 2.6.1.2); GDH=glycerol dehydrogenase (EC 1.1.1.72); SPT=serine-pyruvate transaminase (EC 2.6.1.51);
inhibitors. DHEA=dehydroepiandrosterone; EGCG=epigallocatechin-3-gallate; DCA=3-dihydroxypropyl 2,2-dichloroacetate)
Compass integrates scRNA-Seq profiles with prior knowledge of the metabolic network to infer a cell's metabolic state (
More rigorously, Compass scores reflect the propensity of cells to use certain reactions. Advances in scRNA-Seq provide scalable methods to count transcripts comprehensively and at a single-cell resolution26,27, in ways that are not yet possible for other molecules, such as proteins. Therefore, studies often turn to gene expression in order to explore changes in cellular metabolic states. However, expression of a gene coding a certain enzyme do not always correlate with actual reaction flux28,29, e.g., due to post transcriptional or post-translational modifications. Pathway-based analysis mitigates this concern by pooling information across genes and consequently enhancing robustness in the face of expression measurement noise, but it relies on a predetermined set of canonical metabolic pathways that do not fully capture the complexity of the metabolic network30,31. Compass bridges this gap by using in silico modeling that helps determine which reactions are most likely promoted by the entire metabolic transcriptome. Further, Compass does not rely on predetermined pathway definitions, but derives metabolic pathways based on the observed data in an unsupervised manner.
Compass belongs to the family of Flux Balance Analysis (FBA) algorithms that model metabolic fluxes, namely the rate by which the substrates of a chemical reaction are converted to the reaction's products32. Its definition relies on a choice of an arbitrarily large set of arbitrary FBA objectives, which for simplicity Applicants defer to the Methods section, and instead describe a useful special case in which the objectives represent single-reactions. For each reaction, Compass determines the maximal flux it can carry, and then scores how well aligned is a cell's network-wide transcriptome with the objective of carrying that flux. Intuitively, Compass assumes that if the network-wide transcriptome of a particular cell supports carrying a large flux on a particular reaction, then this reaction is most likely active in the cell, even if its particular gene-coding enzyme is lowly expressed. Thus, a score reflects the propensity of a particular cell to use a particular reactions, which Applicants interpret as a proxy to the activity level of that reaction in that cell.
The framework allows formulating the aforementioned computation as a linear program and solving it efficiently. Like GIMME33, Compass penalizes reactions inversely to the expression of mRNA associated with their enzymes (making the simplistic, yet common modeling assumption34 that mRNA levels correlate with enzymatic activity). The compass score cr,i of reaction r in cell i is the minimal network-level penalty subject to constraining the GEM of i to carry its maximal possible flux through r (up to a multiplicative slack factor). It therefore reflective of how well aligned is the transcriptome of cell i with the objective of carrying high flux through r.
Compass leverages the statistical power afforded by the large number of observations (i.e., single cells) in a typical scRNA-Seq study. This power allows downstream analysis to gain biological insight despite the high dimension of the metabolic space in which Compass embeds cells. However, scRNA-Seq presents unique challenges due to the small quantity of RNA that can be extracted from a single cell5. Sampling bias and transcription stochasticity lead to an abundance of dropouts, i.e., false-negative gene detections, and to variance overestimation of lowly expressed genes, leading in turn to false-positive differential expression. Similar to other scRNA-Seq algorithms, Compass mitigates these effects with an information-sharing approach35-37. Instead of treating each cell in isolation, the flux vector for each cell is determined by balancing its own gene expression with that of its k-nearest neighbors based on similarity of their RNA profiles (Methods).
Th17 functional diversity can be studied in vitro by polarizing them with either IL-1β+1L-6+1L-23 or TGF-β1+IL-6, which upon adoptive transfer into wildtype mice lead to severe or mild-to-none experimental autoimmune encephalomyelitis (EAE), respectively38,39 Applicants name those states “pathogenic” (Th17p) and “non-pathogenic” (Th17n), respectively. As described in ref. 19 Applicants sequenced CD4+naïve T cells 48 hrs post polarization under one of these conditions, ultimately retaining after quality tests 130 unsorted Th17n cells (henceforth Th17nu), 151 IL-17A/GFP+Th17n cells, and 139 IL-17A/GFP+Th17p cells. In this study, Applicants analyze the unsorted and sorted cells independently from one another. The unsorted cells are used to discover cell-to-cell heterogeneity within a seemingly homogenous population, whereas the sorted cell will be used for comparative study of the two polarizing conditions.
Beginning with the unsorted Th17nu population, Applicants computed the compass score for each metabolic reaction in each of the cells (Methods), producing a compass-score matrix of 6,563 reactions X 130 cells. Applicants hierarchically clustered the reactions (i.e., rows of the matrix) and merged reactions that were highly correlated across the entire dataset (Spearman rho≥0.98) into meta-reactions. This resulted in a compass-score matrix of 1730 meta-reactions X 130 cells, and with 76% of the meta-reaction composed of 3 reactions or less (
Gene expression analysis treat cells as points in a high-dimensional vector whose coordinates correspond to genes (or transcripts). Similarly, the Compass output allows studying the cells in a high-dimensional metabolic space whose coordinates correspond to meta-reactions. The first principal component (PC) of the metabolic (Compass) space corresponded to the cell's metabolic activity (
To predict metabolic regulators of Th17 pathogenicity, Applicants ranked metabolic reactions according to their correlation with a computational transcriptome signature of Th17 pathogenicity in Th17nu and Th17n (
Surprisingly, not all glycolytic reactions were correlated with Th17 pathogenicity. To test this prediction, Applicants picked the top two reactions that were most positively correlated and top two that were most negatively correlated with the pathogenicity score (
Applicants proceeded to sequence RNA libraries from Th17n and Th17p under two inhibitors, DHEA and EGCG whose corresponding reactions were predicted to be the most pro- and anti-pathogenic, and were indeed found to significantly suppress or promote IL-17 expression, respectively. A PCA analysis of gene expression confirmed the validity of the dataset (
In conclusion, Compass correctly predicted metabolic targets whose deletion affected Th17 pathogenicity. Importantly, it was able to pinpoint a glycolytic reaction that suppresses Th17 pathogenicity, which runs contrary to the ubiquitous observation that aerobic glycolysis is associated with an activated T cell state.
The division of the metabolic network into functional pathway, namely groups of topologically adjacent reactions thought to operate coherently, is indispensable in the study of metabolism. Nonetheless, the canonical textbook pathways may not translate between different cellular environments, between organisms, or between healthy and disease (e.g., cancerous) cells. Carbon tracing studies have already observed metabolic flows that are contrary to long-standing beliefs44, as well as inter-cellular division of labor across canonical pathways45.
Applicants therefore suggest data-driven metabolic pathways as a valuable data-exploratory tool. Learning pathways from the data affords complementary strengths to the ubiquitously employed enrichment analysis versus collections of a priori defined pathways, such as KEGG or MetaCyc. Previous studies have suggested this concept31,46 but did not benefit from the statistical power of scRNA-Seq available through Compass.
To find data-driven pathways in a given set of cells, Applicants define the distance between metabolic reactions based on cosine dissimilarity of their Compass profiles across the set, and use it to construct a k-nearest neighbor (kNN) graph over the set of metabolic reactions (Methods). Communities in the graph, found for example by the Louvain algorithm, are defined as the data-driven pathways.
Applicants applied this procedure separately to Th17n and Th17p cells to learn the difference in their metabolic rewiring in an unsupervised manner (
To study the inhibitors' effects in vivo, Applicants activated 2D2 TCR-transgenic Th17 cells in the presence of an inhibitor or vehicle and adoptively transferred them back to test animals. In accordance with computational predictions, EGCG-treated Th17n cells successfully induced EAE, whereas untreated cells failed to produce any consequential neuroinflammation (
Applicants presented Compass—a flux balance algorithm for the study of metabolic heterogeneity among cells based on single-cell transcriptome profiles. The algorithm is applicable to any cell type whose transcriptome can be sequenced. Applicants used it to analyze a Th17 dataset and look for metabolic correlates of a transcriptomic pathogenicity signature. Compass correctly predicted a glycolytic reaction that, common to common understanding, promotes a pro-regulatory rather than a pro-inflammatory phenotype, as well as a pro-inflammatory role for the polyamine pathway that is studied in depth in an accompanying manuscript.
For computational tractability, static FBA algorithms assume that the system operates in chemical steady state (Varma and Palsson 1994). Even under this assumption, there remain an infinite number of feasible flux distributions—assignments of an activity level to each flux in the network—that satisfy the preset biochemical constraints (Methods). Therefore, most studies assume that the system (here, a cell) aims to optimize some metabolic function, usually production of biomass or ATP47. However, whereas such objectives may successfully predict unicellular organisms' phenotypes48, they are ill-suited for studying mammalian cells49. To overcome this challenge, rather than optimizing a single metabolic objective function, Compass optimizes an arbitrarily large set of arbitrary objectives that together capture multiple facets of the cell's metabolic capabilities. The vector of optimal values obtainable in these objective represents a cell as a point in a space whose dimension is the set's size, which Applicants denote the Compass space. The set of objectives. A biological signal can be detected in the high-dimension owing to the statistical power afforded by the large number of sequenced libraries in a typical scRNA-Seq. Nonetheless, there is no obstacle preventing one from running Compass on bulk RNA data (typically while setting the parameter lambda to 0 to prevent information sharing between RNA libraries) as an exploratory analysis method.
The metabolic reconstruction Applicants employed represents the overall metabolic capabilities of a human cell. As such, it contains reactions that may not be available to the studied cell type—a concern that can be remedied to some extent by procedures for deriving organ-specific metabolic models (Opdam et al. 2017). Moreover, Applicants used the network to study murine data because no recent and equally validated reconstruction exists for mouse. Last, the metabolic profile of a cell depends on the nutrients available in its environment, which are often poorly characterized. The computations are based on a rich in silico environment, and modifying the latter to better represent physiological conditions should increase the algorithm's predictive capabilities.
Software. Compass is available at github.com/YosefLab/Compass
The algorithm is highly parallelizable. It currently supports execution on multiple threads in a single machine, submission to a Torque queue, and execution on a single machine on Amazon Web Services (AWS).
Compass algorithm. Compass transforms a gene expression matrix G, where rows represent genes and columns represent RNA libraries (usually, single cells) into a matrix C of Compass scores where rows represent metabolic reactions, columns are the same RNA libraries as in the gene expression, and an entry quantifies a proxy for reaction's activity level. More precisely, the entry quantifies the propensity of the cell to use that reaction, as formalized below.
For clarity purposes, Applicants provide here a high-level description, and defer exact formulation to the Supplementary Methods. Applicants slightly abuse notation in writing ƒ(M) for a matrix M=(mi,j) and a function ƒ: →
to denote the transformation ƒ(M): =(ƒ(mi,j)) where the intention is obvious from the context.
Select a genome-scale metabolic network (GEM) according to the dataset in question. Pick an arbitrary set of m linear objective function over the space of reaction flux distributions. Let p(r) be a monotonically decreasing penalty function defined on [0, ∞). Applicants used p(r): =1/(1+r). Let G be the input gene expression matrix. Only metabolic genes, i.e., ones annotated in the metabolic network, are used. The main steps are (
Preprocessing: for computational tractability, the number of cells in G can be reduced by downsampling or, preferably, micropooling (see below).
Postprocessing: Normalize Craw. Importantly, this step negates the matrix in order to transform the penalties into proxies for metabolic activity. It may also merge similar rows (objectives that resulted in similar profiles across the cells). The resulting m′×n (m′≤m) matrix C is the Compass matrix. C embeds the gene expression profiles in m′.
Metabolic network and choice of objective functions. Applicants used the Recon2 GEM25, which Applicants transformed to a unidirectional network by replacing bidirectional reactions with the respective pair of unidirectional reactions. Throughout this application, metabolic genes are defined as the set of genes annotated in Recon2.
The results of flux balance analysis significantly depend on the nutrients made available to the GEM, referred to as the in silico growth medium. Since exact medium composition is mostly unknown even for common in vitro protocols, and certainly unknown in vivo, Applicants chose a rich in silico medium where all nutrients are made available.
Applicants used an intuitive set of objective functions—for each reaction in the network, Applicants defined one objective function which is to maximize the flux it carries. This allows intuitive interpretation of the Compass scores as quantitative proxies to reaction activities. Some of these objective functions need not be computed in practice because their respective reactions are blocked, namely there exists no feasible solution in which they carry non-zero flux.
Preprocessing. For prohibitively large datasets, Applicants recommend using a micropooling approach50 as implemented in the VISION R package (github.com/YosefLab/VISION), in which small numbers of similar cells are grouped together, their transcriptomic profiles are averaged, and the averaged profile is treated as a single cell in subsequent analysis. This was not necessary for the datasets presented in this application.
Smoothing and information sharing between cells. Applicants computed a k-nearest neighbors (kNN) graph (k=5) based on Euclidean cell-to-cell distances in a reduced-dimension (top 20 PCs) of the gene expression space. A cell's neighborhood for the purpose of computing RN was its k neighbors. Compass also supports a Gaussian kernel smoothing instead of the kNN approach (Supplementary Methods).
Note that Compass easily accommodates bulk RNA data (i.e., standard RNA-Seq where libraries represent many cells) and microarrays by setting λ=0. Applicants chose λ=0.25 for single-cell datasets and λ=0 for bulk RNA libraries.
Postprocessing. Using the objective function defined above, every row in Craw represents a penalty for maximizing or minimizing the flux on a certain unidirectional metabolic reaction. Applicants hierarchically clustered the rows by Spearman distance, and merged together leaves in which Spearman similarity (namely 1−ρ, with ρ being Spearman's correlation) by averaging the respective rows. Applicants call the resulting clusters meta-reactions and each represents a set of closely correlated metabolic reactions. Importantly, the division into meta-reactions is data-driven and does not rely on canonical metabolic pathway definitions (
Let Cmeta-raw be the result of the merging step. By definition, all its entries are non-negative. Normalize it as follows:
Bulk RNA-Seq and SMART-Seq2 analysis. Applicants aligned single-cell SMART-Seq2 libraries with Bowtie2, quantified TPM gene expression with RSEM, and performed QC as Applicants described in detail in a previous publication51. This computational pipeline is a massively revised and updated version of the one originally used to analyze these libraries19. Batch effects and other nuisance factors were normalized with a model chosen empirically with SCONE 52. Bulk RNA-Seq were processed with a modified variant of the same pipeline. In the absence of UMIs, differentially expressed genes were called through a linear model fitted to TPM values with the limma R package and with a mean-variance trend added to the empirical bayes prior53,54.
Mice. C57BL/6 wildtype mice (WT) were obtained from Jackson laboratory (Bar Harbor, Me.) (IL-17A.GFP, 2D2 mice PDK4). All experiments were approved by and carried out in accordance with guidelines of the Institutional Animal Care and Use Committee (IACUC) at Harvard Medical School.
T cell differentiation culture and Flow cytometry. Naïve CD4+CD44-CD62L+CD25− T cells with or without including IL-17A.GFP+ were sorted using BD FACSAria sorter and activated with plate-bound anti-CD3 and antiCD28 antibodies (each 1 μg/ml) in the presence of cytokines at a concentration of 5×105 cells/ml. For T cell differentiations the following combinations of cytokines were used: pathogenic Th17 (Th17p): 25 ng/ml rmIL-6, 20 ng/ml rmIL-1b (both Miltenyi Biotec) and 20 ng/ml rmIL-23 (R&D systems); non-pathogenic Th17 (Th17n): 25 ng/ml rmIL-6 and 2 ng/ml of rhTGFb1 (Miltenyi Biotec). For differentiation experiments, cells were harvested at 72 hours and were performed in the presence or absence of 50 μM EGCG (Selleck Chemicals), 50 μM DHEA, 40 μM DCA, 10 μM Shikonin (all Sigma) as indicated.
Intracellular cytokine staining was performed after incubation for 4-6h with Cell Stimulation cocktail plus Golgi transport inhibitors (Thermo Fisher Scientific) using the BD Cytofix/Cytoperm buffer set (BD Biosciences) per manufacturer's instructions. Transcription factor staining was performed using the Foxp3/Transcription Factor Staining Buffer Set (eBioscience).
Proliferation was assessed by staining with CellTrace Violet (Thermo Fisher Scientific) per manufacturer's instructions. All stainings were analyzed on a FACS LSR II (BD Bioscience) and using FlowJo software.
Legendplex. Cytokine concentrations in supernatants of in vitro cultures were analyzed by the LegendPlex Mouse Th Cytokine Panel (13-plex) (BioLegend) according to the manufacturer's instructions and analyzed on a FACS LSR II (BD Biosciences).
Antibodies. All other flow cytometry antibodies were purchased from BioLegend.
Experimental Autoimmune Encephalomyelitis (EAE). For active EAE immunization, MOG35-55 peptide was emulsified in complete freund adjuvant (CFA). Equivalent of 40 μg MOG peptide was injected per mouse subcutaneously followed by pertussis toxin injection intravenously on day 0 and day 2 of immunization. Mice were monitored and assigned grades for clinical signs of EAE using the following scoring system: 0, healthy; 1, limp tail; 2, impaired righting reflex or ataxic gait; 3, hind limb paralysis; 4, total limb paralysis; 5, moribund or death. Mice with a score of >4 were euthanized. If mice died during the course of the experiment, their clinical score of 5 was included in the analysis for the remainder of the experiment.
For adoptive transfer EAE, naïve 2D2 transgenic T cells were sorted and differentiated into Th17n cells+/−EGCG or Th17p+/−DHEA as described for three days followed by a resting phase in the presence of IL-23 alone for 2 days. Cells were then harvested and restimulated with plate-bound anti-CD3 and anti-CD28 for 2 days prior to transfer. 2-8 million cells were transferred per mouse intravenously. EAE was scored as previously published (Jager et al., 2009) or as described above.
Metabolomics/Carbon tracing. For untargeted metabolomics, Th17 cells were differentiated as described. Culture media were snap frozen. Cells were harvested at 96h. 10×106 cells per sample were snap frozen and extracted in either 80% methanol (for fatty acids and oxylipids) or isopropanol (for polar and nonpolar lipids). Two liquid chromatography tandem mass spectrometry (LC-MS) methods were used to measure fatty acids and lipids in cell extracts.
For carbon tracing experiments Th17 cells were differentiated as described. Thereafter, cells were washed and cultured in media supplemented with 8 mM [U-13C]-glucose for 15 min or 3 hrs.
Let |Ci| be the number of carbon atoms in metabolite i, and let xc,i,j be the measured signal of metabolite i in sample j (subsequent to all normalization and QC procedures) in which there are exactly c 13C atoms. The comparisons of carbon flows are based on the statistic yi,j=(Σt=0|C
which measures the 13C ratio of the total carbon contents for metabolite i in sample j.
The Compass Algorithm
The first step in Compass is to create the R matrix, which assigns, for each cell, an expression value to each metabolic reaction. This is done using the boolean gene-to-reaction mapping included in the selected GEM [put refs with similar methods].
If a single gene with linear-scale expression x is associated with the reaction, then the reaction's expression will be log2(x+1). Units of x can be TPMs (as in this application), CPMs, or any other units chosen by the user.
Many reactions, however, are associated with multiple genes and this association is expressed as a Boolean relationship. For example, two genes which encode different subunits of a reaction's enzyme are associated using an AND relationship as both are required to be expressed for the reaction to be catalyzed. Alternately, if multiple enzymes can catalyze a reaction, the genes involved in each will be associated via an OR relationship. For reactions associated with multiple genes in this manner, the Boolean expression is evaluated by taking the sum or the mean of linear-scale expression values (e.g., TPMs) when genes are associated via an OR or AND relationship, respectively. This way, the full gene(s)-to-reaction associations are evaluated to arrive at a single, summary expression value for each reaction in the GEM.
The output of this procedure defines reaction expression as ri(c) for each reaction, i, and each cell, c. This defines the R matrix.
To mitigate the sparseness and stochasticity inherent in single-cell measurements, Compass allows for a degree of information-sharing between cells with similar transcriptional profiles. To accomplish this, a neighborhood reaction expression is computed for each cell which represents a weighted average over expression measurements for similar cells in the data set. To compute this neighborhood reaction expression, two procedures are available to be selected at runtime: k-nearest neighbors (knn) or gaussian. Regardless of choice, first, the full gene expression matrix is reduced to a lower dimensional representation with PCA (20 components). Next, if the gaussian method is selected, a gaussian kernel is used to define cell-to-cell weights which describe the local neighborhood around each cell:
where Δij represents the Euclidean distance between cell i and cell j in the reduced PCA space and σi2 is computed for each cell using a supplied perplexity parameter and the method as described in the tSNE algorithm55. The weights for each cell (rows of the w matrix) are then normalized to sum to 1. Alternately, if the knn method is selected, the weights w11 are defined as 1/k if cell j is one of the k-nearest-neighbors (in the reduced PCA-space) of cell i, and zero otherwise. The number of neighbors (k) can be defined by the user at run-time, though Applicants recommend values in the range of 10-30. The weights resulting from either method are then used as mixing coefficients to arrive at neighborhood reaction expression values, ri(C):
j
(c)=Σjwcjri(j)
The
Let p(r) be a monotonically decreasing penalty function defined on [0, ∞). Applicants took p(r): =1/(1+r). The overall reaction penalty vector is a combination of the individual reaction penalties, p(ri(c)), and the neighborhood reaction penalties p(ri(c)), with the parameter 0≤λ≤1 used to define the mixing ratio.
{circumflex over (p)}
i
(c)=(1−λ)·p(ri(c))+λ·p(ri(c))
The {circumflex over (p)}i(c) values define the {circumflex over (P)} matrix.
The reaction penalties described up to this point only make use of the expression data associated with individual reactions. To integrate the full topology and stoichiometry of the GEM into the determination of the networks ability to carry flux through a reaction, Applicants make use of Flux Balance Analysis.
First, the GEM is transformed to be unidirectional. Each reaction is split into a pair of reactions proceeding in opposite directions and with added constraints only allowing positive reaction flux.
For simplicity of notation, Applicants define Compass below with the set of objective functions used in this application. Namely, m objectives where each one is maximization of one of the m unidirectional models in the network. Applicants further ignore the presence of blocked reactions, that in practice can be excluded to speed the computation. One may supplement or replace these objectives with other linear functions that pertain to cellular metabolism, such as maximization of biomass or ATP production.
Let S be the stoichiometric matrix defined in the GEM, where rows represent metabolites, columns represent reactions, and entries are stoichiometrical coefficients for the reactions comprising the metabolic network. Reactions for uptake and secretion of a metabolite are encoded as having only a coefficient of 1 and −1 in the metabolite's row entry, respectively, and 0 otherwise.
For each reaction r, a linear program computes the maximum amount of flux the network can produce, subject to steady-state and directionality constraints. rev(r) is the reverse unidirectional reaction of r, which has the same stoichiometry but proceeds in the opposite direction.
v
r
opt=max{v∈m}vr
s.t. (i) S·v=0
(ii) α≤v≤β
(iii) vrev(r)=0
Constraint (i) constrains the system to steady state (Varma and Palsson 1994), and constraint (ii) is interpreted as ∀i=1, . . . , n. αi≤vi≤βi and encodes directionality and capacity limits for reactions, including uptake and secretion limits. Constraint (iii) ensures that when evaluating the maximum flux for each reaction, its reverse reaction carries flux to avoid the creation of a futile cycle. This does not prevent futile cycles longer than 2 edges, which can be avoided only by more time-consuming computations (Schellenberger et al. 2011).
Note that this computation described thus far is independent of any expression values and can be computed in advance for each GEM and cached for computational efficiency.
Next, for each reaction, a second linear program is used to evaluate the ability of the network to produce flux near this optimal value, while penalizing flux through reactions with lower expression support. This program minimizes the dot product between the flux distribution vector and reaction penalties while constraining the flux for the current reaction to remain within ω=0.95 range of its optimum. The minimum penalty yr(c) for cell c and reaction r is:
y
r
(c)=min{v∈m}Σivi{circumflex over (p)}i(c)
s.t. (i) S·v=0
(ii) α≤v≤β
(iii) vrev(r)=0
(iv) vr≥ω·vropt
(recall that that the GEM is unidirectional and therefore ∀i. vi>0)
A high penalty yr(c) indicate that cell c is unlikely, judged by transcriptomic evidence, to use reaction r. Cells whose transcriptome are overall more aligned with an ability to carry flux through a reaction will be assigned a lower penalty for that reaction.
The minimum penalty yr(c) define the matrix Craw, which has only non-negative entries by definition. Applicants transform it into a non-negative matrix where high score indicate high propensity to use a certain reaction by taking −log(1+Craw) and then subtracting the minimal value of the resulting matrix from all its entries.
The resulting scores are indicative of a cell's propensity to use a certain reaction. Applicants interpret it as a proxy for the activity level of the reaction in that cell.
Applicants also implemented a second variant of the Compass procedure described above, where objective functions are based on the network's metabolites, rather than reactions. For every metabolite, Applicants define two objective functions—one to maximize its uptake, and one to maximize its secretion.
To assign uptake and secretion scores for a given metabolite, the procedure described above is used with a small modification. If an uptake or secretion reactions exist already in the GEM, they are evaluated in the same manner as other metabolic reactions and the resulting reaction score is used. Otherwise, an uptake/secretion reaction is added to the GEM and its resulting score is used.
Cellular metabolism can orchestrate immune cell function. Applicants previously demonstrated that lipid biosynthesis represents one such gatekeeper to Th17 cell functional state. Utilizing Compass, a transcriptome-based algorithm for prediction of metabolic flux, Applicants constructed a comprehensive metabolic circuitry for Th17 cell function and identified the polyamine pathway as a candidate metabolic node, the flux of which regulates the inflammatory function of T cells. Testing this prediction, Applicants found that expression and activities of enzymes of the polyamine pathway were enhanced in pathogenic Th17 cells and suppressed in regulatory T cells. Perturbation of the polyamine pathway in Th17 cells suppressed canonical Th17 cell cytokines and promoted the expression of Foxp3, accompanied by dramatic shift in transcriptome and epigenome, transitioning Th17 cells into a Treg-like state. Genetic and chemical perturbation of the polyamine pathway resulted in attenuation of tissue inflammation in an autoimmune disease model of central nervous system, with changes in T cell effector phenotype.
Th17 cells and FoxP3+ regulatory T cells play a key role in maintaining the balance between inflammatory and regulatory functions in the immune system. One key aspect is the balance between Th17 and Treg cells. FoxP3+ Tregs play a critical role in maintaining immune tolerance, highlighted by loss-of-function mutations in the Foxp3 gene in human, the master regulator of Tregs, results in the development of IPEX syndrome where patients develop a series of autoimmune pathologies (autoimmune enteropathy, type 1 diabetes, dermatitis) and die prematurely. In contrast, Th17 cells have been shown to be critical for the induction of a number of autoimmune diseases including psoriasis, psoriatic arthritis, ankylosing spondylitis, multiple sclerosis and inflammatory bowel disease [1, 2]. While TGFβ alone can induce FoxP3+ Tregs in vitro, the addition of proinflammatory cytokine IL-6 suppresses the generation of FoxP3+ T cells and together with TGFβ induces generation of Th17 cells. This led to the hypothesis that proinflammatory Th17 and regulatory FoxP3+ Tregs are reciprocally regulated, further supported by experiments on the role of these two cytokines in the induction and differentiation of Th17 cells in vivo [3-6].
However, not all Th17 cells are pathogenic or disease inducing, and they also play a protective role in mucosal tissues, promoting tissue homeostasis, maintaining barrier function as well as preventing invasion of microbiota at the mucosal sites [7-12]. Th17 cells that are induced by TGFb+IL-6 in vitro, produce IL-17 but are not capable of inducing potent tissue inflammation/autoimmunity upon adoptive transfer [13-15]. Additional stimuli, such as IL-1b and IL-23, are needed to evoke pathogenic potential in these Th17 cells [13, 14, 16-20]. Therefore, there appear to be at least two different types of Th17 cells: Th17 cells that are present at homeostasis and do not promote tissue inflammation that Applicants have termed nonpathogenic Th17 cells and the Th17 cells which produce IL-17 together with IFN-g and GMCSF induce tissue inflammation and autoimmunity [21]. Different types of Th17 cells have also been identified in humans where Th17 cells akin to mouse pathogenic Th17 cells have been shown to be specific for Candida albicans and non-pathogenic Th17 cells have been shown to be similar to Th17 cells that have specificity for Staphylococcus aureus infection [22]. Thus, Treg, non-pathogenic Th17 cells and pathogenic Th17 cells represent a functional spectrum in tissue homeostasis, disease and infection and can be differentiated reciprocally with different cytokine cocktails in vitro. However, in addition to cytokines, how these cells are generated in vivo and what are the factors that trigger their development of different functional states has not been fully elucidated.
Cellular metabolism is a mediator and modulator of immune cell differentiation and function, which Applicants hypothesized may play a key role in this balance. In a previous study using scRNA-seq of Th17 cells, Applicants identified CDSL as a major regulator that co-varies in its expression with the pro-inflammatory gene module in Th17 cells. Loss of CDSL made Th17 cells highly pathogenic by altering lipid biosynthesis and transcriptional activity of RoR γt, the master transcription factor critical for development and differentiation of Th17 cells [23]. This observation provided a proof of concept that metabolic processes can be directly involved in gene regulation and balancing proinflammatory and regulatory states of Th17 cells.
However, a full appreciation of metabolic circuitry and its connection with immune cell function has been limited by available technologies that typically define the average metabolic state of a large population of cells. Applicants have developed a flux balance analysis algorithm called Compass that allows prediction of metabolic state of a cell using transcriptome data at the single cell level, allowing comprehensive profiling of metabolic pathways even in a smaller number of cells that could not be otherwise interrogated by traditional metabolomic techniques (Wagner et al, 2020). Here, Applicants used the Compass algorithm to interrogate the metabolic status of pathogenic and nonpathogenic Th17 cells using scRNA-seq datasets of Th17 cells. Applicants show that enzymes of the polyamine pathway are suppressed and cellular polyamine content is significantly lower in regulatory T cells and non-pathogenic Th17 cells (Th17n) as compared to pathogenic Th17 cells (Th17p) due to alternative fluxing. Perturbation of the polyamine pathway in Th17 cells suppressed canonical Th17 cytokines and promoted Foxp3 expression, shifting the Th17 cell transcriptome in favor of a Treg-like state. Applicants demonstrated that the polyamine pathway is critical in maintaining the Th17-specific chromatin landscape against the induction of Tregs-like program. Consistent with the cellular phenotype, chemical inhibition and genetic perturbation of the polyamine pathway in T cells restricted the development of autoimmune responses in the EAE model.
To better analyze the metabolic landscape of Th17 cells that may regulate their functional state, Applicants first used two approaches: untargeted metabolomics (
Next, to obtain a comprehensive view of the metabolic state of each cell despite the inability to measure single cell metabolomic profiles, Applicants investigated the metabolic circuitry of Th17 cells using Compass (see example 9, STAR Methods), with the scRNA-seq profiles from sorted IL-17-GFP+Th17 cells [24]. Briefly, Compass is a Flux Balance Analysis (FBA)-based algorithm [25, 26] and utilizes a comprehensive compendium of thousands of metabolic reactions, their stoichiometry, and the enzymes catalyzing them [27]. Compass models in silico the fluxes through the network of metabolic reactions, while accounting for the observed expression levels of enzyme-coding transcripts in each cell. It does so by optimizing a series of objective functions, each corresponding to an individual metabolic reaction (rather than a single FBA objective such as biomass production). The result of the optimization procedure is a score for each reaction in each cell, indicative of the potential of the cell to direct flux through that reaction, given the transcriptome of that cell. The Compass scores matrix is then subject to downstream analysis, while relying on the statistical power afforded by scRNA-Seq to derive biological insight from the high-dimensional matrix.
Analysis of the Compass scores for each reaction across all single cells in the data (
To investigate the polyamine metabolic process (
As the polyamine pathway, similar to most metabolic pathways, is regulated beyond the transcriptional level, Applicants next directly measured total cellular polyamine content using an enzymatic assay (STAR Methods). Compared to Th17p cells, Tregs and Th17n cells have significantly reduced levels of total polyamines (
To further investigate the concentrations and activities of different polyamines in Th17 cells at different functional states, Applicants applied both targeted metabolomics and carbon tracing. Applicants differentiated Th17n and Th17p cells for 68 hours (STAR Methods) and measured the amount of polyamines and related precursors in cell and media by LC/MS (FIGS. 34H and 38B). Consistent with Compass's predictions, there was higher creatine content in Th17n vs. Th17p cells. On the other hand, while the total amount of cellular ornithine, precursor to polyamines, was comparable between Th17n and Th17p cells, there was a significant increase of putrescine and acetyl-putrescine content in Th17p cells (
To directly investigate polyamine biosynthesis, Applicants cultured differentiated Th17n and Th17p cells in the presence of low amount of carbon or hydrogen labeled arginine or citrulline, which can be used to synthesize ornithine, precursor to the polyamine pathway (
To investigate the functional relevance of these metabolic changes, Applicants studied the effects of polyamine pathway inhibitors on differentiation of pathogenic and nonpathogenic Th17 cells in vitro, using previously defined culture conditions. Applicants first used difluoromethylornithine (DFMO), an irreversible inhibitor of ODC1 (
To determine whether DFMO inhibited Th17 cell differentiation, Applicants measured the expression and activity of key transcription factors. Interestingly, DFMO suppressed Rorgt and Tbet expression in Th17p but not Th17n cells (
To determine whether other enzymes of the polyamine pathway could play a similar role in regulating Th17 cell function, Applicants used inhibitors of spermine synthase (SRM), spermidine synthase (SMS), and SAT1 (
Finally, Applicants confirmed that the effect of DFMO is through the inhibition of ODC1, as addition of putrescine to cells treated with DFMO completely reversed their phenotype (
To further confirm the effects of chemical inhibition of polyamine pathway on Th17/Treg differentiation, Applicants tested the impact of genetic perturbation of ODC1 on the differentiation and functions of Th17 cells, using cells isolated from WT and ODC1−/− mice. Similar to DFMO treatment, there was complete inhibition of Th17 canonical cytokines, such as IL-17A, IL-17F and IL-22, but not IFNg, in ODC1−/− Th17 cells (
To gain mechanistic insight on the effects of inhibiting polyamine biosynthesis in Th17 cells, Applicants profiled by RNA-Seq Th17n, Th17p, and iTreg cells treated with DFMO or control. DFMO had a profound impact on the transcriptome of all Th cell lineages, driving Th17 cells towards Treg cell profiles in Principal Components Analysis (PCA) (
The profound impact of DFMO on the transcriptome prompted Applicants to investigate the mechanism by which the polyamine pathway regulates Th17 cell functions. As DFMO does not appear to consistently restrict phosphorylation of key Th17 cell regulators, particularly not in Th17n cells (
To test this hypothesis, Applicants measured chromatin accessibility by ATAC-seq in Th17n and iTregs cells treated with either control or DFMO (STAR Methods). Overall, DFMO treatment resulted in considerable changes in accessible peaks in both types of Th cells (
To investigate which transcription factors (TFs) may be responsible for the suppression of the Th17 specific program and upregulation of the iTreg program, Applicants looked for putative binding sites (based on DNA binding motifs or ChIP-seq data when available) that significantly overlap with regions whose accessibility is modulated by DFMO (
As JMJD3 is a known regulator of T cell plasticity [33], Applicants tested whether it also contributes to the genome-wide shifts induced by DFMO. Applicants analyzed the effect of DFMO on Th17 cells differentiated from naïve CD4 T cells isolated from control or JMJD3fl/flCD4cre mice (
To investigate the relevance of the polyamine pathway in vivo, Applicants took two approaches to perturbing it in the context of CNS autoimmune disease, EAE: chemical inhibition of ODC1 and T-cell specific genetic deletion of SAT1 (
Applicants first tested ODC1 inhibition by adding DFMO in the drinking water for mice immunized with MOG/CFA for the induction of EAE (STAR Methods). DFMO significantly delayed the onset and severity of EAE (
Since administering DFMO in the drinking water could affect multiple cell types, Applicants also genetically deleted SAT1, the rate limiting enzyme of the polyamine pathway, in CD4+ T cells (SAT1fl/flCD4cre). Applicants confirmed that genetic deletion of SAT1 in T cells resulted in loss of polyamine acetylation as reflected in reduced levels of acetyl-putrescine and acetyl-spermidine (
To understand the functional relevance of metabolic pathways in Th17 cells, Applicants utilized metabolomics, a novel computational algorithm (Compass, Wagner et al., 2020, Example 9) and chemical and genetic perturbation to investigate the functional metabolic networks that impact Th17 pathogenicity. In this study, Applicants investigated in depth the metabolic circuitry centered around the polyamine pathway. Applicants demonstrated that 1) At the transcriptome level, Compass points to the significance of the polyamine pathway as a top candidate in association with Th17 cell pathogenicity and implicates reactions upstream and downstream of putrescine to be associated with functional phenotype of Th17 cells; 2) As predicted by Compass and measured by enzymatic assay and LC/MS metabolomics, Applicants showed that Th17 cells at different functional state have alternative metabolic flux anchored around arginine and putrescine, the precursor to polyamines, and that both regulatory T cells and non-pathogenic Th17 cell have reduced cellular content of polyamines; 3) Chemical targeting of multiple enzymes in the polyamine pathway and genetic deletion of ODC1 resulted in suppression of the Th17 functional program and upregulation of Foxp3 in a putrescine dependent manner; 4) Inhibiting polyamine biosynthesis shifts Th17 cells in favor of Treg-like transcriptome and epigenome; 5) Targeting ODC1 and SAT1 both resulted in upregulation of Foxp3 in vivo and inhibition of effector Th17 cells and regulation of EAE. Taken together, Applicants have provided evidence supporting a critical role of the polyamine pathway in suppressing regulatory program in Th17 cells.
Th17 cells are critical in inducing autoimmune inflammation. In fact, loss of all the components in Th17 pathway including TGF-b, IL-6, IL-1 or IL-23 results in inhibition of Th17 differentiation, upregulation of FoxP3+ Tregs and suppression of EAE. Because of reciprocal generation of Tregs vs. Th17 cells, the effects observed with the inhibition of polyamine pathway may be unique to the diseases where Th17 cells are the effector cells. Whether the effect of polyamine pathway can be generalized to other autoimmune diseases (e.g. autoimmune colitis or type 1 diabetes), where Th1 or NK cells are the effectors, need to be further evaluated. In fact the effects of blocking polyamine pathway in diverting Th17 differentiation to Treg phenotype was much more profound in generating nonpathogenic Th17 (differentiation with TGFb) than in pathogenic Th17 cells (differentiation with IL-1b and IL-23). This observation suggests that inhibition of the pathway may have an effect that is unique to Th17 driven diseases.
The significance of the polyamine pathway in autoimmune diseases is further supported by anecdotal data that polyamine levels are increased in several autoimmune diseases [34, 35] and it is thought that aberrant polyamine metabolism can contribute to autoantigen stabilization [36]. Here Applicants present a potential mechanism of how the polyamine pathway can regulate Th17/Treg balance and impact development of autoimmunity. DFMO is an FDA-approved drug for cancer therapy. Applicants showed that DFMO has significant impact in curtailing EAE, providing the grounds/mechanism for drug repurposing. It should be noted that while targeting any enzyme in the polyamine pathway resulted in similar effects in Th17 cells in vitro, genetic manipulation of ODC1 and SAT1 are not identical in that while both ODC1 and SAT1 deletion promoted Foxp3 expression (
By studying the metabolic differences within the same lineage of effector Th17 cells, Applicants unexpectedly uncovered a central role of the polyamines in regulating Th17-Treg balance. This study suggests a functional role of metabolic pathways beyond energy production. One of the observations made in this study is the role that polyamine pathway plays in shaping the epigenetic landscape of differentiating immune cells. In fact, looking at the ATACseq and RNAseq profiles of Th17 cells activated in the presence of inhibitors of the polyamine pathway shows profound global ATACseq changes concomitantly with changes in transcription, differentiation and function. Polyamines appear to regulate gene expression, cell proliferation and stress responses due to their ability to bind to nucleic acids (both DNA, RNA), alter posttranslational modification and regulate ion channels [37, 38]. A number of studies have suggested the role of polyamines in regulating gene expression due to their polycationic nature and ability to function as a sink to S-adenosylmethionine and Acetyl-coA, both critical metabolites for histone modifications [29, 30, 39, 40]. Furthermore, intracellular polyamines and their analogues are also known to inhibit lysine-specific demethyltransferases such as LSD1 [41] and thereby changing epigenetic landscape affecting development and differentiation. Thus, it stands to reason that metabolic processes that impact polyamines will not only affect energetics but more broadly including shaping the epigenome and transcriptome by the resultant metabolites that are produced during the process of development or differentiation. In this vein, a number of developmental disorders (eg., Snyder-Robinson syndrome) have been associated with maladapted polyamine metabolism [42].
It is very clear that when immune cells take up residence in different tissues they also change their transcriptomes and attain specialized or different functions. Notable examples of this issue has been shown in tissue Tregs [43] and macrophages [44], where the cells look very different transcriptomically depending on the tissue of residence. Applicants and others have observed a similar situation in Th17 cells, where they differ in their function of whether they are in lymph nodes, gut or CNS, as observed by the scRNAseq analysis of Th17 cells [23, 24]. Based on the studies, presented here, Applicants suggest that the metabolic activity of the cell within a defined tissue may have a profound impact in the epigenome and transcriptome, resulting in their changed or specialized functions. With the emerging cell atlases and mapping transcriptome of tissues resident immune cells at the single cell level, the Compass algorithm will provide a powerful tool for studying metabolic pathways across different cell types in different tissues, taking advantage of the wealth of single cell data sets that are being published.
In summary, this study highlights the advantage of utilizing single cell genomics and novel algorithms in studying cellular metabolism, providing roadmaps for studying metabolic pathways in immune cells across normal or diseased tissues. The study validates the predictions made by algorithms, both in vitro and in vivo and shows that interfering with these metabolic pathways identified by Compass have profound effect on the function of the effector cells, by regulating both epigenome and transcriptome of the Th17 cell.
Mice. C57BL/6 wildtype (WT) were obtained from Jackson laboratory (Bar Harbor, Me.). SAT1flox mice were kindly provided by Dr. Manoocher Soleimani (University of Cincinnati), which Applicants crossed to CD4cre to generate conditional T cell deletion of SAT1. Note that only male mice were used in all experiments as SAT1 is an X chromosome gene and female mice have incomplete deletion due to random inactivation of x chromosome. ODC1fl/flCD4cre were gifted by Dr. Erika Pearce (Max Planck Institute). For experiments, mice were matched for sex and age, and most mice were 6-10 weeks old. Littermate WT or Cre− mice were used as controls. All experiments were conducted in accordance with animal protocols approved by the Harvard Medical Area Standing Committee on Animals or BWH IACUC.
T cell differentiation culture and flow cytometry. Naïve CD4+CD44-CD62L+CD25− T cells were sorted using BD FACSAria sorter and activated with plate-bound anti-CD3 (1 μg/ml) and antiCD28 antibodies (1 μg/ml) in the presence of cytokines at a concentration of 5×105 cells/ml. For T cell differentiations the following combinations of cytokines were used: pathogenic Th17: 25 ng/ml rmIL-6, 20 ng/ml rmIL-1b (both Miltenyi Biotec) and 20 ng/ml rmIL-23 (R&D systems); non-pathogenic Th17: 25 ng/ml rmIL-6 and 2 ng/ml of rhTGFb1 (Miltenyi Biotec); iTreg: 2 ng/ml of rhTGFb1; Th1: 20 ng/ml rmIL-12 (R&D systems); Th2: 20 ng/ml rmIL-4 (Miltenyi Biotec). Intracellular cytokine staining was performed after incubation for 4-6h with Cell Stimulation cocktail plus Golgi transport inhibitors (Thermo Fisher Scientific) using the BD Cytofix/Cytoperm buffer set (BD Biosciences) per manufacturer's instructions. Transcription factor staining was performed using the Foxp3/Transcription Factor Staining Buffer Set (eBioscience). Proliferation was assessed by staining with CellTrace Violet (Thermo Fisher Scientific) per manufacturer's instructions. Apoptosis was assessed using Annexin V staining kit (BioLegend). Phosphorylation of proteins to determine cell signaling was performed with BD Phosflow buffer system (BD bioscience) as per manufacturer's instructions.
Inhibitors and metabolites. For differentiation experiments, cells were harvested at 72 hours and were performed in the presence or absence of 100-200 μM DFMO, 500 μM trans-4-Methylcyclohexylamine (MCHA, both Sigma), 500 μM N-(3-Aminopropyl)cyclohexylamine (APCHA, Santa Cruz Biotechnology), 50 μM Diminazene aceturate (Dize, Cayman Chemical) with or without 2.5 mM Putrescine (Sigma, P7505) as indicated.
Compass analysis. Compass is descried in detail in Example 9. In the following Applicants provide a high level description of the algorithm.
Compass integrates scRNA-Seq profiles with prior knowledge of the metabolic network to infer a metabolic state of the cell. The metabolic network Applicants use here consists of 7,440 reactions and 2,626 metabolites (Recon2 database, [27]), along with reaction stoichiometry, gene-enzyme-reaction associations and biochemical constraints (such as reaction irreversibility and nutrient availability).
Compass builds on the paradigm of Flux Balance Analysis (FBA) to model metabolic fluxes, namely the rate by which chemical reactions convert substrates to products [25, 26, 45, 46] (Orth, Thiele, and Palsson 2010; O'Brien, Monk, and Palsson 2015; Lewis, Nagarajan, and Palsson 2012; Palsson 2015). The modeling is based on linear programming, maximizing a certain objective (here, flux through a given reaction), while using the metabolic network to pose constraints.
In its first step, Compass is agnostic to any measurement of gene expression levels and computes, for every metabolic reaction r, the maximal flux vropt it can carry without imposing any constraints on top of those imposed by stoichiometry and mass balance. Next, Compass assigns every reaction in every cell a penalty inversely proportional to the mRNA expression associated with its enzyme(s) in that cell. Compass then finds a flux distribution which minimizes the overall penalty incurred in any given cell i (summing over all reactions), while maintaining a flux of at least 0.95·vropt in r. The Compass score of reaction r in cell i is the negative of that minimal penalty (so that lower scores correspond to lower potential metabolic activity). Intuitively, these scores reflect how well adjusted is each cell's transcriptome to maintaining high flux through each reaction. To reduce the effects of data sparsity (characteristic of scRNA-Seq) Compass uses an information-sharing approach. Instead of treating each cell in isolation, the score vector for each cell is determined by a combined objective that balances the effects in the cell in question with those in its ten nearest neighbors (based on similarity of their RNA profiles).
After applying Compass to the scRNA-Seq of Th17 cells, Applicants aggregated reactions that were highly correlated across the entire dataset (Spearman rho>0.98) into meta-reactions (with median of two reactions per meta-reaction) for downstream analysis. For the ranking analysis in
qPCR. RNA was isolated using RNeasy Plus Mini Kit (Qiagen) and reverse transcribed to cDNA with iScript cDNA Synthesis Kit (Bio-Rad). Gene expression was analyzed by quantitative real-time PCR on a ViiA7 System (Thermo Fisher Scientific) using TaqMan Fast Advanced Master Mix (Thermo Fisher Scientific) with the following primer/probe sets: Ass1 (Mm00711256_m1), Odc1 (Mm02019269_g1), Sat1 (Mm00485911_g1), Srm (Mm00726089_s1), Sms (Mm00786246_s1), Il-17a (Mm00439618_m1), Il-17f (Mm00521423_m1), Foxp3 (Mm00475162_m1), Tead1 (Mm00493507_m1), Taz (Mm00504978_m1), and Actb (Applied Biosystems). Expression values were calculated relative to Actb detected in the same sample by duplex qPCR.
Polyamine ELISA. Cell pellets of in vitro differentiated cells were frozen down and further processed with the Total Polyamine Assay Kit (BioVision Inc.) according to the manufacturer's instructions.
Metabolomics/Carbon tracing. For untargeted metabolomics, Th17 cells were differentiated as described. Culture media were snap frozen. Cells were harvested at 96h. 1×106 cells per sample were snap frozen and extracted in either 80% methanol (for fatty acids and oxylipids) or isopropanol (for polar and nonpolar lipids). Two liquid chromatography tandem mass spectrometry (LC-MS) methods were used to measure fatty acids and lipids in cell extracts.
For carbon tracing experiments Th17 cells were differentiated as described. At 48h, cells were washed and cultured in media supplemented with Arginine (13C6, Sigma, Cat #643440) or aspartic acid (13C4, Sigma, Cat #604852) for 1, 5 and 24 hours.
Legendplex. Cytokine concentrations in supernatants of in vitro cultures were analyzed by the LegendPlex Mouse Th Cytokine Panel (13-plex) (BioLegend) according to the manufacturer's instructions and analyzed on a FACS LSR II (BD Biosciences).
RNA-seq. For population (bulk) RNA-seq, in vitro differentiated T-cells were sorted for live cells and lysed with RLT Plus buffer and RNA was extracted using the RNeasy Plus Mini Kit (Qiagen). Full-length RNA-seq libraries were prepared as previously described [47] and paired-end sequenced (75 bp×2) with a 150 cycle Nextseq 500 high output V2 kit.
Bioinformatic analysis of RNA-seq data. Alignment, quantification, and computation of pathogenicity signatures based on single-cell transcriptomes were conducted as described in the cosubmitted manuscript (Wagner et al.). Briefly, raw scRNA-seq reads from Gaublomme et al. (2015) [24] (
To compute a pathogenicity score for each cell Applicants used a similar scheme as in [24]: For each cell Applicants take the average z-scored normalized log expression of pro-pathogenic markers (CASP1, CCL3, CCL4, CCL5, CSF2, CXCL3, GZMB, ICOS, IL22, IL7R, LAG3, LGALS3, LRMP, STAT4, TBX21) and of pro-regulatory markers (AHR, IKZF3, IL10, IL1RN, IL6ST, IL9, MAF), with the latter multiplied by−1.
Bulk RNA libraries from DFMO- or vehicle-treated Th17p, Th17n, or Treg were studied with 3 replicates per condition for a total of 18 libraries as shown in
ATAC-seq. For population ATAC-seq, in vitro differentiated T-cells were sorted for live cells and stored in Bambanker freezing media (Thermo Fisher Scientific) at −80° C. until further processing. Prior to library preparation, cells were thawed at 37° C. and washed with PBS. For ATAC-seq, cell pellets were lysed and tagmented in 1×TD Buffer, 0.2 ul TDE1 (Illumina), 0.01% digitonin, and 0.3×PBS in 40 ul reaction volume following the protocol described in [49]. Transposition reactions were incubated at 37° C. for 30 min at 300 rpm. The DNA was purified from the reaction using a MinElute PCR purification kit (QIAGEN). The whole resulting product was then PCR-amplified using indexed primers with NEBNext High-Fidelity 2X PCR Master Mix (NEB). First, Applicants performed 5 cycles of pre-amplification. Applicants sampled 10% of the pre-amplification reaction for SYBR Green quantitative PCR to assess the number of additional cycles needed for final amplification. After purifying the final library with the MinElute PCR purification kit (QIAGEN), the library was quantified for sequencing using qPCR and a Qubit dsDNA HS Assay kit (Invitrogen). Libraries were sequenced on an Illumina NextSeq 550 system with paired-end reads of 37 base pairs in length.
Alignment of ATAC-Seq and Peak Calling. All ATAC-Seq reads were trimmed using Trimmomatic [50] to remove primer and low-quality bases. Reads <36 bp were dropped. Reads were then passed to FastQC [www.bioinformatics.babraham.ac.uk/projects/fastqc/] to check the quality of the trimmed reads. The paired-end reads were then aligned to the mm10 reference genome using bowtie2 [51], allowing maximum insert sizes of 2000 bp, with the “—no-mixed” and “—no-discordant” parameters added. Reads with a mapping quality (MAPA) below 30 were removed. Duplicates were removed with PicardTools, and the reads mapping to the blacklist regions and mitochondrial DNA were also removed. Reads mapping to the positive strand were moved +4 bp, and reads mapping to the negative strand were moved −5 bp following the procedure outlined in [52] to account for the binding of the Tn5 transposase.
Peaks were called using macs2 on the aligned fragments [53] with a qvalue cutoff of 0.001 and overlapping peaks among replicates were merged.
Tests of Differential Accessibility. Differential accessibility was assessed using DESeq2 [54] on with a matrix of peaks (merging all samples) by samples. Similar to common practice in the analysis of differential gene expression, the analysis of differential accessibility was conducted using the number of observed Tn5 cuts (i.e., number of reads).
Peaks that are associated with a Th17 or Treg programs (orange and purple, respectively, in
Reprocessing of published ChIP-Seq data. ChIP-Seq Peaks from Xiao et al 2014 [32] were transferred from mm9 to mm10 using the UCSC liftOver tool. ChIP-Seq replicates from Ciofani et al 2012 were downloaded and were trimmed using Trimmomatic [26] to remove primer and low-quality bases. Reads were then passed to FastQC [www.bioinformatics.babraham.ac.uk/projects/fastqc/] to check the quality of the trimmed reads. These single-end reads were then aligned to the mm10 reference genome using bowtie2 [27], allowing maximum insert sizes of 2000 bp, with the “—no-mixed” and “—no-discordant” parameters added. Reads with a mapping quality (MAPA) below 30 were removed. Duplicates were removed with PicardTools, and the reads mapping to the blacklist regions and mitochondrial DNA were also removed.
ChIP-Seq peaks were called in each replicate, versus a control sample, using macs2 [29] with a qvalue cutoff of 0.05.
Enrichment of motifs and ChIP-seq peaks in differentially accessible regions. Peaks were considered differentially accessible if they had a BH-adjusted p<0.05. Applicants calculated fold enrichment of various genomic features in these peaks (described below) versus a background set of peaks. q-values were estimated using q-value package. [Storey J D, Bass A J, Dabney A, Robinson D. qvalue: Q-value estimation for false discovery rate control. github.com/jdstorey/qvalue]
Motifs/Annotation Tracks. PWM's for motifs were downloaded from the 2018 release of JASPAR [55, 56]. Applicants used fimo [56] to identify motifs in mm10, and applied the default threshold of 1e-4. Applicants also included regulatory features from the ORegAnno database[57], (iii) conserved regions annotated by the multiz30way algorithm, and repeat regions annotated by RepeatMasker (www.repeatmasker.org).
GREAT Pathways/Genes. Loci were associated with pathways using GREAT[58], submitted with the rGREAT package (github.com/jokergoo/rGREAT). Applicants retrieved pathways found in the MSigDB Immunologic Signatures, MSigDB Pathway, and GO Biological Process databases. Loci were mapped to genes using GREAT.
Experimental Autoimmune Encephalomyelitis (EAE). For active EAE immunization, MOG35-55 peptide was emulsified in complete freund adjuvant (CFA). Equivalent of 40 μg MOG peptide was injected per mouse subcutaneously followed by pertussis toxin injection intravenously on day 0 and day 2 of immunization. Mice were treated with 0.5% DFMO in drinking water for 10 days as indicated. DFMO was replenished every third day.
Statistical Analysis. Unless otherwise specified, all statistical analyses were performed using the two-tail student t test using GraphPad Prism software. P value less than 0.05 is considered significant (P<0.05=*; P<0.01=**; P<0.001=***) unless otherwise indicated.
Cellular metabolism is a major regulator of immune response, but it is difficult to study the metabolic status of an individual immune cell using current technologies. Here, Applicants present Compass, an algorithm to characterize the metabolic landscape of single cells in silico based on single-cell RNA-Seq profiles and flux balance analysis. Applicants used Compass to study the landscape of metabolic heterogeneity in T helper 17 (Th17) cells and predict novel metabolic regulators of their inflammatory function. Compass recovered the known metabolic switch between glycolysis and fatty acid oxidation and further predicted novel regulators in amino-acid pathways, which Applicants validated through transcriptomic, metabolic and functional assays. Compass also predicted a particular glycolytic reaction (phosphoglycerate mutase—PGAM) that promotes an anti-inflammatory phenotype in Th17 cells, contrary to common immunometabolic understanding of a shift towards glycolysis as promoting pro-inflammatory phenotypes in Th17 and other immune cell types. Applicants validate this prediction and demonstrate that enzymatic inhibition of PGAM leads non-pathogenic Th17 cells to adopt a pro-inflammatory transcriptional program and develop an encephalitogenic phenotype with autoimmunity of the central nervous system upon adoptive transfer in vivo. As diverse cells are profiled by single-cell RNA-Seq in efforts such as the Human Cell Atlas, Compass offers the first broadly applicable tool to characterize metabolic state of individual cells, relate their metabolic state to the transcriptomes and cellular phenotypes, and correlate the state to drivers regulating the phenotype.
Cellular metabolism is both a mediator and a regulator of cellular functions. Metabolic activities are key in normal cellular processes such as activation, expansion and differentiation, but also play an important role in the pathogenesis of multiple disease conditions including autoimmunity, cancer, cardiovascular disease, neurodegeneration, and aging. Recently, the study of metabolism in immune cells (immunometabolism) has gained particular attention as a major regulator of almost all aspects of immune responses including anti-viral immunity, autoimmunity, and cancer (Hotamisligil 2017; O'Neill, Kishton, and Rathmell 2016; Geltink, Kyle, and Pearce 2018; Russell, Huang, and VanderVen 2019; Buck et al. 2017; Ho and Kaech 2017; Chapman, Boothby, and Chi 2019; Karmaus et al. 2019; Green, Galluzzi, and Kroemer 2014).
Due to the scale and complexity of the metabolic network, a metabolic perturbation may create cascading effects and eventually alter a seemingly distant part of the network, while cross-cutting traditional pathway definitions. Therefore, computational tools are needed contextualize observations on specific reactions or enzymes into a systems-level understanding of metabolism and its dysregulation in disease. One successful framework has been Flux Balance Analysis (FBA), which translates curated knowledge on the network's topology and stoichiometry into mathematical objects and uses them to make in silico predictions on metabolic fluxes (Orth, Thiele, and Palsson 2010; O'Brien, Monk, and Palsson 2015; Lewis, Nagaraj an, and Palsson 2012; Palsson 2015). FBA methods have proven particularly useful when contextualized with functional genomics data, including gene expression (Bordbar et al. 2014).
While such metabolic models aim to represent the behavior of individual cells, their contextualization has generally relied on information collected from bulk population data. However, the advent of single-cell RNA-Seq (scRNA-Seq) has highlighted the substantial extent of cell-to-cell diversity that is often missed by bulk profiles (A. Wagner, Regev, and Yosef 2016; Tanay and Regev 2017), and can be especially prominent in immune cells and associated with their functional diversity (Ben-Moshe et al. 2019; Vieira Braga et al. 2019; Karmaus et al. 2019; Azizi et al. 2018; Sade-Feldman et al. 2018; Keren-Shaul et al. 2017; Vento-Tormo et al. 2018; Paul et al. 2015; Soldatov et al. 2019; Miragaia et al. 2019; Zanini et al. 2018; Brown et al. 2019; Van Hove et al. 2019). One of the earliest examples has been the diversity among T helper 17 (Th17) cells (Gaublomme et al. 2015). On the one hand, IL-17 producing Th17 cells can be potent inducers of tissue inflammation in autoimmune disorders (Korn et al. 2009; Tesmer et al. 2008) but on the other hand, these cells are critical in host defense against pathogens (Gaffen, Hernandez-Santos, and Peterson 2011; Romani 2011) and can promote mucosal homeostasis and barrier functions (Lee et al. 2012; Stockinger and Omenetti 2017; X. Wu, Tian, and Wang 2018). Th17 cells with distinct effector functions can be found in patients and animal models and can also be generated in vitro with different combinations of differentiation cytokines, as Applicants have previously demonstrated (Lee et al. 2012). Applicants have previously shown that such functional diversity can be captured by studying transcriptional diversity at the single cell level with scRNA-Seq, and enabled the discovery of novel regulators that are otherwise difficult to detect in bulk RNA-Seq analysis (Gaublomme et al. 2015; Wang et al. 2015).
Applicants hypothesized that a similar spectrum of diversity may exist at the immunometabolic level and relate to cell function. However, most cellular assays, including metabolic assays, are normally done in a targeted manner and difficult to undertake at single-cell level. Furthermore, low cell numbers frequently prohibit direct metabolic assays, for example, in the study of immune cells that are present at tissue sites. In contrast, scRNA-Seq, is broadly accessible and rapidly collected across the human body (Regev et al. 2017), and should allow, in principle, to contextualize metabolic models to the single cell level. A computational method is thus required to systematically address the unique challenges of scRNA-Seq, such as data sparsity, and to capitalize on its opportunities, for example by treating cell populations as natural perturbation systems with a rapidly increasing scale (Svensson, Vento-Tormo, and Teichmann 2018).
Here, Applicants present Compass, an FBA algorithm to characterize and interpret the metabolic heterogeneity among cells, which uses available knowledge of the metabolic network in conjunction with RNA expression of metabolic enzymes. Compass uses single cell transcriptomic profiles to characterize cellular metabolic states at single-cell resolution and with network-wide comprehensiveness. It allows detection of metabolic targets across the entire metabolic network, agnostically of pre-defined metabolic pathway boundaries, and including ancillary pathways that are normally less studied, yet could play an important role in the determining cell function (Puleston, Villa, and Pearce 2017). Applicants applied Compass to Th17 cells, uncovering substantial immunometabolic diversity associated with their inflammatory effector functions. In addition to the expected glycolytic shift, Applicants found diversity in amino acid metabolism, and highlighted a unique and surprising role for the glycolytic reaction catalyzed by phosphoglycerate mutase (PGAM) in promoting an anti-inflammatory phenotype in Th17 cells. Compass is a broadly applicable tool for studying metabolic diversity at the single cell level, and its relationship to the functional diversity between cells.
Applicants reasoned that even though the mRNA expression of individual enzymes does not necessarily provide an accurate proxy for their metabolic activity, a global analysis the entire metabolic network (as enabled by RNA-Seq) in the context of a large sample set (as offered by single cell genomics) coupled with strict criteria for hypotheses testing, would provide an effective framework for predicting cellular metabolic status of the cell. This led Applicants to develop the Compass algorithm, which integrates scRNA-Seq profiles with prior knowledge of the metabolic network to infer a metabolic state of the cell (
The metabolic network is encoded in a Genome-Scale Metabolic Model (GSMM) that includes reaction stoichiometry, biochemical constraints such as reaction irreversibility and nutrient availability, and gene-enzyme-reaction associations. Here, Applicants use Recon2, which comprises of 7,440 reactions and 2,626 unique metabolites (Thiele et al. 2013). To explore the metabolic capabilities of each cell, Compass solves a series of constraint-based optimization problems (formalized as linear programs) that produce a set of numeric scores, one per reaction (STAR Methods). Intuitively, the score of each reaction in each cell reflects how well adjusted is the cell's overall transcriptome to maintaining high flux through that reaction. Henceforth, Applicants refer to the scores as quantifying the “potential activity” of a metabolic reaction (or “activity” in short when it is clear from the context that Compass predictions are discussed).
Compass belongs to the family of Flux Balance Analysis (FBA) algorithms that model metabolic fluxes, namely the rate by which chemical reactions convert substrates to products and apply constrained optimization methods to find flux distributions that satisfy desired properties (a flux distribution is an assignment of flux value to every reaction in the network) (Orth, Thiele, and Palsson 2010; O'Brien, Monk, and Palsson 2015; Lewis, Nagarajan, and Palsson 2012; Palsson 2015). In the first step, Compass is agnostic to any measurement of gene expression levels and computes, for every metabolic reaction r, the maximal flux vropt it can carry without imposing any constraints on top of those imposed by stoichiometry and mass balance. Next, Compass relies on the assumption that mRNA expression of an enzyme coding gene should preferably correlate with the flux through the metabolic reaction(s) it catalyzes. It thus assigns every reaction in every cell a penalty inversely proportional to the mRNA expression associated with its enzyme(s) in that cell. Compass then finds a flux distribution which minimizes the overall penalty incurred in any given cell i (summing over all reactions), while maintaining a flux of at least ω·vropt (here ω=0.95) in r. The Compass score of reaction r in cell i is the negative of that minimal penalty (so that lower scores correspond to lower potential metabolic activity).
Using genome-scale metabolic network allows the entire metabolic transcriptome to impact the computed score for any particular reaction, rather than just the mRNA coding for the enzymes that catalyze it. Applicants reasoned that this helps reduce the effect of instances where mRNA expression does not correlate well with metabolic activity, for example due to post-transcriptional or post-translational modifications. This also mitigates the effects of data sparsity, which is characteristic of scRNA-Seq data. The low transcript signal in scRNA-Seq, which results in the extreme case in false-negative gene detections, magnifies the repercussions of sampling bias and transcription stochasticity, and leads to an overestimation of the variance of lowly expressed genes, which in turn leads in turn to false-positive calling of differentially expressed genes (A. Wagner, Regev, and Yosef 2016). Compass further mitigates data sparsity effects with an information-sharing approach, similar to other scRNA-Seq algorithms (Vallejos, Marioni, and Richardson 2015; Satija et al. 2015; Lun, Bach, and Marioni 2016; Haghverdi et al. 2018; F. Wagner, Yan, and Yanai 2018; Huang et al. 2018; van Dijk et al. 2018; Baran et al. 2019; Grun 2019). Instead of treating each cell in isolation, the score vector for each cell is determined by a combined objective that balances the effects in the cell in question with those in its k-nearest neighbors (based on similarity of their RNA profiles; here, using k=10;
The output of Compass is a quantitative profile for the metabolic state of every cell, which is then subject to downstream analyses (
Th17 Cell Metabolic Diversity Reflects a Balance Between Glycolysis and Fatty Acid Oxidation, which is Associated with Pathogenicity
To demonstrate Compass, Applicants applied it to scRNA-Seq data from Th17 cells, differentiated in vitro from naïve CD4+T into two extreme functional states (Ghoreschi et al. 2010; Lee et al. 2012) (
To investigate the main determinants of Th17 cell-to-cell metabolic heterogeneity, Applicants first analyzed the Compass output as a high dimensional representation of the cells which parallels the one produced by scRNA-Seq, but with features corresponding to metabolic meta-reaction rather than transcripts. Applicants performed principal component analysis (PCA) on the meta-reaction matrix, while restricting it to 784 meta-reactions (out of 1,911) associated with core metabolism (STAR Methods) that span conserved and well-studied pathways for generation of ATP and synthesis of key biomolecules.
The first two principal components (PCs) of the core metabolism subspace were associated both with overall metabolic activity and T effector functions (
To directly search for metabolic targets that are associated with the pathogenic capacity of individual Th17 cells, Applicants searched for biochemical reactions with differential predicted activity between the Th17p and Th17n conditions according to Wilcoxon's rank sum p value and Cohen's d effect size statistics) and defined pro-pathogenic and pro-regulatory reactions as ones that were significantly different in the Th17p or Th17n direction, respectively (
Metabolic reactions in both primary and ancillary pathways were associated with Th17 cell pathogenicity (1,213 or 3,362 reactions out of 6,563 reactions, Benjamini-Hochberg (BH) adjusted Wilcoxon p<0.001 or 0.1, respectively). Many of these reactions are also significantly correlated with the expression of signature genes for Th17 functional activity, which code cytokines and transcription factors (Lee et al. 2012) (
Compass highlighted distinctions in central carbon and fatty acid metabolism between the Th17p and Th17n states, which mirror those found between Th17 and Foxp3+T regulatory (Treg) cells. In central carbon metabolism, Compass predicted that glycolytic reactions, ending with the conversion of pyruvate to lactate are generally more active in the pro-inflammatory Th17p than in the Th17n state (
In fatty acid metabolism, Compass predicted that cytosolic acetyl-CoA carboxylase (ACC1), the committed step towards fatty acid synthesis, is upregulated in Th17p, whereas the first two steps of long-chain fatty acid oxidation (long chain fatty acyl-CoA synthetase and carnitine 0-palmitoyltransferase (CPT)) were predicted to be significantly higher in Th17n. These predictions mirror a known metabolic difference between the Th17 and Treg lineages, where Th17 cells rely more on de novo fatty acid synthesis (Berod et al. 2014), whereas Tregs scavenge them from their environment and catabolize them and produce ATP through beta-oxidation (Michalek et al. 2011). Applicants note, however, that recent evidence suggests that CPT may be upregulated in Treg over Th17, but is not functionally indispensable for Treg cells to obtain their effector phenotypes (Raud et al. 2018).
Among ancillary metabolic pathways, Compass highlighted multiple reactions of amino-acid metabolism that are differentially active between Th17p and Th17n cells (
Applicants validated the Compass prediction that pathogenic and non-pathogenic Th17 functional states differ in their central carbon metabolism (
First, Applicants compared glycolysis and mitochondrial function of Th17p and Th17n cells. A Seahorse assay (which involves culturing cells with glucose-rich media) confirmed that Th17p cells caused significantly higher extracellular acidification (ECAR) than Th17n, indicating accumulation of lactic acid due to aerobic glycolysis (
Next, Applicants directly measured metabolites within the glycolysis pathway and TCA cycle using LC/MS based metabolomics. When pulsed with fresh media containing glucose (and rested for 15 minutes), there is a substantial increase in glycolytic metabolites in Th17p but less so in Th17n cells (
Interestingly, Compass predicted that two parts of the TCA cycle, but not the cycle as a whole, were upregulated in Th17p: the conversion of citrate to isocitrate and of alpha-ketoglutarate to succinate (mirroring previous findings in macrophages, see above and (Jha et al. 2015; E. L. Mills et al. 2016)). LC/MS Metabolomics analysis of cells at steady state revealed that TCA metabolites were generally more abundant in Th17p than in Th17n, apart from succinate (
To test whether not only absolute metabolite levels, but also the relative allocation of carbon into its possible fates differ between Th17p and Th17n cells, Applicants performed a carbon tracing assay with 13C-glucose. Applicants augmented fresh media with 13C-labeled glucose and computed the ratio of the 13C isotope out of the total carbon for each metabolite. Consistent with the predictions, Th17p had significantly higher relative abundance of 13C-labeled glycolytic metabolites than Th17n (
Applicants next validated that Th17n cells prefer beta oxidation as predicted by Compass. Metabolomics analysis shows that Th17n cells were enriched in acyl-carnitine metabolites (
Pyruvate dehydrogenase (PDH) is a critical metabolic juncture through which glycolysis-derived pyruvate enters the TCA cycle (
To determine whether increased glycolysis, regulated by PDH enzymes, in Th17p cells is important for their global metabolic phenotype, Applicants used PDK4−/− mice for perturbation. Despite the low expression of PDK4 mRNA in Th17 cells (
To further determine the global transcriptional and metabolic changes induced by PDK4 perturbation, Applicants profiled 146 WT and 132 PDK4−/− Th17p cells and 236 WT and 307 PDK4−/− Th17n cells by scRNA-seq using SMART-Seq2. Consistent with the size of the effects on lactate secretion (
Seeing that PDK4-deficiency had partially shifted Th17p central carbon metabolism towards the Th17n state in vitro, Applicants next tested the pertaining effects in vivo. To this end, Applicants studied the impact of PDK4− deficiency on the development of EAE, an autoimmune disease induced by pathogenic Th17 cells. Consistent with previous studies that glycolysis promotes inflammation (Gerriets et al. 2015; Gemta et al. 2019; Beckermann, Dudzinski, and Rathmell 2017; Rhoads, Major, and Rathmell 2017), mice with global knockout of PDK4 developed less severe disease as determined by the clinical disease scores (
Thus far, the analysis relied on an inter-population comparison between the extreme states of Th17n and Th17p cells. However, Applicants have previously shown that there is also considerable continuous variation in the transcriptomes of Th17n cells, which spans into pathogenic-like states (Gaublomme et al. 2015). To explore the relationship between metabolic heterogeneity and pathogenic potential within the Th17n subset, Applicants next performed an intra-population analysis of Th17n cells. This also demonstrates that Compass can be applied to scRNA-Seq data in cases where the states of interest (e.g., Th17n vs. Th17p) are either unknown or cannot be experimentally partitioned into discrete types. To perform an intra-population Compass analysis of single Th17n cells, Applicants correlated the Compass scores associated with each reaction with the pathogenicity gene signature scores of the respective cells (STAR Methods).
While the resulting correlations of individual reactions with the pathogenicity score were largely consistent with the results of the inter-population analysis (Th17p vs. Th17n) (
To functionally validate the glycolytic targets associated with Th17 cell pathogenicity by the intra-population analysis, Applicants used chemical inhibitors against enzymes driving the top two glycolytic reactions that were most positively correlated (regulated by pyruvate kinase muscle isozyme [PKM], and glucose-6-phosphate dehydrogenase [G6PD]) and top two that were most negatively correlated (phosphoglycerate mutase [PGAM], and glucokinase [GK]) with the pathogenicity score (
Applicants first analyzed the effects of inhibitors on Th17n and Th17p cell differentiation and function using flow cytometry (
To analyze the impact of perturbing glycolytic enzymes on the transcriptome, Applicants used bulk RNA-Seq to profile Th17n and Th17p cells grown in the presence of either the predicted pro-regulatory inhibitor DHEA (inhibiting G6PD) or the predicted pro-inflammatory inhibitor EGCG (inhibiting PGAM) (
To better interpret the drug-induced transcriptional changes, Applicants examined individual genes whose expression is associated with either Th17n or Th17p effector function as wells as global transcriptomic shifts (
To verify that the effect of EGCG was mediated by a specific inhibition of PGAM (rather than an off-target effect) Applicants conducted a carbon tracing assay in which the cell's medium was supplemented with 13C-glucose (STAR Methods). PGAM inhibition with EGCG led to a sharp decrease (from 51% 13C ratio to 7% in Th17n and from 55% to 33% in Th17p) in 13C contents of 2PG (PGAM's product) but not 3PG (PGAM's substrate) or any other glycolytic metabolite that Applicants were able to measure (
As the serine biosynthesis pathway is more active in Th17p than in Th17n (
Taken together, an intra-population Compass analysis predicted that within the Th17n compartment, the glycolytic PGAM reaction inhibits, rather than promotes, pathogenicity. This prediction relied on heterogeneity within the Th17n population, yielding results that are contrary to those from inter-population comparisons of Th17 to Treg or Th17p to Th17n. EGCG specifically inhibited this reaction, and promoted a transcriptional state indicative of a more pro-inflammatory potential, as evidenced by a global shift in the transcriptome toward a Th17p-like profile. RNA-Seq further supported the hypothesis that EGCG mediates its effects by altering the cellular metabolic profile.
To test the functional relevance of the transcriptome shifts induced by EGCG and DHEA in vivo, Applicants used the adoptive T cell transfer system, so that the effect of inhibitors is limited to T cells rather than all cells in the host. Applicants generated Th17n and Th17p cells from naive CD4+ T cells isolated from 2D2 TCR-transgenic mice, with specificity for MOG 35-55, and transferred them into wildtype mice to induce EAE.
Consistent with Compass prediction, Th17p cells treated with DHEA reduced the severity of disease at peak of EAE in the recipients (
In conclusion, Compass correctly predicted metabolic targets including glycolytic pathways whose deletion affected Th17 function. Importantly, it was able to pinpoint a glycolytic reaction that suppresses Th17 pathogenicity, which runs contrary to the current understanding that aerobic glycolysis as a whole is associated with a pro-inflammatory phenotype in Th17 cells.
Applicants presented Compass—a flux balance analysis (FBA) algorithm for the study of metabolic heterogeneity among cells based on single-cell transcriptome profiles and validated a number of predictions by metabolome and functional analyses. Compass successfully predicted metabolic targets in both central and ancillary pathways based on its network approach. These results support the power of transcriptomic-based FBA to make valid predictions in a mammalian system.
Glycolysis is a central regulator of T cell function. Compass predicted an association between aerobic glycolysis and Th17 pathogenicity, which accords with multiple previous results tying elevated glycolysis with T cell inflammatory functions. However, a Compass-based data-driven analysis based on scRNAseq unexpectedly revealed that not all glycolytic reactions promote the pro-inflammatory phenotype in Th17 cells. This result was obtained via an intra-population analysis of individual cells. It serves as a further example to the power of studying single-cell heterogeneity within seemingly homogenous populations (here, Th17n), which allowed Applicants to identify a novel regulator that would have otherwise been missed at a population level (here, a comparison of Th17p and Th17n). This result further demonstrates that despite the common assumption that glycolysis promotes inflammatory functions in Th17 cells and other immune compartments (E. L. Pearce et al. 2013; E. J. Pearce and Everts 2015; MacIver, Michalek, and Rathmell 2013; O'Neill and Pearce 2016), a more nuanced view is in order (Van den Bossche, O'Neill, and Menon 2017; Newton, Priyadharshini, and Turka 2016).
Static FBA algorithms assume that the system under consideration operates in chemical steady state (Varma and Palsson 1994). Even under this assumption, there remains an infinite number of feasible flux distributions that satisfy the preset biochemical constraints. Therefore, most studies assume that the system (here, a cell) aims to optimize some metabolic function, usually production of biomass or ATP (Damiani et al. 2019). However, whereas such objectives may successfully predict phenotypes of a unicellular organism (Lewis et al. 2010), they are ill-suited for studying mammalian cells (Adler et al. 2019). To overcome this challenge, rather than optimizing a single metabolic objective function, Compass optimizes a set of objective functions, each estimating the degree to which a cell's transcriptome supports carrying the maximal theoretical flux through a given reaction. The result is a high dimensional representation of the cell's metabolic potential (one number per reaction). A biological signal (e.g., differences in reaction potential) can be detected in this high-dimension owing to the statistical power afforded by the large number of cells in a typical scRNA-Seq dataset. Nonetheless, there is no inherent limitation preventing one from applying Compass to study bulk (i.e., non-single-cell) transcriptomic data.
The database of metabolic reactions Applicants used pertains to human cells, and as such the study does not address differences between human and mouse metabolism. In addition, the database provides a global view of the metabolic capabilities of a human cell, accrued from various sources and in diverse cell types. Not all reactions may be functional in a studied cell type, or under particular physiological conditions. This concern can be addressed to some extent by procedures for deriving organ-specific metabolic models (Opdam et al. 2017). Moreover, the metabolic state of a cell depends on the nutrients available in its environment, which are often poorly characterized. Here, the computations assume an environment rich with nutrients, which accords with the studied in vitro growth media. Modifying this to better represent physiological conditions should increase the algorithm's predictive capabilities, especially for cells derived in vivo, where nutrient scarcity may be a limiting factor, and nutrient availability may vary between tissues.
One of the outstanding challenges in the field of single cell genomics is translating the vast data sets presented in cell atlases into an actionable knowledge resource, i.e. using observed cell states to deduce molecular mechanisms and targets (Tanay and Regev 2017). Compass was designed with this challenge in mind, and addresses it in the metabolic cellular subsystem, which can be tractably modeled in silico. In light of the wide appreciation of cellular metabolism as a critical regulator of physiological processes in health and disease, Applicants expect Compass to be useful in predicting cell metabolic states, as well as actionable metabolic targets, in diverse physiological and pathologic contexts.
Naive CD4+CD44-CD62L+CD25− T cells were sorted using BD FACSAria sorter and activated with plate-bound anti-CD3 and anti-CD28 antibodies (both at 1 mg/ml) in the presence of cytokines at a concentration of 0.5×106 cells/ml. For Th17 differentiation: 2 ng/ml of rhTGFb1, 25 ng/ml rmIL-6, 20 ng/ml rmIL-1b (all from Miltenyi Biotec) and 20 ng/ml rmIL-23 (R & D systems) were used at various combinations as specified in figures. For differentiation experiments, cells were harvested at 68 hours for RNA analysis and 72-96h for flow cytometry analysis and Seahorse assay.
Seahorse assay was performed and seahorse media was prepared following manufacturer instructions (Agilent). Cells were re-stimulated with PMA/ionomycin for four hours in the presence of brefaldin and monensin before analysis for cytokines by intracellular cytokine staining. Cytokine concentrations in supernatants of in vitro cultures were analyzed by the LegendPlex Mouse Th Cytokine Panel (13-plex) (BioLegend) according to the manufacturer's instructions and analyzed on a FACS LSR II (BD Biosciences).
Full experimental details are given in (Gaublomme et al., 2015). Briefly, Applicants sequenced CD4+ naive T cells 48 hrs post polarization under one of these conditions, ultimately retaining after quality tests 130 Th17n cells unsorted for IL-17 (denoted Th17nu in the present manuscript), 151 IL-17A/GFP+Th17n cells, and 139 IL-17A/GFP+Th17p cells. Unlike (Gaublomme et al., 2015), in the present study Applicants analyzed the unsorted and sorted cells independently from one another. The sorted cells (Th17n and Th17p) were used for the inter-population analysis, and the unsorted cells (Th17nu) were used for the intra-population analysis.
1.3 Estimation of Transcript Abundance from RNA Se-Quenching
Applicants aligned single-cell SMART-Seq libraries with Bowtie2, quantified TPM gene expression with RSEM, and performed QC as Applicants described in detail in a previous publication (Fletcher et al., 2017). This computational pipeline is a massively revised and updated version of the one originally used to analyze these libraries (Gaublomme et al., 2015). Batch effects and other nuisance factors were normalized with a model chosen empirically with SCONE (Cole et al., 2019). Bulk RNA-Seq were processed with a modified variant of the same pipeline.
For the Smart-Seq libraries, due the absence of UMIs in the dataset, differentially expressed genes were called through a linear model _tted to TPM values with the limma R package and with a mean-variance trend added to the empirical Bayes prior (Ritchie et al., 2015). For the bulk RNA libraries, differentially expressed genes were called with limma-trend or limma-voom (Law et al., 2014) depending on the variance of library sizes, as recommended in the limma package manual (Smyth, 2019).
All chemical inhibitors were purchased from Sigma with the exception of EGCG (Selleck Chemicals) and tested in a wide range of dose (20 nM-200 uM) on Th17 cells. The lowest dose that resulted in minimal impact on cell viability is used for functional evaluation: EGCG, the inhibitor for PGAM1 (Li et al., 2017), was used at 20-50 uM; DHEA, inhibitor for G6PD (Schwartz and Pashko, 2004), was used at 50 uM; DCA, inhibitor for GK (Westergaard et al., 1998, Tisdale and Threadgill, 1984), was used at 40 uM; Shikonin, inhibitor for PKM2 (Zhao et al., 2018, Chen et al., 2011), was used at 10 uM; and PKUMDL-WQ-2101, inhibitor for PHGDH (Wang et al., 2017), was used at 12.5 uM.
C57BL/6 wildtype (WT) and PDK4−/− mice were obtained from Jackson Laboratory (Bar Harbor, Me.). WT 2D2 transgenic mice were bred in house. All experiments were performed in accordance to the guidelines outlined by the Harvard Medical Area Standing Committee on Animals at the Harvard Medical School (Boston, Mass.).
For adoptive transfer EAE, naive T cells (CD4+CD44-CD62L+CD25-) were isolated from 2D2 TCR-transgenic mice and activated with anti-CD3 (1 mg/ml) and anti-CD28 (1 mg/ml) in the presence of differentiation cytokines for 68h. Cells were rested for 2 days and restimulated with plate-bound anti-CD3 (0.5 mg/ml for pathogenic condition; 1 mg/ml for non-pathogenic condition) and anti-CD28 (1 mg/ml) for 2 days prior to transfer. Equal number (2 to 8 million) cells were transferred per mouse intravenously. EAE is scored as previously published (Jager et al., 2009).
For untargeted metabolomics, Th17 cells were differentiated as described. Culture media were snap frozen. Cells were harvested at 96h. 1×106 cells per sample were snap frozen and extracted in either 80% methanol (for fatty acids and oxylipids) or isopropanol (for polar and nonpolar lipids). Two liquid chromatography tandem mass spectrometry (LC-MS) methods were used to measure fatty acids and lipids in cell extracts.
For carbon tracing experiments Th17 cells were differentiated as described. Thereafter, cells were washed and cultured in media supplemented with 8 mM [U-13C]-glucose for 15 min or 3 hrs.
Differentially abundant metabolites were found with Student's t-test and a significance threshold of BH-adjusted p<0:1.
To find metabolites with differential 13C relative abundance, Applicants computed the ratio yi,j of 13C out of the total carbon contents for each metabolite i in sample j. Let |Ci| be the number of carbon atoms in metabolite i, and let xc,i,j be the measured signal of metabolite i in sample j (subsequent to all normalization and QC procedures) in which there are exactly c 13C atoms. Applicants define the 13C/C ratio:
In this application, Applicants define Applicants core metabolism based on reaction metadata included in the Recon2 database. Recon2 assigns a confidence score to each reaction based on the level of evidence supporting it between 1 (no evidence) and 4 (biochemical evidence), with 0 denoting reactions whose confidence was not evaluated. Since pathways generally considered part of primary metabolism are also the best studied ones, Applicants define a reaction as belonging to core metabolism if (a) its Recon2 confidence is either 0 or 4; and (b) it is annotated with an EC (Enzyme Commission) number. Applicants chose to label reactions with unevaluated confidence (i.e., Recon2 confidence score of 0) as part of core metabolism because some of them were found to be key reactions in primary metabolic pathways based on manual correction. The definition of core metabolism is equivalent to taking the set of all metabolic reactions in Recon2, but excluding reactions that either don't have an annotated EC number or for which the Recon2 curators explicitly specified they do not have direct biochemical support. Applicants define a meta-reaction as belonging to core metabolism if it contains at least one core reaction. Metabolic genes are defined as the set of genes annotated in Recond2 (Section 4.7)
2.2 Inter-Population Analysis: Finding Reactions with Differential Potential Activity
To test for differential potential-activity of reactions based on Compass predictions, Applicants computed for each meta-reaction M the Wilcoxon's rank sum between the Compass scores of M in the two populations of interest (here, Th17p and Th17n). Effect size were further assessed with Cohen's d statistic, defined as the difference between the sample means over the pooled sample standard deviation. Let n1, x1, s1be the number of observations in population 1, and the sample mean and standard deviation of their scores in a given meta-reaction, respectively (with a similar notation for population 2). Then with
The resulting p values are adjusted with the Benjamini-Hochberg (BH) method. Note that so far, the computation was done for meta-reactions. Applicants assigned all reactions r∈M the Cohen's d and Wilcoxon's p value that were computed forts. Applicants call a reaction differentially active if its adjusted p is smaller than 0:1. The computation was done on all reactions in the network (namely, both core and non-core reactions).
Applicants used a transcriptomic signature that Applicants have previously shown to capture a Th17 cell's pathogenic capacity (Gaublomme et al., 2015, Wang et al., 2015). Briefly, for each cell compute the average z-scored expression (log(1+TPM)) of pro-pathogenic markers (CASP1, CCL3, CCL4, CCL5, CSF2, CXCL3, GZMB, ICOS, IL22, IL7R, LAG3, LGALS3, LRMP, STAT4, TBX21) and pro-regulatory markers (AHR, IKZF3, IL10, IL1RN, IL6ST, IL9, MAF), with the latter group multiplied by −1.
A compendium of T cell state transcriptomic signatures was described in (Gaublomme et al., 2015). Every signature consists of two gene subsets: a set of positively associated genes and an optionally empty set of negatively associated genes. A scalar signature value is computed for every cell based on its transcriptome profile as described above for pathogenicity. Signatures that are based on KEGG (Kanehisa et al., 2017) pathways or similar resources are constructed by defining the set of positively-associated genes as the ones belonging to the pathway and defining the set of negatively-associated genes as an empty set.
Applicants defined the total metabolic activity of a cell as the sum expression of metabolic enzyme coding genes over the sum expression of all protein coding genes in log-scale TPM (transcripts per million) units. Applicants computed the partial correlation between this quantity and cell PC1 coordinates, while controlling for the sum expression of all protein coding genes in the cells (the aforementioned divisor) to verify the correlation does not arise from the ratio of protein-coding to non-protein coding RNA in the RNA libraries. The correlation was more significant when not controlling for the covariate (Pearson rho=0:56, p<3·10−16).
Applicants defined a transcriptomic signature for late-stage differentiation of Th17 cells based on microarray data from (Yosef et al., 2013). Applicants assigned microarrays into three differentiation stages as described in that paper into early (up to 4h), intermediate (6-16h) and late (20-72h) and fitted with the limma R package a linear model for the discrete 3-level stage covariate. Applicants called differentially expressed genes (BH-adjusted p<0:05 and log 2 fold-change ≥3) and used them to define a transcriptomic signature as described above.
computed genes differentially expressed between the late
2.4 Intra-Population Analysis: Correlation with a Quantitative Cellular Trait
Here, the population was Th17n cells, in one of two biological replicates, and the quantitative trait was a transcriptomic pathogenicity signature. For every meta-reaction M, Applicants computed Spearman correlation of its Compass scores and the pathogenicity scores across all cells. Applicants assigned all reactions r∈M the Spearman correlation and its statistical significance that were computed for M Note that the division of reactions to meta-reactions is dataset-specific and therefore a reaction can belong to different meta-reactions in each of the replicates. So far, the computation was done independently for the two biological replicates. Applicants then computed for each reaction r the Fisher combined p value of the two p values corresponding to the statistical significance of its Spearman correlation with the pathogenicity scores in the two replicates. The combined Fisher p values were adjusted with the Benjamini-Hochberg (BH) method. Applicants call a reaction significantly correlated (or anti-correlated) with the pathogenicity score if its adjusted Fisher combined p is smaller than 0.1. The search space was limited to core reactions.
Applicants manually curated the significant predictions of the central carbon metabolism pathways discussed in the manuscript (glycolysis, TCA cycle, and fatty acid synthesis/oxidation). Recon2 takes account of metabolite localization, and reactions may be functional in more than one cellular compartment. For every reaction, Applicants picked the prediction corresponding to the pertinent cellular compartment (here, cytosol or mitochondria, as shown in
Compass is available at github.com/YosefLab/Compass The algorithm is highly parallelizable. It currently supports execution on multiple threads in a single machine, submission to a Torque queue, and execution on a single machine on Amazon Web Services (AWS). The current implementation relies on the IBM ILOG CPLEX Optimization Studio, which is free for academic use.
The main input is gene expression matrix G in which rows correspond to genes and columns to RNA libraries. Applicants assume that G is (i) already normalized to remove batch and other nuisance effects; (ii) scaled to CPMs or TPMs. In the present application Applicants used TPMs; and (iii) in linear (i.e., not log) scale.
The current manuscript presents the algorithm in the context of single cells, where Compass leverages the statistical power afforded by the large number of observations (cells). Nevertheless, there is no inherent limitation preventing one from applying Compass to study bulk (i.e., non-single-cell) transcriptomic data. In this case, Applicants recommend disabling the information-sharing feature by setting lambda=0 in Algorithm 2. There is also no limitation preventing one from applying Compass to non-RNA-Seq transcriptomic data, such as microarrays.
For prohibitively large datasets, the number of cells (observations) can be reduced by partitioning the cells into small clusters and treating the average of each cluster as an observation in downstream analysis. Two implementations of this approach are micropools (DeTomaso et al., 2019), implemented in the VISION R package (github.com/YosefLab/VISION), and meta-cells (Baran et al., 2019) (tanaylab.github.io/metacell). No pooling was necessary for the analysis presented in this manuscript (i.e., the results are on a single cell level). If cell clusters are large enough, one may choose to skip the information-sharing procedure, which is equivalent to setting the parameter λ=0 in Algorithm 2.
In addition, the number of reactions in the GSMM can be reduced as well by not executing Algorithm 2 on blocked reactions (Section 4.5), non-core reactions (Section 2.1), or reactions outside a predetermined set of metabolic pathways that are of interest. Applicants note that Applicants do not suggest excluding non-blocked reactions from the network altogether (which would result in neglecting their effects on reactions that are of interest), but rather only excluding them from the R(G) matrices in Algorithm 2.
Applicants Used the Recon2 GSMM (Thiele et al., 2013), which Applicants Transformed into a unidirectional network by replacing bidirectional reactions with the respective pair of unidirectional reactions. Consequently, ux values are always nonnegative.
Throughout this application, metabolic genes are defined as the set of genes annotated in Recon2. Note that Compass uses only the expression of metabolic genes and ignores other transcripts.
The results of flux balance analysis significantly depend on the nutrients made available to the GSMM, referred to as the in silico growth medium. Since exact medium composition is mostly unknown even for common in vitro protocols and in vivo models, Applicants chose a rich in silico medium where all nutrients for which a transporter exits are made available in an unlimiting quantity.
In the following sections Applicants denote:
For a given GSMM (here, Recon2), Applicants run once a preparatory step that does not depend on transcriptome data and cache the results (Algorithm 1).
do
Constraint (i) constrains the system to steady state (Varma and Palsson 1994). Constraint (ii) is interpreted as ∀i: αi≤vi≤βi and encodes directionality and capacity limits for reactions, including uptake and secretion limits. Constraint (iii) ensures that when evaluating the maximum ux for each reaction, its reverse reaction carries no flux to avoid the creation of a futile cycle. This does not prevent futile cycles longer than 2 edges, which can be avoided only by more time-consuming computations (Schellenberger, Lewis, and Palsson 2011).
Note that the GSMM may contain blocked reactions vropt=0 that can be excluded from the next steps to speed the computation.
4.6 from Gene Expression to Reaction Expression
By reaction expression, Applicants denote a matrix {R(G)}m×n that is conceptually similar to the gene expression matrix {G}g×n. The columns are the same RNA libraries (e.g., cells) as in G, but rows represent single metabolic reactions rather than transcripts. An entry Rr,j in the matrix R(G) is a quantitative proxy for the activity of reaction r in cell j. Applicants omit the dependence on gene expression matrix and denote simply R when G is obvious from the context.
The reaction expression matrix is created by using the Boolean gene-to-reaction mapping included in the GSMM, similar to the approach taken by (Becker and Palsson, 2008, Shlomi et al., 2008). Let G={xi,j} and consider a particular reaction r in a particular cell j. If a single gene with linear-scale expression x is associated with r, then the reaction's expression will be Rr,j=log2(x+1). If no genes are associated with r then Rr,j=0.
If the reaction is associated with more than one gene, then this association is expressed as a Boolean relationship. For example, two genes which encode different subunits of a reaction's enzyme are associated using an AND relationship as both are required to be expressed for the reaction to be catalyzed. Alternately, if multiple enzymes can catalyze a reaction, the genes involved in each will be associated via an OR relationship. For reactions associated with multiple genes in this manner, the Boolean expression is evaluated by taking the sum or the mean of linear-scale expression values x when genes are associated via an OR or AND relationship, respectively. This way, the full gene(s)-to-reaction associations is evaluated to arrive at a single summary expression value for each reaction in the GSMM.
To mitigate the sparseness and stochasticity of single-cell measurements, Compass allows for a degree of information-sharing between cells with similar transcriptional profiles. Given a gene expression G, Applicants compute k-nearest neighbors (kNN) graph based Euclidean distances in reduced dimension, obtained by taking the top 20 principal components of G. The PCA is computed over all the genes in G, not only metabolic ones.
Let R(C)={ri,j} and
then RN (G)={ri,jN} where
Compass transforms a gene expression matrix {G}g×n, where rows represent genes and columns represent RNA libraries (usually, single cells, although bulk RNA can also be used as discussed below) into a matrix {C}m×n of scores where rows represent metabolic reactions, columns are the same RNA libraries as in the gene expression, and an entry quantifies a proxy for potential reaction activity. More precisely, the entry quantifies the propensity of the cell to use that reaction.
The algorithm is summarized in (Algorithm 2). First, Applicants convert the gene expression matrix Gg×n into a reaction expression matrix Rm×n which is parallel to the gene expression matrix, but with rows representing single metabolic reactions rather than transcripts. Applicants convert R into a penalty matrix Pm×n by point-wise inversion. Whereas R represents gene expression support that a reaction is functional in the cell, P represents the lack thereof (which will be used in a linear program below). The computation of R and P occurs also for the neighborhood of each cell for to smooth results and mitigate single-cell technical noise. Then, Applicants solve a linear program for every reaction r in every cell i to find the minimal resistance of cell i to carry maximal flux through r. Last, Applicants scale the scores, which also entails negating them such that that larger scores will represent larger potential activities (instead of larger penalties, hence smaller potential activity). The final scores indicative of a cell's propensity to use a certain reaction. Applicants interpret it as a proxy for the potential activity of the reaction in that cell.
In step 10 of Algorithm 2, a high penalty yr, indicates that cell c is unlikely, judged by transcriptomic evidence, to use reaction r. Cells whose transcriptome are overall more aligned with an ability to carry ux through a reaction will be assigned a lower penalty yr,c. With regards to the correctness of the step, recall that the GSMM is unidirectional and therefore ∀i. vi>0.
Rows in the Craw matrix that correspond to reactions that are topologically close in the metabolic network can be highly correlated. Applicants therefore hierarchically cluster Craw rows by Spearman distance. Applicants call the resulting clusters meta-reactions and each represents a set of closely correlated metabolic reactions. Note that the division into meta-reactions is data-driven and does not rely on canonical metabolic pathway definitions. Therefore, the division is dataset-dependent—for example, two reactions might be closely correlated and clustered in the same meta-reaction in one cell type, but not in another.
After computing the hierarchical clusters over rows of Craw, Applicants merged leaves in which Spearman similarity (namely 1−ρ, with ρ being the Spearman correlation) by averaging the respective rows. In the present application, Applicants used ρ=0.98. Applicants denote the row-merged matrix {Cmeta-raw}m′×n
By definition, all entries in Cmeta-raw are non-negative. Applicants scale it in Algorithm 3 (the min in the second step denotes matrix-wide minimal entry)
}
, c ∈
do
One of the intuitions behind Compass is that the statistical power afforded by the number of observations (cells) in single-cell RNA-Seq allows increasing dimensionality by computing a new feature set based on the gene expression data and the GSMM. Here, Applicants used an intuitive set of objective functions for each reaction in the network, Applicants defined one objective function which is to maximize the flux it carries (recall that the network is unidirectional and therefore all reactions carry non-negative fluxes). This allows intuitive interpretation of the Compass scores as quantitative proxies to reaction activities. However, the algorithm can be generalized by using an arbitrary set of linear objective functions that pertain to cellular metabolism.
Table 2. Top ranking positive (Table 2A) and negative (Table 2B) correlating pathways from Table 1.
Various modifications and variations of the described methods, pharmaceutical compositions, and kits of the invention will be apparent to those skilled in the art without departing from the scope and spirit of the invention. Although the invention has been described in connection with specific embodiments, it will be understood that it is capable of further modifications and that the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the described modes for carrying out the invention that are obvious to those skilled in the art are intended to be within the scope of the invention. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure come within known customary practice within the art to which the invention pertains and may be applied to the essential features herein before set forth.
This application claims the benefit of U.S. Provisional Application Nos. 62/820,208, filed Mar. 18, 2019, 62/866,547, filed Jun. 25, 2019, and 62/964,289, filed Jan. 22, 2020. The entire contents of the above-identified applications are hereby fully incorporated herein by reference.
This invention was made with government support under Grant No.(s) MH114821, NS045937, NS30843, AI144166, A1073748, A1039671 and A1056299 awarded by the National Institutes of Health. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2020/023399 | 3/18/2020 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62820208 | Mar 2019 | US | |
62866547 | Jun 2019 | US | |
62964289 | Jan 2020 | US |