Mammalian transcription produces diverse RNA species from regulatory elements and genes and transcription of genes occurs in bursts of RNA synthesis. Transcription factors and coactivators recruit RNA polymerase II (Pol II) to enhancer and promoter elements, where short (20-400 bp) RNAs are bidirectionally transcribed before Pol II pauses. These RNA species are short-lived and are reported to have various regulatory roles, although there isn't yet a consensus on their functions. Pol II pause release leads to processive elongation, which occurs in periodic bursts (˜1-10 minutes in duration), where multiple molecules of Pol II can be released from promoters within a short timeframe and produce multiple molecules of mRNA (˜1-100 molecules per burst). How and whether the diverse RNA species produced during transcription-which differ in length, half-life, and number-impact or regulate transcription is currently unclear.
Regulation of transcription is a fundamental cellular process that often goes awry in diseases. Transcription is regulated in part through the concentration and compartmentalization of large numbers of transcription factors, cofactors, and RNA polymerase II in liquid-like condensates. The ability to control the formation and dissolution of these condensates would thereby provide a method to control native and dysregulated transcription. The inventors demonstrate herein that the low levels of RNA present at transcription initiation promote transcriptional condensate formation, while the high levels of RNA produced during elongation promote condensate dissolution. RNA modulates transcriptional condensates principally via regulation of electrostatic interactions in transcriptional condensates. These results provide a simple, general, and powerful mechanism for both positive and negative regulation of transcription by polynucleotides and provide methods for discovery and drug development in the areas of oncogenic noncoding RNAs and RNA therapeutics.
Some aspects of the present disclosure are directed to a method of modulating condensate dependent transcription of a gene comprising modulating an amount, effective charge, structure, or behavior of nucleic acid incorporated in the condensate. In some embodiments, transcription of the gene is increased by less than 2-fold. In some embodiments, transcription of the gene is decreased by less than 50%. In some embodiments, condensate dependent transcription occurs in the presence of one or more species of regulatory RNA (regRNA) in the condensate. In some embodiments, the regRNA is tissue specific regRNA. In some embodiments, the regRNA is a variant associated with a disease or condition. In some embodiments, transcription is modulated by modulating the amount, effective charge, structure, or behavior of the one or more species of regRNA. In some embodiments, the amount of one or more species of regRNA is modulated by modulating transcription of the regRNA.
In some embodiments, transcription is modulated by contacting the one or more species of regRNA with an agent capable of specifically binding to the one or more species of regRNA. In some embodiments, the agent comprises a nucleic acid having a sequence complementary to a sequence of the one or more species of regRNA. In some embodiments, the nucleic acid is an antisense oligonucleotide or antisense RNA. In some embodiments, the one or more species of regRNA comprises enhancer RNA (eRNA). In some embodiments, the eRNA is a sequence variant associated with a disease or condition.
In some embodiments, the amount, effective charge, structure, or behavior of nucleic acid incorporated into the condensate is modulated by contact with an agent specifically binding to a nucleic acid associated with the condensate. In some embodiments, the agent is an oligonucleotide specifically binding a messenger RNA or portion thereof. In some embodiments, the condensate dependent transcription of a gene occurs in a cell. In some embodiments, the condensate dependent transcription of a gene occurs in vivo in a subject.
In some embodiments, the modulating of condensate dependent transcription of a gene treats, prevents or reduces the likelihood of a disease or condition in the subject. In some embodiments, the disease or condition is associated with a haploinsufficiency. In some embodiments, the disease or condition is associated with a gene duplication. In some embodiments, the disease or condition is associated with an eRNA variant.
Some aspects of the present disclosure are directed to a method of treating, preventing or reducing the likelihood of a disease or condition associated with aberrant condensate dependent transcription of a gene in a subject comprising administering to the subject an agent that modulates an amount, effective charge, structure, or behavior of nucleic acid in the condensate and thereby modulates transcription of the gene.
In some embodiments, the condensate dependent transcription occurs in the presence of one or more species of regulatory RNA (regRNA) in the condensate. In some embodiments, the agent comprises a nucleic acid having a sequence complementary to a sequence of the one or more species of regRNA. In some embodiments, the regRNA is an enhancer RNA (eRNA). In some embodiments, the disease or condition is associated with a haploinsufficiency. In some embodiments, the disease or condition is associated with a gene duplication. In some embodiments, the disease or condition is associated with an eRNA variant.
Some aspects of the present disclosure are directed to a method of treating, preventing or reducing the likelihood of a disease or condition in a subject comprising administering to the subject an agent that modulates an amount, effective charge, structure, or behavior of nucleic acid in a transcriptional condensate in a cell of the subject and thereby modulates transcription of a gene and treats, prevents, or reduces the likelihood of the disease or condition. In some embodiments, the transcription of the gene is increased. In some embodiments, a gene product of the gene is increased in the subject. In some embodiments, the agent comprises an oligonucleotide that specifically binds to a nucleic acid associated with the condensate. In some embodiments, transcription of the gene is decreased.
Some aspects of the present disclosure are directed to a method of identifying an agent that modulates a condensate, comprising providing a condensate comprising a regulatory RNA (regRNA), contacting the condensate with a test agent, and assessing whether the test agent dissolves or modulates the size of the condensate. In some embodiments, the regRNA is an enhancer RNA (eRNA). In some embodiments, the eRNA is an eRNA variant associated with a disease or condition. In some embodiments, the test agent specifically binds the regRNA. In some embodiments, the test agent comprises an antisense oligonucleotide or an antisense RNA. In some embodiments, the condensate comprises a detectable label. In some embodiments, the condensate is an in vitro condensate (e.g., a synthetic condensate or a condensate isolated from a cell) or is contained in a cell.
Some aspects of the present disclosure are directed to a method of identifying an agent that increases condensate formation, comprising providing a composition comprising a regulatory RNA (regRNA) and a condensate component under conditions wherein the concentration of the regRNA or condensate component does not form a condensate, contacting the composition with a test agent, and assessing whether contact with the test agent causes formation of a condensate. In some embodiments, the regRNA is an enhancer RNA (eRNA). In some embodiments, the eRNA is an eRNA variant associated with a disease or condition. In some embodiments, the test agent specifically binds the regRNA. In some embodiments, the test agent comprises an antisense oligonucleotide or an antisense RNA. In some embodiments, the regRNA or the condensate component comprises a detectable label. In some embodiments, the condensate is an in vitro condensate or is contained in a cell.
Some aspects of the present disclosure are directed to a method of identifying an agent that modulates condensate dependent transcription of a gene, comprising providing an in vitro transcription assay with condensate dependent expression of a reporter gene, contacting the in vitro transcription assay with a test agent, and assessing expression of the reporter gene, wherein condensate dependent expression requires incorporation of a regulatory RNA (regRNA) in the condensate. In some embodiments, the regRNA is an enhancer RNA (eRNA). In some embodiments, the eRNA is an eRNA variant associated with a disease or condition. In some embodiments, the test agent specifically binds the regRNA. In some embodiments, the test agent comprises an antisense oligonucleotide or an antisense RNA. In some embodiments, the condensate comprises a detectable label.
Some aspects of the present disclosure are directed to a method of identifying an agent that modulates condensate dependent transcription of a gene, comprising providing a cell with condensate dependent expression of a heterologous reporter gene, contacting the cell with a test agent, and assessing expression of the heterologous reporter gene, wherein condensate dependent expression requires incorporation of a regulatory RNA (regRNA) in the condensate. In some embodiments, the regRNA is an enhancer RNA (eRNA). In some embodiments, the eRNA is an eRNA variant associated with a disease or condition. In some embodiments, the test agent specifically binds the regRNA. In some embodiments, the test agent comprises an antisense oligonucleotide or an antisense RNA. In some embodiments, the condensate comprises a detectable label.
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawings will be provided by the Office upon request and payment of the necessary fee.
The practice of the present invention will typically employ, unless otherwise indicated, conventional techniques of cell biology, cell culture, molecular biology, transgenic biology, microbiology, recombinant nucleic acid (e.g., DNA) technology, immunology, and RNA interference (RNAi) which are within the skill of the art. Non-limiting descriptions of certain of these techniques are found in the following publications: Ausubel, F., et al., (eds.), Current Protocols in Molecular Biology, Current Protocols in Immunology, Current Protocols in Protein Science, and Current Protocols in Cell Biology, all John Wiley & Sons, N. Y., edition as of December 2008; Sambrook, Russell, and Sambrook, Molecular Cloning: A Laboratory Manual, 3rd ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 2001; Harlow, E. and Lane, D., Antibodies-A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 1988; Freshney, R.I., “Culture of Animal Cells, A Manual of Basic Technique”, 5th ed., John Wiley & Sons, Hoboken, N J, 2005. Non-limiting information regarding therapeutic agents and human diseases is found in Goodman and Gilman's The Pharmacological Basis of Therapeutics, 11th Ed., McGraw Hill, 2005, Katzung, B. (ed.) Basic and Clinical Pharmacology, McGraw-Hill/Appleton & Lange; 10th ed. (2006) or 11th edition (July 2009). Non-limiting information regarding genes and genetic disorders is found in McKusick, V. A.: Mendelian Inheritance in Man. A Catalog of Human Genes and Genetic Disorders. Baltimore: Johns Hopkins University Press, 1998 (12th edition) or the more recent online database: Online Mendelian Inheritance in Man, OMIM™. McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University (Baltimore, MD) and National Center for Biotechnology Information, National Library of Medicine (Bethesda, MD), as of May 1, 2010, ncbi.nlm.nih.gov/omim/and in Online Mendelian Inheritance in Animals (OMIA), a database of genes, inherited disorders and traits in animal species (other than human and mouse), at omia.angis.org.au/contact.shtml. All patents, patent applications, and other publications (e.g., scientific articles, books, websites, and databases) mentioned herein are incorporated by reference in their entirety. In case of a conflict between the specification and any of the incorporated references, the specification (including any amendments thereof, which may be based on an incorporated reference), shall control. Standard art-accepted meanings of terms are used herein unless indicated otherwise. Standard abbreviations for various terms are used herein.
It is shown herein that RNA acts as a powerful mediator of transcription via the control of transcriptional condensate formation and dissolution. Transcription becomes dysregulated in many diseases, most notably during oncogenesis, and the results shown herein provide a framework for new entry points in therapeutics that target transcriptional processes, which have been notoriously difficult to target. Described herein is a previously unknown feature of transcription, showing that bursts of RNA transcription can lead to the dissolution of transcriptional condensates. Further, the results herein show that noncoding RNA, such as enhancer RNA, may play a role in transcription regulation through modulating transcriptional condensate formation, dissolution and stability.
The ability to deploy RNA in order to control local electrostatic interactions within transcriptional condensates permits tuning of transcriptional output. In addition, various oncogenic noncoding RNAs or therapeutic RNAs like antisense oligos are rapidly being developed and fit in the framework of RNA-mediated feedback control of transcription. These therapeutics provide opportunities for targeting transcriptional processes in disease.
Some aspects of the present disclosure are directed to a method of modulating condensate dependent transcription of a gene comprising modulating an amount, effective charge, structure, or behavior of nucleic acid incorporated in the condensate. The condensate may be a transcriptional condensate (in a cell or isolated from a cell) or a synthetic transcriptional condensate (sometimes referred to herein as a synthetic or artificial condensate).
Transcriptional condensates are phase-separated multi-molecular assemblies that occur at the sites of transcription and are high density cooperative assemblies of multiple components that can include transcription factors, co-factors, chromatin regulators, DNA, non-coding RNA, nascent RNA, and RNA polymerase II. In some instances, transcriptional condensates are formed by super-enhancer assemblies. Many diseases are caused by, or associated with, alteration in these nucleic acid and protein components, and therapeutic intervention may be afforded by altering transcriptional output of condensates. As used herein, a synthetic transcriptional condensate refers to a non-naturally occurring condensate comprising one or more transcriptional condensate components.
As used herein “modulating” (and verb forms thereof, such as “modulates”) means causing or facilitating a qualitative or quantitative change, alteration, or modification. Without limitation, such change may be an increase or decrease in a qualitative or quantitative aspect.
The terms “increased,” “increase” or “enhance” may be, for example, increase or enhancement by a statically significant amount. In some instances, for example, an element can be increased or enhanced by at least about 10% as compared to a reference level (e.g., a control), at least about 20%, at least about 30%, at least about 40%, at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100%, and these ranges will be understood to include any integer amount therein (e.g., 2%, 14%, 28%, etc.) which are not exhaustively listed for brevity. In other instances, an element can be increased or enhanced by at least about 2-fold, at least about 3-fold, at least about 4-fold, at least about 5-fold at least about 10-fold or more as compared to a reference level.
The terms “decrease,” “reduce,” “reduced,” “reduction,” and “inhibit” may be, for example, a decrease or reduction by a statistically significant amount relative to a reference (e.g., a control). In some instances an element can be, for example, decreased or reduced by at least 10% as compared to a reference level, by at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, up to and including, for example, the complete absence of the element as compared to a reference level. These ranges will be understood to include any integer amount therein (e.g., 6%, 18%, 26%, etc.) which are not exhaustively listed for brevity.
For example, modulating transcription of a gene includes increasing or decreasing the rate or frequency of gene transcription; modulating an amount of nucleic acid incorporated in the condensate includes increasing or decreasing the amount of the nucleic acid incorporated in the condensate; modulating an effective charge of nucleic acid incorporated in the condensate includes increasing or decreasing the effective charge of the nucleic acid; modulating a shape of nucleic acid incorporated into the condensate includes modifying the shape of the nucleic acid including minor modifications in shape as well as significant modifications in shape; and modulating behavior of nucleic acid incorporated into the condensate includes modifying the binding, localization, and/or stability of the nucleic acid.
In some embodiments, transcription of the gene or a level of gene product is increased by 1.1-fold, 1.2-fold, 1.3-fold, 1.4-fold, 1.5-fold, 1.6-fold, 1.7-fold, 1.8-fold, 1.9-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, 20-fold, 50-fold, 100-fold, or more as compared to a reference level (e.g., an untreated control cell or condensate). In some embodiments, transcription of the gene or a level of gene product is increased by less than 2-fold. In some embodiments, transcription of the gene or a level of gene product is increased by 1.1 to 1.5-fold. In some embodiments, transcription of the gene or a level of gene product is increased by 1.5 to 1.9-fold.
In some embodiments, transcription of the gene or a level of gene product is reduced by at least about 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99%, 99.5%, 99.9%, or more as compared to a reference level (e.g., an untreated control cell or condensate). In some embodiments, transcription of the gene or a level of gene product is decreased by less than 50%. In some embodiments, transcription of the gene or a level of gene product is decreased by between about 10% and 50%. In some embodiments, transcription of the gene or a level of gene product is decreased by between about 5% and 25%.
In some embodiments, the rate of gene transcription is modulated. In some embodiments, the amount of time during which transcription occurs is modulated. For example, for burst transcription, the period of time during which transcription is occurring during one or more of the burst transcription events can be increased by, e.g., increasing the stability of the condensate by modulating nucleic acid incorporated into the condensate. In other embodiments, the period of time during which transcription is occurring during one or more of the burst transcription events can be decreased by, e.g., decreasing the stability of the condensate by modulating nucleic acid incorporated into the condensate. In some embodiments, the period of time between burst transcription events can be modulated by, e.g., inhibiting or enhancing condensate formation by modulating an amount, effective charge, structure, or behavior of nucleic acid incorporated in the condensate.
The gene for which condensate transcription is modulated is not limited. In some embodiments, the gene is an oncogene. Exemplary oncogenes include MYC, SRC, FOS, JUN, MYB, RAS, ABL, HOXI1, HOXI1 1L2, TAL1/SCL, LMO1, LMO2, EGFR, MYCN, MDM2, CDK4, GLI1, IGF2, activated EGFR, mutated genes, such as FLT3-ITD, mutated of TP53, PAX3, PAX7, BCR/ABL, HER2/NEU, FLT3R, FLT6-ITD, SRC, ABL, TAN1, PTC, B-RAF, PML-RAR-alpha, E2A-PRX1, and NPM-ALK, as well as fusion of members of the PAX and FKHR gene families. Other exemplary oncogenes are well known in the art. In some embodiments the oncogene is selected from the group consisting of c-MYC and IRF4. In some embodiments the gene encodes an oncogenic fusion protein, e.g., an MLL rearrangement, EWS-FLI, ETS fusion, BRD4-NUT, NUP98 fusion.
In some embodiments, the gene is associated with a hallmark of a disease such as cancer (e.g., breast cancer). In some embodiments, the gene is associated with a disease associated DNA sequence variation such as a SNP. In some embodiments, the disease is Alzheimer's disease, and the gene is BIN1. In some embodiments, the disease is type 1 diabetes, and the gene is associated with a primary Th cell. In some embodiments, the disease is systemic lupus erythematosus, and the gene plays a key role in B cell biology. In some embodiments, the gene is associated with a disease or condition associated with a mutation in a gene encoding a nuclear receptor. In some embodiments, the gene is associated with a hallmark characteristic of the cell. In some embodiments, the gene is aberrantly expressed or is associated with a DNA variation such as a SNP. “Aberrantly expressed” is used to indicate that the gene expression in one or more cells or synthetic condensates is detectably different from a control level that is typical of that found in normal cells (e.g., normal cells of the same cell type or, for cultured cells, cultured cells under comparable conditions) or condensates not subject to a test treatment or condition (e.g., for condensates isolated from cells, isolated condensates from normal cells of the same cell type or, for cultured cells, cultured cells under comparable conditions). In some embodiments, the gene is associated with aberrant signaling in a cell (e.g. aberrant signaling associated with the WNT, TGF-β or JAK/STAT pathways). In some embodiments, the gene exhibits aberrant mRNA initiation or elongation (e.g., aberrant splicing). As used herein, “aberrant mRNA initiation or elongation” is detectably or significantly different than mRNA initiation or elongation in a control cell or subject (e.g., higher than or lower than in (increased or decreased as compared to) a healthy cell or subject, or cell or subject without a disease or condition characterized by atypical mRNA initiation or elongation). In some embodiments, the gene is associated with a disease or disorder associated with aberrant gene silencing (e.g., increased or decreased gene silencing as compared to gene silencing in a healthy cell or healthy subject (e.g., control cell or subject)). In some embodiments, the disease or disorder associated with aberrant gene silencing is Rett syndrome, MeCP2 over-expression syndrome or MeCP2 under-expression or activity. MeCP2 refers to methyl CpG binding protein 2 (Human UniProt ID: P51608).
In some embodiments, the gene is found in a mammalian cell, e.g., human cell; fetal cell; embryonic stem cell or embryonic stem cell-like cell, e.g., cell from the umbilical vein, e.g., endothelial cell from the umbilical vein; muscle, e.g., myotube, fetal muscle; blood cell, e.g., cancerous blood cell, fetal blood cell, monocyte; B cell, e.g., Pro-B cell; brain, e.g., astrocyte cell, angular gyrus of the brain, anterior caudate of the brain, cingulate gyrus of the brain, hippocampus of the brain, inferior temporal lobe of the brain, middle frontal lobe of the brain, brain cancer cell; T cell, e.g., naïve T cell, memory T cell; CD4 positive cell; CD25 positive cell; CD45RA positive cell; CD45RO positive cell; IL-17 positive cell; a cell that is stimulated with PMA; Th cell; Th17 cell; CD255 positive cell; CD127 positive cell; CD8 positive cell; CD34 positive cell; duodenum, e.g., smooth muscle tissue of the duodenum; skeletal muscle tissue; myoblast; stomach, e.g., smooth muscle tissue of the stomach, e.g., gastric cell; CD3 positive cell; CD14 positive cell; CD19 positive cell; CD20 positive cell; CD34 positive cell; CD56 positive cell; prostate, e.g., prostate cancer; colon, e.g., colorectal cancer cell; crypt cell, e.g., colon crypt cell; intestine, e.g., large intestine; e.g., fetal intestine; bone, e.g., osteoblast; pancreas, e.g., pancreatic cancer; adipose tissue; adrenal gland; bladder; esophagus; heart, e.g., left ventricle, right ventricle, left atrium, right atrium, aorta; lung, e.g., lung cancer cell; skin, e.g., fibroblast cell; ovary; psoas muscle; sigmoid colon; small intestine; spleen; thymus, e.g., fetal thymus; breast, e.g., breast cancer; cervix, e.g., cervical cancer; mammary epithelium; liver, e.g., liver cancer; DND41 cell; GM12878 cell; H1 cell; H2171 cell; HCC1954 cell; HCT-116 cell; HeLa cell; HepG2 cell; HMEC cell; HSMM tube cell; HUVEC cell; IMR90 cell; Jurkat cell; K562 cell; LNCaP cell; MCF-7 cell; MM1S cell; NHLF cell; NHDF-Ad cell; RPMI-8402 cell; U87 cell; VACO 9M cell; VACO 400 cell; or VACO 503 cell.
In some embodiments, the gene is a disease-associated variation related to rheumatoid arthritis, multiple sclerosis, systemic scleroderma, primary biliary cirrhosis, Crohn's disease, Graves disease, vitiligo and atrial fibrillation. In some embodiments, the gene is associated with a developmental disorder. In some embodiments, the gene is associated with a neurological disorder or developmental neurological disorder.
In some embodiments, the gene is considered cell type specific. A cell type specific gene need not be expressed only in a single cell type but may be expressed in one or several, e.g., up to about 5, or about 10 different cell types out of the approximately 200 commonly recognized (e.g., in standard histology textbooks) and/or most abundant cell types in an adult vertebrate, e.g., mammal, e.g., human. In some embodiments, a cell type specific gene is one whose expression level can be used to distinguish a cell, e.g., a cell as disclosed herein, such as a cell of one of the following types from cells of the other cell types: adipocyte (e.g., white fat cell or brown fat cell), cardiac myocyte, chondrocyte, endothelial cell, exocrine gland cell, fibroblast, glial cell, hepatocyte, keratinocyte, macrophage, monocyte, melanocyte, neuron, neutrophil, osteoblast, osteoclast, pancreatic islet cell (e.g., a beta cell), skeletal myocyte, smooth muscle cell, B cell, plasma cell, T cell (e.g., regulatory, cytotoxic, helper), or dendritic cell. In some embodiments a cell type specific gene is lineage specific, e.g., it is specific to a particular lineage (e.g., hematopoietic, neural, muscle, etc.) In some embodiments, a cell-type specific gene is a gene that is more highly expressed in a given cell type than in most (e.g., at least 80%, at least 90%) or all other cell types. Thus specificity may relate to level of expression, e.g., a gene that is widely expressed at low levels but is highly expressed in certain cell types could be considered cell type specific to those cell types in which it is highly expressed. In some embodiments, a cell-type specific gene is a gene that is less expressed, or not expressed, in a given cell type than in most (e.g., at least 80%, at least 90%) or all other cell types. Thus specificity may relate to level of expression, e.g., a gene that is widely expressed but is much less expressed in certain cell types could be considered cell type specific to those cell types in which it is less, or not at all, expressed. It will be understood that expression can be normalized based on total mRNA expression (optionally including miRNA transcripts, long non-coding RNA transcripts, and/or other RNA transcripts) and/or based on expression of a housekeeping gene in a cell. In some embodiments, a gene is considered cell type specific for a particular cell type if it is expressed at levels at least 2, 5, or at least 10-fold greater or less than in that cell than it is, on average, in at least 25%, at least 50%, at least 75%, at least 90% or more of the cell types of an adult of that species, or in a representative set of cell types. One of skill in the art will be aware of databases containing expression data for various cell types, which may be used to select cell type specific genes. In some embodiments a cell type specific gene is a transcription factor. In some embodiments, a cell type specific gene is associated with embryonic, fetal, or post-natal development.
In some embodiments, the disease or condition is associated with aberrant expression or activity of one or more regulatory RNAs. For example, miRNA and lncRNA have been implicated in cancer, cardiovascular disease, neurodegenerative disease (Parkinson's disease, Alzheimer's disease, Huntington's disease, spinal muscular atrophy, frontotemporal lobar degeneration, amyotrophic lateral sclerosis). See, Lekka, E., & Hall, J. (2018) “Noncoding RNAs in disease,” FEBS letters, 592(17), 2884-2900. In some embodiments, the disease or condition is associated with haploinsufficiency, gene duplication, or an enhancer RNA variant.
In some embodiments, condensate dependent transcription occurs in the presence of one or more species of regulatory RNA (regRNA) in the condensate. As used herein, “regulatory RNA” are functional RNA molecules that are not translated into proteins. Types of regulatory RNA include microRNAs (miRNA), which are about 22 nucleotides long RNA molecules that in animals regulate gene expression post-transcriptionally in a sequence-specific manner, by facilitating messenger RNA (mRNA) degradation or by controlling translation. Other small regRNAs include: PIWI-interacting RNA (piRNA), about 28 nucleotides long RNA molecules involved in transposon repression and DNA methylation; small nucleolar RNA (snoRNA), about 60-300 nucleotides long, components of small nucleolar ribonucleoproteins, which modulate biogenesis and activity of ribosomes by post-transcriptional modifications of ribosomal RNA (rRNA); and small nuclear RNA (snRNA), about 150 nucleotides long RNA molecules that facilitate mRNA splicing and regulate transcription factors. In some embodiments, regRNA include promoter-associated RNA, including but not limited to, a promoter upstream transcript (PROMPT), a promoter-associated long RNA (PALR), and a promoter-associated small RNA (PASR). In further embodiments, regRNAs may include but are not limited to transcription start sites (TSS)-associated RNAs (TSSa-RNAs), transcription initiation RNAs (tiRNAs), and terminator-associated small RNAs (TASRs). In specific examples, the regRNA is an enhancer RNA (eRNA). As used herein, enhancer RNA (eRNA) refers to a class of relatively short non-coding RNA molecules (50-2000 nucleotides) transcribed from the DNA sequence of an enhancer region, such as a super-enhancer. In some embodiments, the eRNA is identified in the HACER database available on the world-wide web at bioinfo.vanderbilt.edu/AE/HACER/. In some embodiments, the eRNA is Trim28 eRNA, Pou5f1 eRNA, Oct4 eRNA, or Nanog eRNA. In some embodiments, the eRNA can be found in a database of transcribed human enhancers available on the world-wide web at fantom.gsc.riken.jp/5/. In some embodiments, the eRNA can be found in Andersson, et al., Nature 507, 455-461.
In some embodiments, the regRNA is tissue, organ or cell specific regRNA. The tissue, organ, or cell type is not limited and may be any tissue, organ, or cell type mentioned herein with regard to genes specific to cells, tissues or organs. As is apparent to a person of skill in the art, the capability to modulate a widely expressed gene only in a specific cell, tissue or organ type by modulating regRNA (e.g., eRNA) uniquely expressed in the specific cell, tissue or organ type has wide applicability for therapy and research.
In some embodiments, the regRNA is a variant associated with a disease or condition. Recent genome-wide association studies (GWAS) have found >88% of disease-risk variants lie in non-coding regions (Hindorff et al., “Potential etiologic and functional implications of genome-wide association loci for human diseases and traits,” Proc. Natl. Acad. Sci. U.S.A. 2009; 106:9362-9367), especially enriched in enhancers (Corradin et al., “Enhancer variants: evaluating functions in common disease,” Genome Med. 2014; 6:85). In some embodiments, the regRNA is an eRNA associated with an enhancer region comprising a SNP correlated with a disease or condition in a GWAS. In some embodiments, the enhancer region is part of or associated with the Trim28, Pou5f1, Oct4, or Nanog gene.
In some embodiments, the regRNA (e.g., eRNA) is associated with or part of a gene described above (e.g., a gene associated with a disease or condition, a cell specific gene, a tissue or organ specific gene.
In some embodiments, transcription is modulated by modulating the amount, effective charge, structure, or behavior of the one or more species of regRNA (e.g., eRNA). In some embodiments, the amount of one or more species of regRNA is modulated by modulating transcription of the regRNA. In some embodiments, transcription of the eRNA is increased. In some embodiments, transcription of the eRNA is decreased or prevented. In some embodiments, the transcription of the one or more species of regRNA (e.g., eRNA) is modulated by introducing a mutation in a DNA region (e.g., promoter) controlling transcription of the eRNA. In some embodiments, the effective charge, structure, or behavior of the one or more species of regRNA (e.g., eRNA) is modulated by introducing a mutation in the DNA transcribing the eRNA. In some embodiments, the introduced mutation changes the effective charge of the eRNA in the condensate. In some embodiments, the introduced mutation changes the shape of the eRNA in the condensate. In some embodiments, the introduced mutation changes the affinity of the eRNA for associating with the condensate or one or more components forming or associated with the condensate. In some embodiments, the mutation is introduced by contact with an agent as described herein (e.g., a targeting endonuclease).
As used herein, the phrases “a component associated with a condensate” or the like and the phrase “a condensate component” or the like refer to a peptide, protein, nucleic acid, signaling molecule, lipid, or the like that is part of a condensate or has the capability of being part of a condensate (e.g., transcriptional or synthetic condensate). In some embodiments, the component is within the condensate. In some embodiments, the component is on the surface of the condensate. In some embodiments, the component is necessary for condensate formation or stability. In some embodiments, the component is not necessary for condensate formation or stability. In some embodiments, the component is a protein or peptide and comprises one or more intrinsically ordered domains (e.g., an IDR of an activation domain of a transcription factor, an IDR that interacts with an IDR of an activation domain of a transcription factor, an IDR of a signaling factor, an IDR of a polymerase,). In some embodiments, the component is a non-structural member of a condensate (e.g., not necessary for condensate integrity) and is sometimes referred to as a client component. In some embodiments, a condensate comprises, consists of, or consists essentially of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more components. In some embodiments, the component is a fragment of a protein or nucleic acid. In some embodiments, the transcriptional condensate components comprise transcription factors, co-factors, chromatin regulators, DNA, non-coding RNA, nascent RNA, RNA polymerase II, kinases, proteasomes, topoisomerase, and/or enhancers. In some embodiments, the transcription factor is, e.g., OCT4, p53, MYC, GCN4, NANOG, MyoD, KLF4, a SOX family transcription factor, a GATA family transcription factor, a nuclear receptor, or a fusion oncogenic transcription factor. In some embodiments, the component is mediator, a mediator component, MED1, BRD4, POLII (i.e., POL2). In some embodiments, the transcriptional condensate component is a fragment of a transcriptional condensate component described herein comprising an IDR. Regions of intrinsic disorder, also termed intrinsic (or intrinsically) disordered regions (IDR) or intrinsic (or intrinsically) disordered domains can be found in many protein condensate components. Each of these terms is used interchangeably throughout the disclosure. IDR lack stable secondary and tertiary structure. In some embodiments, an IDR may be identified by the methods disclosed in Ali, M., & Ivarsson, Y. (2018). High-throughput discovery of functional disordered regions. Molecular Systems Biology, 14(5), e8377. IDRs are known in the art and any suitable method may be used to identify an IDR.
In some embodiments, transcription is modulated by contacting the one or more species of regRNA with an agent. The agent is not limited and may be any suitable agent.
“Agent” is used herein to refer to any substance, compound (e.g., molecule), supramolecular complex, material, or combination or mixture thereof. In some aspects, an agent can be represented by a chemical formula, chemical structure, or sequence. Example of agents, include, e.g., small molecules, polypeptides, nucleic acids (e.g., RNAi agents, antisense oligonucleotide, aptamers), lipids, polysaccharides, peptide mimetics, etc. In general, agents may be obtained using any suitable method known in the art. The ordinary skilled artisan will select an appropriate method based, e.g., on the nature of the agent. An agent may be at least partly purified or be substantially pure. In some embodiments an agent may be provided as part of a composition, which may contain, e.g., a counter-ion, aqueous or non-aqueous diluent or carrier, buffer, preservative, or other ingredient, in addition to the agent, in various embodiments. In some embodiments an agent may be provided as a salt, ester, hydrate, or solvate. In some embodiments an agent is cell-permeable, e.g., within the range of typical agents that are taken up by cells and acts intracellularly, e.g., within mammalian cells. Certain compounds may exist in particular geometric or stereoisomeric forms. Such compounds, including cis- and trans-isomers, E- and Z-isomers, R- and S-enantiomers, diastereomers, (D)-isomers, (L)-isomers, (−)- and (+)-isomers, racemic mixtures thereof, and other mixtures thereof are encompassed by this disclosure in various embodiments unless otherwise indicated. Certain compounds may exist in a variety or protonation states, may have a variety of configurations, may exist as solvates (e.g., with water (i.e. hydrates) or common solvents) and/or may have different crystalline forms (e.g., polymorphs) or different tautomeric forms. Embodiments exhibiting such alternative protonation states, configurations, solvates, and forms are encompassed by the present disclosure where applicable.
An “analog” of a first agent refers to a second agent that is structurally and/or functionally similar to the first agent. A “structural analog” of a first agent is an analog that is structurally similar to the first agent. Unless otherwise specified, the term “analog” as used herein refers to a structural analog. A structural analog of an agent may have substantially similar physical, chemical, biological, and/or pharmacological propert(ies) as the agent or may differ in at least one physical, chemical, biological, or pharmacological property. In some embodiments at least one such property differs in a manner that renders the analog more suitable for a purpose of interest. In some embodiments a structural analog of an agent differs from the agent in that at least one atom, functional group, or substructure of the agent is replaced by a different atom, functional group, or substructure in the analog. In some embodiments, a structural analog of an agent differs from the agent in that at least one hydrogen or substituent present in the agent is replaced by a different moiety (e.g., a different substituent) in the analog.
In some embodiments, the agent is a nucleic acid. The term “nucleic acid” refers to polynucleotides such as deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). The terms “nucleic acid” and “polynucleotide” are used interchangeably herein and should be understood to include double-stranded polynucleotides, single-stranded (such as sense or antisense) polynucleotides, and partially double-stranded polynucleotides. A nucleic acid often comprises standard nucleotides typically found in naturally occurring DNA or RNA (which can include modifications such as methylated nucleobases), joined by phosphodiester bonds. In some embodiments a nucleic acid may comprise one or more non-standard nucleotides, which may be naturally occurring or non-naturally occurring (i.e., artificial; not found in nature) in various embodiments and/or may contain a modified sugar or modified backbone linkage. Nucleic acid modifications (e.g., base, sugar, and/or backbone modifications), non-standard nucleotides or nucleosides, etc., such as those known in the art as being useful in the context of RNA interference (RNAi), aptamer, CRISPR technology, polypeptide production, reprogramming, or antisense-based molecules for research or therapeutic purposes may be incorporated in various embodiments. Such modifications may, for example, increase stability (e.g., by reducing sensitivity to cleavage by nucleases), decrease clearance in vivo, increase cell uptake, or confer other properties that improve the translation, potency, efficacy, specificity, or otherwise render the nucleic acid more suitable for an intended use. Various non-limiting examples of nucleic acid modifications are described in, e.g., Deleavey G F, et al., Chemical modification of siRNA. Curr. Protoc. Nucleic Acid Chem. 2009; 39:16.3.1-16.3.22; Crooke, ST (ed.) Antisense drug technology: principles, strategies, and applications, Boca Raton: CRC Press, 2008; Kurreck, J. (ed.) Therapeutic oligonucleotides, RSC biomolecular sciences. Cambridge: Royal Society of Chemistry, 2008; U.S. Pat. Nos. 4,469,863; 5,536,821; 5,541,306; 5,637,683; 5,637,684; 5,700,922; 5,717,083; 5,719,262; 5,739,308; 5,773,601; 5,886,165; 5,929,226; 5,977,296; 6,140,482; 6,455,308 and/or in PCT application publications WO 00/56746 and WO 01/14398. Different modifications may be used in the two strands of a double-stranded nucleic acid. A nucleic acid may be modified uniformly or on only a portion thereof and/or may contain multiple different modifications. Where the length of a nucleic acid or nucleic acid region is given in terms of a number of nucleotides (nt) it should be understood that the number refers to the number of nucleotides in a single-stranded nucleic acid or in each strand of a double-stranded nucleic acid unless otherwise indicated. An “oligonucleotide” is a relatively short nucleic acid, typically between about 5 and about 100 nt long. In some embodiments, the nucleic acid agent codes for a gene product of the testis specific gene or X-linked homolog thereof.
“Nucleic acid construct” refers to a nucleic acid that is generated by man and is not identical to nucleic acids that occur in nature, i.e., it differs in sequence from naturally occurring nucleic acid molecules and/or comprises a modification that distinguishes it from nucleic acids found in nature. A nucleic acid construct may comprise two or more nucleic acids that are identical to nucleic acids found in nature, or portions thereof, but are not found as part of a single nucleic acid in nature.
In some embodiments, the agent is a small molecule. The term “small molecule” refers to an organic molecule that is less than about 2 kilodaltons (kDa) in mass. In some embodiments, the small molecule is less than about 1.5 kDa, or less than about 1 kDa. In some embodiments, the small molecule is less than about 800 daltons (Da), 600 Da, 500 Da, 400 Da, 300 Da, 200 Da, or 100 Da. Often, a small molecule has a mass of at least 50 Da. In some embodiments, a small molecule is non-polymeric. In some embodiments, a small molecule is not an amino acid. In some embodiments, a small molecule is not a nucleotide. In some embodiments, a small molecule is not a saccharide. In some embodiments, a small molecule contains multiple carbon-carbon bonds and can comprise one or more heteroatoms and/or one or more functional groups important for structural interaction with proteins (e.g., hydrogen bonding), e.g., an amine, carbonyl, hydroxyl, or carboxyl group, and in some embodiments at least two functional groups. Small molecules often comprise one or more cyclic carbon or heterocyclic structures and/or aromatic or polyaromatic structures, optionally substituted with one or more of the above functional groups.
In some embodiments, the agent is a protein or polypeptide. The term “polypeptide” refers to a polymer of amino acids linked by peptide bonds. A protein is a molecule comprising one or more polypeptides. A peptide is a relatively short polypeptide, typically between about 2 and 100 amino acids (aa) in length, e.g., between 4 and 60 aa; between 8 and 40 aa; between 10 and 30 aa. The terms “protein”, “polypeptide”, and “peptide” may be used interchangeably. In general, a polypeptide may contain only standard amino acids or may comprise one or more non-standard amino acids (which may be naturally occurring or non-naturally occurring amino acids) and/or amino acid analogs in various embodiments. A “standard amino acid” is any of the 20 L-amino acids that are commonly utilized in the synthesis of proteins by mammals and are encoded by the genetic code. A “non-standard amino acid” is an amino acid that is not commonly utilized in the synthesis of proteins by mammals. Non-standard amino acids include naturally occurring amino acids (other than the 20 standard amino acids) and non-naturally occurring amino acids. An amino acid, e.g., one or more of the amino acids in a polypeptide, may be modified, for example, by addition, e.g., covalent linkage, of a moiety such as an alkyl group, an alkanoyl group, a carbohydrate group, a phosphate group, a lipid, a polysaccharide, a halogen, a linker for conjugation, a protecting group, a small molecule (such as a fluorophore), etc.
In some embodiments, the agent is a peptide mimetic. The terms “mimetic,” “peptide mimetic” and “peptidomimetic” are used interchangeably herein, and generally refer to a peptide, partial peptide or non-peptide molecule that mimics the tertiary binding structure or activity of a selected native peptide or protein functional domain (e.g., binding motif or active site). These peptide mimetics include recombinantly or chemically modified peptides, as well as non-peptide agents such as small molecule drug mimetics.
In some embodiments, the agent is encoded by a synthetic RNA (e.g., modified mRNAs). The synthetic RNA can encode any suitable agent described herein. Synthetic RNAs, including modified RNAs are taught in WO 2017075406, which is herein incorporated by reference. In some embodiments, the agent is, or is encoded by, a synthetic RNA (e.g., modified mRNAs) conjugated to non-nucleic acid molecules. In some embodiments, the synthetic RNAs are conjugated to (or otherwise physically associated with) a moiety that promotes cellular uptake, nuclear entry, and/or nuclear retention (e.g., peptide transport moieties or the nucleic acids). In some embodiments, the synthetic RNA is conjugated to a peptide transporter moiety, for example a cell-penetrating peptide transport moiety, which is effective to enhance transport of the oligomer into cells.
In some embodiments, the agent is a targetable nuclease and, if appropriate, a guide molecule (e.g., one or more gRNA). In some embodiments, the agent is capable of making a mutation in a DNA region coding a regRNA (e.g., eRNA) or a DNA region controlling transcription of the regRNA. The term “targetable nuclease” refers to a nuclease that can be programmed to produce site-specific DNA breaks, e.g., double-stranded breaks (DSBs), at a selected site in DNA. Such a site may be referred to as a “target site”. The target site can be selected by appropriate design of the targetable nuclease or by providing a guide molecule (e.g., a guide RNA) directs the nuclease to the target site. Examples of targetable nucleases include zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and RNA-guided nucleases (RGNs) such as the Cas proteins of the CRISPR/Cas Type II system, and engineered meganucleases. In some embodiments, the agent is an RNA-guided nucleases (RGNs) (e.g., the Cas proteins of the CRISPR/Cas Type II system targetable nuclease) and RNA template (e.g., gRNA) capable of making a mutation in a DNA region coding a regRNA (e.g., eRNA) or a DNA region controlling transcription of the regRNA that modulates transcription of the relevant gene.
In some embodiments, the agent comprises an “RNA interference” (RNAi) agent or an antisense oligonucleotide specifically binding with the regRNA.
The term “RNA interference” (RNAi) encompasses processes in which a molecular complex known as an RNA-induced silencing complex (RISC) is formed. RISC may incorporate a short nucleic acid strand (e.g., about 16-about 30 nucleotides (nt) in length) that pairs with RNA (e.g., regRNA, eRNA) to which the strand has complementarity. The short nucleic acid strand may be referred to as a “guide strand” or “antisense strand”. An RNA strand to which the guide strand has complementarity may be referred to as a “target RNA.” A guide strand may initially become associated with RISC components (in a complex sometimes termed the RISC loading complex) as part of a short double-stranded RNA (dsRNA), e.g., a short interfering RNA (siRNA). The other strand of the short dsRNA may be referred to as a “passenger strand” or “sense strand”. The complementarity of the structure formed by hybridization of a target RNA and the guide strand may be such that the strand can (i) guide cleavage of the target RNA in the RNA-induced silencing complex (RISC) and/or (ii) modulate the effective charge, structure, or behavior of the target RNA (e.g., regRNA, eRNA).
In some embodiments, the RNAi agent reduces the amount of regRNA in the condensate by increasing target regRNA cleavage in the RISC. In some embodiments, the RNAi agent modulates the amount of nucleic acid in the condensate. In some embodiments, the RNAi agent binds to and localizes with the target regRNA into the condensate, thereby increasing nucleic acid in the condensate. In some embodiments, the increase in nucleic acid accelerates condensate formation (e.g., increases transcription of the associated gene). In some embodiments, the increase in nucleic acid accelerates condensate dissolution (e.g., decreases transcription of associated gene). In some embodiments, the RNAi agent binds to and modulates the effective charge, structure, or behavior of the regRNA incorporated into the condensate. In some embodiments, the modulation of regRNA effective charge, structure, or behavior accelerates condensate formation (e.g., increases transcription of the associated gene). In some embodiments, the modulation of regRNA effective charge, structure, or behavior accelerates condensate dissolution (e.g., decreases transcription of the associated gene). In some embodiments, the modulation of regRNA effective charge, structure, or behavior stabilizes the condensate (e.g., increases transcription of the associated gene).
As known in the art, the complementarity between the guide strand and a target RNA need not be perfect (100%) but need only be sufficient to result in specific binding to the target RNA. For example, in some embodiments 1, 2, 3, 4, 5, or more nucleotides of a guide strand may not be matched to a target RNA. “Not matched” or “unmatched” refers to a nucleotide that is mismatched (not complementary to the nucleotide located opposite it in a duplex, i.e., wherein Watson-Crick base pairing does not take place) or forms at least part of a bulge. Examples of mismatches include, without limitation, an A opposite a G or A, a C opposite an A or C, a U opposite a Cor U, a G opposite a G. A bulge refers to a sequence of one or more nucleotides in a strand within a generally duplex region that are not located opposite to nucleotide(s) in the other strand. “Partly complementary” refers to less than perfect complementarity. In some embodiments a guide strand has at least about 80%, 85%, or 90%, e.g., at least about 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence complementarity to a target RNA over a continuous stretch of at least about 15 nt, e.g., between 15 nt and 30 nt, between 17 nt and 29 nt, between 18 nt and 25 nt, between 19 nt and 23 nt, of the target RNA. In some embodiments at least the seed region of a guide strand (the nucleotides in positions 2-7 or 2-8 of the guide strand) is perfectly complementary to a target RNA. In some embodiments, a guide strand and a target RNA sequence may form a duplex that contains no more than 1, 2, 3, or 4 mismatched or bulging nucleotides over a continuous stretch of at least 10 nt, e.g., between 10-30 nt. In some embodiments a guide strand and a target RNA sequence may form a duplex that contains no more than 1, 2, 3, 4, 5, or 6 mismatched or bulging nucleotides over a continuous stretch of at least 12 nt, e.g., between 10-30 nt. In some embodiments, a guide strand and a target RNA sequence may form a duplex that contains no more than 1, 2, 3, 4, 5, 6, 7, or 8 mismatched or bulging nts over a continuous stretch of at least 15 nt, e.g., between 10-30 nt. In some embodiments, a guide strand and a target RNA sequence may form a duplex that contains no mismatched or bulging nucleotides over a continuous stretch of at least 10 nt, e.g., between 10-30 nt. In some embodiments, between 10-30 nt is 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nt.
As used herein, the term “RNAi agent” encompasses nucleic acids that can be used to achieve RNAi in eukaryotic cells. Short interfering RNA (siRNA), short hairpin RNA (shRNA), and microRNA (miRNA) are examples of RNAi agents. siRNAs typically comprise two separate nucleic acid strands that are hybridized to each other to form a structure that contains a double stranded (duplex) portion at least 15 nt in length, e.g., about 15-about 30 nt long, e.g., between 17-27 nt long, e.g., between 18-25 nt long, e.g., between 19-23 nt long, e.g., 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides. In some embodiments the strands of an siRNA are perfectly complementary to each other within the duplex portion. In some embodiments the duplex portion may contain one or more unmatched nucleotides, e.g., one or more mismatched (non-complementary) nucleotide pairs or bulged nucleotides. In some embodiments either or both strands of an siRNA may contain up to about 1, 2, 3, or 4 unmatched nucleotides within the duplex portion. In some embodiments a strand may have a length of between 15-35 nt, e.g., between 17-29 nt, e.g., 19-25 nt, e.g., 21-23 nt. Strands may be equal in length or may have different lengths in various embodiments. In some embodiments, strands may differ by 1-10 nt in length. A strand may have a 5′ phosphate group and/or a 3′ hydroxyl (—OH) group. Either or both strands of an siRNA may comprise a 3′ overhang of, e.g., about 1-10 nt (e.g., 1-5 nt, e.g., 2 nt). Overhangs may be the same length or different in lengths in various embodiments. In some embodiments an overhang may comprise or consist of deoxyribonucleotides, ribonucleotides, or modified nucleotides or modified ribonucleotides such as 2′-O-methylated nucleotides, or 2′-O-methyl-uridine. An overhang may be perfectly complementary, partly complementary, or not complementary to a target RNA in a hybrid formed by the guide strand and the target RNA in various embodiments.
shRNAs are nucleic acid molecules that comprise a stem-loop structure and a length typically between about 40-150 nt, e.g., about 50-100 nt, e.g., about 60-80 nt. A “stem-loop structure” (also referred to as a “hairpin” structure) refers to a nucleic acid having a secondary structure that includes a region of nucleotides which are known or predicted to form a double strand (stem portion; duplex) that is linked on one side by a region of (usually) predominantly single-stranded nucleotides (loop portion). Such structures are well known in the art and the term is used consistently with its meaning in the art. A guide strand sequence may be positioned in either arm of the stem, i.e., 5ÿ with respect to the loop or 3ÿ with respect to the loop in various embodiments. As is known in the art, the stem structure does not require exact base-pairing (perfect complementarity). Thus, the stem may include one or more unmatched residues or the base-pairing may be exact, i.e., it may not include any mismatches or bulges. In some embodiments the stem is between 15-30 nt, e.g., between 17-29 nt, e.g., between 19-25 nt. In some embodiments the stem is between 15-19 nt. In some embodiments the stem is between 19-30 nt. The primary sequence and number of nucleotides within the loop may vary. Examples of loop sequences include, e.g., UGGU; ACUCGAGA; UUCAAGAGA. In some embodiments a loop sequence found in a naturally occurring miRNA precursor molecule (e.g., a pre-miRNA) may be used. In some embodiments a loop sequence may be absent (in which case the termini of the duplex portion may be directly linked). In some embodiments a loop sequence may be at least partly self-complementary. In some embodiments the loop is between 1 and 20 nt in length, e.g., 1-15 nt, e.g., 4-9 nt. The shRNA structure may comprise a 5′ or 3′ overhang. As known in the art, an shRNA may undergo intracellular processing, e.g., by the ribonuclease (RNase) III family enzyme known as Dicer, to remove the loop and generate an siRNA.
Mature endogenous miRNAs are short (typically 18-24 nt, e.g., about 22 nt), single-stranded RNAs that are generated by intracellular processing from larger, endogenously encoded precursor RNA molecules termed miRNA precursors (see, e.g., Bartel, D., Cell. 116(2):281-97 (2004); Bartel D P. Cell. 136(2):215-33 (2009); Winter, J., et al., Nature Cell Biology 11: 228-234 (2009). Artificial miRNA may be designed to take advantage of the endogenous RNAi pathway in order to bind with and/or silence a target RNA of interest. The sequence of such artificial miRNA may be selected so that one or more bulges is present when the artificial miRNA is hybridized to its target sequence, mimicking the structure of naturally occurring miRNA:mRNA hybrids. Those of ordinary skill in the art are aware of how to design artificial miRNA.
In some embodiments an RNAi agent is a vector (e.g., an expression vector) suitable for causing intracellular expression of one or more transcripts that give rise to a siRNA, shRNA, or miRNA in the cell. Such a vector may be referred to as an “RNAi vector”. An RNAi vector may comprise a template that, when transcribed, yields transcripts that may form a siRNA (e.g., as two separate strands that hybridize to each other), shRNA, or miRNA precursor (e.g., pre-miRNA).
Antisense oligonucleotides (ASO) are small sequences of DNA or RNA (e.g., about 8-50 base pairs in length) able to target RNA transcripts (e.g., regRNA, eRNA) by Watson-Crick base pairing. In some embodiments, oligonucleotides are unmodified. In other embodiments oligonucleotides include one or more modifications, e.g., to improve solubility, binding, potency, and/or stability of the antisense oligonucleotide. Modified oligonucleotides may comprise at least one modification relative to unmodified RNA or DNA. In some embodiments, oligonucleotides are modified to include internucleoside linkage modifications, sugar modifications, and/or nucleobase modifications. Examples of such modifications are known to those of skill in the art.
In some embodiments the oligonucleotide is modified by the substitution of at least one nucleotide with a modified nucleotide, such that in vivo stability is enhanced as compared to a corresponding unmodified oligonucleotide. In some aspects, the modified nucleotide is a sugar-modified nucleotide. In another aspect, the modified nucleotide is a nucleobase-modified nucleotide.
In some embodiments, oligonucleotides, may contain at least one modified nucleotide analogue. The nucleotide analogues may be located at positions where the target-specific activity, e.g., the splice site selection modulating activity is not substantially affected, e.g., in a region at the 5′-end and/or the 3′-end of the oligonucleotide molecule. In some aspects, the ends may be stabilized by incorporating modified nucleotide analogues.
In some aspects preferred nucleotide analogues include sugar- and/or backbone-modified ribonucleotides (i.e., include modifications to the phosphate-sugar backbone). For example, the phosphodiester linkages of a ribonucleotide may be modified to include at least one of a nitrogen or sulfur heteroatom. In preferred backbone-modified ribonucleotides the phosphoester group connecting to adjacent ribonucleotides is replaced by a modified group, e.g., of phosphothioate group. In preferred sugar-modified ribonucleotides, the 2′ OH-group is replaced by a group selected from H, OR, R, halo, SH, SR, NH2, NHR, NR2 or ON, wherein R is C1-C6 alkyl, alkenyl or alkynyl and halo is F, Cl, Br or I.
In some embodiments, modified oligonucleotides comprise one or more modified nucleosides comprising a modified sugar moiety.
Modified oligonucleotides may comprise one or more nucleosides comprising an unmodified nucleobase. In some embodiments modified oligonucleotides comprise one or more nucleosides comprising a modified nucleobase. In some embodiments, modified oligonucleotides comprise one or more nucleosides that does not comprise a nucleobase.
In certain embodiments, modified nucleobases are selected from: 5-substituted pyrimidines, 6-azapyrimidines, alkyl or alkynyl substituted pyrimidines, alkyl substituted purines, and N-2, N-6 and 0-6 substituted purines. In certain embodiments, modified nucleobases are selected from: 2-aminopropyladenine, 5-hydroxymethyl cytosine, xanthine, hypoxanthine, 2-aminoadenine, 6-N-methylguanine, 6-N-methyladenine, 2-propyladenine, 2-thiouracil, 2-thiothymine and 2-thiocytosine, 5-propynyl uracil, 5-propynylcytosine, 6-azouracil, 6-azocytosine, 6-azothymine, 5-ribosyluracil (pseudouracil), 4-thiouracil, 8-halo, 8-amino, 8-thiol, 8-thioalkyl, 8-hydroxyl, 8-aza and other 8-substituted purines, 5-halo, particularly 5-bromo, 5-trifluoromethyl, 5-halouracil, and 5-halocytosine, 7-methylguanine, 7-methyladenine, 2-F-adenine, 2-aminoadenine, 7-deazaguanine, 7-deazaadenine, 3-deazaguanine, 3-deazaadenine, 6-N-benzoyladenine, 2-N-isobutyrylguanine, 4-N-benzoylcytosine, 4-N-benzoyluracil, 5-methyl 4-N-benzoylcytosine, 5-methyl 4-N-benzoyluracil, universal bases, hydrophobic bases, promiscuous bases, size-expanded bases, and fluorinated bases. Further modified nucleobases include tricyclic pyrimidines, such as 1,3-diazaphenoxazine-2-one, 1,3-diazaphenothiazine-2-one and 9-(2-aminoethoxy)-1,3-diazaphenoxazine-2-one (G-clamp). Modified nucleobases may also include those in which the purine or pyrimidine base is replaced with other heterocycles, for example 7-deaza-adenine, 7-deazaguanosine, 2-aminopyridine and 2-pyridone.
Also preferred are nucleobase-modified ribonucleotides, i.e., ribonucleotides, containing at least one non-naturally occurring nucleobase instead of a naturally occurring nucleobase. Examples of modified nucleobases include, but are not limited to, uridine and/or cytidine modifications at the 5-position, e.g., 5-(2-amino)propyl uridine, 5-bromo uridine; adenosine and/or guanosines modified at the 8 position, e.g., 8-bromo guanosine; deaza nucleotides, e.g., 7-deaza-adenosine; O- and N-alkylated nucleotides, e.g., N6-methyl adenosine. Oligonucleotide reagents of the invention also may be modified with chemical moieties that improve the in vivo pharmacological properties of the oligonucleotide reagents.
In some embodiments, nucleosides of modified oligonucleotides are linked together using any internucleoside linkage.
Additional modifications are known by those of skill in the art and examples can be found in WO 2019/241648, U.S. Pat. Nos. 10,307,434, 9,045,518, and 10,266,822, each of which is incorporated herein by reference.
Oligonucleotides may be of any size and/or chemical composition sufficient to target a regRNA. In some embodiments, an oligonucleotide is between about 5-300 nucleotides or modified nucleotides. In some aspects an oligonucleotide is between about 10-100, 15-85, 20-70, 25-55, or 30-40 nucleotides or modified nucleotides. In certain aspects an oligonucleotide is between about 15-35, 15-20, 20-25, 25-30, or 30-35 nucleotides or modified nucleotides.
In some embodiments, an oligonucleotide and the target RNA sequence (e.g., regRNA, eRNA) have 100% sequence complementarity. In some aspects, an oligonucleotide may comprise sequence variations, e.g., insertions, deletions, and single point mutations, relative to the target sequence. In some embodiments, an oligonucleotide has at least 70% sequence identity or complementarity to the target RNA. In certain embodiments, an oligonucleotide has at least 70%, 75%, 80%, 85%, 90%, 95%, 97%, 99%, or 100% sequence identity to the target sequence.
In some embodiments, the ASO agent reduces the amount of the target regRNA in the condensate by increasing target regRNA degradation. In some embodiments, the ASO agent modulates the amount of nucleic acid in the condensate. In some embodiments, the ASO agent binds to and localizes with the target regRNA into the condensate, thereby increasing nucleic acid in the condensate. In some embodiments, the increase in nucleic acid accelerates condensate formation (e.g., increases transcription of the associated gene). In some embodiments, the increase in nucleic acid accelerates condensate dissolution (e.g., decreases transcription of associated gene). In some embodiments, the ASO agent binds to and modulates the effective charge, structure, or behavior of the regRNA incorporated into the condensate. In some embodiments, the modulation of regRNA effective charge, structure, or behavior accelerates condensate formation (e.g., increases transcription of the associated gene). In some embodiments, the modulation of regRNA effective charge, structure, or behavior accelerates condensate dissolution (e.g., decreases transcription of the associated gene). In some embodiments, the modulation of regRNA effective charge, structure, or behavior stabilizes the condensate (e.g., increases transcription of the associated gene).
In some embodiments, transcription is modulated by contacting the one or more species of regRNA with an agent (e.g., RNAi agent or ASO agent) capable of specifically binding to the one or more species of regRNA. In some embodiments, the one or more species of regRNA comprises enhancer RNA (eRNA). In some embodiments, the eRNA is a sequence variant associated with a disease or condition.
In some embodiments, the amount, effective charge, structure, or behavior of nucleic acid incorporated into the condensate is modulated by contact with an agent specifically binding to a nucleic acid associated with the condensate. The agent is not limited and may be any suitable agent described herein specifically binding to a nucleic acid associated with the condensate. In some embodiments, the agent is an oligonucleotide (e.g., RNAi agent or ASO agent) specifically binding a target messenger RNA or portion thereof. In some embodiments, the agent reduces the amount of oligonucleotide (e.g., mRNA) associated with the condensate by increasing target oligonucleotide (e.g., mRNA) degradation. In some embodiments, the agent modulates the amount of nucleic acid in the condensate. In some embodiments, the agent binds to and localizes with the target oligonucleotide (e.g., mRNA) into the condensate, thereby increasing nucleic acid in the condensate. In some embodiments, the increase in nucleic acid accelerates condensate formation (e.g., increases transcription of the associated gene). In some embodiments, the increase in nucleic acid accelerates condensate dissolution (e.g., decreases transcription of associated gene). In some embodiments, the ASO agent binds to and modulates the effective charge, structure, or behavior of the oligonucleotide (e.g., mRNA) incorporated into the condensate. In some embodiments, the modulation of oligonucleotide (e.g., mRNA) effective charge, structure, or behavior accelerates condensate formation (e.g., increases transcription of the associated gene). In some embodiments, the modulation of oligonucleotide (e.g., mRNA) effective charge, structure, or behavior accelerates condensate dissolution (e.g., decreases transcription of the associated gene). In some embodiments, the modulation of oligonucleotide (e.g., mRNA) effective charge, structure, or behavior stabilizes the condensate (e.g., increases transcription of the associated gene).
In some embodiments, the condensate dependent transcription of a gene occurs in a cell. The cell is not limited and may be any cell described herein. In some embodiments, the cell is a cancer cell. In some embodiments, the cell is a nerve cell. In some embodiments, the condensate dependent transcription of a gene occurs in vivo in a subject. The subject is not limited. The term “subject” encompasses any vertebrate including but not limited to mammals (e.g., rats, mice, rabbits, sheep, cats, dogs, cows, pigs, and non-human primates), reptiles, amphibians and fish. However, advantageously, the subject is a mammal such as a human, or other mammals such as a domesticated mammal, e.g. dog, cat, horse, and the like, or production mammal, e.g. cow, sheep, pig, and the like.
In some embodiments, modulating of condensate dependent transcription of a gene treats, prevents or reduces the likelihood of a disease or condition in the subject. The disease or condition is not limited and may be any disease or condition disclosed herein treatable by modulating condensate dependent transcription of a gene. In some embodiments, modulating of condensate dependent transcription of a gene treats, prevents or reduces the likelihood of a disease or condition associated with a haploinsufficiency. In some embodiments, the disease or condition associated with a haploinsufficiency is a cancer, 1q21.1 deletion syndrome, 5q-syndrome in myelodysplastic syndrome (MDS), 22q11.2 deletion syndrome, CHARGE syndrome, Cleidocranial dysostosis, Ehlers-Danlos syndrome, Frontotemporal dementia caused by mutations in progranulin, GLUT1 deficiency (DeVivo syndrome), Haploinsufficiency of A20, Holoprosencephaly caused by haploinsufficiency in the Sonic Hedgehog gene, Holt-Oram syndrome, Marfan syndrome, Phelan-McDermid syndrome, Polydactyly, or Dravet Syndrome. In some embodiments, modulating of condensate dependent transcription of a gene treats, prevents or reduces the likelihood of a disease or condition associated with gene duplication. In some embodiments, the disease or condition associated with gene duplication is a cancer with an oncogene duplication, Charcot-Marie-Tooth disease type I, or MECP2 duplication syndrome. In some embodiments, modulating of condensate dependent transcription of a gene treats, prevents or reduces the likelihood of a disease or condition associated with an eRNA variant (e.g., an eRNA comprising an SNP). In some embodiments, modulating of condensate dependent transcription of a gene treats, prevents or reduces the likelihood of a disease or condition associated with aberrant transcription (e.g., cancer).
“Treat” as used herein covers any treatment of a disease or condition of a mammal (e.g., cancer, a disease or condition associated with a haploinsufficiency, gene duplication, or an eRNA variant), particularly a human, and includes: (a) preventing symptoms of the disease or condition (e.g., cancer, a disease or condition associated with a haploinsufficiency, gene duplication, or an eRNA variant) from occurring in a subject which may be predisposed to the disease or condition but has not yet begun experiencing symptoms; (b) inhibiting the disease or condition (e.g., arresting its development); or (c) relieving the disease or condition (e.g., causing regression of the disease or condition, providing improvement in one or more symptoms). The method of administration is not limited and may be any suitable method of administration.
The agents may be administered in pharmaceutically acceptable solutions, which may routinely contain pharmaceutically acceptable concentrations of salt, buffering agents, preservatives, compatible carriers, adjuvants, and optionally other therapeutic ingredients.
The agents may be formulated into preparations in solid, semi-solid, liquid or gaseous forms such as tablets, capsules, powders, granules, ointments, solutions, depositories, inhalants and injections, and usual ways for oral, parenteral or surgical administration. The invention also embraces pharmaceutical compositions which are formulated for local administration, such as by implants.
Compositions suitable for oral administration may be presented as discrete units, such as capsules, tablets, lozenges, each containing a predetermined amount of the active agent. Other compositions include suspensions in aqueous liquids or non-aqueous liquids such as a syrup, elixir or an emulsion.
In some embodiments, agents may be administered directly to a tissue. Direct tissue administration may be achieved by direct injection. The agents may be administered once, or alternatively they may be administered in a plurality of administrations. If administered multiple times, the peptides may be administered via different routes. For example, the first (or the first few) administrations may be made directly into the affected tissue while later administrations may be systemic.
For oral administration, compositions can be formulated readily by combining the agent with pharmaceutically acceptable carriers well known in the art. Such carriers enable the agents to be formulated as tablets, pills, dragees, capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a subject to be treated. Pharmaceutical preparations for oral use can be obtained as solid excipient, optionally grinding a resulting mixture, and processing the mixture of granules, after adding suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, gum tragacanth, methyl cellulose, hydroxypropylmethyl cellulose, sodium carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents may be added, such as the cross linked polyvinyl pyrrolidone, agar, or alginic acid or a salt thereof such as sodium alginate. Optionally the oral formulations may also be formulated in saline or buffers for neutralizing internal acid conditions or may be administered without any carriers.
Dragee cores are provided with suitable coatings. For this purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be added to the tablets or dragee coatings for identification or to characterize different combinations of active compound doses.
Pharmaceutical preparations which can be used orally include push fit capsules made of gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, stabilizers may be added. Microspheres formulated for oral administration may also be used. Such microspheres have been well defined in the art. All formulations for oral administration should be in dosages suitable for such administration. For buccal administration, the compositions may take the form of tablets or lozenges formulated in conventional manner.
The compounds, when it is desirable to deliver them systemically, may be formulated for parenteral administration by injection, e.g., by bolus injection or continuous infusion. Formulations for injection may be presented in unit dosage form, e.g., in ampoules or in multi-dose containers, with an added preservative. The compositions may take such forms as suspensions, solutions or emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, stabilizing and/or dispersing agents.
Preparations for parenteral administration include sterile aqueous or non-aqueous solutions, suspensions, and emulsions. Examples of non-aqueous solvents are propylene glycol, polyethylene glycol, vegetable oils such as olive oil, and injectable organic esters such as ethyl oleate. Aqueous carriers include water, alcoholic/aqueous solutions, emulsions or suspensions, including saline and buffered media. Parenteral vehicles include sodium chloride solution, Ringer's dextrose, dextrose and sodium chloride, lactated Ringer's, or fixed oils. Intravenous vehicles include fluid and nutrient replenishers, electrolyte replenishers (such as those based on Ringer's dextrose), and the like. Preservatives and other additives may also be present such as, for example, antimicrobials, anti-oxidants, chelating agents, and inert gases and the like In the event that a response in a subject is insufficient at the initial doses applied, higher doses (or effectively higher doses by a different, more localized delivery route) may be employed to the extent that patient tolerance permits. Multiple doses per day are contemplated in some embodiments to achieve appropriate systemic levels of compounds. In some embodiments, the method further includes administering to the subject an effective amount of at least one chemotherapeutic agent. The chemotherapeutic agent is not limited and may be any suitable chemotherapeutic agent known in the art.
RNAi and ASO agents described herein may formulated with one or more acceptable reagents, which provide a vehicle for delivering such to target cells. Appropriate reagents are generally selected with regard to a number of factors, which include, among other things, the biological or chemical properties of the RNAi and ASO agents, the intended route of administration, the anticipated biological environment to which such RNAi and ASO agents will be exposed and the specific properties of the intended target cells. In some embodiments, transfer vehicles, such as liposomes, encapsulate the RNAi and ASO agents without compromising biological activity. In some embodiments, the transfer vehicle demonstrates preferential and/or substantial binding to a target cell relative to non-target cells. In a preferred embodiment, the transfer vehicle delivers its contents to the target cell such that the RNAi and ASO agents are delivered to the appropriate subcellular compartment, such as the cytoplasm.
In some embodiments, the transfer vehicle in the compositions of the invention is a liposomal transfer vehicle, e.g. a lipid nanoparticle. In one embodiment, the transfer vehicle may be selected and/or prepared to optimize delivery of the nucleic acid to a target cell. For example, if the target cell is a hepatocyte the properties of the transfer vehicle (e.g., size, charge and/or pH) may be optimized to effectively deliver such transfer vehicle to the target cell, reduce immune clearance and/or promote retention in that target cell. Alternatively, if the target cell is in the central nervous system (e.g., for the treatment of neurodegenerative diseases, the transfer vehicle may specifically target brain or spinal tissue), selection and preparation of the transfer vehicle must consider penetration of, and retention within the blood brain barrier and/or the use of alternate means of directly delivering such transfer vehicle to such target cell.
The use of liposomal transfer vehicles to facilitate the delivery of nucleic acids to target cells is contemplated by the present disclosure. Liposomes (e.g., liposomal lipid nanoparticles) are generally useful in a variety of applications in research, industry, and medicine, particularly for their use as transfer vehicles of diagnostic or therapeutic compounds in vivo (Lasic, Trends Biotechnol., 16: 307-321, 1998; Drummond et al., Pharmacol. Rev., 51: 691-743, 1999) and are usually characterized as microscopic vesicles having an interior aqua space sequestered from an outer medium by a membrane of one or more bilayers. Bilayer membranes of liposomes are typically formed by amphiphilic molecules, such as lipids of synthetic or natural origin that comprise spatially separated hydrophilic and hydrophobic domains (Lasic, Trends Biotechnol., 16: 307-321, 1998). Bilayer membranes of the liposomes can also be formed by amphiphilic polymers and surfactants (e.g., polymerosomes, niosomes, etc.).
In the context of the present disclosure, a liposomal transfer vehicle typically serves to transport the nucleic acid to the target cell. For the purposes of the present invention, the liposomal transfer vehicles are prepared to contain the desired nucleic acids. The process of incorporation of a desired RNAi or ASO agent into a liposome is often referred to as “loading” (Lasic, et al., FEBS Lett., 312: 255-258, 1992). The liposome-incorporated nucleic acids may be completely or partially located in the interior space of the liposome, within the bilayer membrane of the liposome, or associated with the exterior surface of the liposome membrane. The incorporation of a nucleic acid into liposomes is also referred to herein as “encapsulation” wherein the nucleic acid is entirely contained within the interior space of the liposome. The purpose of incorporating a nucleic acid into a transfer vehicle, such as a liposome, is often to protect the nucleic acid from an environment which may contain enzymes or chemicals that degrade nucleic acids and/or systems or receptors that cause the rapid excretion of the nucleic acids. Accordingly, in a preferred embodiment of the present invention, the selected transfer vehicle is capable of enhancing the stability of the nucleic acid contained therein. The liposome can allow the encapsulated nucleic acid to reach the target cell and/or may preferentially allow the encapsulated nucleic acid to reach the target cell, or alternatively limit the delivery of such nucleic acid to other sites or cells where the presence of the administered nucleic acid may be useless or undesirable. Furthermore, incorporating the RNAi and ASO agents into a transfer vehicle, such as for example, a cationic liposome, also facilitates the delivery of such into a target cell.
Liposomal transfer vehicles can be prepared to encapsulate one or more desired RNAi and ASO agents such that the compositions demonstrate a high transfection efficiency and enhanced stability. While liposomes can facilitate introduction of nucleic acids into target cells, the addition of polycations (e.g., poly L-lysine and protamine), as a copolymer can facilitate, and in some instances markedly enhance the transfection efficiency of several types of cationic liposomes by 2-28 fold in a number of cell lines both in vitro and in vivo. (See N. J. Caplen, et al., Gene Ther. 1995; 2: 603; S. Li, et al., Gene Ther. 1997; 4, 891.)
In some embodiments, the transfer vehicle is formulated as a lipid nanoparticle. As used herein, the phrase “lipid nanoparticle” refers to a transfer vehicle comprising one or more lipids (e.g., cationic lipids, non-cationic lipids, and PEG-modified lipids). Preferably, the lipid nanoparticles are formulated to deliver one or more RNAi or ASO agents to one or more target cells.
Examples of suitable lipids include, for example, the phosphatidyl compounds (e.g., phosphatidylglycerol, phosphatidylcholine, phosphatidylserine, phosphatidylethanolamine, sphingolipids, cerebrosides, and gangliosides). Also contemplated is the use of polymers as transfer vehicles, whether alone or in combination with other transfer vehicles. Suitable polymers may include, for example, polyacrylates, polyalkycyanoacrylates, polylactide, polylactide-polyglycolide copolymers, polycaprolactones, dextran, albumin, gelatin, alginate, collagen, chitosan, cyclodextrins, dendrimers and polyethylenimine. In one embodiment, the transfer vehicle is selected based upon its ability to facilitate the transfection of a nucleic acid to a target cell.
The present disclosure contemplates the use of lipid nanoparticles as transfer vehicles comprising a cationic lipid to encapsulate and/or enhance the delivery of nucleic acid into the target cell. As used herein, the phrase “cationic lipid” refers to any of a number of lipid species that carry a net positive charge at a selected pH, such as physiological pH. The contemplated lipid nanoparticles may be prepared by including multi-component lipid mixtures of varying ratios employing one or more cationic lipids, non-cationic lipids and PEG-modified lipids. Several cationic lipids have been described in the literature, many of which are commercially available.
Suitable cationic lipids of use in the compositions and methods herein include those described in international patent publication WO 2010/053572, incorporated herein by reference, e.g., C12-200 described at paragraph of WO 2010/053572. In certain embodiments, the compositions and methods of the invention employ a lipid nanoparticles comprising an ionizable cationic lipid such as, e.g., (15Z,18Z)—N,N-dimethyl-6-(9Z,12Z)-octadeca-9,12-dien-1-yl)tetracosa-15,18-dien-1-amine (HGT5000), (15Z,18Z)—N,N-dimethyl-6-((9Z,12Z)-octadeca-9,12-dien-1-yl)tetracosa-4,15,18-trien-1-amine (HGT5001), and (15Z,18Z)—N,N-dimethyl-6-((9Z,12Z)-octadeca-9,12-dien-1-yl)tetracosa-5,15,18-trien-1-amine (HGT5002).
In some embodiments, the cationic lipid N-[1-(2,3-dioleyloxy)propyl]-N,N,N-trimethylammonium chloride or “DOTMA” is used. (Felgner et al. (Proc. Nat'l Acad. Sci. 84, 7413 (1987); U.S. Pat. No. 4,897,355). DOTMA can be formulated alone or can be combined with the neutral lipid, dioleoylphosphatidyl-ethanolamine or “DOPE” or other cationic or non-cationic lipids into a liposomal transfer vehicle or a lipid nanoparticle, and such liposomes can be used to enhance the delivery of nucleic acids into target cells. Other suitable cationic lipids include, for example, 5-carboxyspermylglycinedioctadecylamide or “DOGS,” 2,3-dioleyloxy-N-[2(spermine-carboxamido)ethyl]-N,N-dimethyl-1-propanaminium or “DOSPA” (Behr et al. Proc. Nat.′l Acad. Sci. 86, 6982 (1989); U.S. Pat. Nos. 5,171,678; 5,334,761), 1,2-Dioleoyl-3-Dimethylammonium-Propane or “DODAP”, 1,2-Dioleoyl-3-Trimethylammonium-Propane or “DOTAP”. Contemplated cationic lipids also include 1,2-distearyloxy-N,N-dimethyl-3-aminopropane or “DSDMA”, 1,2-dioleyloxy-N,N-dimethyl-3-aminopropane or “DODMA”, 1,2-dilinoleyloxy-N,N-dimethyl-3-aminopropane or “DLinDMA”, 1,2-dilinolenyloxy-N,N-dimethyl-3-aminopropane or “DLenDMA”, N-dioleyl-N,N-dimethylammonium chloride or “DODAC”, N,N-distearyl-N,N-dimethylammonium bromide or “DDAB”, N-(1,2-dimyristyloxyprop-3-yl)-N,N-dimethyl-N-hydroxyethyl ammonium bromide or “DMRIE”, 3-dimethylamino-2-(cholest-5-en-3-beta-oxybutan-4-oxy)-1-(cis,cis-9,12-octadecadienoxy)propane or “CLinDMA”, 2-[5ÿ-(cholest-5-en-3-beta-oxy)-3ÿ-oxapentoxy)-3-dimethyl-1-(cis,cis-9ÿ, 1-2ÿ-octadecadienoxy)propane or “CpLinDMA”, N,N-dimethyl-3,4-dioleyloxybenzylamine or “DMOBA”, 1,2-N,Nÿ-dioleylcarbamyl-3-dimethylaminopropane or “DOcarbDAP”, 2,3-Dilinoleoyloxy-N,N-dimethylpropylamine or “DLinDAP”, 1,2-N,Nÿ-Dilinoleylcarbamyl-3-dimethylaminopropane or “DLincarbDAP”, 1,2-Dilinoleoylcarbamyl-3-dimethylaminopropane or “DLinCDAP”, 2,2-dilinoleyl-4-dimethylaminomethyl-[1,3]-dioxolane or “DLin-K-DMA”, 2,2-dilinoleyl-4-dimethylaminoethyl-[1,3]-dioxolane or “DLin-K-XTC2-DMA”, and 2-(2,2-di((9Z,12Z)-octadeca-9,12-dien-1-yl)-1,3-dioxolan-4-yl)-N,N-dimethylethanamine (DLin-KC2-DMA)) (See, WO 2010/042877; Semple et al., Nature Biotech. 28:172-176 (2010)), or mixtures thereof. (Heyes, J., et al., J Controlled Release 107: 276-287 (2005); Morrissey, D V., et al., Nat. Biotechnol. 23(8): 1003-1007 (2005); PCT Publication WO2005/121348A1).
The use of cholesterol-based cationic lipids is also contemplated by the present disclosure. Such cholesterol-based cationic lipids can be used, either alone or in combination with other cationic or non-cationic lipids. Suitable cholesterol-based cationic lipids include, for example, DC-Chol (N,N-dimethyl-N-ethylcarboxamidocholesterol), 1,4-bis(3-N-oleylamino-propyl)piperazine (Gao, et al. Biochem. Biophys. Res. Comm. 179, 280 (1991); Wolf et al. BioTechniques 23, 139 (1997); U.S. Pat. No. 5,744,335), or ICE.
The skilled artisan will appreciate that various reagents are commercially available to enhance transfection efficacy. Suitable examples include LIPOFECTIN (DOTMA:DOPE) (Invitrogen, Carlsbad, Calif.), LIPOFECTAMINE (DOSPA:DOPE) (Invitrogen), LIPOFECTAMINE2000. (Invitrogen), FUGENE, TRANSFECTAM (DOGS), and EFFECTENE.
Also contemplated are cationic lipids such as the dialkylamino-based, imidazole-based, and guanidinium-based lipids. For example, certain embodiments are directed to a composition comprising one or more imidazole-based cationic lipids, for example, the imidazole cholesterol ester or “ICE” lipid (3S,10R,13R,17R)-10,13-dimethyl-17-((R)-6-methylheptan-2-yl)-2,3,4,7,8,9,10,11,12,13,14,15,16,17-tetradecahydro-1H-cyclopenta[a]phenanthren-3-yl 3-(1H-imidazol-4-yl)propanoate. In a preferred embodiment, a transfer vehicle for delivery of synthetic RNA (e.g., modified mRNA) may comprise one or more imidazole-based cationic lipids, for example, the imidazole cholesterol ester or “ICE” lipid (3S,10R,13R,17R)-10,13-dimethyl-17-((R)-6-methylheptan-2-yl)-2,3,4,7,8,9,10,11,12,13,14,15,16,17-tetradecahydro-1H-cyclopenta[a]phenanthren-3-yl 3-(1H-imidazol-4-yl)propanoate.
The imidazole-based cationic lipids are also characterized by their reduced toxicity relative to other cationic lipids. The imidazole-based cationic lipids (e.g., ICE) may be used as the sole cationic lipid in the lipid nanoparticle, or alternatively may be combined with traditional cationic lipids, non-cationic lipids, and PEG-modified lipids. The cationic lipid may comprise a molar ratio of about 1% to about 90%, about 2% to about 70%, about 5% to about 50%, about 10% to about 40% of the total lipid present in the transfer vehicle, or preferably about 20% to about 70% of the total lipid present in the transfer vehicle.
In some embodiments, the lipid nanoparticles comprise the HGT4003 cationic lipid 2-((2,3-Bis((9Z,12Z)-octadeca-9,12-dien-1-yloxy)propyl)disulfanyl)-N,N-dimethylethanamine, as further described in US Pub. No. 20140288160 the entire teachings of which are incorporated herein by reference in their entirety.
In other embodiments the compositions and methods described herein are directed to lipid nanoparticles comprising one or more cleavable lipids, such as, for example, one or more cationic lipids or compounds that comprise a cleavable disulfide (S—S) functional group (e.g., HGT4001, HGT4002, HGT4003, HGT4004 and HGT4005), as further described in US Pub. No. 20140288160, the entire teachings of which are incorporated herein by reference in their entirety.
The use of polyethylene glycol (PEG)-modified phospholipids and derivatized lipids such as derivatized ceramides (PEG-CER), including N-Octanoyl-Sphingosine-1-[Succinyl(Methoxy Polyethylene Glycol)-2000] (C8 PEG-2000 ceramide) is also contemplated by the present invention, either alone or preferably in combination with other lipids together which comprise the transfer vehicle (e.g., a lipid nanoparticle). Contemplated PEG-modified lipids include, but is not limited to, a polyethylene glycol chain of up to 5 kDa in length covalently attached to a lipid with alkyl chain(s) of C6-C20 length. The addition of such components may prevent complex aggregation and may also provide a means for increasing circulation lifetime and increasing the delivery of the lipid-nucleic acid composition to the target cell, (Klibanov et al. (1990) FEBS Letters, 268 (1): 235-237), or they may be selected to rapidly exchange out of the formulation in vivo (see U.S. Pat. No. 5,885,613). In some embodiments, exchangeable lipids comprise PEG-ceramides having shorter acyl chains (e.g., C14 or C18). The PEG-modified phospholipid and derivatized lipids of the present invention may comprise a molar ratio from about 0% to about 20%, about 0.5% to about 20%, about 1% to about 15%, about 4% to about 10%, or about 2% of the total lipid present in the liposomal transfer vehicle.
The present disclosure also contemplates the use of non-cationic lipids. As used herein, the phrase “non-cationic lipid” refers to any neutral, zwitterionic or anionic lipid. As used herein, the phrase “anionic lipid” refers to any of a number of lipid species that carry a net negative charge at a selected pH, such as physiological pH. Non-cationic lipids include, but are not limited to, distearoylphosphatidylcholine (DSPC), dioleoylphosphatidylcholine (DOPC), dipalmitoylphosphatidylcholine (DPPC), dioleoylphosphatidylglycerol (DOPG), dipalmitoylphosphatidylglycerol (DPPG), dioleoylphosphatidylethanolamine (DOPE), palmitoyloleoylphosphatidylcholine (POPC), palmitoyloleoyl-phosphatidylethanolamine (POPE), dioleoyl-phosphatidylethanolamine 4-(N-maleimidomethyl)-cyclohexane-1-carboxylate (DOPE-mal), dipalmitoyl phosphatidyl ethanolamine (DPPE), dimyristoylphosphoethanolamine (DMPE), distearoyl-phosphatidyl-ethanolamine (DSPE), 16-O-monomethyl PE, 16-O-dimethyl PE, 18-1-trans PE, 1-stearoyl-2-oleoyl-phosphatidyethanolamine (SOPE), cholesterol, or a mixture thereof. Such non-cationic lipids may be used alone, but are preferably used in combination with other excipients, for example, cationic lipids. When used in combination with a cationic lipid, the non-cationic lipid may comprise a molar ratio of 5% to about 90%, or preferably about 10% to about 70% of the total lipid present in the transfer vehicle.
In some embodiments, the transfer vehicle (e.g., a lipid nanoparticle) is prepared by combining multiple lipid and/or polymer components. For example, a transfer vehicle may be prepared using C12-200, DOPE, chol, DMG-PEG2K at a molar ratio of 40:30:25:5, or DODAP, DOPE, cholesterol, DMG-PEG2K at a molar ratio of 18:56:20:6, or HGT5000, DOPE, chol, DMG-PEG2K at a molar ratio of 40:20:35:5, or HGT5001, DOPE, chol, DMG-PEG2K at a molar ratio of 40:20:35:5. The selection of cationic lipids, non-cationic lipids and/or PEG-modified lipids which comprise the lipid nanoparticle, as well as the relative molar ratio of such lipids to each other, is based upon the characteristics of the selected lipid(s), the nature of the intended target cells, the characteristics of the synthetic RNA (e.g., modified mRNA) to be delivered. Additional considerations include, for example, the saturation of the alkyl chain, as well as the size, charge, pH, pKa, fusogenicity and toxicity of the selected lipid(s). Thus the molar ratios may be adjusted accordingly. For example, in embodiments, the percentage of cationic lipid in the lipid nanoparticle may be greater than 10%, greater than 20%, greater than 30%, greater than 40%, greater than 50%, greater than 60%, or greater than 70%. The percentage of non-cationic lipid in the lipid nanoparticle may be greater than 5%, greater than 10%, greater than 20%, greater than 30%, or greater than 40%. The percentage of cholesterol in the lipid nanoparticle may be greater than 10%, greater than 20%, greater than 30%, or greater than 40%. The percentage of PEG-modified lipid in the lipid nanoparticle may be greater than 1%, greater than 2%, greater than 5%, greater than 10%, or greater than 20%.
In certain embodiments, the lipid nanoparticles of the present disclosure comprise at least one of the following cationic lipids: C12-200, DLin-KC2-DMA, DODAP, HGT4003, ICE, HGT5000, or HGT5001. In embodiments, the transfer vehicle comprises cholesterol and/or a PEG-modified lipid. In some embodiments, the transfer vehicles comprises DMG-PEG2K. In certain embodiments, the transfer vehicle comprises one of the following lipid formulations: C12-200, DOPE, chol, DMG-PEG2K; DODAP, DOPE, cholesterol, DMG-PEG2K; HGT5000, DOPE, chol, DMG-PEG2K, HGT5001, DOPE, chol, DMG-PEG2K.
The liposomal transfer vehicles for use in the compositions of the disclosure can be prepared by various techniques which are presently known in the art. Multi-lamellar vesicles (MLV) may be prepared conventional techniques, for example, by depositing a selected lipid on the inside wall of a suitable container or vessel by dissolving the lipid in an appropriate solvent, and then evaporating the solvent to leave a thin film on the inside of the vessel or by spray drying. An aqueous phase may then added to the vessel with a vortexing motion which results in the formation of MLVs. Uni-lamellar vesicles (ULV) can then be formed by homogenization, sonication or extrusion of the multi-lamellar vesicles. In addition, unilamellar vesicles can be formed by detergent removal techniques.
In certain embodiments, the compositions of the present disclosure comprise a transfer vehicle wherein the RNAi and ASO agent is associated on both the surface of the transfer vehicle and encapsulated within the same transfer vehicle. For example, during preparation of the compositions of the present invention, cationic liposomal transfer vehicles may associate with the agent through electrostatic interactions.
In certain embodiments, the compositions of the invention may be loaded with diagnostic radionuclide, fluorescent materials or other materials that are detectable in both in vitro and in vivo applications. For example, suitable diagnostic materials for use in the present invention may include Rhodamine-dioleoylphospha-tidylethanolamine (Rh-PE), Green Fluorescent Protein mRNA (GFP mRNA), Renilla Luciferase mRNA and Firefly Luciferase mRNA.
Selection of the appropriate size of a liposomal transfer vehicle may take into consideration the site of the target cell or tissue and to some extent the application for which the liposome is being made. In some embodiments, it may be desirable to limit transfection of the RNAi or ASO agent to certain cells or tissues. For example, to target hepatocytes a liposomal transfer vehicle may be sized such that its dimensions are smaller than the fenestrations of the endothelial layer lining hepatic sinusoids in the liver; accordingly the liposomal transfer vehicle can readily penetrate such endothelial fenestrations to reach the target hepatocytes. Alternatively, a liposomal transfer vehicle may be sized such that the dimensions of the liposome are of a sufficient diameter to limit or expressly avoid distribution into certain cells or tissues. For example, a liposomal transfer vehicle may be sized such that its dimensions are larger than the fenestrations of the endothelial layer lining hepatic sinusoids to thereby limit distribution of the liposomal transfer vehicle to hepatocytes. Generally, the size of the transfer vehicle is within the range of about 25 to 250 nm, preferably less than about 250 nm, 175 nm, 150 nm, 125 nm, 100 nm, 75 nm, 50 nm, 25 nm or 10 nm.
A variety of alternative methods known in the art are available for sizing of a population of liposomal transfer vehicles. One such sizing method is described in U.S. Pat. No. 4,737,323, incorporated herein by reference. Sonicating a liposome suspension either by bath or probe sonication produces a progressive size reduction down to small ULV less than about 0.05 microns in diameter. Homogenization is another method that relies on shearing energy to fragment large liposomes into smaller ones. In a typical homogenization procedure, MLV are recirculated through a standard emulsion homogenizer until selected liposome sizes, typically between about 0.1 and 0.5 microns, are observed. The size of the liposomal vesicles may be determined by quasi-electric light scattering (QELS) as described in Bloomfield, Ann. Rev. Biophys. Bioeng., 10:421-450 (1981), incorporated herein by reference. Average liposome diameter may be reduced by sonication of formed liposomes. Intermittent sonication cycles may be alternated with QELS assessment to guide efficient liposome synthesis.
Some aspects of the present disclosure are directed to a method of treating, preventing or reducing the likelihood of a disease or condition associated with aberrant condensate dependent transcription of a gene in a subject comprising administering to the subject an agent that modulates an amount, effective charge, structure, or behavior of nucleic acid in the condensate and thereby modulates transcription of the gene. The subject is not limited and may be any subject disclosed herein. In some embodiments, the subject is human.
The disease or condition associated with aberrant condensate dependent transcription is not limited and may be any suitable gene described herein. In some embodiments, modulating aberrant condensate dependent transcription of a gene treats, prevents or reduces the likelihood of a disease or condition associated with a haploinsufficiency. In some embodiments, the disease or condition associated with a haploinsufficiency is a cancer, 1q21.1 deletion syndrome, 5q-syndrome in myelodysplastic syndrome (MDS), 22q11.2 deletion syndrome, CHARGE syndrome, Cleidocranial dysostosis, Ehlers-Danlos syndrome, Frontotemporal dementia caused by mutations in progranulin, GLUT1 deficiency (De Vivo syndrome), Haploinsufficiency of A20, Holoprosencephaly caused by haploinsufficiency in the Sonic Hedgehog gene, Holt-Oram syndrome, Marfan syndrome, Phelan-McDermid syndrome, Polydactyly, or Dravet Syndrome. In some embodiments, modulating aberrant condensate dependent transcription of a gene treats, prevents or reduces the likelihood of a disease or condition associated with gene duplication. In some embodiments, the disease or condition associated with gene duplication is a cancer with an oncogene duplication, Charcot-Marie-Tooth disease type I, or MECP2 duplication syndrome. In some embodiments, the disease or condition is associated with an eRNA variant.
In some embodiments, the condensate dependent transcription occurs in the presence of one or more species of regulatory RNA (regRNA) in the condensate. The regRNA is not limited and may be any regRNA disclosed herein. In some embodiments, the regRNA is an eRNA. In some embodiments, the regRNA (e.g., eRNA) comprises an SNP. In some embodiments, the regRNA (e.g., eRNA) is associated with a gene described herein (e.g., a gene associated with a disease or condition described herein).
The agent is not limited and may be any agent described herein. In some embodiments, the agent comprises a nucleic acid having a sequence complementary to a sequence of the one or more species of regRNA. In some embodiments, the agent is an RNAi agent or ASO agent as described herein.
Some aspects of the present disclosure are directed towards a method of treating, preventing or reducing the likelihood of a disease or condition in a subject comprising administering to the subject an agent that modulates an amount, effective charge, structure, or behavior of nucleic acid in a transcriptional condensate in a cell of the subject and thereby modulates transcription of a gene and treats, prevents, or reduces the likelihood of the disease or condition. The subject is not limited and may be any subject disclosed herein. In some embodiments, the subject is human. In some embodiments, the method increases transcription of the gene or a level of gene product in the subject. In some embodiments, the transcription of the gene or a level of gene product is increased by less than 2-fold. In some embodiments, the method decreases transcription of the gene or a level of gene product. In some embodiments, transcription of the gene or a level of gene product is decreased by less than 50%.
The agent is not limited and may be any suitable agent described herein. In some embodiments, the agent specifically associates with or binds to a nucleic acid associated with the condensate. In some embodiments, the agent is an RNAi agent or ASO agent. The nucleic acid associated with the condensate is not limited. In some embodiments, the nucleic acid is a regRNA (e.g., eRNA). In some embodiments, the regRNA (e.g., eRNA) is associated with a gene described herein (e.g., a gene associated with a disease or condition described herein) or the regRNA has an SNP and is associated with a disease or condition. In some embodiments, the nucleic acid is an mRNA.
Some aspects of the present disclosure are related to a method of identifying an agent that modulates a condensate, comprising providing a condensate comprising a regulatory RNA (regRNA), contacting the condensate with a test agent, and assessing whether the test agent dissolves or modulates the size of the condensate.
The condensate is not limited and may be any suitable condensate described herein. In some embodiments, the condensate is in a cell (e.g., a transgenic cell). In some embodiments, the condensate is isolated from a cell. In some embodiments, the condensate is a synthetic condensate. In some embodiments, the condensate is a transcriptional condensate or a synthetic transcriptional condensate. In some embodiments, the condensate is capable of transcribing a reporter gene. The components of the condensate may be any condensate component described herein.
The regRNA is not limited and may be any regRNA described herein. In some embodiments, the regRNA is an enhancer RNA. In some embodiments, the regRNA (e.g., eRNA) comprises an SNP associated with a disease or condition. In some embodiments, the regRNA (e.g., eRNA) is associated with a disease or condition or associated with a gene associated with a disease or condition (e.g., as described herein). The disease or condition is not limited and may be any disease or condition described herein (e.g., cancer, a disease or condition associated with haploinsufficiency, gene duplication, or eRNA variant).
Some aspects of the present disclosure are related to a method of identifying an agent that increases condensate formation, comprising providing a composition comprising a regulatory RNA (regRNA) and a condensate component under conditions wherein the concentration of the regRNA or condensate component does not form a condensate, contacting the composition with a test agent, and assessing whether contact with the test agent causes formation of a condensate.
The condensate is not limited and may be any suitable condensate described herein. In some embodiments, the condensate is in a cell (e.g., a transgenic cell). In some embodiments, the condensate is isolated from a cell. In some embodiments, the condensate is a synthetic condensate. In some embodiments, the condensate is a transcriptional condensate or a synthetic transcriptional condensate. In some embodiments, the condensate is capable of transcribing a reporter gene. The components of the condensate may be any condensate component described herein.
The regRNA is not limited and may be any regRNA described herein. In some embodiments, the regRNA is an enhancer RNA. In some embodiments, the regRNA (e.g., eRNA) comprises an SNP associated with a disease or condition. In some embodiments, the regRNA (e.g., eRNA) is associated with a disease or condition or associated with a gene associated with a disease or condition. The disease or condition is not limited and may be any disease or condition described herein (e.g., cancer, a disease or condition associated with haploinsufficiency, gene duplication, or eRNA variant).
Some aspects of the present disclosure are directed to a method of identifying an agent that modulates condensate dependent transcription of a gene, comprising providing an in vitro transcription assay with condensate dependent expression of a reporter gene, contacting the in vitro transcription assay with a test agent, and assessing expression of the reporter gene, wherein condensate dependent expression requires incorporation of a regulatory RNA (regRNA) in the condensate.
The condensate is not limited and may be any suitable condensate described herein. In some embodiments, the condensate is isolated from a cell. In some embodiments, the condensate is a synthetic condensate. The components of the condensate may be any suitable condensate component described herein.
The regRNA is not limited and may be any regRNA described herein. In some embodiments, the regRNA is an enhancer RNA. In some embodiments, the regRNA (e.g., eRNA) comprises an SNP associated with a disease or condition. In some embodiments, the regRNA (e.g., eRNA) is associated with a disease or condition or associated with a gene associated with a disease or condition. The disease or condition is not limited and may be any disease or condition described herein (e.g., cancer, a disease or condition associated with haploinsufficiency, gene duplication, or eRNA variant).
Some aspects of the present disclosure are related to a method of identifying an agent that modulates condensate dependent transcription of a gene, comprising providing a cell with condensate dependent expression of a heterologous reporter gene, contacting the cell with a test agent, and assessing expression of the heterologous reporter gene, wherein condensate dependent expression requires incorporation of a regulatory RNA (regRNA) in the condensate. The cell is not limited and may be any cell described herein. In some embodiments, the cell is a cancer cell or a diseased cell (e.g., a cell exhibiting a disease described herein).
The regRNA is not limited and may be any regRNA described herein. In some embodiments, the regRNA is an enhancer RNA. In some embodiments, the regRNA (e.g., eRNA) comprises an SNP associated with a disease or condition. In some embodiments, the regRNA (e.g., eRNA) is associated with a disease or condition or associated with a gene associated with a disease or condition. The disease or condition is not limited and may be any disease or condition described herein (e.g., cancer, a disease or condition associated with haploinsufficiency, gene duplication, or eRNA variant).
In some embodiments of the screening methods disclosed herein, the condensate has a detectable tag. The detectable tag can be used to determine if contact with the test agent modulates formation, stability, or morphology of the condensate. In some embodiments, a cell is a genetically engineered to express the detectable tag. In some embodiments, the detectable tag is incorporated into the condensate (e.g., a component of the condensate, the regRNA). The term “detectable tag” or “detectable label” as used herein includes, but is not limited to, detectable labels, such as fluorophores, radioisotopes, colorimetric substrates, or enzymes; heterologous epitopes for which specific antibodies are commercially available, e.g., FLAG-tag; heterologous amino acid sequences that are ligands for commercially available binding proteins, e.g., Strep-tag, biotin; fluorescence quenchers typically used in conjunction with a fluorescent tag on the other polypeptide; nucleic acid intercalating agents, and complementary bioluminescent or fluorescent polypeptide fragments. A tag that is a detectable label or a complementary bioluminescent or fluorescent polypeptide fragment may be measured directly (e.g., by measuring fluorescence or radioactivity of, or incubating with an appropriate substrate or enzyme to produce a spectrophotometrically detectable color change for the associated polypeptides as compared to the unassociated polypeptides). A tag that is a heterologous epitope or ligand is typically detected with a second component that binds thereto, e.g., an antibody or binding protein, wherein the second component is associated with a detectable label.
In some specific embodiments of the screening methods disclosed herein, the condensate comprises a mediator component or fragment thereof comprising an IDR. In some embodiments, the mediator component or fragment thereof comprising an IDR further comprises a label.
In some embodiments of the screening methods disclosed herein, “assessing” comprises measuring a physical property as compared to a control or reference. For example, assessing if the condensate is dissolved or the stability of a condensate is modulated may comprise measuring the period of time a condensate exists as compared to a control condensate not subject to a test condition or agent. assessing if the shape or size of a condensate is modulated can comprise comparing the shape of a condensate as compared to a control condensate not subject to a test condition or agent. In some embodiments, one or more properties of a condensate may be “assessed” to be modulated if they are changed by a statistically significant amount (e.g., at least 10%, at least 20%, at least 30%, at least 50%, at least 75%, or more).
In some embodiments of the screening methods disclosed herein, the step of determining if contact with the test agent modulates size, dissolution, formation, stability, or morphology of the condensate is performed using microscopy, which is not limited. In some embodiments, the microscopy is deconvolution microscopy, structured illumination microscopy, or interference microscopy. In some embodiments, the step of determining if contact with the test agent modulates formation, stability, or morphology of the condensate is performed using DNA-FISH, RNA-FISH, or a combination thereof.
In some embodiments of the screening methods disclosed herein, the cell or condensate does not express a reporter gene prior to contact with a test agent and expresses a reporter gene after contact with an agent that enhances condensate formation, stability, size, function, or morphology. In some embodiments, the cell does express a reporter gene prior to contact with a test agent and stops or reduces expression of the reporter gene after contact with an agent that dissolves condensates, reduces condensate stability, or prevents/supresses condensate formation.
In some embodiments of the screening methods disclosed herein, a high throughput screen (HTS) is performed. A high throughput screen can utilize cell-free or cell-based assays (e.g., a condensate containing cell as described herein, an in vitro condensate, an isolated in vitro condensate). High throughput screens often involve testing large numbers of compounds with high efficiency, e.g., in parallel. For example, tens or hundreds of thousands of compounds can be routinely screened in short periods of time, e.g., hours to days. Often such screening is performed in multiwell plates containing, at least 96 wells or other vessels in which multiple physically separated cavities or depressions are present in a substrate. High throughput screens often involve use of automation, e.g., for liquid handling, imaging, data acquisition and processing, etc. Certain general principles and techniques that may be applied in embodiments of a HTS of the present invention are described in Macarrón R & Hertzberg R P. Design and implementation of high-throughput screening assays. Methods Mol Biol., 565:1-32, 2009 and/or An W F & Tolliday N J., Introduction: cell-based assays for high-throughput screening. Methods Mol Biol. 486:1-12, 2009, and/or references in either of these. Useful methods are also disclosed in High Throughput Screening: Methods and Protocols (Methods in Molecular Biology) by William P. Janzen (2002) and High-Throughput Screening in Drug Discovery (Methods and Principles in Medicinal Chemistry) (2006) by Jorg Hüser.
The term “hit” generally refers to an agent that achieves an effect of interest in a screen or assay, e.g., an agent that has at least a predetermined level of modulating effect on cell survival, cell proliferation, gene expression, protein activity, or other parameter of interest being measured in the screen or assay. Test agents that are identified as hits in a screen may be selected for further testing, development, or modification. In some embodiments a test agent is retested using the same assay or different assays. For example, a candidate anticancer agent may be tested against multiple different cancer cell lines or in an in vivo tumor model to determine its effect on cancer cell survival or proliferation, tumor growth, etc. Additional amounts of the test agent may be synthesized or otherwise obtained, if desired. Physical testing or computational approaches can be used to determine or predict one or more physicochemical, pharmacokinetic and/or pharmacodynamic properties of compounds identified in a screen. For example, solubility, absorption, distribution, metabolism, and excretion (ADME) parameters can be experimentally determined or predicted. Such information can be used, e.g., to select hits for further testing, development, or modification. For example, small molecules having characteristics typical of “drug-like” molecules can be selected and/or small molecules having one or more unfavorable characteristics can be avoided or modified to reduce or eliminated such unfavorable characteristic(s).
In some embodiments structures of hit compounds are examined to identify a pharmacophore, which can be used to design additional compounds. An additional compound may, for example, have one or more altered, e.g., improved, physicochemical, pharmacokinetic (e.g., absorption, distribution, metabolism and/or excretion) and/or pharmacodynamic properties as compared with an initial hit or may have approximately the same properties but a different structure. An improved property is generally a property that renders a compound more readily usable or more useful for one or more intended uses. Improvement can be accomplished through empirical modification of the hit structure (e.g., synthesizing compounds with related structures and testing them in cell-free or cell-based assays or in non-human animals) and/or using computational approaches. Such modification can make use of established principles of medicinal chemistry to predictably alter one or more properties. In some embodiments a molecular target of a hit compound is identified or known. In some embodiments, additional compounds that act on the same molecular target may be identified empirically (e.g., through screening a compound library) or designed.
Data or results from testing an agent or performing a screen may be stored or electronically transmitted. Such information may be stored on a tangible medium, which may be a computer-readable medium, paper, etc. In some embodiments a method of identifying or testing an agent comprises storing and/or electronically transmitting information indicating that a test agent has one or more propert(ies) of interest or indicating that a test agent is a “hit” in a particular screen, or indicating the particular result achieved using a test agent. A list of hits from a screen may be generated and stored or transmitted. Hits may be ranked or divided into two or more groups based on activity, structural similarity, or other characteristics
Once a candidate agent is identified, additional agents, e.g., analogs, may be generated based on it. An additional agent, may, for example, have increased cancer cell uptake, increased potency, increased stability, greater solubility, or any improved property. In some embodiments a labeled form of the agent is generated. The labeled agent may be used, e.g., to directly measure binding of an agent to a molecular target in a cell. In some embodiments, a molecular target of an agent identified as described herein may be identified. An agent may be used as an affinity reagent to isolate a molecular target. An assay to identify the molecular target, e.g., using methods such as mass spectrometry, may be performed. Once a molecular target is identified, one or more additional screens maybe performed to identify agents that act specifically on that target.
The test agent for the screening methods disclosed herein are not limited and may be a type of agent described herein (e.g., a protein, nucleic acid, small molecule, etc.) In some embodiments of the screening methods disclosed herein, the test agent is designed to specifically binds the regRNA (e.g., an RNAi agent or ASO agent).
Agents can be obtained from natural sources or produced synthetically. Agents may be at least partially pure or may be present in extracts or other types of mixtures. Extracts or fractions thereof can be produced from, e.g., plants, animals, microorganisms, marine organisms, fermentation broths (e.g., soil, bacterial or fungal fermentation broths), etc. In some embodiments, a compound collection (“library”) is tested. A compound library may comprise natural products and/or compounds generated using non-directed or directed synthetic organic chemistry. In some embodiments a library is a small molecule library, peptide library, peptoid library, cDNA library, oligonucleotide library, or display library (e.g., a phage display library). In some embodiments a library comprises agents of two or more of the foregoing types. In some embodiments oligonucleotides in an oligonucleotide library comprise siRNAs, shRNAs, antisense oligonucleotides, aptamers, or random oligonucleotides.
A library may comprise, e.g., between 100 and 500,000 compounds, or more. In some embodiments a library comprises at least 10,000, at least 50,000, at least 100,000, or at least 250,000 compounds. In some embodiments compounds of a compound library are arrayed in multiwell plates. They may be dissolved in a solvent (e.g., DMSO) or provided in dry form, e.g., as a powder or solid. Collections of synthetic, semi-synthetic, and/or naturally occurring compounds may be tested. Compound libraries can comprise structurally related, structurally diverse, or structurally unrelated compounds. Compounds may be artificial (having a structure invented by man and not found in nature) or naturally occurring. In some embodiments compounds that have been identified as “hits” or “leads” in a drug discovery program and/or analogs thereof. In some embodiments a library may be focused (e.g., composed primarily of compounds having the same core structure, derived from the same precursor, or having at least one biochemical activity in common). Compound libraries are available from a number of commercial vendors such as Tocris BioScience, Nanosyn, BioFocus, and from government entities such as the U.S. National Institutes of Health (NIH). In some embodiments a test agent is not an agent that is found in a cell culture medium known or used in the art, e.g., for culturing vertebrate, e.g., mammalian cells, e.g., an agent provided for purposes of culturing the cells. In some embodiments, if the agent is one that is found in a cell culture medium known or used in the art, the agent may be used at a different, e.g., higher, concentration when used as a test agent in a method or composition described herein.
The description of embodiments of the disclosure is not intended to be exhaustive or to limit the disclosure to the precise form disclosed. While specific embodiments of, and examples for, the disclosure are described herein for illustrative purposes, various equivalent modifications are possible within the scope of the disclosure, as those skilled in the relevant art will recognize. For example, while method steps or functions are presented in a given order, alternative embodiments may perform functions in a different order, or functions may be performed substantially concurrently. The teachings of the disclosure provided herein can be applied to other procedures or methods as appropriate. The various embodiments described herein can be combined to provide further embodiments. Aspects of the disclosure can be modified, if necessary, to employ the compositions, functions and concepts of the above references and application to provide yet further embodiments of the disclosure. These and other changes can be made to the disclosure in light of the detailed description.
Specific elements of any of the foregoing embodiments can be combined or substituted for elements in other embodiments. Furthermore, while advantages associated with certain embodiments of the disclosure have been described in the context of these embodiments, other embodiments may also exhibit such advantages, and not all embodiments need necessarily exhibit such advantages to fall within the scope of the disclosure.
All patents and other publications identified are expressly incorporated herein by reference for the purpose of describing and disclosing, for example, the methodologies described in such publications that might be used in connection with the present invention. These publications are provided solely for their disclosure prior to the filing date of the present application. Nothing in this regard should be construed as an admission that the inventors are not entitled to antedate such disclosure by virtue of prior invention or prior publication, or for any other reason. All statements as to the date or representation as to the contents of these documents is based on the information available to the applicants and does not constitute any admission as to the correctness of the dates or contents of these documents.
One skilled in the art readily appreciates that the present invention is well adapted to carry out the objects and obtain the ends and advantages mentioned, as well as those inherent therein. The details of the description and the examples herein are representative of certain embodiments, are exemplary, and are not intended as limitations on the scope of the invention. Modifications therein and other uses will occur to those skilled in the art. These modifications are encompassed within the spirit of the invention. It will be readily apparent to a person skilled in the art that varying substitutions and modifications may be made to the invention disclosed herein without departing from the scope and spirit of the invention.
The articles “a” and “an” as used herein in the specification and in the claims, unless clearly indicated to the contrary, should be understood to include the plural referents. Claims or descriptions that include “or” between one or more members of a group are considered satisfied if one, more than one, or all of the group members are present in, employed in, or otherwise relevant to a given product or process unless indicated to the contrary or otherwise evident from the context. The invention includes embodiments in which exactly one member of the group is present in, employed in, or otherwise relevant to a given product or process. The invention also includes embodiments in which more than one, or all of the group members are present in, employed in, or otherwise relevant to a given product or process. Furthermore, it is to be understood that the invention provides all variations, combinations, and permutations in which one or more limitations, elements, clauses, descriptive terms, etc., from one or more of the listed claims is introduced into another claim dependent on the same base claim (or, as relevant, any other claim) unless otherwise indicated or unless it would be evident to one of ordinary skill in the art that a contradiction or inconsistency would arise. It is contemplated that all embodiments described herein are applicable to all different aspects of the invention where appropriate. It is also contemplated that any of the embodiments or aspects can be freely combined with one or more other such embodiments or aspects whenever appropriate. Where elements are presented as lists, e.g., in Markush group or similar format, it is to be understood that each subgroup of the elements is also disclosed, and any element(s) can be removed from the group. It should be understood that, in general, where the invention, or aspects of the invention, is/are referred to as comprising particular elements, features, etc., certain embodiments of the invention or aspects of the invention consist, or consist essentially of, such elements, features, etc. For purposes of simplicity those embodiments have not in every case been specifically set forth in so many words herein. It should also be understood that any embodiment or aspect of the invention can be explicitly excluded from the claims, regardless of whether the specific exclusion is recited in the specification. For example, any one or more active agents, additives, ingredients, optional agents, types of organism, disorders, subjects, or combinations thereof, can be excluded.
Where the claims or description relate to a composition of matter, it is to be understood that methods of making or using the composition of matter according to any of the methods disclosed herein, and methods of using the composition of matter for any of the purposes disclosed herein are aspects of the invention, unless otherwise indicated or unless it would be evident to one of ordinary skill in the art that a contradiction or inconsistency would arise. Where the claims or description relate to a method, e.g., it is to be understood that methods of making compositions useful for performing the method, and products produced according to the method, are aspects of the invention, unless otherwise indicated or unless it would be evident to one of ordinary skill in the art that a contradiction or inconsistency would arise.
Where ranges are given herein, the invention includes embodiments in which the endpoints are included, embodiments in which both endpoints are excluded, and embodiments in which one endpoint is included and the other is excluded. It should be assumed that both endpoints are included unless indicated otherwise. Furthermore, it is to be understood that unless otherwise indicated or otherwise evident from the context and understanding of one of ordinary skill in the art, values that are expressed as ranges can assume any specific value or subrange within the stated ranges in different embodiments of the invention, to the tenth of the unit of the lower limit of the range, unless the context clearly dictates otherwise. It is also understood that where a series of numerical values is stated herein, the invention includes embodiments that relate analogously to any intervening value or range defined by any two values in the series, and that the lowest value may be taken as a minimum and the greatest value may be taken as a maximum. Numerical values, as used herein, include values expressed as percentages. For any embodiment of the invention in which a numerical value is prefaced by “about” or “approximately”, the invention includes an embodiment in which the exact value is recited. For any embodiment of the invention in which a numerical value is not prefaced by “about” or “approximately”, the invention includes an embodiment in which the value is prefaced by “about” or “approximately”.
“Approximately” or “about” generally includes numbers that fall within a range of 1% or in some embodiments within a range of 5% of a number or in some embodiments within a range of 10% of a number in either direction (greater than or less than the number) unless otherwise stated or otherwise evident from the context (except where such number would impermissibly exceed 100% of a possible value). It should be understood that, unless clearly indicated to the contrary, in any methods claimed herein that include more than one act, the order of the acts of the method is not necessarily limited to the order in which the acts of the method are recited, but the invention includes embodiments in which the order is so limited. It should also be understood that unless otherwise indicated or evident from the context, any product or composition described herein may be considered “isolated.”
Recent studies have shown that transcriptional condensates can compartmentalize and concentrate large numbers of transcription factors, cofactors and Pol II at super-enhancers, which are clusters of enhancers that regulate genes with prominent roles in cell identity (Boija et al., 2018; Cho et al., 2018; Cramer, 2019; Hnisz et al., 2017; Sabari et al., 2018). The component enhancer elements of such genes promote transcriptional condensate formation by crowding transcription factors and Mediator at densities above sharply defined thresholds for condensate formation (Shrinivas et al., 2019). Transcriptional condensates are highly dynamic and can be observed in live cells to form and dissolve at timescales ranging from seconds to minutes (Cho et al., 2018).
RNA molecules are components of, and play regulatory roles in, diverse biomolecular condensates. These include the nucleolus, nuclear speckles, paraspeckles, and stress granules (Fay and Anderson, 2018; Roden and Gladfelter, 2020; Sabari et al., 2020; Strom and Brangwynne, 2019). RNA has a high negative charge density due to its phosphate backbone, and the effective charge of a given RNA molecule is directly proportional to its length (Boeynaems et al., 2019). Condensates are thought to be formed by an ensemble of low-affinity molecular interactions, including electrostatic interactions, and RNA can be a powerful regulator of condensates that are formed and maintained by electrostatic forces (Banani et al., 2017; Maharana et al., 2018; Peran and Mittag, 2020; Shin and Brangwynne, 2017). Indeed, RNA has been shown to enter and modify the properties of simple condensates formed by polyelectrolyte-rich molecules (Drobot et al., 2018; Frankel et al., 2016; Mountain and Keating, 2020). In a phenomenon called complex coacervation, a type of liquid-liquid phase separation mediated by electrostatic interactions between oppositely charged polyelectrolytes, low levels of RNA can enhance condensate formation whereas high levels can cause their dissolution (Lin et al., 2019; Overbeek and Voorn, 1957; Sing, 2017; Srivastava and Tirrell, 2016). Condensate formation and subsequent dissolution with increasing RNA concentration is an example of reentrant phase behavior, which is driven by favorable opposite-charge interactions at low RNA concentrations (formation) and repulsive like-charge interactions at high RNA concentrations (dissolution) (Banerjee et al., 2017; Milin and Deniz, 2018). The inventors investigated whether such a reentrant equilibrium phase behavior coupled to the non-equilibrium processes that occur during transcription could regulate transcriptional output.
By combining physics-based modeling and experimental analysis, the inventors have proposed and tested a model whereby the products of transcription initiation stimulate condensate formation and those of a burst of elongation stimulate condensate dissolution. The inventors provide experimental evidence that physiological RNA levels can enhance or dissolve transcriptional condensates. These results provide a mechanism by which the products of transcription regulate condensate behaviors and thus transcription, and suggest that this non-equilibrium process provides negative feedback to dissolve the transcriptional condensates that support initiation and thereby arrest transcription.
To explore the potential role of RNA in regulating transcriptional condensates, the inventors sought to estimate the number and effective charge of RNA and protein molecules in a typical transcriptional condensate at different stages of transcription. In early stages of transcription, low levels of small noncoding RNAs are produced by Pol II at enhancers and promoter-proximal regions (
As an initial test of whether low levels of RNA stimulate transcriptional condensate formation while high levels of RNA favor condensate dissolution, the inventors used an in vitro droplet assay (
The inventors next sought to quantify how diverse RNAs regulate the reentrant phase behavior of transcriptional proteins. The inventors performed in vitro droplet assays (
The inventors sought to further test whether the RNA-mediated effects on MED1-IDR condensates fit a charge balance model (STAR Methods). MED1-IDR/RNA condensate formation should be enhanced when the protein and RNA polymers are balanced in charge, and they should be sensitive to disruption of this balance. To test this model, the inventors quantified the relative charge of RNA and MED1-IDR and computed the correlation with the partition ratio of MED1-IDR (STAR Methods). As expected, RNA-mediated effects on MED1-IDR condensates fit a charge balance model (
The inventors sought to investigate the functional consequence of the RNA-mediated reentrant phase behavior on transcription. Pol II-dependent transcription can be reconstituted in vitro with purified components (Roeder, 2019), so the inventors investigated whether droplets containing transcriptional components are formed in these assays and if conditions that alter droplet levels similarly alter transcriptional output. The inventors used a classical reconstituted mammalian transcription system with purified components, including Pol II, general transcription factors, Mediator and a transcriptional activator (Gal4), where addition of nucleotides permits transcription of a linear DNA template (
The inventors reasoned that if transcription and droplet formation are mutually dependent in the reconstituted system, then treatments that alter transcription should similarly impact condensate formation and vice versa. The addition to the reaction of various chemicals that are known to inhibit transcription (elevated concentrations of NTPs, NaCl, or heparin) (Carey et al., 2009; Reinberg and Roeder, 1987), caused reductions in droplet area, DNA partitioning and transcription (
An expectation of the RNA-feedback model is that droplets in the reconstituted system might ultimately produce enough RNA to cause a reduction in droplet size and transcriptional output. However, the low concentrations of RNA produced in these systems (3.5±0.5 pM, STAR Methods) are insufficient to dissolve the droplets. For this reason, the inventors tested whether purified RNA, added to the reaction, would similarly impact droplets and transcription. Indeed, addition of exogenous RNA reduced the number and size of the droplets (
The in vitro experiments, which provide evidence that key transcriptional proteins and RNA exhibit an electrostatics-driven, RNA-protein ratio dependent, reentrant phase transition, were performed under equilibrium conditions (
The inventors first developed a free-energy function to recapitulate the experimentally observed reentrant phase behavior of RNA-protein mixtures (
The inventors next developed a mathematical framework to study the temporal evolution of transcriptional condensates as transcription ensues. Most transcriptional proteins turn-over with a half-life of several hours (Cambridge et al., 2011; Chen et al., 2016), which is longer than time-scales of transcription-associated events, which range from seconds to minutes (Chen and Larson, 2016; Fukaya et al., 2016; Rodriguez and Larson, 2020). Hence, the overall amount of protein is conserved in the timescales of interest. Thus, the dynamics of the protein concentration (ϕp) are represented by standard Model B dynamics (
The inventors first sought to determine whether this model is consistent with previous studies (Cho et al., 2016, 2018). These studies have shown that transcriptional condensates at different genomic loci recruit a varying number of transcriptional proteins, which in turn, correlates with condensate lifetimes. To explore this phenomenon, the inventors numerically simulated Eq. 2 (
The inventors next investigated how the sizes and lifetimes of transcriptional condensates change as a function of the effective rate of RNA synthesis (kp), while keeping all other parameters fixed. In these simulations, the size of condensates initially increases and subsequently decreases with increasing effective rates of RNA synthesis (
The inventors then investigated the extent to which non-equilibrium effects underlying transcription regulate transcriptional condensate dynamics. RNA synthesis, degradation, and diffusion influence the spatial distribution of RNA, which in turn, may feedback on transcriptional condensates. To explore this, the inventors varied the diffusivity of RNA and the effective rates of RNA synthesis and degradation, while holding the ratio of synthesis and degradation rates constant. The latter constraint ensures that the overall RNA concentration is constant in the condensate as other parameters are varied, thus any effect on condensate dynamics arises from purely non-equilibrium effects. Varying the parameters that control RNA synthesis/degradation rates and diffusion changes the relative time-scales of these processes (tr and td, respectively) (STAR Methods), which in turn, influences the spatial distribution of RNA in the condensate. If diffusion is slower than synthesis/degradation (tr<td), then RNA will accumulate near transcription sites, leading to a higher local RNA concentration in the condensate. Conversely, if diffusion is faster than synthesis/degradation (tr>td), then RNA will diffuse away from transcription sites, leading to a lower uniform RNA concentration in the condensate. The spatial distribution of RNA will impact condensates according to local charge balance. To study how varying spatial distributions of RNA affect transcriptional condensates, the inventors simulated conditions where the overall RNA concentration was fixed close to the charge-balance condition, thus promoting condensate formation at equilibrium. In these simulations, condensates that are stable when synthesis/degradation is slower than diffusion (tr>td) dissolve when RNA synthesis/degradation is faster than diffusion (tr<td) (
The inventors sought to synthesize our results so far to explore the effect of non-equilibrium dynamics on regulating transcriptional condensates across transcription initiation and productive elongation. Simulations were started at a relatively low effective rate of RNA synthesis, mimicking initiation, followed by an increase to a relatively high effective rate of RNA synthesis, mimicking productive elongation. The simulations predict that low effective rates of RNA synthesis enhance condensate formation, and these condensates subsequently dissolve upon ensuing higher effective rates of RNA synthesis (
Transcriptional condensates in cells are highly dynamic, forming and dissolving at timescales ranging from seconds to minutes (Cho et al., 2018). The inventors previously showed that condensate formation is associated with transcription activation and initiation (Cho et al., 2018). Once transcriptional condensates are formed, the RNA-mediated condensate dissolution model predicts that inhibition of elongation should increase the size and lifetime of transcriptional condensates (
To experimentally test these predictions from the simulations, mESCs engineered with an endogenous, GFP-tagged subunit of Mediator (Med1-GFP) (Sabari et al., 2018) were treated for 30 minutes with Actinomycin-D or DRB (
The RNA-mediated feedback model suggests that modifying the concentration or size of RNA molecules should have a predictable effect on transcriptional output. The inventors developed complementary experimental and simulation approaches (
To test this prediction, the inventors investigated the effect of artificially increasing the levels of feedback RNAs on the transcription of an adjacent luciferase reporter gene in cells (
The results described here indicate that transcription is a non-equilibrium process that provides dynamic feedback through its RNA product. The results support a model whereby RNA provides both positive and negative feedback on transcription via the regulation of electrostatic interactions in transcriptional condensates. Transcriptional condensates, whose production involves crowding of transcription factors by enhancer DNA (Shrinivas et al., 2019) and electrostatic and other interactions between the IDRs of transcription factors and coactivators (Boija et al., 2018; Sabari et al., 2018), engage RNA to both promote and dissolve the condensates. In this RNA feedback model, low levels of short RNAs produced during transcription initiation promote formation of transcriptional condensates, while high levels of the longer RNAs produced during elongation can cause condensate dissolution (
RNA-mediated feedback regulation of transcription is the result of coupling between the non-equilibrium processes of RNA synthesis, degradation, and diffusion with an underlying equilibrium phase behavior of RNA-transcriptional protein mixtures that exhibit a reentrant phase transition. Such phase transitions have been observed in prior studies of complex coacervation in mixtures of oppositely charged polyelectrolyte solutions, including RNA-polyelectrolyte mixtures (Lin et al., 2019; Overbeek and Voorn, 1957; Sing, 2017; Srivastava and Tirrell, 2016). The inventors provided several lines of evidence to show that mixtures of RNA and transcriptional molecules undergo a reentrant phase transition at equilibrium. In droplet formation assays, low levels of RNA that occur at gene regulatory regions can stimulate condensate formation by the Mediator coactivator whereas high levels suppress condensates (
Using a physics-based model, the inventors then studied how non-equilibrium processes of RNA synthesis, degradation, and diffusion are linked to equilibrium reentrant phase behavior to regulate the size and dynamics of transcriptional condensates in vivo. In agreement with predictions from the model, the dependence of these quantities and transcriptional output on RNA synthesis rates and lengths were then positively tested in cell-free and in cellular systems (
An RNA-mediated feedback model for transcriptional regulation provides a potential explanation for the roles of enhancer and promoter-associated RNAs, which are evolutionarily conserved features of eukaryotes. These low-abundance short RNAs, transcribed bidirectionally from enhancers and promoters, have been reported to affect transcription from their associated genes through diverse postulated mechanisms (Andersson et al., 2014; Catarino and Stark, 2018; Core et al., 2014; Gardini and Shiekhattar, 2015; Henriques et al., 2018; Lai et al., 2013; Li et al., 2016; Mikhaylichenko et al., 2018; Nair et al., 2019; Pefanis et al., 2015; Rahnamoun et al., 2018; Schaukowitch et al., 2014; Scruggs et al., 2015; Sigova et al., 2015; Smith et al., 2019). The diversity of sequences present in these short RNA species has made it difficult to postulate a common molecular mechanism for their effects on transcription. In this context, a model for RNA-mediated feedback regulation of condensates is attractive for several reasons. RNA molecules are known components of other biomolecular condensates, including the nucleolus, nuclear speckles, paraspeckles and stress granules, where they are known to play regulatory roles (Fay and Anderson, 2018; Roden and Gladfelter, 2020). RNA is a powerful regulator of condensates that are formed by electrostatic forces because it has a high negative charge density due to its phosphate backbone (Drobot et al., 2018; Frankel et al., 2016), thus explaining why the effects of diverse RNAs on transcriptional condensates are sequence-independent.
Recent studies indicate that transcription occurs in periodic bursts (˜1-10 minutes in duration), where multiple molecules of Pol II can be released from promoters within a short timeframe and produce multiple molecules of mRNA (˜1-100 molecules per burst) (Cisse et al., 2013; Fukaya et al., 2016; Larsson et al., 2019). Multiple models explain such periodic bursts through stochastic gene activation events (Chen and Larson, 2016; Larsson et al., 2019; Raj et al., 2006; Rodriguez and Larson, 2020; Suter et al., 2011; Tunnacliffe and Chubb, 2020) but are often agnostic to the underlying mechanism or attribute these to rate-limiting transcription factor binding events. The inventors suggest that a rapid and spatially-localized change in charge balance, due to increased RNA synthesis at pause release of active Pol II, may contribute to dissolution of transcriptional condensates and thus dynamic loss of the pool of transcriptional apparatus in those condensates. This would provide a means to provide negative feedback to arrest transcription and a mechanism that may contribute to the dynamic bursty behavior observed for transcription.
The code generated during this study is available on the worldwide web at
github.com/krishna-shrinivas/2020_Henninger_Oksuz_Shrinivas_RNA_feedback.
The Jaenisch laboratory gifted the V6.5 mouse ES cells. ES cells were maintained at 37° C. with 5% CO2 in a humidified incubator on 0.2% gelatinized (Sigma, G1890) tissue-culture plates in 2i medium with LIF, which was made according to the following recipe: 960 mL DMEM/F12 (Life Technologies, 11320082), 5 mL N2 supplement (Life Technologies, 17502048; stock 100×), 10 mL B27 supplement (Life Technologies, 17504044; stock 50×), 5 mL additional L-glutamine (Gibco 25030-081; stock 200 mM), 10 mL MEM nonessential amino acids (Gibco 11140076; stock 100×), 10 mL penicillin-streptomycin (Life Technologies, 15140163; stock 10{circumflex over ( )}4 U/mL), 333 μL BSA fraction V (Gibco 15260037; stock 7.50%), 7 μL β-mercaptoethanol (Sigma M6250; stock 14.3 M), 100 μL LIF (Chemico, ESG1107; stock 10{circumflex over ( )}7 U/mL), 100 μL PD0325901 (Stemgent, 04-0006-10; stock 10 mM), and 300 μL CHIR99021 (Stemgent, 04-0004-10; stock 10 mM). For confocal and PALM imaging, cells were grown on glass coverslips (Carolina Biological Supply, 633029) that had been coated with the following: 5 μg/mL of poly-L-ornithine (Sigma P4957) at 37° C. for at least 30 minutes followed by 5 μg/mL of laminin (Corning, 354232) at 37° C. for at least 2 hours. Cells were passaged by washing once with 1×PBS (Life Technologies, AM9625) and incubating with TrypLE (Life Technologies, 12604021) for 3-5 minutes, then quenched with serum-containing media made by the following recipe: 500 mL DMEM KO (Gibco 10829-018), MEM nonessential amino acids (Gibco 11140076; stock 100×), penicillin-streptomycin (Life Technologies, 15140163; stock 10{circumflex over ( )}4 U/mL), 5 mL L-glutamine (Gibco 25030-081; stock 100×), 4 μL β-mercaptoethanol (Sigma M6250; stock 14.3 M), 50 μL LIF (Chemico, ESG1107; stock 10{circumflex over ( )}7 U/mL), and 75 mL of fetal bovine serum (Sigma, F4135). Cells were passaged every 2 days.
ChIP-seq browser tracks for MED1, Pol II, BRD4, and OCT4 were generated as described (Sabari et al., 2018; Whyte et al., 2013). Briefly, reads were aligned to NCBI37/mm9 using Bowtie with the following settings: “-p 4 --best-k 1-m 1 --sam-1 40”. WIG files represent counts (in reads per million, floored at 0.1) of aligned reads within 50 bp bins. Each read was extended by 200 nt in the direction of the alignment.
(Source: www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE112808)
For generation of the GRO-seq browser tracks, GRO-seq reads were processed as described in (Sigova et al., 2015). The GRO-seq .sra file corresponding to GEO accession number GSM1665566 (Sigova et al., 2015) was converted to .fastq using the SRA toolkit (Leinonen et al., 2011). Reads were aligned to the mouse genome (NCBI37/mm9) using Bowtie v1.2.2 (Langmead et al., 2009) with the following settings “-e 70-k 1-m 10-n 2 --best”. The reads corresponding to each one of the features (super-enhancers, typical enhancers, proximal promoter regions, genes) were counted using featureCounts v1.6.2 (Liao et al., 2014) with default settings. The coordinates for typical enhancers and super-enhancers in mouse embryonic stem cells (mESCs) were acquired from (Whyte et al., 2013). The coordinates for genes (transcription start and end sites) were acquired using the UCSC Table Browser (Karolchik et al., 2004). The upstream antisense promoter regions were defined as genomic areas containing 1 kb upstream of each TSS. Their coordinates were retrieved by using BEDTools v.2.26.0 (Quinlan and Hall, 2010) and the TSS coordinates as input (to the slop function). Reads were normalized with the size of the corresponding feature they aligned to.
The RNA-seq .sra file corresponding to GEO accession number GSM2686137b (Chiu et al., 2018) was converted to .fastq using the SRA Toolkit RNA-seq analysis was performed using the nf-core RNA-seq pipeline (v1.4.2) (Ewels et al., 2020) with default settings and NCBI37/mm9 as reference genome. Nextflow v20.01.0 was used as a workflow tool on an LSF High-Performance Computing environment (Di Tommaso et al., 2017). STAR v2.6.1d (Dobin et al., 2013) was used for the alignment of reads. Aligned reads were assigned to the aforementioned intervals (typical enhancers, super-enhancers, proximal promoter regions and genes) by using featureCounts v1.6.4, with the default settings.
Known concentrations of in vitro transcribed enhancer RNAs and pre-mRNAs from Trim28 and Pou5f1 loci are used as standards to approximate the number of molecules in cells. These RNAs are converted to cDNAs by reverse-transcription and mixed at equal concentrations. For each RNA species, a standard curve of qRT-PCR Ct value to RNA amount was generated using serial dilutions, with two different primer sets in technical duplicates. Next, qRT-PCR reactions using the same primer sets were performed for biological duplicates of mESCs. Actb-normalized Ct values were then used to determine the amount of RNA species in the reaction based on the standard curves above. To calculate the number of RNA molecules per cell, the amount of RNA (g) was divided by the molar weight of each species (˜350 (g mol−1 nt−1)×length of in vitro transcribed RNA (nt)), multiplied by Avogadro's number (6.022×1023 mol−1), and divided by the approximate number of cells used in each reaction (10,000 cells). Melting curves were analyzed to confirm primers specificity. Non-reverse-transcribed (−RT) controls were included to rule out the amplification of genomic DNA. Primer sequences are indicated in Table S1.
Recombinant GFP fusion proteins were concentrated to a desired protein concentration using Amicon Ultra centrifugal filters (30K MWCO, Millipore). Droplet reactions with the recombinant proteins were performed in 10 ul volumes in PCR tubes under the following buffer condition: 30 mM Tris HCl PH 7.4, 100 mM NaCl, 2% Glycerol and 1 mM DTT. The same buffer containing 55 mM NaCl was used for BRD4-IDR-GFP. Droplet reactions with the Mediator complex were performed under the following buffer condition: 30 mM HEPES pH 7.4, 65 mM NaCl, 2% Glycerol and 1 mM DTT. For all droplet reactions, protein and buffer were mixed first and RNA or ssDNA or heparin (Sigma, H3393) was added later. The reactions were incubated at room temperature for 1 hr without any shaking or rotating. The reactions were then individually transferred into 384 well-plate (Cellvis P384-1.5H-N) by using a micropipette (2-20 μL) 5 minutes prior to imaging by confocal microscopy at 150× magnification or prior to turbidity measurements on a plate reader (Tecan) at 350 nm absorbance at room temperature (Banerjee et al., 2017). The concentration of proteins and RNAs in the droplet reactions are indicated in the figure legends. For brightfield Mediator experiments (
Fluorescence Recovery after Photobleaching (FRAP)
FRAP was performed on an Andor Revolution Spinning Disk Confocal microscope with 488-nm laser. Droplets were bleached using 30% laser power with 20 us dwell time for 5 pulses, and images were collected every second for 60 seconds. Fluorescence intensity at the bleached spot and a control unbleached spot was measured using ImageJ. Values are normalized to the unbleached spot to control for photobleaching during image acquisition and then normalized to the first time point intensity.
MATLAB™ scripts were written to process the intensity data, and post bleach FRAP recovery data was normalized to pre-bleach intensity (FRAP(t)) and fit to:
Where, M (mobile fraction) and t (half-life of recovery) are inferred in-built MATLAB functions. These values are inferred for each replicate and averaged to provide a range for the apparent diffusion coefficients, which is computed as:
To analyse in vitro droplet experiments, the inventors used a previously reported pipeline (Guo et al., 2019). The code for this analysis is available at the Github link in the Data/Code Availability section. Briefly, all droplets were segmented from average images of captured channels on various criteria: (1) an intensity threshold that was three s.d. above the mean of the image; (2) size thresholds (20 pixel minimum droplet size); and (3) a minimum circularity (circularity=4π·(area)/(perimeter2)) of 0.8 (1 being a perfect circle). After segmentation, mean intensity for each droplet was calculated while excluding pixels near the phase interface. Hundreds of droplets identified in (typically) ten independent fields of view were quantified. The mean intensity within the droplets (C-in) and in the bulk (C-out) were calculated for each channel. The partition ratio was computed as (C-in)/(C-out).
Droplet size, partition ratio, and condensed fraction measure distinct properties of droplet formation, and these three metrics show similar trends upon RNA-mediated reentrant phase transitions. When a protein or RNA is fluorescently-labeled in our experiments, we favor measuring the partition ratio. This is because the partition ratio can be measured on a per-droplet basis, and unlike condensed fraction, which varies depending on the number of droplets per field, the partition ratio is more independent of the field that is imaged.
For the size analysis of droplets formed in the reconstituted transcription assays (
Enhancer and promoter sequences for RNAs were obtained from super-enhancer-regulated genes Pou5f1, Nanog, and Trim28. For promoter sequences, the first 475-490 bp from the first exon were selected from mm10. For enhancer sequences, GROseq reads (Sigova et al., 2015) from both + and − strands aligned to mm9 were overlapped with called super-enhancers (Whyte et al., 2013). Contiguous regions of read density above background were manually selected (
Phusion polymerase (NEB) is used to amplify the products with the bacterial promoters, and products are run on a 1% agarose gel, gel-purified using the Qiaquick Gel Extraction Kit (Qiagen), and eluted in 40 μL H2O. Templates were sequenced to verify their identity. A volume of 8 μL of each template (10-40 ng/μL) was transcribed using the MEGAscript T7 (Invitrogen; sense) or MEGAscript SP6 (Invitrogen; antisense) kits according to the manufacturer's instructions. For visualization of the RNA by microscopy, reactions included a Cy5-labeled UTP (Enzo LifeSciences ENZ-42506) at a ratio of 1:10 labeled UTP:unlabeled UTP. The in vitro transcription was incubated overnight at 37° C., then 1 μL TURBO DNAse (supplied in kit) was added, and the reaction was incubated for 15 minutes at 37° C. The MEGAclear Transcription Clean-Up Kit (Invitrogen) was used to purify the RNA following the manufacturer's instructions and eluting in 40 μL H2O. RNA was diluted to 2 μM and aliquoted to limit freeze/thaw cycles, and RNA was run on 1% agarose gels in TBE buffer to verify a single band of correct size.
Recombinant protein purifications were performed as previously reported (Boija et al., 2018; Guo et al., 2019; Sabari et al., 2018; Shrinivas et al., 2019; Zamudio et al., 2019). Briefly, pET expression plasmids containing 6×HIS tag and genes of interest or their IDRs tagged with either mEGFP or mCherry were transformed into LOBSTR cells (gift of I. Cheeseman Lab). Expression of proteins was induced by addition of 1 mM IPTG either at 16° C. for 18 hours or at 37° C. for 5 hours. Extracts were prepared as previously described (Boija et al., 2018). Proteins were purified by Ni-NTA agarose beads (Invitrogen, R901-15), and eluted with 50 mM Tris pH 7.4, 500 mM NaCl, 250 mM imidazole buffer containing complete protease inhibitors (Roche, 11873580001). Proteins were dialyzed against 50 mM Tris pH 7.4, 125 mM NaCl, 10% glycerol and 1 mM DTT at 4° C. for BRD4-IDR-GFP, OCT4-GFP and GFP alone and the same buffer containing 500 mM NaCl for MED1-IDR-GFP.
Purification of Human Mediator Complex from HeLa Nuclear Extract.
HeLa nuclear protein extract (4 g) was prepared as described in (Dignam et al., 1983). Nuclear extract was dialyzed against BC100: BC buffer, pH 7.5+100 mM KCl (20 mM Tris-HCl, 20 mM B-Mercaptoethanol, 0.2 mM PMSF, 0.2 mM EDTA, 10% glycerol (v/v) and 100 mM KCl). The extract was fractionated on a phosphocellulose column (P11) with BC buffer containing 0.1, 0.3, 0.5 and 1M KCl. The Mediator complex eluted in the 0.5M KCL (BC500) fraction. This fraction was dialyzed against BC100 and loaded on a DEAE Cellulose column and sequentially fractionated with BC buffer containing 0.1, 0.3 and 0.5M KCl. The Mediator did not bind the DEAE Cellulose resin and was collected in the flow through fraction 0.1M KCl (BC100). This fraction was then directly loaded onto a DEAE-5 PW column (TSK) and eluted with a linear KCl gradient from 0.1 to 1M KCl in BC buffer. The Mediator complex eluted between 0.4 and 0.6M KCl. The fractions containing Mediator were pooled and dialyzed against BD700: BD buffer, (20 mM Hepes pH 7.5, 20 mM β-Mercaptoethanol, 0.2 mM PMSF, 0.2 mM EDTA, 10% glycerol, and 700 mM (NH4)2SO4). This fraction was then loaded onto a Phenyl-Sepharose Hydrophobic Interaction Chromatography (HIC) column and eluted with a linear reverse gradient from 0.7 to 0.025M (NH4)2SO4 in BD buffer. The Mediator complex eluted between 0.3 and 0.1M (NH4)2SO4. The Mediator-containing fractions were again pooled and dialyzed against BA100: BA buffer, pH 7.5+100 mM NaCl (20 mM Hepes, 20 mM B-Mercaptoethanol, 0.2 mM PMSF, 0.2 mM EDTA, 10% glycerol and 100 mM NaCl) and loaded onto a Heparin Agarose column. The column was washed with BA100 and step-eluted with BA buffer containing 0.25, 0.5, 1M and 1M NaCl. The Mediator complex eluted in the 0.5M NaCl (BA500) fraction. A portion of this fraction was then loaded on a Superose-6 (gel filtration column) that was equilibrated and run in BC100. The Mediator complex eluted from the gel filtration column with a mass range between 1-2MDa.
The reconstituted in vitro transcription by RNA polymerase II was performed as previously described (Flores et al., 1992; LeRoy et al., 2008, 2019; Orphanides et al., 1998) with some modifications. A 1000 bp template DNA (unlabeled or Cy-3 labeled at 3′ end) containing adenovirus major late promoter, five Gal4 binding sites, TATA-box sequence and 561 bp from eGFP sequence was used. First, pre-initiation complex was assembled at RT for 15 min by mixing the following components: 50 nM RNA polymerase II enriched for hypophosphorylated CTD, 50 nM general transcription factors (TFIIA-B-D-E-F-H), and 5.75 nM template DNA, in a buffer containing 10 mM HEPES pH 7.5, 65 mM NaCl, 6.25 mM MgCl2, and 6.25 mM Sodium butyrate. Next, 10 nM Mediator complex and 10 nM GAL4 (Gal4 DNA binding domain fused to activation domain of VP16) were added to the reaction. Last, nucleotide mix containing 0.375 mM ATP, CTP, UTP, GTP (Invitrogen), 0.01 U RNase Inhibitor (Invitrogen), 1.25% PEG-8000 were added together with one of the following: a) various amounts of purified exogenous Pou5f1 RNA (0-500 nM) b) spermine (Sigma, S4264) c) extra NTPs (Invitrogen) d) extra NaCl e) heparin (Sigma, H3393). The reaction was incubated at 30° C. for 2 hr. RNA isolation was performed using RNeasy kit (Qiagen) by including a spike-in RNA control and an RNA carrier. Purified RNAs were treated with ezDNase (Invitrogen) for 30 min at 37° C. to eliminate the template DNA. Reverse transcription was performed using Superscript IV (Invitrogen) and qPCR was performed with SYBR Green Real Time PCR master mix (Invitrogen) to quantify the template derived transcriptional output. The Ct values of the reactions were normalized to the spike-in RNA control. The concentration of template derived transcriptional output was calculated by using a standard curve of qRT-PCR Ct values generated by known amounts of serially diluted GFP RNA. The sequence of primers used for qRT-PCR are indicated in Table S1.
To visualize the droplets formed in the reconstituted transcription assay, using a micropipette (2-20 μL), 5 μL of the reactions were loaded onto a homemade chamber, which was prepared by attaching coverslips to a glass slide by parallel strips of double-sided tape (Sabari et al., 2018). After the droplets were settled on the glass coverslip, the images were collected by using RPI Spinning Disk confocal microscope with a 100× objective. To account for camera artifacts in the images, brightfield Images of droplets from reconstituted assays were subjected to a white tophat filter with a disk element radius of 21 using the MorphoLib plugin in ImageJ, then a Gaussian filter (sigma=1) was applied.
The goal in this section is to develop a simplified and coarse-grained model that captures the qualitative physics of RNA-protein mixtures. Based on phenomenological observations of transcriptional proteins and RNA (
Transcriptional proteins phase separate in the absence of RNA through other types of interactions, albeit at higher concentrations.
At fixed protein concentrations, addition of RNA initially promotes de-mixing and at higher levels drive a re-entry into the mixed phase.
Motivated by the evidence that transcriptional condensates recruit diverse coactivators, transcription factors, and other proteins of the transcriptional apparatus (Boija et al., 2018; Guo et al., 2019; Sabari et al., 2018; Shrinivas et al., 2019), an effective protein component P was defined that lumps together different transcriptional molecules. Similarly, while different species of RNA are likely present within these condensates, an effective RNA species (R) was defined.
First, this problem was approached by constructing a phenomenological free-energy with 2 order-parameters that represent scaled concentrations of protein (ϕp({right arrow over (r)}, t)) and RNA (ϕr({right arrow over (r)}, t)). The free-energy (normalized to kBT=1) was defined as:
Here, fdw(ϕp({right arrow over (r)},t)=ρs(ϕp−α)2(ϕp−β)2 is a standard double-well potential that ensures protein components phase separate without RNA with co-existence concentrations specified by α, β. Choice of κ>0 ensures that there is finite surface tension for the protein condensate. The second-order term for RNA (ρr>0) states that within this model-framework, RNA cannot phase-separate in the absence of protein. Given that electrostatic interactions at physiological salt conditions are fairly short-ranged (Debye length ˜1 nm), the non-linear nature of RNA-protein interactions was captured in an effective interaction term χeff. This interaction term was defined in the spirit of the Landau-Ginzburg approach as an expansion in powers of the order parameters:
While symmetry arguments often dictate or exclude certain types of terms (odd powers in Ising models for example) in such an expansion, there are no obvious symmetry constraints for this system. Hence, this modeling approach is to minimize the number of higher-order terms that need to be included to recapitulate the experimentally observed reentrant phase transition. These experimental results suggest that low concentrations of RNA promote phase separation, and thus the lowest order term (−χϕpϕr, χ>0) lowers the free-energy. However, higher-order terms must counter this and below the inventors outline how it was determined which terms to include. In general, the stability of a mixture described by such a free-energy can be ascertained from the Jacobian matrix J. For this model, the elements of this 2×2 matrix are:
The mixed phase is no longer stable to perturbations when at least one eigen value of J becomes negative (spinodal instability). In the absence of RNA, the spinodal satisfies Jpp=0. If only the pair-wise interaction terms were considered (−χϕpϕr), the spinodal region broadens i.e., phase separation is promoted at lower protein concentrations when RNA is present. The effect of an additional higher-order term (only one of a, b or c is non-zero) on the Jacobian matrix was next characterized. Briefly, it was ascertained that:
While cubic and higher-order terms are required to recapitulate complete phase-behavior, this model was explored with c>0, assuming the coefficients a, b are small. In the simulations reported in
github.com/krishna-shrinivas/2020_Henninger_Oksuz_Shrinivas_RNA_feedback.
In this approach, rather than employ a phenomenological model, a microscopic model motivated by Flory-Huggins polymer-solution theory (Flory, 1942) was parameterized. The simplified F-H model contains 3 components-protein, RNA, and the solvent (s), whose volume fractions are defined as ϕp({right arrow over (r)}, t), ϕr({right arrow over (r)}, t), 1−ϕp({right arrow over (r)}, t)−ϕr({right arrow over (r)},t) respectively. The free-energy (normalized as before) is defined as:
Here, ri are the solvent-equivalent polymerization lengths of the RNA & protein (assumed to be equal for simplicity) and χij are the various pairwise interaction terms. As before, it was assumed that these interactions to be short-ranged at physiological salt levels. Choice of χpr>χps>0 and χrs<0 recapitulate the attractive contributions of protein-protein/protein-RNA interactions and repulsive RNA-RNA interactions. With these choices of constraints, the resulting free-energy looks similar to the phase diagram from the Landau approach with c>0 (
Numerical investigations of the coupled-equations outlined in
The chemical potential for the protein components is calculated as:
The radius of condensates was inferred from the volume of mesh regions where
The mobility of RNA and protein were chosen to be 1.0 unless mentioned elsewhere.
The inventors designed simulations (
In estimating the number and charge of transcriptional proteins (
As defined in the model (
over the range of parameters including diffusivity and radii of cluster.
Charge-balance calculations were performed (
The effective concentration of MED1-IDR in the assays was 1000 nM. The results were not quantitatively affected by inclusion/exclusion of the partial charge on Histidine residues, partly due to their low frequency on the protein sequences. For Heparin, a charge of roughly −3 per monomer was employed (Lin et al., 2020) and for single-stranded DNA, a charge of −1 per nt was employed. A comprehensive listing of charges of various species employed in this study are provided in Table S2. The fit is quantified by calculating Pearson correlation coefficient (r) was calculated between the median droplet partition value at different concentrations and the relevant charge-balance ratios and reported in
github.com/krishna-shrinivas/2020_Henninger_Oksuz_Shrinivas_RNA_feedback.
For small molecule inhibition experiments, cells were treated with 100 μM DRB (Sigma), or 1 μM Actinomycin-D (Sigma) in 2i media (detailed above) for 30 minutes, then imaged. For wash-out experiments, media was replaced with fresh 2i media and cells were allowed to recover for 1 hour, then the cells were imaged.
Cells with endogenously-tagged Med1-GFP (Sabari et al., 2018) were plated on glass-bottom dishes (Mattek) coated with poly-L-ornithine (Sigma) and laminin (ThermoFisher). Mock (DMSO) and treated cells were imaged on a LSM 880 Confocal Microscope with Airyscan to obtain super-resolution z-stacks for at least 8 different fields containing multiple cells. For quantification, a manual threshold was applied equally across all conditions to remove background, and the size of Med1-GFP puncta was quantified in 3D using the 3D object counter plugin (Fiji/ImageJ).
HaloTag was endogenously knocked into 5′-end of Med19 via homology-directed repair (HDR) in mouse embryonic stem cells (R1 mESCs). Three single-guide RNAs (sgRNAs) targeting+/−100 bps from the start codons of Med19 gene were designed using the web-based CRISPR Design tool (http://crispr.mit.edu) and integrated into a Streptococcus pyogenes Cas9 vector (Addgene #62988) for standard CRISPR/Cas9 editing. Single positive colonies were sorted by fluorescence-activated cell sorting (FACS) and validated under the microscope.
Cells were cultured in serum-free 2i medium on poly-L-ornithine (PLO) and Laminin-coated flasks for more than two days and then were transferred onto coated imaging dishes for another day. Before imaging, cells were stained with (PA)-JF549-HaloTAG dye (a gift from Luke Lavis Lab, Janelia Research Campus) of 100 nM concentration for 2 hours followed by a 60-minute wash in fresh 2i medium. Lastly, dishes were filled in with 2 ml Leibovitz's L-15 Medium (no phenol red, Thermo Fisher) and brought to the microscope for imaging.
Photo-activation localization microscopy (PALM) imaging was performed using a Nikon Eclipse Ti microscope with a 100× oil immersion objective (NA 1.40) (Nikon, Tokyo, Japan). A 405 nm beam of 100 mW power (attenuated with 25% AOTF) and a 561 nm beam of 500 mW power were columnated and superposed to perform simultaneous activation and excitation. The combined beam was expanded and re-collimated with an achromatic beam expander (AC254-040-A and AC508-300-A, THORLABS) to improve the uniformity of illumination across the whole region of interest (ROI 256{circumflex over ( )}2 pixels). Images were acquired with an Andor iXon Ultra 897 EMCCD camera (gain 1000, exposure time 50 ms) interfaced through Micro Manager 1.4. 2400 frames were acquired for each imaging cycle. The cells were maintained at 37° C. in a temperature-controlled platform (In Vivo Scientific, St. Louis, MO) on the microscope stage during image acquisition. Med19-Halo cluster lifetimes were calculated as previously described using the qSR software (dark time tolerance=20 frames, min cluster size=50) (Andrews et al., 2018), and a cumulative distribution was generated using Prism software (GraphPad).
For the nascent RNA experiments in
For analysis of these images, nuclei were segmented using the Cellpose algorithm (Stringer et al., 2020) on the 405 Hoechst channel images. For average image analysis in
Vectors used in the reporter assay are modified from pTETRIS-cargo vector, gift from J. M. Calabrese (Kirk et al., 2018). 6× STOP codon sequence was cloned into NotI digested pTETRIS-cargo vector using Gibson cloning strategy by following the manufacturer's instructions (NEB). This vector is called pTETRIS-cargo-STOP. The feedback gene and the reporter gene have their own polyA termination signal (200-300 bp) to terminate transcription. There is 51 bp between these two polyA signals that are facing each other. The reporter gene is regulated by a phosphoglycerate kinase (PGK) promoter. Various versions of the pTETRIS-cargo-STOP using Gibson cloning strategy (NEB): i) the relative orientations of the feedback RNA and luciferase reporter were altered (tandem or divergent orientations) ii) feedback RNAs and luciferase reporter were cloned into separate vectors. Using Gibson cloning strategy (NEB), various RNA sequences were cloned downstream of the 6× STOP sequence to prevent translation of these RNAs. Stable cell lines for individual RNAs were generated by transfecting Med1-GFP mESCs with the following vectors: 1.0 μg pTETRIS-cargo-STOP containing individual RNAs, 1.0 μg rTTA-cargo, gift from J. M. Calabrese (Kirk et al., 2018), and 1 μg piggyBAC transposase (Systems Biosciences). Cells were selected on puromycin (2 μg/ml) and G418 (200 μg/ml) for 1 week for successful integrations. For luciferase assays, 1×105 cells of each genotype were plated in triplicate on 0.2%-gelatin-coated 24-well plates and allowed to settle overnight. Cells were treated with doxycycline (Sigma) and harvested after 24 h to measure either luciferase activity or to purify RNA. Luciferase activity was measured using the Luciferase Assay System (Promega) according to manufacturer instructions. Luciferase signal was normalized to total protein content, measured by BCA protein assay kit (Invitrogen, #23227), and then normalized to a control not treated with doxycycline. To measure RNA expression, RNA was purified using the Qiagen RNeasy Mini kit (Qiagen) according to manufacturer instructions, cDNA was generated by Superscript III (Invitrogen) according to manufacturer instructions, and 10 ng of cDNA was used in a qRT-PCR SYBR-green reaction (Life Technologies) with primers specific to a common sequence shared across the vectors (qPCR_Tetris, Table S1). Ct values were normalized to a housekeeping gene (qPCR_mActb, Table S1) and a control condition with no doxycycline treatment.
For the washout experiments in
For imaging experiments in
This application claims the benefit of U.S. Provisional Application No. 63/118,586, filed on Nov. 25, 2020. The entire teachings of the above application are incorporated herein by reference.
This invention was made with government support under Grant Nos. GM123511, CA155258 and 1F32CA254216-01 awarded by the National Institutes of Health, and Grant No. PHY-1743900 awarded by the National Science Foundation. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2021/060735 | 11/24/2021 | WO |
Number | Date | Country | |
---|---|---|---|
63118586 | Nov 2020 | US |