The following description of the background of the present technology is provided simply as an aid in understanding the present technology and is not admitted to describe or constitute prior art to the present technology. Throughout and within this disclosure technical and patent literature is referenced by an Arabic numeral or an identifying citation. The complete bibliographic citation for the literature referenced by an Arabic numeral can be found immediately preceding the claims.
Epigenetics is the study of mitotically and/or meiotically stable but reversible modifications to nucleotides or higher order chromatin structure that can alter expression patterns of genes in the absence of changes to the underlying DNA sequence (1). These modifications occur on multiple levels, such as 5-methyl-cytosine (5-meC) DNA methylation, post-translational modifications of histones bound by protein domains that serve as epigenetic writers, readers and erasers and noncoding RNAs that assist in the recruitment of chromatin modifying proteins to DNA (2). These epigenetic layers dynamically dictate the three-dimensional organization of the genome within the nuclear ultrastructure and orchestrate local accessibility for the eukaryotic transcriptional machinery (3). Because of this, epigenetic signatures play a crucial role in dictating cellular identity during development and throughout life in response to the environment (1), correlate with aging (4) and are linked to disease (5), for instance, Rett syndrome (RTT) and CDKL5 deficiency disorder (CDD), two rare X-linked developmental brain disorders associated with epigenetic modification. The neurodevelopmental disorder CDKL5 deficiency is caused by de novo mutations in the CDKL5 gene on the X chromosome (30). Due to random X-chromosome inactivation (XCI), females affected by the disorder form a mosaic of tissue with cells expressing either the mutant or wild type allele (31). Phenotypic variation observed between females in families with RTT are also ascribed to differences in X-inactivation patterns.
Accordingly, there is a need to improve our understanding of XCI and reactivation of X-linked genes, and a need for targeted approaches that result in specific gene reactivation. Targeted DNA demethylation of genes on the X chromosome would allow for a directed assessment of the causal role between DNA methylation and gene expression on the inactive X chromosome. Furthermore, the presence of coding SNPs that exist in clonally-derived female cell lines provides an allele-specific model to study escape from XCI induced by targeted epigenetic remodelling. There is also a need for potential therapeutic approaches that activate a silenced wild type allele of a gene such as CDKL5 in cells expressing the loss-of-function mutant allele.
This disclosure satisfies these needs and provides related advantages as well.
The process of XCI epigenetically regulates the amount of transcriptionally active X-chromatin in somatic tissue as a dosage compensation mechanism to ensure equal expression levels of X-linked genes in males and females (6). In female somatic cells, one X chromosome randomly becomes inactive and is cytologically manifested during interphase as a perinuclear heterochromatic Barr body, which is then clonally maintained through mitosis (7, 8). This mechanism is mediated by the long noncoding RNA X-inactive specific transcript (XIST) expressed from the inactive X chromosome in cis (9), which serves as a guiding factor to tether Polycomb proteins for gene silencing to target sites on the X-chromatin (10). XIST induces the formation of repressive heterochromatin through histone deacetylation (11), DNA methylation of CpG-island (CGI) promoters (12), di- and trimethylation of histone 3 at lysine 9 (H3K9me2/3) (13), the deposition and spreading of H3K27me3 across the inactive X-chromatin (14) and the H2A histone variant macroH2A (15).
Gene expression data suggests there is an estimated 15-30% of human X-linked genes that escape XCI (16) at an arbitrary transcriptional threshold of 10% of the active allele (17). The level of escape from XCI is variable between genes and individuals (16), demonstrates tissue heterogeneity (18) and increases with age (19). X-escapees have a distinct epigenetic signature from genes that are subject to XCI, including enrichment of active and depletion of repressive histone marks, and generally reduced levels of DNA methylation near regulatory elements (17). In particular, the degree of CGI promoter 5meC DNA methylation has been demonstrated to be highly correlative with XCI (12, 20).
In line with the idea that DNA methylation forms an epigenetic barrier on the inactive X chromosome, the most potent X-reactivation to date has been achieved by treatment with 5-azacytidine, a global DNA hypomethylating agent in combination with X-wide genetic ablation of XIST (21). In addition, pharmacological and genetic screens aiming to identify trans-acting factors promoting XCI have identified the maintenance DNA methyltransferase DNMT1 as a key player in XCI (22, 23). However, previous studies aiming to elucidate the mechanism of XCI-escape, such as the aforementioned small molecule approaches, utilized untargeted approaches. While these studies have provided a significant foundation of knowledge, in particular demonstrating the importance of DNA methylation in our understanding of X-reactivation, the global side-effects of these types of approaches limit the study of specific gene reactivation.
Until recently, the lack of targeted approaches by which epigenetics can be modified has limited the studies of XCI mechanisms. With the availability of the RNA-guided clustered regularly interspaced palindromic repeats (CRISPR) system, catalytically inactive dCas9 fused to epigenetic effector domains has become the method of choice for targeted rewriting of the epigenome to further elucidate the causality between epigenetic marks and gene expression (24, 25). In particular, dCas9 fusions with the catalytic domain of ten-eleven translocation dioxygenase 1 (TET1) have gained prominence as a candidate to precisely demethylate gene promoters or enhancers for multiple gene targets (26-29). Synthetically inducing a gene escape from XCI via DNA methylation editing of a gene promoter using a dCas9 fusion proteins for targeted DNA demethylation has the potential for providing a much needed therapy for at least X-linked developmental brain disorders.
Building on these discoveries, Applicant provides the following aspects and disclosures.
In one aspect, the present disclosure provides a gene editing system comprising, or consisting essentially of or yet further consisting of: (i) a first nucleotide molecule encoding a dCas9-Ten-Eleven Translocation methylcytosine dioxygenase 1 catalytic domain (TET1CD) fusion protein, and (ii) a second nucleotide molecule encoding at least one single guide RNA (sgRNA), comprising, or consisting essentially of, or yet further consisting of a scaffold region and a spacer region; wherein the spacer region hybridizes to a nucleotide sequence complementary to a target sequence adjacent to a 5′-end of a protospacer adjacent motif (PAM); and wherein the target sequence and the PAM are located within about 1 kilobase (kb) of the transcriptional start site (TSS) of the cyclin dependent kinase-like 5 (CDKL5) gene.
In some embodiments, the spacer region comprises, or consists essentially of, or yet further consists of a spacer sequence provided in Table 1.
In some embodiments, the gene editing system further comprises a third nucleotide molecule encoding a dCas9 protein fused to at least one transcriptional activator.
In some embodiments, the at least one transcriptional activator fused to the dCas9 protein that comprises, or consists essentially of or consists of VP64 or a fragment thereof.
In some embodiments, the target sequence for the sgRNA comprises, or consists essentially of, or consist of one or more of AGAGCATCGGACCGAAGCGG, GGGGGAGAACATACTCGGGG, and/or CCCAGGTTGCTAGGGCTTGG.
In some embodiments, the at least one sgRNA comprises a first sgRNA, a second sgRNA, and a third sgRNA, wherein the target sequence for the first sgRNA comprises or consists essentially of, or yet further consists of AGAGCATCGGACCGAAGCGG, wherein the target sequence for the second sgRNA comprises or consists essentially of, or yet further consists of GGGGGAGAACATACTCGGGG, and wherein the target sequence for the third sgRNA comprises or consists essentially of, or yet further consists of CCCAGGTTGCTAGGGCTTGG.
In some embodiments, the first nucleotide molecule, the second nucleotide molecule, and the third nucleotide molecule are integrated into one or more viral or plasmid vectors.
In some embodiments, the viral vector is a selected from the group of a lentiviral vector, an adeno-associated viral (AAV) vector, or an adenoviral vector.
In one aspect, the disclosure provides a kit comprising the system as described herein and optional instructions for use in the methods as described herein.
In one aspect, the disclosure provides a host cell comprising the gene editing system.
In some embodiments, the host cell comprises a prokaryotic or a eukaryotic cell.
In some embodiments, the host cell comprises a mammalian or a human cell. In another aspect, the mammalian or host cell is a stem cell or progenitor cell, e.g., a iPSC, an embryonic stem cell or a stem cell with the capacity to differentiate into a specific lineage, e.g., neuronal lineage.
In some embodiments, the host cell as described herein has reduced CDKL5 gene expression and/or reduced DNA methylation in the CDKL5 promoter region.
In some embodiments, the host cell is a cultured cell or a primary cell.
In some embodiments, the host cell further comprising a therapeutic molecule.
In one aspect, the disclosure provides a pharmaceutical composition comprising the gene editing system, the vectors or the host cell comprising the gene editing system.
In some embodiments, the pharmaceutical composition comprises a carrier.
In some embodiments, the pharmaceutical composition comprises a pharmaceutically acceptable carrier or excipient.
In one aspect, the disclosure provides a method for increasing CDKL5 gene expression in a cell or subject in need thereof comprising or consists essentially of, or yet further consists of administering to the subject the gene editing system or the pharmaceutical composition comprising or consists essentially of, or yet further consists of the gene editing system.
In some embodiments, DNA methylation in a CDKL5 promoter region of the subject is methylated or hypermethylated, and in one aspect as compared to a non-silenced X-chromosome.
In some embodiments, the CDKL5 promoter region is located on a silenced X-chromosomal allele of the subject.
In some embodiments, the subject has been diagnosed with CDKL5 deficiency disorder (CDD).
In some embodiments, a cell is isolated from a subject having been diagnosed with CDD.
In some embodiments, the cell is a neuronal cell.
In some embodiments, the gene editing system or the pharmaceutical composition is administered to the subject by one or more of: an intravenous route, a subcutaneous route, an intramuscular route, an intradermal route, an intranasal route, an oral route, an intracranial route, an intrathecal route, an ocular route, an otic route, a rectal route, a vaginal route, an optic route, or an intraperitoneal route.
In some embodiments, the subject to be treated is a mammal.
In some embodiments, the mammal is a non-human fetus, an infant, a juvenile, or an adult.
In some embodiments, a biological sample from the subject is analyzed for CDKL5 gene expression, prior to and/or after treatment.
In some embodiments, CDKL5 gene expression is analyzed by quantitative PCR using exon-spanning primers for CDKL5 and for the reference gene GAPDH. Exemplary primer oligonucleotides for analyzing CDKL5 gene expression are provided in Table 1.
In one aspect, the disclosure provides a method for treating or preventing CDD in a subject in need thereof comprising administering to the subject the gene editing system or the pharmaceutical composition comprising the gene editing system. In one aspect, a biological system is analyzed for CDKL5 gene expression prior to or after treatment.
In some embodiments, DNA methylation in a CDKL5 promoter region of the subject is reduced, in one aspect, as compared to wild-type gene.
In some embodiments, the CDKL5 promoter region is located on a silenced X-chromosomal allele of the subject.
In some embodiments, the gene editing system or pharmaceutical composition is administered to the subject by one or more of: an intravenous route, a subcutaneous route, an intramuscular route, an intradermal route, an intranasal route, an oral route, an intracranial route, an intrathecal route, an ocular route, an otic route, a rectal route, a vaginal route, an optic route, or an intraperitoneal route.
In some embodiments, the subject is a mammal. In some embodiments, the mammal is a non-human fetus, an infant, a juvenile, or an adult.
In some embodiments, genomic DNA isolated from the subject is analyzed for targeted DNA methylation.
In some embodiments, targeted DNA methylation is analyzed by bisulfite-sequencing PCR. Exemplary primers for bisulfite-sequencing PCR are provided in Table 1.
In one aspect, the disclosure provides a vector encoding a sgRNA, wherein the sgRNA comprises a scaffold region and a spacer region, wherein the spacer region hybridizes to a nucleotide sequence complementary to a target sequence comprising, or consisting essentially of, or yet further consisting of one or more of AGAGCATCGGACCGAAGCGG, and/or GGGGGAGAACATACTCGGGG, and/or CCCAGGTTGCTAGGGCTTGG.
In some embodiments, the spacer region comprises a spacer sequence provided in Table 1.
In some embodiments, the vector encodes a first sgRNA and a second sgRNA; wherein the first sgRNA and the second sgRNA each comprise (a) a scaffold region and (b) a spacer region that hybridizes to a nucleotide sequence complementary to a target sequence; and wherein: (i) the target sequence of the first sgRNA comprises or consists essentially of, or yet further consists of AGAGCATCGGACCGAAGCGG, and the target sequence of the second sgRNA comprises or consists essentially of, or yet further consists of GGGGGAGAACATACTCGGGG; (ii) the target sequence of the first sgRNA comprises or consists essentially of, or yet further consists of AGAGCATCGGACCGAAGCGG, and the target sequence of the second sgRNA comprises or consists essentially of, or yet further consists of CCCAGGTTGCTAGGGCTTGG; or (iii) the target sequence of the first sgRNA comprises or consists essentially of, or yet further consists of GGGGGAGAACATACTCGGGG, and the target sequence of the second sgRNA comprises or consists essentially of, or yet further consists of CCCAGGTTGCTAGGGCTTGG.
In some embodiments, the vector encodes a first sgRNA, a second sgRNA, and a third sgRNA, wherein the first sgRNA, the second sgRNA, and the third sgRNA each comprise (a) a scaffold region and (b) a spacer region that hybridizes to a nucleotide sequence complementary to a target sequence, wherein the target sequence of the first sgRNA comprises or consists essentially of, or yet further consists of AGAGCATCGGACCGAAGCGG, wherein the target sequence of the second sgRNA comprises or consists essentially of, or yet further consists of GGGGGAGAACATACTCGGGG, and wherein the target sequence of the third sgRNA comprises or consists essentially of, or yet further consists of CCCAGGTTGCTAGGGCTTGG.
In some embodiments, the vector further comprises a nucleotide molecule encoding a dCas9-TET1CD fusion protein.
In some embodiments, the vector further comprises a nucleotide molecule encoding a dCas9 protein fused to at least one transcriptional activator.
In some embodiments, the vector further comprises a first nucleotide molecule encoding a dCas9-TET1CD fusion protein and a second nucleotide molecule encoding a dCas9 protein fused to at least one transcriptional activator.
In some embodiments, the transcriptional activator fused to the dCas9 protein comprises VP64 or a fragment thereof. In some embodiments, the vector is a viral vector or a plasmid vector.
In some embodiments, the viral vector is a lentiviral vector, an AAV vector, or an adenoviral vector.
In one aspect, the disclosure provides a host cell comprising the vector.
In one aspect, the disclosure provides a pharmaceutical composition comprising the vector or the host cell comprising the vector.
In some embodiments, the pharmaceutical composition comprises a carrier.
In some embodiments, the pharmaceutical composition comprises a pharmaceutically acceptable carrier or excipient.
Embodiments according to the present disclosure are described more fully hereinafter. Aspects of the disclosure may, however, be embodied in different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. The terminology used in the description herein is for the purpose of describing particular embodiments only and is not intended to be limiting. Throughout and within this disclosure various technical and patent publications are references by a citation or an Arabic numeral. The full bibliographic citations for each reference identified by an Arabic numeral is found in the reference section, immediately preceding the claims.
It is to be appreciated that certain aspects, modes, embodiments, variations and features of the present methods are described below in various levels of detail in order to provide a substantial understanding of the present technology. The definitions of certain terms as used in the specification are provided below. Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the present application and relevant art and should not be interpreted in an idealized or overly formal sense unless expressly so defined herein. While not explicitly defined below, such terms should be interpreted according to their common meaning.
The terminology used in the description herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. All publications, patent applications, patents and other references mentioned herein are incorporated by reference in their entirety.
The practice of the present technology will employ, unless otherwise indicated, conventional techniques of tissue culture, immunology, molecular biology, microbiology, cell biology, and recombinant DNA, which are within the skill of the art.
Unless the context indicates otherwise, it is specifically intended that the various features of the invention described herein can be used in any combination. Moreover, the disclosure also contemplates that in some embodiments, any feature or combination of features set forth herein can be excluded or omitted. To illustrate, if the specification states that a complex comprises components A, B and C, it is specifically intended that any of A, B or C, or a combination thereof, can be omitted and disclaimed singularly or in any combination. The term consisting of intends the recited elements and any additional elements that do not materially change of the function of the recited element or elements.
Unless explicitly indicated otherwise, all specified embodiments, features, and terms intend to include both the recited embodiment, feature, or term and biological equivalents thereof.
All numerical designations, e.g., pH, temperature, time, concentration, and molecular weight, including ranges, are approximations which are varied (+) or (−) by increments of 1.0 or 0.1, as appropriate, or alternatively by a variation of +/−15%, or alternatively 10%, or alternatively 5%, or alternatively 2%. It is to be understood, although not always explicitly stated, that all numerical designations are preceded by the term “about”. It also is to be understood, although not always explicitly stated, that the reagents described herein are merely exemplary and that equivalents of such are known in the art.
The practice of the present technology employs, unless otherwise indicated, conventional techniques of tissue culture, immunology, molecular biology, microbiology, cell biology, and recombinant DNA, which are within the skill of the art. See, e.g., Green and Sambrook eds. (2012) Molecular Cloning: A Laboratory Manual, 4th edition; the series Ausubel et al. eds. (2015) Current Protocols in Molecular Biology; the series Methods in Enzymology (Academic Press, Inc., N.Y.); MacPherson et al. (2015) PCR 1: A Practical Approach (IRL Press at Oxford University Press); MacPherson et al. (1995) PCR 2: A Practical Approach; McPherson et al. (2006) PCR: The Basics (Garland Science); Harlow and Lane eds. (1999) Antibodies, A Laboratory Manual; Greenfield ed. (2014) Antibodies, A Laboratory Manual; Freshney (2010) Culture of Animal Cells: A Manual of Basic Technique, 6th edition; Gait ed. (1984) Oligonucleotide Synthesis; U.S. Pat. No. 4,683,195; Hames and Higgins eds. (1984) Nucleic Acid Hybridization; Anderson (1999) Nucleic Acid Hybridization; Herdewijn ed. (2005) Oligonucleotide Synthesis: Methods and Applications; Hames and Higgins eds. (1984) Transcription and Translation; Buzdin and Lukyanov ed. (2007) Nucleic Acids Hybridization: Modern Applications; Immobilized Cells and Enzymes (IRL Press (1986)); Grandi ed. (2007) In Vitro Transcription and Translation Protocols, 2nd edition; Guisan ed. (2006) Immobilization of Enzymes and Cells; Perbal (1988) A Practical Guide to Molecular Cloning, 2nd edition; Miller and Calos eds, (1987) Gene Transfer Vectors for Mammalian Cells (Cold Spring Harbor Laboratory); Makrides ed. (2003) Gene Transfer and Expression in Mammalian Cells; Mayer and Walker eds. (1987) Immunochemical Methods in Cell and Molecular Biology (Academic Press, London); Lundblad and Macdonald eds. (2010) Handbook of Biochemistry and Molecular Biology, 4th edition; Herzenberg et al. eds (1996) Weir's Handbook of Experimental Immunology, 5th edition; and/or more recent editions thereof.
As used herein, the singular forms “a,” “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise.
As used herein, the term “about,” when referring to a measurable value such as an amount or concentration and the like, is meant to encompass variations of 20%, 10%, 5%, 1%, 0.5%, or even 0.1% of the specified amount.
As used herein, the terms or “acceptable,” “effective,” or “sufficient” when used to describe the selection of any components, ranges, dose forms, etc. disclosed herein intend that said component, range, dose form, etc. is suitable for the disclosed purpose.
As used herein, the term “adeno-associated virus” or “AAV” refers to a member of the class of viruses associated with this name and belonging to the genus dependoparvovirus, family Parvoviridae. Multiple serotypes of this virus are known to be suitable for gene delivery; all known serotypes can infect cells from various tissue types. At least 11 or 12, sequentially numbered, are disclosed in the prior art. Non-limiting exemplary serotypes useful in the gene editing systems, host cells, pharmaceutical compositions, vectors, and methods disclosed herein include any of the 11 or 12 serotypes, e.g., AAV2, AAV5, and AAV8, or variant serotypes, e.g. AAV-DJ. The AAV structural particle is composed of 60 protein molecules made up of VP1, VP2 and VP3. Each particle contains approximately 5 VP1 proteins, 5 VP2 proteins and 50 VP3 proteins ordered into an icosahedral structure.
As used herein, the term “administering” a compound or composition to a subject means delivering the compound to the subject. “Administering” includes prophylactic administration of the compound or composition (i.e., before the disease and/or one or more symptoms of the disease are detectable) and/or therapeutic administration of the composition (i.e., after the disease and/or one or more symptoms of the disease are detectable). The methods of the present technology include administering one or more compounds or agents.
If more than one compound is to be administered, the compounds may be administered together at substantially the same time, and/or administered at different times in any order.
Also, the compounds of the present technology may be administered before, concomitantly with, and/or after administration of another type of drug or therapeutic procedure (e.g., surgery).
As used herein, “ameliorate,” “ameliorating,” and the like, as used herein, refer to inhibiting, relieving, eliminating, or slowing progression of one or more symptoms.
As used herein, “and/or” refers to and encompasses any and all possible combinations of one or more of the associated listed items, as well as the lack of combinations when interpreted in the alternative (“or”).
As used herein, the term “aptamer” as used herein refers to single stranded DNA or RNA molecules that can bind to one or more selected targets with high affinity and specificity. Non-limiting exemplary targets include by are not limited to proteins or peptides.
As used herein, the term “Cas9” refers to a CRISPR-associated, RNA-guided endonuclease such as Streptococcus pyogenes Cas9 (spCas9) and orthologs and biological equivalents thereof. Biological equivalents of Cas9 include but are not limited to C2c1 from Alicyclobacillus acideterrestris and Cpf1 (which performs cutting functions analogous to Cas9) from various bacterial species including Acidaminococcus spp. and Francisella novicida U112. Cas9 may refer to an endonuclease that causes double stranded breaks in DNA, a nickase variant such as a RuvC or HNH mutant that causes a single stranded break in DNA, as well as other variations such as deadCas-9 or dCas9, which lack endonuclease activity. Cas9 may also refer to “split-Cas9” in which CAs9 is split into two halves—C-Cas9 and N-Cas9—and fused with a two intein moieties. See, e.g., U.S. Pat. No. 9,074,199 B1; Zetsche et al., Nat Biotechnol. 33(2):139-42 (2015); Wright et al., PNAS 112(10) 2984-89 (2015).
As used herein, the term “cell” or “host cell” may refer to either a prokaryotic or eukaryotic cell, optionally obtained from a subject or a commercially available source.
As used herein, the term “CRISPR” refers to Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR). CRISPR may also refer to a technique or system of sequence-specific genetic manipulation relying on the CRISPR pathway. A CRISPR recombinant expression system can be programmed to cleave a target polynucleotide using a CRISPR endonuclease and a guide RNA. A CRISPR system can be used to cause double stranded or single stranded breaks in a target polynucleotide. A CRISPR system can also be used to recruit proteins or label a target polynucleotide. In some aspects, CRISPR-mediated gene editing utilizes the pathways of nonhomologous end-joining (NHEJ) or homologous recombination to perform the edits. These applications of CRISPR technology are known and widely practiced in the art. See, e.g., U.S. Pat. No. 8,697,359, and Hsu et al., Cell 156(6): 1262-1278 (2014).
As used herein, the term “comprising” is intended to mean that the compositions and methods include the recited elements, but do not exclude others.
As used herein, the transitional phrase “consisting essentially of” (and grammatical variants) is to be interpreted as encompassing the recited materials or steps “and those that do not materially affect the basic and novel characteristic(s)” of the recited embodiment. Thus, the term “consisting essentially of” as used herein should not be interpreted as equivalent to “comprising.” “Consisting of” shall mean excluding more than trace elements of other ingredients and substantial method steps for administering the compositions disclosed herein. Aspects defined by each of these transition terms are within the scope of the present disclosure.
As used herein, the term “effective amount” or “therapeutically effective amount” refers to the amount of an agent that is sufficient to effect beneficial or desired results. The therapeutically effective amount may vary depending upon one or more of: the subject and disease condition being treated, the weight and age of the subject, the severity of the disease condition, the manner of administration and the like, which can readily be determined by one of ordinary skill in the art. The specific dose may vary depending on one or more of: the particular agent chosen, the dosing regimen to be followed, whether it is administered in combination with other compounds, timing of administration, the route of administration, and the physical delivery system in which it is carried.
In some embodiments, “effective amount” or “therapeutically effective amount” refers to a quantity sufficient to achieve a desired therapeutic and/or prophylactic effect, e.g., an amount which results in the full or partial amelioration of disease or disorders or symptoms associated with mitochondrial dysfunction, neurological disease, lack of energy, glycolytic process dysfunction or cellular respiration related dysfunction in a subject in need thereof. In the context of therapeutic or prophylactic applications, the amount of a composition administered to the subject will depend on the type and severity of the disease and on the characteristics of the individual, such as general health, age, sex, body weight and tolerance to drugs. It will also depend on the degree, severity and type of disease. A person of ordinary skill in the art will be able to determine appropriate dosages depending on these and other factors. The compositions can also be administered in combination with one or more additional compounds. Multiple doses may be administered. Additionally or alternatively, multiple therapeutic compositions or compounds may administered. In the methods described herein, the compounds may be administered to a subject having one or more signs or symptoms of a disease or disorder described herein.
As used herein, the term “encode” as it is applied to nucleic acid sequences refers to a polynucleotide which is said to “encode” a polypeptide if, in its native state or when manipulated by methods well known to those skilled in the art, can be transcribed and/or translated to produce the mRNA for the polypeptide and/or a fragment thereof. The antisense strand is the complement of such a nucleic acid, and the encoding sequence can be deduced therefrom.
As used herein, the term “endonuclease” refers to any suitable endonuclease enzyme protein or a variant thereof that will be specifically directed by the selected guide polynucleotide to enzymatically knock-out the target sequence of the guide polynucleotide.
As used herein, the term “variant thereof,” as used with respect to an endonuclease, refers to the referenced endonuclease in its enzymatically functional form expressed in any suitable host organism or expression system and/or including any modifications to enhance the enzymatic activity of the endonuclease.
In some embodiments of the present disclosure, a suitable endonuclease includes a CRISPR-associated sequence 9 (Cas9) endonuclease or a variant thereof, a CRISPR-associated sequence 13 (Cas13) endonuclease or a variant thereof, CRISPR-associated sequence 6 (Cas6) endonuclease or a variant thereof, a CRISPR from Prevotella and Francisella 1 (Cpf1) endonuclease or a variant thereof, or a CRISPR from Microgenomates and Smithella 1 (Cms1) endonuclease or a variant thereof. In some embodiments of the present disclosure, a suitable endonuclease includes a Streptococcus pyogenes Cas9 (SpCas9), a Staphylococcus aureus Cas9 (SaCas9), a Francisella novicida Cas9 (FnCas9), or a variant thereof. Variants may include a protospacer adjacent motif (PAM) SpCas9 (xCas9), high fidelity SpCas9 (SpCas9-FIF1), a high fidelity SaCas9, or a high fidelity FnCas9.
In some embodiments of the present disclosure, the endonuclease comprises a Cas fusion nuclease comprising a Cas9 protein or a variant thereof fused with a Fok1 nuclease or variant thereof. Variants of the Cas9 protein of this fusion nuclease include a catalytically inactive Cas9 (e.g., dead Cas9). In some embodiments of the present disclosure, the endonuclease may be a Cas9, Cas1 3, Cas6, Cpf1, CMS1 protein, or any variant thereof that is derived or expressed from Methanococcus maripaludis C7, Corynebacterium diphtheria, Corynebacterium efficiens YS-314, Corynebacterium glutamicum (ATCC 13032), Corynebacterium glutamicum (ATCC 13032), Corynebacterium glutamicum R, Corynebacterium kroppenstedtii (DSM 44385), Mycobacterium abscessus (ATCC 19977), Nocardia farcinica IFM1 0 152, Rhodococcus erythropolis PR4, Rhodococcus jostii RFIA1, Rhodococcus opacus B4 (uid36573), Acidothermus cellulolyticus 11B, Arthrobacter chlorophenolicus A6, Kribbella flavida (DSM 17836, uid43465), Thermomonospora curvata (DSM431 83), Bifidobacterium dentium Bd1, Bifidobacterium longum DJO10A, Slackia heliotrinireducens (DSM 20476), Persephonella marina EX H1, Bacteroides fragilis NCTC 9434, Capnocytophaga ochracea (DSM 7271), Flavobacterium psychrophilum JIP02 86, Akkermansia muciniphila (ATCC BAA 835), Roseiflexus castenholzii (DSM 13941), Roseiflexus RS1, Synechocystis PCC6803, Elusimicrobium minutum Pei1 9 1, uncultured Termite group 1 bacterium phylotype Rs D 17, Fibrobacter succinogenes S85, Bacillus cereus (ATCC 10987), Listeria innocua, Lactobacillus casei, Lactobacillus rhamnosus GG, Lactobacillus salivarius UCC1 18, Streptococcus agalactiae-5-A909, Streptococcus agalactiae NEM316, Streptococcus agalactiae 2603, Streptococcus dysgalactiae equisimilis GGS 124, Streptococcus equi zooepidemicus MGCS1 0565, Streptococcus gallolyticus UCN34 (uid46061), Streptococcus gordonii Challis subst CH1, Streptococcus mutans NN2025 (uid46353), Streptococcus mutans, Streptococcus pyogenes M 1 GAS, Streptococcus pyogenes MGAS5005, Streptococcus pyogenes MGAS2096, Streptococcus pyogenes MGAS9429, Streptococcus pyogenes MGAS 10270, Streptococcus pyogenes MGAS61 80, Streptococcus pyogenes MGAS31 5, Streptococcus pyogenes SSI-1, Streptococcus pyogenes MGAS1 0750, Streptococcus pyogenes NZ1 3 1, Streptococcus thermophiles CNRZ1 066, Streptococcus thermophiles LMD-9, Streptococcus thermophiles LMG 1831 1, Clostridium botulinum A3 Loch Maree, Clostridium botulinum B Eklund 17B, Clostridium botulinum Ba4 657, Clostridium botulinum F Langeland, Clostridium cellulolyticum H 10, Finegoldia magna (ATCC 29328), Eubacterium rectale (ATCC 33656), Mycoplasma gallisepticum, Mycoplasma mobile 163K, Mycoplasma penetrans, Mycoplasma synoviae 53, Streptobacillus moniliformis (DSM 121 12), Bradyrhizobium BTAil, Nitrobacter hamburgensis X14, Rhodopseudomonas palustris BisB1 8, Rhodopseudomonas palustris BisB5, Parvibaculum lavamentivorans DS-1, Dinoroseobacter shibae. DFL 12, Gluconacetobacter diazotrophicus Pal 5 FAPERJ, Gluconacetobacter diazotrophicus Pal 5 JGI, Azospirillum B51 0 (uid46085), Rhodospirillum rubrum (ATCC 11170), Diaphorobacter TPSY (uid29975), Verminephrobacter eiseniae EFO1-2, Neisseria meningitides 053442, Neisseria meningitides alpha14, Neisseria meningitides Z2491, Desulfovibrio salexigens DSM 2638, Campylobacter jejuni doylei 269 97, Campylobacter jejuni 8 1116, Campylobacter jejuni, Campylobacter lari RM21 00, Helicobacter hepaticus, Wolinella succinogenes, Tolumonas auensis DSM 9 187, Pseudoalteromonas atlantica T6c, Shewanella pealeana (ATCC 700345), Legionella pneumophila Paris, Actinobacillus succinogenes 130Z, Pasteurella multocida, Francisella tularensis novicida U112, Francisella tularensis holarctica, Francisella tularensis FSC 198, Francisella tularensis, Francisella tularensis WY96-3418, or Treponema denticola (ATCC 35405).
As used herein, the terms “equivalent” or “biological equivalent” are used interchangeably when referring to a particular molecule, biological, or cellular material and intend those having minimal homology while still maintaining desired structure or functionality.
As used herein, the term “expression” refers to the process by which polynucleotides are transcribed into mRNA and/or the process by which the transcribed mRNA is subsequently being translated into peptides, polypeptides, or proteins. If the polynucleotide is derived from genomic DNA, expression may include splicing of the mRNA in a eukaryotic cell. The expression level of a gene may be determined by measuring the amount of mRNA or protein in a cell or tissue sample; further, the expression level of multiple genes can be determined to establish an expression profile for a particular sample.
As used herein, the term “functional” may be used to modify any molecule, biological, or cellular material to intend that it accomplishes a particular, specified effect.
As used herein, the term “guide polynucleotide” refers to a polynucleotide having a “synthetic sequence” capable of binding the corresponding endonuclease enzyme protein (e.g., Cas9) and a variable target sequence capable of binding the genomic target (e.g., a nucleotide sequence found in an exon of a target gene). In some embodiments of the present disclosure, a guide polynucleotide is a guide ribonucleic acid (gRNA). In some embodiments, the variable target sequence of the guide polynucleotide is any sequence within the target that is unique with respect to the rest of the genome and is immediately adjacent to a Protospacer Adjacent Motif (PAM). The exact sequence of the PAM sequence may vary as different endonucleases require different PAM sequences.
As used herein, “homology” or “identity” or “similarity” refers to sequence similarity between two peptides or between two nucleic acid molecules. Homology can be determined by comparing a position in each sequence which may be aligned for purposes of comparison. When a position in the compared sequence is occupied by the same base or amino acid, then the molecules are homologous at that position. A degree of homology between sequences is a function of the number of matching or homologous positions shared by the sequences. An “unrelated” or “non-homologous” sequence shares less than 40% identity, or alternatively less than 25% identity, with one of the sequences of the present invention.
As used herein, “hybridization” or “hybridizes” refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues. The hydrogen bonding may occur by Watson-Crick base pairing, Hoogstein binding, or in any other sequence-specific manner. The complex may comprise two strands forming a duplex structure, three or more strands forming a multi-stranded complex, a single self-hybridizing strand, or any combination of these. A hybridization reaction may constitute a step in a more extensive process, such as the initiation of a PC reaction, or the enzymatic cleavage of a polynucleotide by a ribozyme.
Examples of stringent hybridization conditions include: incubation temperatures of about 25° C. to about 37° C.; hybridization buffer concentrations of about 6× saline-sodium citrate (“SSC”) to about 10×SSC; formamide concentrations of about 0% to about 25%; and wash solutions from about 4×SSC to about 8×SSC. Examples of moderate hybridization conditions include: incubation temperatures of about 40° C. to about 50° C.; buffer concentrations of about 9×SSC to about 2×SSC; formamide concentrations of about 30% to about 50%; and wash solutions of about 5×SSC to about 2×SSC. Examples of high stringency conditions include: incubation temperatures of about 55° C. to about 68° C.; buffer concentrations of about 1×SSC to about 0.1×SSC; formamide concentrations of about 55% to about 75%; and wash solutions of about 1×SSC, 0.1×SSC, or deionized water. In general, hybridization incubation times are from 5 minutes to 24 hours, with 1, 2, or more washing steps, and wash incubation times are about 1, 2, or 15 minutes. SSC is 0.15 M sodium chloride (“NaCl”) and 15 mM citrate buffer. It is understood that equivalents of SSC using other buffer systems can be employed.
As used herein, the term “isolated” as used herein refers to molecules or biologicals or cellular materials being substantially free from other materials.
As used herein, the term “lentivirus” refers to a member of the class of viruses associated with this name and belonging to the genus lentivirus, family Retroviridae. While some lentiviruses are known to cause diseases, other lentivirus are known to be suitable for gene delivery. See, e.g., Tomas et al. (2013) Biochemistry, Genetics and Molecular Biology: “Gene Therapy—Tools and Potential Applications,” ISBN 978-953-51-1014-9, DOI: 10.5772/52534.
As used herein, the terms “nucleic acid sequence,” “nucleotide sequence,” and “polynucleotide” are used interchangeably to refer to a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides. Thus, this term includes, but is not limited to, single-, double-, or multi-stranded DNA or RNA, genomic DNA, cDNA, DNA-RNA hybrids, or a polymer comprising purine and pyrimidine bases or other natural, chemically or biochemically modified, non-natural, or derivatized nucleotide bases.
As used herein, the term “organ” a structure which is a specific portion of an individual organism, where a certain function or functions of the individual organism is locally performed and which is morphologically separate. Non-limiting examples of organs include the skin, blood vessels, cornea, thymus, kidney, heart, liver, umbilical cord, intestine, nerve, lung, placenta, pancreas, thyroid and brain.
As used herein, the term “ortholog” is used in reference of another gene or protein and intends a homolog of said gene or protein that evolved from the same ancestral source.
Orthologs may or may not retain the same function as the gene or protein to which they are orthologous. Non-limiting examples of Cas9 orthologs include S. aureus Cas9 (“spCas9”), S. thermophiles Cas9, L. pneumophilia Cas9, N. lactamica Cas9, N. meningitides Cas9, B. longum Cas9, A. muciniphila Cas9, and O. laneus Cas9.
As used herein, “prevention,” “prevents,” or “preventing” of a disorder or condition refers to a compound that, in a statistical sample, reduces the occurrence of the disorder, symptom, or condition in the treated sample relative to a control subject, or delays the onset of one or more symptoms of the disorder or condition relative to the control subject.
As used herein, the term “promoter” as used herein refers to any sequence that regulates the expression of a coding sequence, such as a gene. refers to a region of DNA that initiates transcription of a particular gene. The promoter includes the core promoter, which is the minimal portion of the promoter required to properly initiate transcription and can also include regulatory elements such as transcription factor binding sites. The regulatory elements may promote transcription or inhibit transcription. Regulatory elements in the promoter can be binding sites for transcriptional activators or transcriptional repressors. A promoter can be constitutive or inducible. A constitutive promoter refers to one that is always active and/or constantly directs transcription of a gene above a basal level of transcription. An inducible promoter is one which is capable of being induced by a molecule or a factor added to the cell or expressed in the cell. An inducible promoter may still produce a basal level of transcription in the absence of induction, but induction typically leads to significantly more production of the protein. Promoters can also be tissue specific. A tissue specific promoter allows for the production of a protein in a certain population of cells that have the appropriate transcriptional factors to activate the promoter.
Promoters may be constitutive, inducible, repressible, or tissue-specific, for example. A “promoter” is a control sequence that is a region of a polynucleotide sequence at which initiation and rate of transcription are controlled. It may contain genetic elements at which regulatory proteins and molecules may bind such as RNA polymerase and other transcription factors. Non-limiting exemplary promoters include CDKL5 promoter, SCML2 promoter, COL9A3 promoter, MECP2, CMV promoter and U6 promoter, the phosphoglycerate kinase 1 (PGK) promoter; SSFV, CMV, MNDU3, SV40, Efla, UBC and CAGG. Non-limiting exemplary promoter sequences are provided herein below:
CMV Promoter
ATACGCGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGG GGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAA TGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACG TATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGT ATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTAC GCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTAC ATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTAT TACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTGGATAGCGGTTTGAC TCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGC ACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGC AAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTCGTTTAG TGAACCGTCAGATCGCCTGGAGACGCCATCCACGCTGTTTTGACCTCCATAGAAG ACACCGGGACCGATCCAGCCTCCGGACTCTAGAGGATCGAACCCTT, or a biological equivalent thereof.
U6 Promoter
GAGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCT GTTAGAGAGATAATTAGAATTAATTTGACTGTAAACACAAAGATATTAGTACAA AATACGTGACGTAGAAAGTAATAATTTCTTGGGTAGTTTGCAGTTTTAAAATTAT GTTTTAAAATGGACTATCATATGCTTACCGTAACTTGAAAGTATTTCGATTTCTTG GCTTTATATATCTTGTGGAAAGGACGAAACACC, or a biological equivalent thereof.
A number of effector elements are disclosed herein for use in these vectors; e.g., a tetracycline response element (e.g., tetO), a tet-regulatable activator, T2A, VP64, RtA, KRAB, and a miRNA sensor circuit. The nature and function of these effector elements are commonly understood in the art and a number of these effector elements are commercially available. Non-limiting exemplary sequences thereof are disclosed herein and further description thereof is provided herein below.
As used herein, the term “protein”, “peptide” and “polypeptide” are used interchangeably and in their broadest sense to refer to a compound of two or more subunits of amino acids, amino acid analogs or peptidomimetics. The subunits may be linked by peptide bonds. In another aspect, the subunit may be linked by other bonds, e.g., ester, ether, etc. A protein or peptide must contain at least two amino acids and no limitation is placed on the maximum number of amino acids which may comprise a protein's or peptide's sequence. As used herein the term “amino acid” refers to either natural and/or unnatural or synthetic amino acids, including glycine and both the D and L optical isomers, amino acid analogs and peptidomimetics.
As used herein, “protospacer adjacent motif” (PAM) refers to a short nucleotide sequence adjacent to a target sequence (protospacer) that is recognized (targeted) by a sgRNA/Cas endonuclease system described herein. The sequence and length of a PAM herein can differ depending on the Cas protein or Cas protein complex used. The PAM sequence can be of any length but is typically 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 nucleotides long. The PAM sequence plays a key role in target recognition by licensing sgRNA base pairing to the protospacer sequence (Szczelkun et al., Proc. Natl. Acad. Sci. U.S.A 111: 9798-803 (2014)).
As used herein, the term “recombinant expression system” refers to a genetic construct for the expression of certain genetic material formed by recombination.
As used herein, the term “sgRNA” or “single guide RNA” as used herein refers to the guide RNA sequences used to target specific genes for correction employing the CRISPR technique. Techniques of designing sgRNAs and donor therapeutic polynucleotides for target specificity are well known in the art. For example, Doench et al., Nature Biotechnology 32(12):1262-7 (2014), Mohr et al., FEBS J. 283: 3232-38 (2016), and Graham et al., Genome Biol. 16:260 (2015). sgRNA comprises or alternatively consists essentially of, or yet further consists of a fusion polynucleotide comprising CRISPR RNA (crRNA; i.e., a scaffold region) and trans-activating CRIPSPR RNA (tracrRNA; i.e., a spacer region); or a polynucleotide comprising crRNA (i.e., a scaffold region) and tracrRNA (i.e., a spacer region). In some aspects, a sgRNA is synthetic (Kelley et al., J of Biotechnology 233:74-83 (2016).
As used herein, the terms “subject,” “individual,” or “patient” can be an individual organism, a vertebrate, a mammal, or a human. “Mammal” includes a human, non-human mammal, non-human primate, murine (e.g., mouse, rat, guinea pig, hamster), ovine, bovine, ruminant, lagomorph, porcine, caprine, equine, canine, feline, avis, etc. In any embodiment herein, the mammal is feline or canine. In any embodiment herein, the mammal is human.
As used herein, “target sequence” refers to a nucleotide sequence adjacent to a 5′-end of a protospacer adjacent motif (PAM). Being “adjacent” herein means being within 1 to 8 nucleotides of the site of reference, including being “immediately adjacent,” which means that there is no intervening nucleotides between the immediately adjacent nucleotide sequences and the immediately adjacent nucleotide sequences are within one nucleotide of each other.
As used herein, “target site” refers to a site of the target sequence including both the target sequence and its complementary sequence, for example, in double stranded nucleotides. The target site described herein may mean a nucleotide sequence hybridizing to a sgRNA spacer region, a complementary nucleotide sequence of the nucleotide sequence hybridizing to a sgRNA spacer region, and/or a nucleotide sequence adjacent to the 5′-end of a PAM. Full complementarity of a sgRNA spacer region with a target site is not necessarily required, provided there is sufficient complementarity to cause hybridization and promote formation of a CRISPR complex. A target sequence or target site may comprise any polynucleotide, such as DNA or RNA polynucleotides. In some embodiments, a target sequence or target site is located in the nucleus or cytoplasm of a cell. In some embodiments, the target sequence or target site may be within an organelle of a eukaryotic cell, for example, mitochondrion or chloroplast.
As used herein, the term “tissue” is used herein to refer to tissue of a living or deceased organism or any tissue derived from or designed to mimic a living or deceased organism. The tissue may be healthy, diseased, and/or have genetic mutations. The biological tissue may include any single tissue (e.g., a collection of cells that may be interconnected) or a group of tissues making up an organ or part or region of the body of an organism. The tissue may comprise a homogeneous cellular material or it may be a composite structure such as that found in regions of the body including the thorax which for instance can include lung tissue, skeletal tissue, and/or muscle tissue. Exemplary tissues include, but are not limited to those derived from liver, lung, thyroid, skin, pancreas, blood vessels, bladder, kidneys, brain, biliary tree, duodenum, abdominal aorta, iliac vein, heart and intestines, including any combination thereof.
As used herein, “treating” or “treatment” of a disease in a subject refers to (1) preventing the symptoms or disease from occurring in a subject that is predisposed or does not yet display symptoms of the disease; (2) inhibiting the disease or arresting its development; or (3) ameliorating or causing regression of the disease or the symptoms of the disease. As understood in the art, “treatment” is an approach for obtaining beneficial or desired results, including clinical results. For the purposes of the present technology, beneficial or desired results can include one or more, but are not limited to, alleviation or amelioration of one or more symptoms, diminishment of extent of a condition (including a disease), stabilized (i.e., not worsening) state of a condition (including disease), delay or slowing of condition (including disease), progression, amelioration or palliation of the condition (including disease), states and remission (whether partial or total), whether detectable or undetectable. In one aspect, the term “treatment” excludes prevention or prophylaxis.
As used herein, “stem cell” defines a cell with the ability to divide for indefinite periods in culture and give rise to specialized cells. At this time and for convenience, stem cells are categorized as somatic (adult) or embryonic. A somatic stem cell is an undifferentiated cell found in a differentiated tissue that can renew itself (clonal) and (with certain limitations) differentiate to yield all the specialized cell types of the tissue from which it originated. An embryonic stem cell is a primitive (undifferentiated) cell from the embryo that has the potential to become a wide variety of specialized cell types. An embryonic stem cell is one that has been cultured under in vitro conditions that allow proliferation without differentiation for months to years. A clone is a line of cells that is genetically identical to the originating cell; in this case, a stem cell.
A population of cells intends a collection of more than one cell that is identical (clonal) or non-identical in phenotype and/or genotype. A substantially homogenous population of cells is a population having at least 70%, or alternatively at least 75%, or alternatively at least 80%, or alternatively at least 85%, or alternatively at least 90%, or alternatively at least 95%, or alternatively at least 98% identical phenotype, as measured by pre-selected markers.
As used herein, “embryonic stem cells” refers to stem cells derived from tissue formed after fertilization but before the end of gestation, including pre-embryonic tissue (such as, for example, a blastocyst), embryonic tissue, or fetal tissue taken any time during gestation, typically but not necessarily before approximately 10-12 weeks gestation. Most frequently, embryonic stem cells are pluripotent cells derived from the early embryo or blastocyst. Embryonic stem cells can be obtained directly from suitable tissue, including, but not limited to human tissue, or from established embryonic cell lines. “Embryonic-like stem cells” refer to cells that share one or more, but not all characteristics, of an embryonic stem cell.
A neural stem cell is a cell that can be isolated from the adult central nervous systems of mammals, including humans. They have been shown to generate neurons, migrate and send out aconal and dendritic projections and integrate into pre-existing neuroal circuits and contribute to normal brain function. Reviews of research in this area are found in Miller (2006) The Promise of Stem Cells for Neural Repair, Brain Res. Vol. 1091(1):258-264; Pluchino et al. (2005) Neural Stem Cells and Their Use as Therapeutic Tool in Neurological Disorders, Brain Res. Brain Res. Rev., Vol. 48(2):211-219; and Goh, et al. (2003) Adult Neural Stem Cells and Repair of the Adult Central Nervous System, J. Hematother. Stem Cell Res., Vol. 12(6):671-679.
As use herein, the term “differentiation” describes the process whereby an unspecialized cell acquires the features of a specialized cell such as a heart, liver, or muscle cell. “directed differentiation” refers to the manipulation of stem cell culture conditions to induce differentiation into a particular cell type. “Dedifferentiated” defines a cell that reverts to a less committed position within the lineage of a cell. As used herein, the term “differentiates or differentiated” defines a cell that takes on a more committed (“differentiated”) position within the lineage of a cell. As used herein, “a cell that differentiates into a mesodermal (or ectodermal or endodermal) lineage” defines a cell that becomes committed to a specific mesodermal, ectodermal or endodermal lineage, respectively. Examples of cells that differentiate into a mesodermal lineage or give rise to specific mesodermal cells include, but are not limited to, cells that are adipogenic, leiomyogenic, chondrogenic, cardiogenic, dermatogenic, hematopoetic, hemangiogenic, myogenic, nephrogenic, urogenitogenic, osteogenic, pericardiogenic, or stromal.
As used herein, the term “differentiates or differentiated” defines a cell that takes on a more committed (“differentiated”) position within the lineage of a cell. “Dedifferentiated” defines a cell that reverts to a less committed position within the lineage of a cell. Induced pluripotent stem cells are examples of dedifferentiated cells.
As used herein, the “lineage” of a cell defines the heredity of the cell, i.e. its predecessors and progeny. The lineage of a cell places the cell within a hereditary scheme of development and differentiation.
A “multi-lineage stem cell” or “multipotent stem cell” refers to a stem cell that reproduces itself and at least two further differentiated progeny cells from distinct developmental lineages. The lineages can be from the same germ layer (i.e. mesoderm, ectoderm or endoderm), or from different germ layers. An example of two progeny cells with distinct developmental lineages from differentiation of a multilineage stem cell is a myogenic cell and an adipogenic cell (both are of mesodermal origin, yet give rise to different tissues). Another example is a neurogenic cell (of ectodermal origin) and adipogenic cell (of mesodermal origin).
A “precursor” or “progenitor cell” intends to mean cells that have a capacity to differentiate into a specific type of cell. A progenitor cell may be a stem cell. A progenitor cell may also be more specific than a stem cell. A progenitor cell may be unipotent or multipotent. Compared to adult stem cells, a progenitor cell may be in a later stage of cell differentiation. An example of progenitor cell includes, without limitation, a progenitor nerve cell.
A “parthenogenetic stem cell” refers to a stem cell arising from parthenogenetic activation of an egg. Methods of creating a parthenogenetic stem cell are known in the art. See, for example, Cibelli et al. (2002) Science 295(5556):819 and Vrana et al. (2003) Proc. Natl. Acad. Sci. USA 100 (Suppl. 1) 11911-6.
As used herein, a “pluripotent cell” defines a less differentiated cell that can give rise to at least two distinct (genotypically and/or phenotypically) further differentiated progeny cells. In another aspect, a “pluripotent cell” includes an Induced Pluripotent Stem Cell (iPSC) which is an artificially derived stem cell from a non-pluripotent cell, typically an adult somatic cell, that has historically been produced by inducing expression of one or more stem cell specific genes. Such stem cell specific genes include, but are not limited to, the family of octamer transcription factors, i.e. Oct-3/4; the family of Sox genes, i.e., Sox1, Sox2, Sox3, Sox 15 and Sox 18; the family of Klf genes, i.e. Klf1, Klf2, Klf4 and Klf5; the family of Myc genes, i.e. c-myc and L-myc; the family of Nanog genes, i.e., OCT4, NANOG and REX1; or LIN28. Examples of iPSCs are described in Takahashi et al. (2007) Cell advance online publication 20 Nov. 2007; Takahashi & Yamanaka (2006) Cell 126:663-76; Okita et al. (2007) Nature 448:260-262; Yu et al. (2007) Science advance online publication 20 Nov. 2007; and Nakagawa et al. (2007) Nat. Biotechnol. Advance online publication 30 Nov. 2007.
As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. Vectors include, but are not limited to, nucleic acid molecules that are single-stranded, double-stranded, or partially double-stranded; nucleic acid molecules that comprise one or more free ends, no free ends (e.g., circular); nucleic acid molecules that comprise DNA, RNA, or both; and other varieties of polynucleotides known in the art. One type of vector is a “plasmid,” which refers to a circular double stranded DNA loop into which additional DNA segments can be inserted, such as by standard molecular cloning techniques.
Another type of vector is a viral vector, wherein virally-derived DNA or RNA sequences are present in the vector for packaging into a virus (e.g., retroviruses, replication defective retroviruses, adenoviruses, replication defective adenoviruses, lentiviruses, replication defective lentiviruses, and adeno-associated viruses). Viral vectors also include polynucleotides carried by a virus for transfection into a host cell. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., non-episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome. Moreover, certain vectors are capable of directing the expression of genes to which they are operatively linked. Such vectors are referred to herein as “expression vectors.” Common expression vectors of utility in recombinant DNA techniques are often in the form of plasmids. Recombinant expression vectors can comprise a nucleic acid of the invention in a form suitable for expression of the nucleic acid in a host cell, which means that the recombinant expression vectors include one or more regulatory elements, which may be selected on the basis of the host cells to be used for expression, that is operatively-linked to the nucleic acid sequence to be expressed. Within a recombinant expression vector, “operably linked” is intended to mean that the nucleotide sequence of interest is linked to the regulatory element(s) in a manner that allows for expression of the nucleotide sequence (e.g., in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell). Advantageous viral expression vectors include retroviruses, replication defective retroviruses, adenoviruses, replication defective adenoviruses, lentiviruses, replication defective lentiviruses, and adeno-associated viruses.
It is to be inferred without explicit recitation and unless otherwise intended, that when the present disclosure relates to a polypeptide, protein, polynucleotide or antibody, a fragement an equivalent or a biologically equivalent of such is intended within the scope of this disclosure. As used herein, the term “biological equivalent thereof” is intended to be synonymous with “equivalent thereof” when referring to a reference protein, antibody, polypeptide or nucleic acid, intends those having minimal homology while still maintaining desired structure or functionality. Unless specifically recited herein, it is contemplated that any polynucleotide, polypeptide or protein mentioned herein also includes equivalents thereof. For example, an equivalent intends at least about 70% homology or identity, or at least 80% homology or identity and alternatively, or at least about 85%, or alternatively at least about 90%, or alternatively at least about 95%, or alternatively 98% percent homology or identity and exhibits substantially equivalent biological activity to the reference protein, polypeptide or nucleic acid. Alternatively, when referring to polynucleotides, an equivalent thereof is a polynucleotide that hybridizes under stringent conditions to the reference polynucleotide or its complement.
Applicants have provided herein the polypeptide and/or polynucleotide sequences for use in gene and protein transfer and expression techniques described below. It should be understood, although not always explicitly stated that the sequences provided herein can be used to provide the expression product as well as substantially identical sequences that produce a protein that has the same biological properties. These “biologically equivalent” or “biologically active” polypeptides are encoded by equivalent polynucleotides as described herein. They may possess at least 60%, or alternatively, at least 65%, or alternatively, at least 70%, or alternatively, at least 75%, or alternatively, at least 80%, or alternatively at least 85%, or alternatively at least 90%, or alternatively at least 95% or alternatively at least 98%, identical primary amino acid sequence to the reference polypeptide when compared using sequence identity methods run under default conditions. Specific polypeptide sequences are provided as examples of particular embodiments. Modifications to the sequences to amino acids with alternate amino acids that have similar charge. Additionally, an equivalent polynucleotide is one that hybridizes under stringent conditions to the reference polynucleotide or its complement or in reference to a polypeptide, a polypeptide encoded by a polynucleotide that hybridizes to the reference encoding polynucleotide under stringent conditions or its complementary strand. Alternatively, an equivalent polypeptide or protein is one that is expressed from an equivalent polynucleotide.
Pharmaceutically acceptable salts of compounds described herein are within the scope of the present technology and include acid or base addition salts which retain the desired pharmacological activity and is not biologically undesirable (e.g., the salt is not unduly toxic, allergenic, or irritating, and is bioavailable). When the compound of the present technology has a basic group, such as, for example, an amino group, pharmaceutically acceptable salts can be formed with inorganic acids (such as hydrochloric acid, hydroboric acid, nitric acid, sulfuric acid, and phosphoric acid), organic acids (e.g. alginate, formic acid, acetic acid, benzoic acid, gluconic acid, fumaric acid, oxalic acid, tartaric acid, lactic acid, maleic acid, citric acid, succinic acid, malic acid, methanesulfonic acid, benzenesulfonic acid, naphthalene sulfonic acid, and p-toluenesulfonic acid) or acidic amino acids (such as aspartic acid and glutamic acid). When the compound of the present technology has an acidic group, such as for example, a carboxylic acid group, or a hydroxyl group(s) it can form salts with metals, such as alkali and earth alkali metals (e.g. Na*, Li*, K*, Ca2+, Mg2+, Zn2+), ammonia or organic amines (e.g. dicyclohexylamine, trimethylamine, triethylamine, pyridine, picoline, ethanolamine, diethanolamine, triethanolamine) or basic amino acids (e.g. arginine, lysine and ornithine). Such salts can be prepared in situ during isolation and purification of the compounds or by separately reacting the purified compound in its free base or free acid form with a suitable acid or base, respectively, and isolating the salt thus formed.
Modes for Carrying Out the Disclosure Gene Editing Systems
The disclosure provides a gene editing systems comprising, or alternatively consisting essentially of, or yet further consisting of: (i) a first nucleotide molecule encoding a dCas9-Ten-Eleven Translocation methylcytosine dioxygenase 1 catalytic domain (TET1CD) fusion protein; and (ii) a second nucleotide molecule encoding at least one small guide RNA (sgRNA). In some embodiment, the second nucleotide molecule encoding at least one small guide RNA (sgRNA) comprises, or consists essentially of, or consisting of a scaffold region and a spacer region. In some embodiments, the scaffold region is an amino acid sequence that is necessary for dCas9 binding to the gRNA (addgene.org/guides/crispr/). In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence adjacent to a 5′-end of a protospacer adjacent motif (PAM). In some embodiments, the target sequence and the PAM are located at least about 2 or about 1 kilobase (kb), at least about 1.5 kb, at least about 1 kb, at least about 0.9 kb, at least about 0.8 kb, at least about 0.7 kb, at least about 0.6 kb, at least about 0.5 kb, at least about 0.4 kb, at least about 0.3 kb, at least about 0.2 kb, at least about 0.1 kb from the transcriptional start site (TSS) of the CDKL5 gene. While the target sequence and the PAM are in one aspect located can be located at least about 1 kb from the transcriptional start site, it is apparent to the skilled artisan that other ranges are within the scope of this invention, e.g., the target sequence and the PAM are located from about 2 kb, or from about 1 kb to about 0.1 kb.
In some embodiments, the first nucleotide molecule encoding a dCas9-Ten-Eleven Translocation methylcytosine dioxygenase 1 catalytic domain (TET1CD) fusion protein; and (ii) the second nucleotide molecule encoding at least one small guide RNA (sgRNA) induce DNA demethylation of CpGs (GC islands or region) at positions of at least about −1500, at least about −1000, at least about −500, at least about −200, at least about −148, at least about −66 and, at least about −19 relative to transcription start site.
In some embodiments, the first nucleotide and second nucleotide molecules permit the transcriptional reprogramming of a gene promoter by precisely demethylating gene promoters or enhancers for desired gene targets. Thus, in one aspect, as described herein, is a method for transcriptionally reprogramming a gene promoter in a cell in need thereof, by inserting into the cell, the system as disclosed herein. In some embodiments, DNA is methylated at 5-cytosine (5mC), and such methylation silence gene expression and is important for genomic imprinting, regulation of gene expression, chromatic architecture organization, and cell-fate determination. In some embodiments, gene demythylation is associated with gene activation and occurs either via passive demethylation or through the oxidation of the methyl group. In some embodiments, demethylation via oxidation is mediated by TET (ten-eleven translocation) dioxygenases that oxidizes 5 methyl cytosine (5mC) to 5-hydroxymethylcytosine (5-hmC), which is a critical step in the ultimate removal of the methyl group.
In some embodiments, the full-length TET1 protein comprises typical features of 20G-Fe(II) oxygenases, including conservation of residues predicted to be important for coordination of the cofactors Fe(II) and 20G. The full-length TET1 protein has 2136 amino acids, and comprises an N-terminal a helix followed by a continuous series of p strands, typical of the double-stranded 0 helix (DSBH) fold of the 20G-Fe(II) oxygenases, a unique conserved cysteine-rich region (amino acids 1418-1610 of the full-length human TET1 protein; MIM:607790; ENSG00000138336) that is contiguous with the N terminus of the DSBH region (amino acids 1611-2074), a CXXC-type zinc-binding domain (amino acids 584-624 of the full-length human TET1 protein) domain, binuclear Zn-chelating domain, and three bipartite nuclear localization signals (NLS) (66, 68). In some embodiments, TET1 catalytic domain (TET1CD) comprises, or consists essentially of, or consisting of amino acids 1418 to 2136 of the full-length TET1 protein, and encompasses the conserved cysteine-rich region and the DSBH domain (68). In some embodiments, the DSBH domain of the catalytic domain construct comprises a nuclear localization (NLS) sequence. In some embodiments, the DSBH domain of the catalytic domain construct does not comprise a NLS sequence.
In some embodiments, the dCas9-TET1 fusion protein facilitates the targeted demethylation of gene targets (24-29). In particular, dCas9-TET1 facilitates the targeted demethylation of gene targets selected from the group consisting of CDK5L, SCML2 (Scm Polycomb Group Protein Like 2), COL9A3, or Methyl-CpG Binding Protein 2 (MECP) as shown in the Examples below. In some embodiments, both (i) a first nucleotide molecule encoding a dCas9-Ten-Eleven Translocation methylcytosine dioxygenase 1 catalytic domain (TET1CD) fusion protein and (ii) a second nucleotide molecule encoding at least one small guide RNA (sgRNA), are required to target dCas9-Tetl to a specific locus to demethylate DNA without altering the DNA sequence.
In some embodiments, the dCas9 is a catalytically inactive Cas9 nuclease from the Clustered regularly interspaced palindromic repeats (CRISPR), a type II bacterial adaptive immune system that has been modified to target the dCas9 to a desired genomic loci using sequence-specific guide RNAs for genome editing. In some embodiments, the desired genomic loci include any genes, optionally CDK5L, SCML2 (Scm Polycomb Group Protein Like 2), COL9A3, or Methyl-CpG Binding Protein 2 (MECP). In some embodiments, CDKL5 sgRNAs 20-bp spacer sequences are selected within at least about about 1 kb or about 2 kb, at least about 1.5 kb, at least about 1 kb, at least about 0.9 kb, at least about 0.8 kb, at least about 0.7 kb, at least about 0.6 kb, at least about 0.5 kb, at least about 0.4 kb, at least about 0.3 kb, at least about 0.2 kb, at least about 0.1 kb of the CDKL5 TSS (chrX:18,443,725, hg19) using the CRISPR/Cas9 and TALEN online tool for genome editing, CHOPCHOP. In some embodiments, guide RNAs (sgRNAs) span DNase I hypersensitive sites and H3K4me3 peaks of the CDKL5 promoter within at least about 2 kb, at least about 1.5 kb, at least about 1 kb, at least about 0.9 kb, at least about 0.8 kb, at least about 0.7 kb, at least about 0.6 kb, at least about 0.5 kb, at least about 0.4 kb, at least about 0.3 kb, at least about 0.2 kb, at least about 0.1 kb of window on either side of the CDKL5 transcriptional start site. In some embodiments, the second nucleotide molecule encoding at least one small guide RNA (sgRNA) used to create target-specific sgRNA expression vectors are listed in Table 1.
In some embodiments, the targeted sequence is a sequence in the gene promoter. The targeted sequence or a fragment thereof hybridizes to the corresponding gRNA. In one embodiment, the targeted sequence hybridizes to the corresponding gRNA without any mismatches. In another embodiment, the targeted sequence hybridizes to the corresponding gRNA with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more mismatches. Based on the targeted sequence, the gRNA sequence can be determined. In one embodiment, a gRNA comprises, or consists essentially of, or yet further consists of a sequence complement to a targeted sequence, such as those as disclosed herein, or an equivalent that is capable of binding to the same targeted sequence but comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more mismatches. In another embodiment, a gRNA comprises, or consists essentially of, or yet further consists of a sequence reverse-complement to a targeted sequence, such as those as disclosed herein, or an equivalent that is capable of binding to the same targeted sequence but comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more mismatches. In yet another embodiment, a gRNA comprises, or consists essentially of, or yet further consists of a sequence reverse to a targeted sequence, such as those as disclosed herein, or an equivalent that is capable of binding to the same targeted sequence but comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more mismatches.
In one aspect, this disclosure provides a third nucleotide molecule encoding a dCas9 protein fused to at least one transcriptional activator. In some embodiments, the fusion protein comprising the deactivated CRISPR-associated protein 9 (dCas9) with at least one tandem repeat of the transcriptional activator herpes simplex virus VP16 (i.e.VP64) induces transcriptional activation of endogenous of an endogenous gene. In some embodiments, the at least one transcriptional activator comprises VP64 or a biologically active fragment of VP16. Transcription factors act through a DNA-binding domain that localizes a protein to a specific site within the genome and through accessory effector domains that either activate or repress transcription at or near that site. Effector domains, such as the activation domain the herpes simplex virus VP16 (66) and the repression domain Kruppel-associated box (KRAB), are modular and retain their activity when they are fused to other DNA-binding proteins. In some embodiments, VP64 is the activation domain VP16 In some embodiments, VP64 is a recombinant tetrameric repeat of comprising the minimal activation domain VP64. In some embodiments, the activation domain of VP16 comprises amino acids 413-489 of the VP16 protein (66). In some embodiments, the recombinant tetrameric repeat of VP16's minimal activation domain comprises, or consists essentially of, or yet further consists of the amino acid sequence DAL DDFDLDMIL (66) In some embodiments, a third nucleotide molecule encoding a dCas9 protein fused to at least one of dCas9-VP64, VP64-p65-Rta triparte fusion (addgene.org/99670/), and or SunTag. SunTag is a novel protein scaffold/tagging system with a repeating peptide array for signal amplification in gene expression.
In some embodiment, dCas9-VP64 fusion protein upregulates genes in an unmethylated chromatin context. In some embodiment combination of dCas9-VP64 fusion protein and dCas9-TET1CD shows a synergistic effect resulted in a greater than 60% expression of an inactive allele (i.e. silence allele). In some embodiments, expression of dCas9-VP64 fusion protein alone does not significantly increase the reactivation levels of the inactive allele. In some embodiments, dual expression of dCas9-VP64 fusion protein and dCas9-TET1CD resulted in the fewest number of differentially expressed genes in RNAseq analysis.
In some embodiments, gene activation requires several sgRNAs. In some embodiments, gene activation requires six sgRNAs. In some embodiments, gene activation requires at least about, 1-10, 1-5, 1-6, 1-3, 3-6, or 4-6 sgRNAs. In some embodiments, the target sequence for the sgRNA comprises or consists essentially of or consist of one or more of: AGAGCATCGGACCGAAGCGG, GGGGGAGAACATACTCGGGG, CCCAGGTTGCTAGGGCTTGG, ATCGCCTGAAACTTGTCCGG, CGAAAGGGTGTGAAAGAGGG, and/or TGGGGAAGGTAAAGCGGCGA. In some embodiments, the target sequence for the sgRNA comprises or consists essentially of or consist of AGAGCATCGGACCGAAGC. In some embodiments, the target sequence for the sgRNA comprises or consists essentially of or consist of GGGGGAGAACATACTCGGGG.
In some embodiments, the target sequence for the sgRNA comprises or consists essentially of or consist of CCCAGGTTGCTAGGGCTTGG.
In some embodiments, the second nucleotide molecule encoding at least one small guide RNA (sgRNA) comprises or consists essentially of or consist of at least three sgRNAs.
In some embodiments, the second nucleotide molecule encoding at least one small guide RNA (sgRNA) comprises a first sgRNA, a second sgRNA, and a third sgRNA. In some embodiments, the target sequence for the first sgRNA comprises or consists essentially of or consist of AGAGCATCGGACCGAAGCGG. In some embodiments, the target sequence for the second sgRNA comprises or consists essentially of or consist of GGGGGAGAACATACTCGGGG. In some embodiments, the target sequence for the third sgRNA comprises or consists essentially of or consist of CCCAGGTTGCTAGGGCTTGG.
In some embodiments, the target sequence for the first sgRNA comprises or consists essentially of or consist of one or more of AGAGCATCGGACCGAAGCGG, GGGGGAGAACATACTCGGGG, and/or CCCAGGTTGCTAGGGCTTGG.
In one aspect, the present disclosure provides a gene editing system comprising, or consisting essentially of or yet further consisting of: a first nucleotide molecule encoding a dCas9-Ten-Eleven Translocation methylcytosine dioxygenase 1 catalytic domain (TET1CD) fusion protein, wherein the dCas9-TET1 fusion protein facilitates the targeted demethylation of a gene target selected from the group consisting of CDK5L, SCML2, COL9A3, or MECP. and a second nucleotide molecule encoding at least one single guide RNA (sgRNA), comprising, or consisting essentially of, or yet further consisting of a scaffold region and a spacer region; wherein the spacer region hybridizes to a nucleotide sequence complementary to a target sequence adjacent to a 5′-end of a protospacer adjacent motif (PAM); and wherein the target sequence and the PAM are located within about 2 or aboutl kilobase (kb) and ranges as described herein of the transcriptional start site (TSS) of the cyclin dependent kinase-like 5 (CDKL5) gene, and wherein the target sequence for the first sgRNA comprises or consists essentially of AGAGCATCGGACCGAAGCGG, the target sequence for the second sgRNA comprises or consists essentially of or consists of GGGGGAGAACATACTCGGGG, and the target sequence for the third sgRNA comprises or consists essentially of or consists of CCCAGGTTGCTAGGGCTTGG.
In some embodiments, the spacer region comprises, or consists essentially of, or yet further consists of a spacer sequence provided in Table 1.
In some embodiments, the gene editing system further comprises a third nucleotide molecule encoding a dCas9 protein fused to at least one transcriptional activator.
In some embodiments, the at least one transcriptional activator fused to the dCas9 protein that comprises, or consists essentially of or consists of VP64 or a fragment thereof.
In some embodiments, the target sequence for the sgRNA comprises, or consists essentially of, or consist of one or more of AGAGCATCGGACCGAAGCGG, GGGGGAGAACATACTCGGGG, and/or CCCAGGTTGCTAGGGCTTGG.
In some embodiments, the at least one sgRNA comprises a first sgRNA, a second sgRNA, and a third sgRNA, wherein the target sequence for the first sgRNA comprises or consists essentially of, or yet further consists of AGAGCATCGGACCGAAGCGG, wherein the target sequence for the second sgRNA comprises or consists essentially of, or yet further consists of GGGGGAGAACATACTCGGGG, and wherein the target sequence for the third sgRNA comprises or consists essentially of, or yet further consists of CCCAGGTTGCTAGGGCTTGG.
In some embodiments, the first nucleotide molecule, the second nucleotide molecule, and the third nucleotide molecule are integrated into one or more viral or plasmid vectors.
In some embodiments, the viral vector is a selected from the group of a lentiviral vector, an adeno-associated viral (AAV) vector, or an adenoviral vector.
In one aspect, the disclosure provides a kit comprising the system as described herein and optional instructions for use in the methods as described herein.
In one aspect, the disclosure provides a host cell comprising the gene editing system.
In one aspect, the disclosure provides a pharmaceutical composition comprising the gene editing system, the vectors or the host cell comprising the gene editing system.
In some embodiments, the pharmaceutical composition comprises a carrier.
In some embodiments, the pharmaceutical composition comprises a pharmaceutically acceptable carrier or excipient.
Vector Systems
In one aspect, the present disclosure provides is a vector comprising, or alternatively consisting essentially of, or yet further consisting of one or more of the nucleotide molecule(s) as disclosed herein. In one embodiment, provided is a vector comprising, or consisting essentially of, or yet further consisting of a nucleotide molecule(s) as disclosed herein or its complement or an equivalent of each thereof. Such equivalent hybridize to the same targeted sequence or encodes the same protein. In one embodiment, a nucleotide molecule(s) or a vector as provided herein may further comprises another sequence, such as one or more of a sequence identified above and/or listed as a feature in the tables or figures.
In some embodiments, the first nucleotide molecule, the second nucleotide molecule, and the third nucleotide molecule are inserted into and comprised as part of one or more viral or plasmid vectors. In some embodiments, the second nucleotide molecule encoding at least one small guide RNA (sgRNAs) is inserted into, incorporated or cloned into a sgRNA expression vector. In some embodiments, the second nucleotide molecule encoding at least one small guide RNA (sgRNAs) is cloned into a viral vector. In some embodiments, the viral vector is selected from the group of retroviral vectors, adenovirus vectors, adeno-associated virus vectors, or alphavirus vectors. Infectious tobacco mosaic virus (TMV)-based vectors can be used to manufacturer proteins and have been reported to express Griffithsin in tobacco leaves (O'Keefe et al. (2009) Proc. Nat. Acad. Sci. USA 106(15):6099-6104). Alphavirus vectors, such as Semliki Forest virus-based vectors and Sindbis virus-based vectors, have also been developed for use in gene therapy and immunotherapy. See, Schlesinger & Dubensky (1999) Curr. Opin. Biotechnol. 5:434-439 and Ying et al. (1999) Nat. Med. 5(7):823-827. In aspects where gene transfer is mediated by a retroviral vector, a vector construct refers to the polynucleotide comprising the retroviral genome or part thereof. Further details as to modern methods of vectors for use in gene transfer may be found in, for example, Kotterman et al. (2015) Viral Vectors for Gene Therapy: Translational and Clinical Outlook Annual Review of Biomedical Engineering 17. In some embodiments, the viral vector is a selected from the group of a lentiviral vector, an adeno-associated viral (AAV) vector, or an adenoviral vector.
In some embodiments, the viral vector is a lentiviral vector. In some embodiments, the lentiviral vector is an optimized lentiviral sgRNA cloning vector with MS2 loops at tetraloop and stemloop 2 and EFla-puro resistance marker.
In one aspect, the present disclosure provides a vector encoding a sgRNA. In some embodiments, the sgRNA comprises, or consists essentially of, or yet further consists of a scaffold region and a spacer region. In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence comprising, or consisting essentially of, or yet further consisting of one or more of GGGGGAGAACATACTCGGGG, AGAGCATCGGACCGAAGCGG, CCCAGGTTGCTAGGGCTTGG, ATCGCCTGAAACTTGTCCGG, CGAAAGGGTGTGAAAGAGGG, and/or TGGGGAAGGTAAAGCGGCGA. In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence comprising or consisting essentially of, or yet further consisting GGGGGAGAACATACTCGGGG. In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence comprising or consisting essentially of, or yet further consisting AGAGCATCGGACCGAAGCGG. In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence comprising or consisting essentially of, or yet further consisting CCCAGGTTGCTAGGGCTTGG. In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence comprising or consisting essentially of, or yet further consisting ATCGCCTGAAACTTGTCCGG. In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence comprising or consisting essentially of, or yet further consisting CGAAAGGGTGTGAAAGAGGG. In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence comprising or consisting essentially of, or yet further consisting TGGGGAAGGTAAAGCGGCGA.
In one aspect, the present disclosure provides a vector encoding a first sgRNA and a second sgRNA. In some embodiments, the first sgRNA comprises or consisting essentially of, or yet further consisting a scaffold region and a spacer region, and the spacer region of the first sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of, or yet further consisting AGAGCATCGGACCGAAGCGG In some embodiments, the second sgRNA comprises or consisting essentially of, or yet further consisting a scaffold region and a spacer region, and the spacer region of the second sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of, or yet further consisting GGGGGAGAACATACTCGGGG.
In some embodiments, the vector encodes a first sgRNA and a second sgRNA. In some embodiments, the first sgRNA comprises or consists essentially of, or yet further consists a scaffold region and a spacer region and the spacer region of the first sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of, or yet further consisting of AGAGCATCGGACCGAAGCGG. In some embodiments, the second sgRNA comprises or consists essentially of, or consists of a scaffold region and a spacer region, and the spacer region of the second sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of, or consisting of CCCAGGTTGCTAGGGCTTGG.
In some embodiments, a vector encodes a first sgRNA and a second sgRNA. In some embodiments, the first sgRNA comprises or consists essentially of, or consists of a scaffold region and a spacer region and the spacer region of the first sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of, or consisting of GGGGGAGAACATACTCGGGG. In some embodiments, the second sgRNA comprises or consists essentially of, or consists of a scaffold region and a spacer region, and the spacer region of the second sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of, or consisting of CCCAGGTTGCTAGGGCTTGG.
In some embodiments, a vector encodes a first sgRNA, a second sgRNA, and a third sgRNA. In some embodiments, the first sgRNA comprises or consists essentially of, or consists of a scaffold region and a spacer region and the spacer region of the first sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of, or consisting of AGAGCATCGGACCGAAGCGG. In some embodiments, the second sgRNA comprises or consists essentially of, or consists of a scaffold region and a spacer region, and the spacer region of the second sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of, or consisting of GGGGGAGAACATACTCGGGG. In some embodiments, the third sgRNA comprises or consists essentially of, or consists of a scaffold region and a spacer region, and the spacer region of the third sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of, or consisting of CCCAGGTTGCTAGGGCTTGG.
In some embodiments, a vector encodes a sgRNA and the sgRNA comprises or consists essentially of, or consists of a scaffold region and a spacer region, and the spacer region hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of, or consisting of AGAGCATCGGACCGAAGCGG.
In one aspect, the present disclosure provides a vector encoding a first sgRNA, a second sgRNA, and/or a third sgRNA and further comprises a nucleotide molecule encoding a dCas9-TET1CD fusion protein. In some embodiments, the vector encoding a first sgRNA, a second sgRNA, and/or a third sgRNA and further comprises or consists essentially of, or consists of a nucleotide molecule encoding a dCas9 protein fused to at least one transcriptional activator. In some embodiments, the transcriptional activator comprises VP64 or a biologically equivalent fragment thereof. In some embodiments, VP64 is the activation domain VP16. In some embodiments, VP64 is a recombinamt tetrameric repeat of comprising the minimal activation domain VP64. In some embodiments, the activation domain of VP16 comprises amino acids 413-489 of the VP16 protein. In another embodiments, the recombinant tetrameric repeat of VP16's minimal activation domain comprises or consists essentially of, or consists of the amino acid sequence DALDDFDLDML.
In some embodiments, the vector further comprises or consists essentially of, or consists of a first nucleotide molecule encoding a dCas9-TET1CD fusion protein and a second nucleotide molecule encoding a dCas9 protein fused to at least one transcriptional activator. In some embodiments, the transcriptional activator comprises VP64 or a fragment thereof. In some embodiments, VP64 fragment comprises the activation domain VP16. In some embodiments, VP64 is a recombinant tetrameric repeat of comprising or consisting essentially of or yet further consisting of the minimal activation domain VP64. In some embodiments, the activation domain of VP16 comprises or consists essentially of, or consists of amino acids 413-489 of the VP16 protein. In another embodiments, the recombinant tetrameric repeat of VP16's minimal activation domain comprises or consists essentially of, or consists of the amino acid sequence DALDDFDLDML. In some embodiments, the vector is a viral vector or a plasmid vector. In some embodiments, the viral vector is a lentiviral vector, an AAV vector, or an adenoviral vector.
In a further aspect, the systems, nucleotides, nucleic acids or host cells as described herein are detectably labeled for research or other use. Detectable labels such as radionucleotides and fluorescent labels are commercially available and widely used.
Host Cells
The present disclosure provides an isolated or engineered host cell comprising any one or more of the polynucleotides, gene editing systems and/or any one or more of the vectors as disclosed herein. In some embodiments, the host cell produces the gene editing system, the nucleotide molecule(s) and/or the vector(s). Additionally or alternatively, the host cell is an insect cell, a mammalian cell, or a bacterial cell. In some embodiment, the host cell is selected from a stem cell, an embryonic stem cell (that in one aspect is from an established cultured cell line), a progenitor cell, an IPSC, a neuronal progenitor cell, a neuronal stem cell or a stem or progenitor cell with the ability to differentiate into a neuron. The host cell can also be an egg, a sperm, a zygote, or a germline cell. In yet a further embodiment, the host cell is a mammalian cell. In one aspect the cell is a culture or primary cell from a non-human host or subject. In one aspect, the cell is a cell in need of genetic correction, e.g., a cell with inactive gene expression, as described herein. In a further aspect, the cell is a neuronal cell with dysfunctional gene expression. The cells are useful in cell assay systems and therapies as described herein.
In some embodiments, the nucleotide molecule is engineered to one or more of the chromosome(s) or chromosome sites of the host cell. In some embodiments, the host cell comprises homozygous polynucleotides. In another embodiment, the host cell comprises a heterozygous polynucleotide. In some aspects and/or embodiments of the disclosure herein, the nucleotide molecule is engineered to one or more of the chromosome(s) or chromosome site(s) of the mammalian cell.
In some embodiments, the host cell comprises gene editing systems comprising, or alternatively consisting essentially of, or yet further consisting of: (i) a first nucleotide molecule encoding a dCas9-Ten-Eleven Translocation methylcytosine dioxygenase 1 catalytic domain (TET1CD) fusion protein; and (ii) a second nucleotide molecule encoding at least one small guide RNA (sgRNA). In some embodiment, the second nucleotide molecule encoding at least one small guide RNA (sgRNA) that comprises a scaffold region and a spacer region. In some embodiment, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence adjacent to a 5′-end of a protospacer adjacent motif (PAM). In some embodiments, the target sequence and the PAM are located at least 1 kilobase (kb) from the transcriptional start site (TSS) of the CDKL5 gene. In some embodiments, the first nucleotide molecule encoding a dCas9-Ten-Eleven Translocation methylcytosine dioxygenase 1 catalytic domain (TET1CD) fusion protein; and (ii) the second nucleotide molecule encoding at least one small guide RNA (sgRNA) induce DNA demethylation of CpGs (GC islands or region) at positions of at least about −1500, at least about −1000, at least about −500, at least about −200, at least about −148, at least about −66 and, at least about −19 relative to transcription start site.
In one aspect, the present disclosure provides a host cell expressing a third nucleotide molecule encoding a dCas9 protein fused to at least one transcriptional activator. In some embodiments, the fusion protein comprising the deactivated CRISPR-associated protein 9 (dCas9) with at least one tandem repeat of the transcriptional activator herpes simplex virus VP16 (i.e. VP64) induces transcriptional activation of endogenous of an endogenous gene. In some embodiments, the at least one transcriptional activator comprises VP64 or a fragment thereof. Transcription factors act through a DNA-binding domain that localizes a protein to a specific site within the genome and through accessory effector domains that either activate or repress transcription at or near that site. Effector domains, such as the activation domain the herpes simplex virus VP16 (4) and the repression domain Kruppel-associated box (KRAB), are modular and retain their activity when they are fused to other DNA-binding proteins. In some embodiments, VP64 is the activation domain VP16. In some embodiments, VP64 is a recombinant tetrameric repeat of comprising the mimimal activation domain VP64. In some embodiments, the activation domain of VP16 comprises amino acids 413-489 of the VP16 protein. In another embodiments, the recombinant tetrameric repeat of VP16's minimal activation domain comprises the amino acid sequence DALDDFDLDML
In some embodiment, dCas9-VP64 fusion protein upregulates genes in the host cell in an unmethylated chromatin context. In some embodiment combination of dCas9-VP64 fusion protein and dCas9-TET1CD shows a synergistic effect resulted in a greater than 60% expression of an inactive allele (i.e. silence allele) in the host cell. In some embodiments, expression of dCas9-VP64 fusion protein alone does not significantly increase the reactivation levels of the inactive allele. In some embodiments, dual expression of dCas9-VP64 fusion protein and dCas9-TET1CD resulted in the fewest number of differentially expressed genes in RNAseq analysis.
In some embodiments, the host cell further expresses several sgRNAs. In some embodiments, the host cell expresses six sgRNAs. In some embodiments, the host cell expresses at least about, 1-10, 1-5, 1-6, 1-3, 3-6, or 4-6 sgRNAs. In some embodiments, the host cell expresses a target sequence that is complementary to a sgRNA sequence selected from the group consisting of AGAGCATCGGACCGAAGCGG, GGGGGAGAACATACTCGGGG, CCCAGGTTGCTAGGGCTTGG, ATCGCCTGAAACTTGTCCGG, CGAAAGGGTGTGAAAGAGGG, and TGGGGAAGGTAAAGCGGCGA. In some embodiments, the target sequence for the sgRNA comprises AGAGCATCGGACCGAAGC. In some embodiments, the target sequence for the sgRNA comprises GGGGGAGAACATACTCGGGG. In some embodiments, the target sequence for the sgRNA comprises CCCAGGTTGCTAGGGCTTGG.
In some embodiments, the host cell expresses a second nucleotide molecule encoding at least one small guide RNA (sgRNA) that comprises at least three sgRNAs. In some embodiments, the host cell expresses a second nucleotide molecule encoding at least one small guide RNA (sgRNA) that comprises a first sgRNA, a second sgRNA, and a third sgRNA. In some embodiments, the target sequence for the first sgRNA comprises AGAGCATCGGACCGAAGCGG. In some embodiments, the target sequence for the second sgRNA comprises GGGGGAGAACATACTCGGGG. In some embodiments, the target sequence for the third sgRNA comprises CCCAGGTTGCTAGGGCTTGG. In some embodiments, the target sequence for the first sgRNA comprises AGAGCATCGGACCGAAGCGG, the target sequence for the second sgRNA comprises GGGGGAGAACATACTCGGGG, and the target sequence for the third sgRNA comprises CCCAGGTTGCTAGGGCTTGG.
In one aspect, the present disclosure provides a host cell genetically engineered to express a vector comprising, or alternatively consisting essentially of, or yet further consisting of one or more of the nucleotide molecule(s) as disclosed herein. In one embodiment, provided is a vector comprising, or consisting essentially of, or yet further consisting of a nucleotide molecule(s) as disclosed herein or its complement or an equivalent of each thereof. Such equivalent hybridize to the same targeted sequence or encodes the same protein. In one embodiment, a nucleotide molecule(s) or a vector as provided herein may further comprises another sequence, such as one or more of a sequence listed as a feature in the drawings.
In some embodiments, the host cell expresses a first nucleotide molecule, the second nucleotide molecule, and the third nucleotide molecule are cloned into one or more viral or plasmid vectors. In some embodiments, the second nucleotide molecule encoding at least one small guide RNA (sgRNAs) is cloned into a sgRNA expression vector. In some embodiments, the second nucleotide molecule encoding at least one small guide RNA (sgRNAs) is cloned into a viral vector. In some embodiments, the viral vector is selected from the group consisting of retroviral vectors, adenovirus vectors, adeno-associated virus vectors, and alphavirus vectors. Infectious tobacco mosaic virus (TMV)-based vectors can be used to manufacturer proteins and have been reported to express Griffithsin in tobacco leaves (O'Keefe et al. (2009) Proc. Nat. Acad. Sci. USA 106(15):6099-6104). Alphavirus vectors, such as Semliki Forest virus-based vectors and Sindbis virus-based vectors, have also been developed for use in gene therapy and immunotherapy. See, Schlesinger & Dubensky (1999) Curr. Opin. Biotechnol. 5:434-439 and Ying et al. (1999) Nat. Med. 5(7):823-827. In aspects where gene transfer is mediated by a retroviral vector, a vector construct refers to the polynucleotide comprising the retroviral genome or part thereof, and a therapeutic gene. Further details as to modern methods of vectors for use in gene transfer may be found in, for example, Kotterman et al. (2015) Viral Vectors for Gene Therapy: Translational and Clinical Outlook Annual Review of Biomedical Engineering 17. In some embodiments, the viral vector is a selected from the group of a lentiviral vector, an adeno-associated viral (AAV) vector, or an adenoviral vector, e.g., an Addgene plasmid available under 73797 or an equivalent thereof. In some embodiments, the viral vector is a lentiviral vector. In some embodiments, the lentiviral vector is an optimized lentiviral sgRNA cloning vector with MS2 loops at tetraloop and stemloop 2 and EFla-puro resistance marker.
In one aspect, the present disclosure provides a host cell engineered to express a vector encoding a sgRNA. In some embodiments, the sgRNA comprises a scaffold region and a spacer region. In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence comprising or consisting essentially of or consisting of GGGGGAGAACATACTCGGGG, AGAGCATCGGACCGAAGCGG, CCCAGGTTGCTAGGGCTTGG, ATCGCCTGAAACTTGTCCGG, CGAAAGGGTGTGAAAGAGGG, or TGGGGAAGGTAAAGCGGCGA. In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence comprising or consisting essentially of or consisting of GGGGGAGAACATACTCGGGG. In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence comprising or consisting essentially of or consisting of AGAGCATCGGACCGAAGCGG. In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence comprising or consisting essentially of or consisting of CCCAGGTTGCTAGGGCTTGG. In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence comprising ATCGCCTGAAACTTGTCCGG. In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence comprising or consisting essentially of or consisting of CGAAAGGGTGTGAAAGAGGG. In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence comprising or consisting essentially of or consisting of TGGGGAAGGTAAAGCGGCGA.
In one aspect, the present disclosure provides a host cell engineered to express a vector encoding a first sgRNA and a second sgRNA. In some embodiments, the host cell expresses a first sgRNA that comprises a scaffold region and a spacer region, and the spacer region of the first sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of or consisting of AGAGCATCGGACCGAAGCGG. In some embodiments, the host cell expresses a second sgRNA that comprises a scaffold region and a spacer region, and the spacer region of the second sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of or consisting of GGGGGAGAACATACTCGGGG.
In some embodiments, the host cell expresses a vector that encodes a first sgRNA and a second sgRNA. In some embodiments, the host cell expresses a first sgRNA that comprises a scaffold region and a spacer region and the spacer region of the first sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of or consisting of AGAGCATCGGACCGAAGCGG. In some embodiments, the host cell expresses a second sgRNA that comprises a scaffold region and a spacer region, and the spacer region of the second sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of or consisting of CCCAGGTTGCTAGGGCTTGG.
In some embodiments, the host cell expresses a vector that encodes a first sgRNA and a second sgRNA. In some embodiments, the host cell expresses a first sgRNA comprises a scaffold region and a spacer region and the spacer region of the first sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of or consisting of GGGGGAGAACATACTCGGGG. In some embodiments, the host cell expresses a second sgRNA comprises a scaffold region and a spacer region, and the spacer region of the second sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of or consisting of CCCAGGTTGCTAGGGCTTGG.
In some embodiments, the host cell expresses a vector that encodes a first sgRNA, a second sgRNA, and a third sgRNA. In some embodiments, the first sgRNA comprises a scaffold region and a spacer region and the spacer region of the first sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of or consisting of AGAGCATCGGACCGAAGCGG. In some embodiments, the host cell expresses a second sgRNA comprises a scaffold region and a spacer region, and the spacer region of the second sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of or consisting of GGGGGAGAACATACTCGGGG. In some embodiments, the host cell expresses a third sgRNA comprises a scaffold region and a spacer region, and the spacer region of the third sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of or consisting of CCCAGGTTGCTAGGGCTTGG.
In some embodiments, the host cell expresses a vector that encodes a sgRNA and the sgRNA comprises a scaffold region and a spacer region, and the spacer region hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of or consisting of AGAGCATCGGACCGAAGCGG.
In one aspect, the present disclosure provides the host cell engineered to express a vector encoding a first sgRNA, a second sgRNA, and/or a third sgRNA and further comprises or consists essentially of or consists of a nucleotide molecule encoding a dCas9-TET1CD fusion protein. In some embodiments, the host cell expresses a vector encoding a first sgRNA, a second sgRNA, and/or a third sgRNA and further comprises a nucleotide molecule encoding a dCas9 protein fused to at least one transcriptional activator. In some embodiments, the transcriptional activator comprises VP64 or a biologically active fragment thereof. In some embodiments, the biologically active fragment of VP64 is the activation domain VP16. In some embodiments. VP64 is a recombinant tetrameric repeat of comprising the minimal activation domain VP64. In some embodiments, the activation domain of VP16 comprises or consists essentially of or consists of amino acids 413-489 of the VP16 protein. In another embodiments, the recombinant tetrameric repeat of VP16's minimal activation domain comprises or consists essentially of or consists of the amino acid sequence DALDDFDLDML.
In some embodiments, the host cell expresses a vector that further comprises or consists essentially of or consists of a first nucleotide molecule encoding a dCas9-TET1CD fusion protein and a second nucleotide molecule encoding a dCas9 protein fused to at least one transcriptional activator. In some embodiments, the transcriptional activator comprises or consists essentially of or consists VP64 or a biologically active fragment thereof. In some embodiments, the biologically active fragment of VP64 is the activation domain VP16. In some embodiments, VP64 is a recombinant tetrameric repeat of comprising or consisting essentially of or consisting of the minimal activation domain VP64. In some embodiments, the activation domain of VP16 comprises or consists essentially of or consists amino acids 413-489 of the VP16 protein. In another embodiments, the recombinant tetrameric repeat of VP16's minimal activation domain comprises or consists essentially of or consists of the amino acid sequence DALDDFDLDML In some embodiments, the vector is a viral vector or a plasmid vector. In some embodiments, the viral vector is a lentiviral vector, an AAV vector, or an adenoviral vector.
In one aspect, the present disclosure provides for a pharmaceutical composition comprising an isolated or engineered host cell comprising any one or more of the polynucleotides, systems, vectors or host cells alone or in combination with each other and optionally additional therapeutic agents, and a carrier, optionally a pharmaceutically acceptable carrier or excipient. In some embodiments, the host cell produces the gene editing system, the nucleotide molecule(s) and/or the vector(s).
In some embodiments, the pharmaceutical composition comprises a host cell comprising a gene editing systems comprising, or alternatively consisting essentially of, or yet further consisting of: (i) a first nucleotide molecule encoding a dCas9-Ten-Eleven Translocation methylcytosine dioxygenase 1 catalytic domain (TET1CD) fusion protein; and (ii) a second nucleotide molecule encoding at least one small guide RNA (sgRNA). In some embodiment, the second nucleotide molecule encoding at least one small guide RNA (sgRNA) that comprises a scaffold region and a spacer region. In some embodiments, the composition comprises a carrier, optionally a pharmaceutically acceptable carrier or excipient. In some embodiment, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence adjacent to a 5′-end of a protospacer adjacent motif (PAM). In some embodiments, the target sequence and the PAM are located at least 1 kilobase (kb) from the transcriptional start site (TSS) of the CDKL5 gene. In some embodiments, the first nucleotide molecule encoding a dCas9-Ten-Eleven Translocation methylcytosine dioxygenase 1 catalytic domain (TET1CD) fusion protein; and (ii) the second nucleotide molecule encoding at least one small guide RNA (sgRNA) induce DNA demethylation of CpGs (GC islands or region) at positions of at least about −1500, at least about −1000, at least about −500, at least about −200, at least about −148, at least about −66 and, at least about −19 relative to transcription start site.
In one aspect, the present disclosure provides a composition comprising a host cell as described herein and expressing a third nucleotide molecule encoding a dCas9 protein fused to at least one transcriptional activator. In some embodiments, the fusion protein comprising the deactivated CRISPR-associated protein 9 (dCas9) with at least one tandem repeat of the transcriptional activator herpes simplex virus VP16 (i.e.VP64) induces transcriptional activation of endogenous of an endogenous gene. In some embodiments, the at least one transcriptional activator comprises VP64 or a biologically active fragment thereof. Transcription factors act through a DNA-binding domain that localizes a protein to a specific site within the genome and through accessory effector domains that either activate or repress transcription at or near that site. Effector domains, such as the activation domain the herpes simplex virus VP16 (4) and the repression domain Kruppel-associated box (KRAB), are modular and retain their activity when they are fused to other DNA-binding proteins. In some embodiments, VP64 is the activation domain VP16. In some embodiments, VP64 is a recombinant tetrameric repeat of comprising the minimal activation domain VP64. In some embodiments, the activation domain of VP16 comprises amino acids 413-489 of the VP16 protein. In another embodiments, the recombinant tetrameric repeat of VP16's minimal activation domain comprises the amino acid sequence DALDDFDL_DML.
In some embodiments, the composition comprises a host cell as described herein that further expresses several sgRNAs, also as described herein. In some embodiments, the host cell expresses six sgRNAs. In some embodiments, the host cell expresses at least about, 1-10, 1-5, 1-6, 1-3, 3-6, or 4-6 sgRNAs. In some embodiments, the host cell expresses a target sequence that is complementary to a sgRNA sequence selected from the group of AGAGCATCGGACCGAAGCGG, GGGGGAGAACATACTCGGGG, CCCAGGTTGCTAGGGCTTGG, ATCGCCTGAAACTTGTCCGG, CGAAAGGGTGTGAAAGAGGG, or TGGGGAAGGTAAAGCGGCGA. In some embodiments, the target sequence for the sgRNA comprises or consists essentially of or consists of AGAGCATCGGACCGAAGC. In some embodiments, the target sequence for the sgRNA comprises or consists essentially of or consists or GGGGGAGAACATACTCGGGG. In some embodiments, the target sequence for the sgRNA comprises or consists essentially of or consists or CCCAGGTTGCTAGGGCTTGG.
In some embodiments, the pharmaceutical composition comprises a host cell as described herein that expresses a second nucleotide molecule encoding at least one small guide RNA (sgRNA) that comprises or consists essentially of or consists of at least three sgRNAs. In some embodiments, the host cell expresses a second nucleotide molecule encoding at least one small guide RNA (sgRNA) that comprises a first sgRNA, a second sgRNA, and a third sgRNA. In some embodiments, the target sequence for the first sgRNA comprises or consists essentially of or consists or AGAGCATCGGACCGAAGCGG. In some embodiments, the target sequence for the second sgRNA comprises or consists essentially of or consists or GGGGGAGAACATACTCGGGG. In some embodiments, the target sequence for the third sgRNA comprises or consists essentially of or consists or CCCAGGTTGCTAGGGCTTGG. In some embodiments, the target sequence for the first sgRNA comprises or consists essentially of or consists or AGAGCATCGGACCGAAGCGG, the target sequence for the second sgRNA comprises or consists essentially of or consists or GGGGGAGAACATACTCGGGG, and the target sequence for the third sgRNA comprises or consists essentially of or consists or CCCAGGTTGCTAGGGCTTGG.
In one aspect, the present disclosure provides a composition comprising a host cell as described herein genetically engineered to express a vector comprising, or alternatively consisting essentially of, or yet further consisting of one or more of the nucleotide molecule(s) as disclosed herein. In one embodiment, provided is a vector comprising, or consisting essentially of, or yet further consisting of a nucleotide molecule(s) as disclosed herein or its complement or an equivalent of each thereof. Such equivalent hybridize to the same targeted sequence or encodes the same protein. In one embodiment, a nucleotide molecule(s) or a vector as provided herein may further comprises another sequence, such as one or more of a sequence listed as a feature in the drawings.
In some embodiments, the composition comprises a host cell that expresses a first nucleotide molecule, the second nucleotide molecule, and the third nucleotide molecule are cloned into one or more viral or plasmid vectors. In some embodiments, the second nucleotide molecule encoding at least one small guide RNA (sgRNAs) is cloned into a sgRNA expression vector. In some embodiments, the second nucleotide molecule encoding at least one small guide RNA (sgRNAs) is cloned into a viral vector. In some embodiments, the viral vector is selected from the group consisting of retroviral vectors, adenovirus vectors, adeno-associated virus vectors, and alphavirus vectors.
In one aspect, the present disclosure provides a composition comprising a host cell engineered to express a vector encoding a sgRNA. In some embodiments, the sgRNA comprises a scaffold region and a spacer region. In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence comprising or consisting essentially of or consisting of GGGGGAGAACATACTCGGGG, AGAGCATCGGACCGAAGCGG, CCCAGGTTGCTAGGGCTTGG, ATCGCCTGAAACTTGTCCGG, CGAAAGGGTGTGAAAGAGGG, or TGGGGAAGGTAAAGCGGCGA. In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence comprising or consisting essentially of or consisting of GGGGGAGAACATACTCGGGG. In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence comprising or consisting essentially of or consisting of AGAGCATCGGACCGAAGCGG.
In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence comprising or consisting essentially of or consisting of CCCAGGTTGCTAGGGCTTGG. In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence comprising or consisting essentially of or consisting of ATCGCCTGAAACTTGTCCGG. In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence comprising or consisting essentially of or consisting of CGAAAGGGTGTGAAAGAGGG.
In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence comprising or consisting essentially of or consisting of TGGGGAAGGTAAAGCGGCGA.
In one aspect, the present disclosure provides a composition comprising a host cell engineered to express a vector encoding a first sgRNA and a second sgRNA. In some embodiments, the host cell expresses a first sgRNA that comprises a scaffold region and a spacer region, and the spacer region of the first sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of or consisting of AGAGCATCGGACCGAAGCGG. In some embodiments, the host cell expresses a second sgRNA that comprises a scaffold region and a spacer region, and the spacer region of the second sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of or consisting of GGGGGAGAACATACTCGGGG.
In some embodiments, the composition comprisese host cell that expresses a vector that encodes a first sgRNA and a second sgRNA. In some embodiments, the host cell expresses a first sgRNA that comprises a scaffold region and a spacer region and the spacer region of the first sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of or consisting of AGAGCATCGGACCGAAGCGG. In some embodiments, the host cell expresses a second sgRNA that comprises a scaffold region and a spacer region, and the spacer region of the second sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of or consisting of CCCAGGTTGCTAGGGCTTGG.
In some embodiments, the composition comprises a host cell that expresses a vector that encodes a first sgRNA and a second sgRNA. In some embodiments, the host cell expresses a first sgRNA comprises a scaffold region and a spacer region and the spacer region of the first sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of or consisting of GGGGGAGAACATACTCGGGG. In some embodiments, the host cell expresses a second sgRNA comprises a scaffold region and a spacer region, and the spacer region of the second sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of or consisting of CCCAGGTTGCTAGGGCTTGG.
In some embodiments, the composition comprises a host cell expresses a vector that encodes a first sgRNA, a second sgRNA, and a third sgRNA. In some embodiments, the first sgRNA comprises a scaffold region and a spacer region and the spacer region of the first sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of or consisting of AGAGCATCGGACCGAAGCGG. In some embodiments, the host cell expresses a second sgRNA comprises a scaffold region and a spacer region, and the spacer region of the second sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of or consisting of GGGGGAGAACATACTCGGGG. In some embodiments, the host cell expresses a third sgRNA comprises a scaffold region and a spacer region, and the spacer region of the third sgRNA hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of or consisting of CCCAGGTTGCTAGGGCTTGG.
In some embodiments, the composition comprises a host cell expresses a vector that encodes a sgRNA and the sgRNA comprises a scaffold region and a spacer region, and the spacer region hybridizes to a nucleotide sequence complementary to a target sequence comprising or consisting essentially of or consisting of AGAGCATCGGACCGAAGCGG.
In one aspect, the present disclosure provides a composition comprising a host cell engineered to express a vector encoding a first sgRNA, a second sgRNA, and/or a third sgRNA and further comprises a nucleotide molecule encoding a dCas9-TET1CD fusion protein. In some embodiments, the host cell expresses a vector encoding a first sgRNA, a second sgRNA, and/or a third sgRNA and further comprises a nucleotide molecule encoding a dCas9 protein fused to at least one transcriptional activator. In some embodiments, the transcriptional activator comprises VP64 or a biologically active fragment thereof. In some embodiments, the biologically active fragment of VP64 is the activation domain VP16. In some embodiments, VP64 is a recombinant tetrameric repeat of comprising the minimal activation domain VP64. In some embodiments, the activation domain of VP16 comprises or consists essentially of or consists of amino acids 413-489 of the VP16 protein. In another embodiments, the recombinant tetrameric repeat of VP16's minimal activation domain comprises or consists essentially of or consists of the amino acid sequence DALDDFDLDML.
In some embodiments, the composition comprises a host cell that expresses a vector that further comprises a first nucleotide molecule encoding a dCas9-TET1CD fusion protein and a second nucleotide molecule encoding a dCas9 protein fused to at least one transcriptional activator. In some embodiments, the transcriptional activator comprises or consisting essentially of or consisting of VP64 or a biologically active fragment thereof. In some embodiments, VP64 fragment comprises, or consists essentially thereof or consists of the activation domain VP16 In some embodiments, VP64 is a recombinant tetrameric repeat of comprising the minimal activation domain VP64. In some embodiments, the activation domain of VP16 comprises amino acids 413-489 of the VP16 protein. In another embodiments, the recombinant tetrameric repeat of VP16's minimal activation domain comprises or consists essentially of or consists of the amino acid sequence DALDDFDLDML. In some embodiments, the vector is a viral vector or a plasmid vector.
In some embodiments, the viral vector is a lentiviral vector, an AAV vector, or an adenoviral vector. In some embodiments, the composition comprises a carrier, optionally a pharmaceutically acceptable carrier or excipient.
Cell Assay Systems
The vectors, gene editing systems and host cells can be used as in vitro assays systems to test new therapies or additions to the vectors, gene editing systems or host cells as described herein. Thus, in one aspect, provided herein is a method for increasing a gene expression such as a CDKL5 gene expression in a cell, comprising inserting into the cell the vectors and/or gene editing systems as described above. In one aspect, the gene expression is lower than wildtype expression due to reduced DNA methylation of the CDKL5 promoter region. Although CDKL5 is used as an example of such as system, one skill in the art can apply the principles of this system to other genes wherein DNA methylation is reduced, and/or the promoter region is located on a silenced X-chromosomal allele of the cell. The cells can be samples isolated from subjects suspected of containing defective gene expression and/or a commercially available or laboratory generated cell line. The host cell can be a prokaryotic or a eukaryotic cell, non-limiting examples of such include an insect cell, a mammalian cell, or a bacterial cell. In some embodiment, the host cell is selected from an egg, a sperm, a zygote, or a germline cell. In yet a further embodiment, the host cell is a mammalian cell. In one aspect, the cell is a cell in need of genetic correction, e.g., a cell with dysfunctional gene expression, as described herein. In a further aspect, the cell is a neuronal cell with dysfunctional gene expression.
One of skill of the art can generate the host cell system with a cell or cells from a subject to determine if the therapy is useful for the subject. In additional or alternatively, additional therapies can be tested for combination therapy.
The insertion of the vectors and/or gene editing system can be in vitro, ex vivo or in vivo. When used in an animal, it can serve as an animal model to assay for combination therapies.
Therapeutic and Diagnostic Methods
The present disclosure provides a gene editing system comprising a first nucleotide encoding a dCas9-ten-Eleven Translocation methylcytosine dioxygenase 1 catalytic domain (TET1CD) fusion protein and a second nucleotide encoding at least one small guide RNA (sgRNA) for targeting a nucleotide complementary sequence located within aboutl kilobase of the transcritptional start site (TSS) of the CDKL5 gene. Additional alterations and modifications to the systems are provided herein.
A significant number of X-linked genes escape from X chromosome inactivation and are associated with a distinct epigenetic signature. One epigenetic modification that strongly correlates with X-escape is reduced DNA methylation in promoter regions. Applicant created an artificial escape system by editing DNA methylation on the promoter of CDKL5, a gene causative for an infantile epilepsy, from the silenced X-chromosomal allele in human neuronal-like cells. The artificial system comprises a fusion of the catalytic domain of TET1 to dCas9 that is targeted to the CDKL5 promoter using three guide RNAs. This artificial system caused significant reactivation of the inactive CDKL5 allele in combination with removal of methyl groups from CpG dinucleotides. The artificial system also was further enhanced with co-expression of dCas9-TET1 fusion protein and a fusion protein comprising dCas9 and theVP64 transactivator. Together, these two dCas9 fusion proteins exhibited a synergistic effect on the reactivation of the inactive allele to levels above 60% of the active allele (
In particular, defects in epigenetics modification of ions channel in the nervous system are linked to Rett syndrome (RTT) and cyclin-dependent kinase-like 5 (CDKL5) deficiency disorder (CDD). RTT and CDKL5 deficiency disorder are two X-linked developmental brain disorders with overlapping but distinct phenotypic features. Mutations in the X-linked gene encoding methyl-CpG-binding protein 2 (MECP2) account for 90-95% of the case of classic Rett syndrome, and mutations in the X-linked gene encoding CDKL5 account from some cases of atypical RTT that manifest with early refractory epilepsy.
The neurodevelopmental disorder CDKL5 deficiency is caused by de novo mutations in the CDKL5 gene on the X chromosome (30). Due to random XCI, females affected by the disorder form a mosaic of tissue with cells expressing either the mutant or wild type allele (31). A potential therapeutic approach might be to activate the silenced wild type CDKL5 allele in cells expressing the loss-of-function mutant allele. Applicants synthetically induced escape of CDKL5 from the inactive X chromosome in the neuronal-like cell line SH-SY5Y via a DNA methylation editing of the CDKL5 promoter using a dCas9-TET1 fusion protein for targeted DNA demethylation. This artificial system/synthetic induction of CDKL5 escape from XCI, resulted in a significant increase in allele-specific expression of the inactive CDKL5 allele and correlated with a significant reduction in methylated CpG dinucleotides in the CGI core promoter.
The present disclosure demonstrates that dCas9-TET1 has a synergistic effect with the dCas9-VP64, thereby further increasing transcript levels from the inactive allele. The disclosure also provides describes whole-transcriptomic and genome-wide methylation data that illustrate the specificity of the novel artificial system for one target gene (CDKL5). As such, the disclosure demonstrates that loss of DNA methylation is crucial for inducing escape from the inactive X chromosome, and illustrates a novel therapeutic avenue for treatment subjects suffering from X-linked disorders generally.
In one aspect, the present disclosure provides a method for increasing CDKL5 gene expression in a cell or subject in need thereof comprising or consisting essentially of or consisting of administering to the cell or subject a system of gene editing systems comprising, or alternatively consisting essentially of, or yet further consisting of: (i) a first nucleotide molecule encoding a dCas9-Ten-Eleven Translocation methylcytosine dioxygenase 1 catalytic domain (TET1CD) fusion protein; and (ii) a second nucleotide molecule encoding at least one small guide RNA (sgRNA). In some embodiment, the second nucleotide molecule encoding at least one small guide RNA (sgRNA) that comprises a scaffold region and a spacer region. In some embodiment, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence adjacent to a 5′-end of a protospacer adjacent motif (PAM). In some embodiments, the target sequence and the PAM are located at least 1 kilobase (kb) from the transcriptional start site (TSS) of the CDKL5 gene.
In one aspect, the present disclosure provides a method for increasing CDKL5 gene expression in a cell or subject in need thereof comprising or consisting essentially of or consisting of administering to the cell or subject a system of gene editing further comprising, a third nucleotide molecule encoding a dCas9 protein fused to at least one transcriptional activator. In some embodiments, the transcriptional activator comprises VP64 or a biologically active fragment thereof. In some embodiments, VP64 is the activation domain VP16. In some embodiments, VP64 is a recombinant tetrameric repeat of comprising or consisting essentially of or consisting of the minimal activation domain VP64. In some embodiments, the activation domain of VP16 comprises or consists essentially of or consists of amino acids 413-489 of the VP16 protein. In another embodiments, the recombinant tetrameric repeat of VP16's minimal activation domain comprises or consists essentially of or consists of the amino acid sequence DALDDFDLDML
In some embodiments, a method for increasing CDKL5 gene expression in a cell or subject in need thereof comprising or consisting essentially of or consisting of administering to the cell or subject a system of gene editing further comprising a sgRNA. In some embodiments, the system of gene editing system comprises at least about, 1-10, 1-5, 1-6, 1-3, 3-6, or 4-6 sgRNAs. In some embodiments, the system of gene editing system comprises consists essentially of or consists of a sgRNA selected from the group of AGAGCATCGGACCGAAGCGG, GGGGGAGAACATACTCGGGG, CCCAGGTTGCTAGGGCTTGG, ATCGCCTGAAACTTGTCCGG, CGAAAGGGTGTGAAAGAGGG, or TGGGGAAGGTAAAGCGGCGA. In some embodiments, the gene editing system comprises or consists essentially of or consists of a sgRNA with a sequence set forth as AGAGCATCGGACCGAAGC. In some embodiments, the gene editing system comprises or consists essentially of or consists of a sgRNA with a sequence set forth as GGGGGAGAACATACTCGGGG. In some embodiments, the gene editing system comprises a sgRNA with a sequence set forth as CCCAGGTTGCTAGGGCTTGG.
In some embodiments, the second nucleotide molecule encoding at least one small guide RNA (sgRNA) comprises at least three sgRNAs. In some embodiments, the second nucleotide molecule encoding at least one small guide RNA (sgRNA) comprises a first sgRNA, a second sgRNA, and a third sgRNA. In some embodiments, the target sequence for the first sgRNA comprises or consists essentially of or consists of AGAGCATCGGACCGAAGCGG. In some embodiments, the target sequence for the second sgRNA comprises or consists essentially of or consists of GGGGGAGAACATACTCGGGG. In some embodiments, the target sequence for the third sgRNA comprises or consists essentially of or consists of CCCAGGTTGCTAGGGCTTGG.
In some embodiments, the target sequence for the first sgRNA comprises or consists essentially of or consists of AGAGCATCGGACCGAAGCGG, the target sequence for the second sgRNA comprises or consists essentially of or consists of GGGGGAGAACATACTCGGGG, and the target sequence for the third sgRNA comprises or consists essentially of or consists of CCCAGGTTGCTAGGGCTTGG.
In one aspect, the present disclosure provides a method for increasing CDKL5 gene expression in a cell or a subject in need thereof comprising administering to the cell or subject a pharmaceutical composition of the present disclosure. In some embodiments, administering to a subject a gene editing system or the pharmaceutical composition of the present invention reduces DNA methylation in a CDKL5 promoter region of the subject. In some embodiments, the CDKL5 promoter region is located on a silenced X-chromosomal allele of the subject. In some embodiments, the subject in need for increasing CDKL5 gene expression has been diagnosed with CDKL5 deficiency disorder (CDD). In some embodiments, the subject is a mammal or mammalian cell. In some embodiments, the mammal is a non-human fetus, an infant, a juvenile, or an adult.
In some embodiments, the system or pharmaceutical composition is administered to the subject by one or more of: an intravenous route, a subcutaneous route, an intramuscular route, an intradermal route, an intranasal route, an oral route, an intracranial route, an intrathecal route, an ocular route, an otic route, a rectal route, a vaginal route, an optic route, or an intraperitoneal route.
In one aspect, the present disclosure provides a method for treating or preventing CDD in a cell or subject in need thereof comprising administering to the cell or subject a gene editing system or the pharmaceutical composition of the present invention. In some embodiments, administering to a subject a gene editing system or the pharmaceutical composition of the present invention reduces DNA methylation in a CDKL5 promoter region of the subject. In some embodiments, the CDKL5 promoter region is located on a silenced X-chromosomal allele of the subject. In some embodiments, the subject is a mammal. In some embodiments, the mammal is a non-human fetus, an infant, a juvenile, or an adult. In some embodiments, the system or pharmaceutical composition is administered to the subject by one or more of: an intravenous route, a subcutaneous route, an intramuscular route, an intradermal route, an intranasal route, an oral route, an intracranial route, an intrathecal route, an ocular route, an otic route, a rectal route, a vaginal route, an optic route, or an intraperitoneal route.
Kits
In one aspect, the present invention provides a kit comprising or consisting essentially of, or yet further consisting of any one or more of the gene editing system, the vector, the host cell or the compositions and an optional instruction for use in activating a silenced X-chomosomal allele in a subject in need thereof. In some embodiments, the kit is used for increasing CDKL5 gene expression in a subject in need thereof. In some embodiments, the kit is used for treating or preventing CDD in a subject in need thereof. In some embodiments, a kit comprising the gene editing system of the present invention and optional instructions for use as described herein.
The following examples are provided to illustrate but not limit the aspects of this disclosure.
The examples herein are provided to illustrate advantages of the present technology and to further assist a person of ordinary skill in the art with preparing and/or using the compounds of the present technology. The examples herein are also presented in order to more fully illustrate the preferred aspects of the present technology. The examples should in no way be construed as limiting the scope of the present technology. The examples can include or incorporate any of the variations, aspects, or embodiments of the present technology described above. The variations, aspects, or embodiments described above may also further each include or incorporate the variations of any or all other variations, aspects or embodiments of the present technology.
Cloning ofsgRNAs. For the cloning of CDKL5 sgRNAs 20-bp spacer sequences were selected within ±1 kb of the CDKL5 TSS (chrX:18,443,725, hg19) using the online tool CHOPCHOP (Montague et al., Nucleic Acids Res. 42:W401-7 (2014)). For transient transfection experiments, sgRNAs were cloned into a sgRNA expression vector (Addgene plasmid #73797) following a previously published protocol (Mali et al., Science 339:823-826. (2013)). For transductions, sgRNAs were cloned into a lentiviral expression vector (Addgene plasmid #73797) as previously described (Joung et al., Nat. Protoc. 12:828-863 (2017)). Spacer sequences used to create target-specific sgRNA expression vectors are listed in Table 1. All constructs were sequence confirmed by Sanger sequencing (Genewiz, Inc, South Plainfield, N.J., USA) and chromatograms were analysed using SnapGene software (from GSL Biotech; available at snapgene.com).
Transient transfection experiments. U87MG (ATCC, Manassas, Va.) and Lenti-X 293T (Takara Bio USA, Inc., Mountain View, Calif.) were grown in media containing high-glucose DMEM supplemented with 1% L-glutamine (Thermo Fisher Scientific, Waltham, Mass.) and 10% HyClone heat-inactivated FBS (Thermo Fisher Scientific). BE(2)C (ATCC) cells were grown in DME/F12 (Thermo Fisher Scientific) supplemented with 1% L-glutamine and 10% HyClone heat-inactivated FBS. For gene expression modulation experiments, cells per well were grown to 80% confluency and transfected within 24 hours of plating using Lipofectamine 3000 (Life Technologies) following the manufacturer's instructions with 3 ul of Lipofectamine 3000 reagent diluted in 500 ul Opti-MEM reduced serum media (Thermo Fisher Scientific). Transfections were performed in 12-well plates using either a mock-treatment (diluted transfection reagent) or 700 ng dCas9 expression vector (Fuw-dCas9-Tet1CD-P2A-BFP, Addgene plasmid #108245; Fuw-dCas9-Tet1CD_IM, Addgene plasmid #84479; pLV hUbC-dCas9-T2A-GFP, Addgene plasmid #53191; pLV hUbC-dCas9 VP64-T2A-GFP, Addgene plasmid #53192) and 300 ng of equimolar pooled sgRNA expression vectors. Transfection medium was replaced 24 hours post-transfection with complete growth medium.
48 hours post-transfection, cells were rinsed in 1×DPBS (Thermo) and lysed in the well using TriZol (Ambion, Austin, Tex.). Total RNA was extracted using the Direct-zol RNA Miniprep kit (Zymo Research, Irvine, Calif.) and 500 ng RNA was reverse transcribed using RevertAid First Strand cDNA Synthesis Kit according to the manufacturer's instructions using random hexamer primers. Real-time PCR was performed in triplicate with 20 ng of cDNA per reaction and PowerUp SYBR Green Master Mix (Thermo Fisher Scientific) using the StepOne Plus Real Time PCR system (Thermo Fisher Scientific) and the StepOne Plus software was used to extract raw CT values. Gene expression analysis was performed with GAPDH as a reference gene in three biological replicates using exon-spanning primers for CDKL5 and GAPDH. All primer oligonucleotides used in this study are listed in Supplementary Table 1. Fold change of gene expression was calculated as the delta delta CT between GAPDH and CDKL5 transcript levels normalized to Mock-treated relative CDKL5 transcript levels as the reference.
Integrative XCI status analysis of CDKL5. In order to determine the XCI status of CDKL5, publicly available data from GTEx (gtexportal.org) was used to determine the sex-biased expression using 27 GTEx v6p tissues and blood dendritic cells from a female of Asian ancestry (24A) to assess XCI status of CDKL5 (16). Publicly available microarray data was also used to identify a single nucleotide polymorphism (SNP) in the CDKL5 gene of SH-SHY5Y (Krishna et al., BMC Genomics 15:1154 (2014)). Genomic DNA was isolated from SH-SY5Y using the Quick-gDNA MiniPrep kit (Zymo Research). Total RNA was extracted using the Direct-zol RNA Miniprep kit (Zymo Research) and 500 ng RNA was reverse transcribed using RevertAid First Strand cDNA Synthesis Kit (Thermo Fisher Scientific). The presence of the coding SNPs rs34567810 in CDKL5 and rs1808 in the escape gene CA5B was confirmed via Sanger sequencing (Genewiz, Inc) of both genomic DNA and RNA. Chromatograms were analysed using SnapGene software (GSL Biotech).
Lentivirus production and purification. To produce lentiviral particles as described before (Pollock et al., Mol. Ther. 24:965-977 (2016)), a total of 50 million Lenti-X 293T cells were seeded into two T-225 flasks per viral packaging the day before transfection in high glucose DMEM supplemented with 10% fetal bovine serum and 1% L-glutamine. For each flask 25 μg of dC, dCV, dCT or sgRNA expression vector, 5 μg of pMD2.G (envelope, Addgene plasmid #12259), and 25 μg of psPAX2 (gag/pol, Addgene plasmid #12260) were complexed with 140 ul using TransIT-293 (Mirus, Madison, Wis.) according to the manufacturer's recommendation in OPTI-mem. 48 hours after transfection, media was changed to 15 mL of UltraCULTURE medium (Lonza, Basel, Switzerland).
Vector supernatants were collected 72 hours post-transfection. Supernatant is initially centrifuged at 1500 rpm to clarify media and then concentrated by centrifugation at 3,000 rpm using Centricon-Plus-70-Centrifugal-Filter-Units (MilliPoreSigma, Burlington, Mass.). Viral aliquots were stored at −80° C. Virus for the expression of dCas9 effectors was titered by transduction of Lenti-X 293T cells and analysed by flow cytometry for expression of GFP and BFP. All flow cytometry analyses were performed on the BD Fortessa at the UC Davis Flow Cytometry Shared Resource Core. Viral titers for the expression of sgRNAs were determined by using the qPCR lentivirus titration kit (Applied Biological Materials Inc., Richmond, BC). SH-SY5Y (ATCC) cells were grown in DME/F12 media containing 20% FBS and 1% L-glut. SH-SY5Y cells were seeded on 6-well plates at a density of 300,000 cells per well and co-transduced with equimolar levels of dCas9 lentiviral particles equivalent to one Lenti-X 293T and a volume of dCas9 lentivirus equivalent to one Lenti-X 293T transducing unit and 5×107 IU of each sgRNA expression vector in combination with 2.5 μg/ml protamine sulfate (Fresenius Kabi, Lake Zurick, Ill.). For cells co-transduced with dCas9-VP64 and dCas9-TET1CD lentiviral volumes equivalent to 0.5 Lenti-X 293T transducing units each were used. Cells were sorted 5 days post-transduction at passage 11 for expression of GFP and/or BFP using the Influx cell sorter at the UC Davis Flow Cytometry Shared Resource Core (Sacramento, Calif.) and further expanded for 3-4 passages for subsequent analysis.
Targeted X-reactivation analysis. SH-SY5Y cells from each FACS-isolated treatment group and unsorted cells were seeded at a density of 300,000 cells per well in 6-well plates and allowed to grow until approximately 70% confluency. Cells were then rinsed in 1× DPBS and lysed in the well using TriZol (Ambion). Total RNA was extracted using the Direct-zol RNA Miniprep kit (Zymo Research) and 500 ng RNA was reverse transcribed using RevertAid First Strand cDNA Synthesis Kit (Thermo Fisher Scientific). For X-reactivation analysis, 100 ng of cDNA from stable SH-SY5Y lines was used for PCR amplification using Phusion High Fidelity Mastermix (New England Biolabs, Ipswich, Mass.). Each forward primer contained a unique 5-bp barcode sequence at the 5′ end for multiplexing (Supplementary Table 1). All amplicons were gel extracted and purified using the Zymo Gel DNA Recovery kit (Zymo Research) and pooled at equal concentrations for Illumina sequencing.
Amplicon sequencing was performed by the CCIB DNA Core Facility at Massachusetts General Hospital (Cambridge, Mass.). Forward and reverse reads of raw sequencing data were merged into a single long read using FLASH2 and barcodes were demultiplexed using FASTX at the beginning or end of the sequence read, allowing for a single mismatch each, yielding a mean read depth of >10,000 reads per sample. Processed FASTQ files were then analysed for frequency of reads containing the reactivated C allele for the coding SNP rs35478150 identified in exon 16 of the CDKL5 gene with the grep function over the total number of matched reads, yielding the reactivation frequency. Allele-specific RT-qPCR was performed as described above using a common forward primer and allele-specific reverse primers for the same coding SNP as analysed by amplicon sequencing (Table 1). Reactivation percentage was calculated as the percentage of relative Xi CDKL5 expression over relative Xa CDKL5 expression from mock-treated cells, normalized to GAPDH.
Targeted DNA demethylation analysis. Genomic DNA from transduced and mock-treated cells was isolated using the Quick-gDNA MiniPrep kit (Zymo Research). Bisulfite conversion was performed using the EZ DNA Methylation-Lightning Kit (Zymo Research) following the manufacturer's instructions. Primers for bisulfite-sequencing PCR were designed using MethPrimer with default settings (Li and Dahiya, Bioinformatics 18:1427-1431 (2002)) and unique 5-bp barcode sequences were added at the 5′ end for multiplexing (Table 1). 100 ng of bisulfite converted DNA was used for PCR amplification with ZymoTaq polymerase (Zymo Research) and the 238-bp amplicon was purified with the QIAquick PCR Purification Kit (Qiagen, Hilden, Germany) and submitted for amplicon sequencing. Amplicon sequencing was performed by the CCIB DNA Core Facility at Massachusetts General Hospital (Cambridge, Mass.) and further processed as described above. Alignment of processed FASTQ files and read mapping to a 238 bp reference amplicon was performed using Bismark with default settings (Krueger et al., Bioinformatics 27:1571-1572 (2011)). Further analysis and methylation calling of sorted BAM files was performed using CGMapTools (Guo et al., Bioinformatics 34:381-387 (2018)).
Chromatin immunoprecipitation (ChIP) and ChIP-qPCR. ChIP was performed as previously described (O'Geen et al., Epigenetics Chromatin 12:26 (2019)). Mock-treated and transduced cells were cross-linked 3-4 passages after FACS as described above in 1% formaldehyde for 10 min at room temperature and the reaction was stopped with 0.125 M glycine. Cross-linked cells were lysed with ChIP lysis buffer (5 mM PIPES pH8, 85 mM KCl, 1% Igepal) with a protease inhibitor (PI) cocktail (Roche). Nuclei were collected by centrifugation at 2000 rpm for 5 min at 4° C. and lysed in nuclei lysis buffer (50 mM Tris pH8, 10 mM EDTA, 1% SDS) supplemented with PI cocktail. Chromatin was fragmented using the Bioruptor Pico (Diagenode, Denville, N.J.) and diluted with 5 volumes RIPA buffer (50 mM Tris pH 7.6, 150 mM NaCl, 1 mM EDTA pH8, 1% Igepal, 0.25% deoxycholic acid).
ChIP enrichment was performed by incubation with 3 μg H3K27me3 antibody (ab6002, Abcam, Cambridge, UK) or 2 μg normal rabbit IgG (ab46540, Abcam) for 16 h at 4° C. Immune complexes were bound to 20 μl magnetic protein A/G beads (Biorad, Hercules, Calif.) for 2 h at 4° C. Beads were washed 2× with RIPA (Thermo Fisher Scientific) and 3× with ChIP wash buffer (100 mM Tris pH8, 500 mM LiCl, 1% deoxycholic acid). The final wash was performed in ChIP wash buffer with 150 mM NaCl. Cross-links were then reversed by heating beads in 100 μl ChIP elution buffer (50 mM NaHCO3, 1% SDS) overnight at 65° C., and DNA was purified using the QIAquick PCR Purification Kit (Qiagen, Hilden, Germany). ChIP-qPCR was performed with PowerUp SYBR Green Master Mix (Thermo Fisher Scientific) using the StepOne Plus Real Time PCR system (Thermo Fisher Scientific) and the StepOne Plus software was used to extract raw CT values. ChIP enrichment was calculated relative to input samples using the delta CT method.
Whole-genome methylation analysis by Infinium MethylationEPIC array. Whole genome methylation analysis was performed following (O'Geen et al., supra). Briefly, 300,000 cells for each treatment group were seeded in 6-well plates and allowed to grow to approximately 70% confluency. Genomic DNA from transduced and mock-treated cells in biological duplicates was isolated using the Quick-gDNA MiniPrep kit (Zymo Research) and 500 ng submitted for bisulfite conversion and Illumina's Infinium MethylationEPIC BeadChip array by the Vincent J. Coates Genomics Sequencing Laboratory (Berkeley, Calif.). The minfi package (Aryee et al., Bioinformatics 30:1363-1369 (2014); Fortin et al., Bioinformatics 33:558-560 (2017)) was used to extract two channel raw data (RGChannelSet) from the IDAT files at the probe level for all 850,000 probes. The RGChannelSet was used for background subtraction using preprocessNoob (Triche et al., Nucleic Acids Res. 41:e90 (2013)) followed by preprocessFunnorm (Fortin et al., Genome Biol. 15:503 (2014)) to normalize the samples. Beta values for each site (beta=M/(M+U), where M and U denote the methylated and unmethylated signals) were extracted from the GenomicRatioSet, which is the data organized by the CpG locus level mapped to the genome. The ChAMP package (Tian et al., Bioinformatics 33:3982-3984 (2017)) was used to filter probes using default settings with filterXY set to false. The limma function within ChAMP was then used (Smyth et al., Stat. Appl. Genet. Mol. Biol. 3, Article3 (2004), Wettenhall et al., Bioinformatics, 20:3705-3706 (2004)) to detect differentially methylated positions at default settings and merged the output file with the individual FunNorm beta values. In order to determine differentially methylated promoter regions, CpG sites were selected for cgi.feat. TSS200-island and TSS1500-island and a mean difference in beta value of ±0.05. Differentially methylated genes were defined as genes with at least 3 differentially methylated positions in the promoter. Venn diagrams were generated using bioinformatics.psb.ugent.be/.
RNA-Seq Library Preparation and Analysis. Global changes to transcription were assessed using RNA-Seq. Briefly, 300,000 cells for each treatment group were seeded in 6-well plates and allowed to grow to approximately 70% confluency. Cells were then rinsed in 1×DPBS and lysed in the well using TriZol (Ambion). Total RNA was extracted using the Direct-zol RNA Miniprep kit (Zymo Research). RNA was quantified with Nanodrop and 1 ug of RNA was used for each library. RNA libraries were generated using the NEBNext Ultra II RNA Library Prep kit (NEB) following manufacturer's instructions. Libraries were multiplexed and pooled for a single lane of sequencing on a HiSeq4000. Sequencing reads were de-multiplexed and aligned to the Hg38 reference genome with STAR Universal Aligner version 2.5.3a using the following settings: Indexed Reference Genome: Ensembl reference genome and annotation files for Hg38 release 77 were downloaded and complied into a single file, Genome was indexed using the following arguments “STAR—runMode genomeGenerate--runThreadN 12-genomeDir/STAR_INDEX_HG38--genomeFastaFiles GRCh38_r77.all.fa--sjdbGTFfile Homo_sapiens.GRCh38.77.gtf-sjdbOverhang 149”; Sample Read Alignment: alignment of each sample's reads was performed with the following arguments: “STAR--runThreadN 24-genomeDir/STAR_INDEX_HG38--outFileNamePrefix/STAR/SampleName_--outSAMtype BAM SortedByCoordinate--outWigType bedGraph--quantMode TranscriptomeSAM GeneCounts--readFilesCommand zcat--readFilesIn Sample-R1.fastq.gz Sample-R2.fastq.gz”. Differential Expression (DE) analysis was performed with DESeq2 (Love et al., Genome Biol. 15:550 (2014)) software in R Studio. First, gene count files were combined into a single file. Then, normalization and DE analysis were performed using a dCas9 control. DE gene lists from pairwise comparisons were exported into .csv files and utilized for GO term analysis using DAVID (david.ncifcrf.gov). Volcano plots were generated using ggplot2 software in R studio.
Off-target analysis. Off-target analysis of CRISPR sgRNAs was performed using the CasOFFinder tool (www.rgenome.net/cas-offinder/) (Bae et al., Bioinformatics 30:1473-1475 (2014)). Briefly, 20 bp spacer sequences for the three top sgRNA candidates without PAM sequences were used as the query using hg38 as the reference genome for canonical SpCas9 PAM sites. The algorithm was executed using 3 or less mismatches and DNA and RNA bulge sizes of 1. In order to extend the list from off-target sites to potential off-target genes, genes in a ±5 kb window were included using the Table Browser function of the UCSC Genome Browser. The list of off-target genes was then overlapped with all differentially expressed genes from the three conditions as well as differentially methylated probes from the dCas9-TET1CD comparison with dCas9 catalytically inactive TET1.
Statistical analysis. Statistical analyses were performed in Prism 8 (GraphPad Software, San Diego, Calif.) and in R Studio 3.6.0. Statistics are presented as the mean f SD. Targeted assessments were performed in biological triplicates. Genome-wide assessment were performed in triplicates unless otherwise noted. Between-group differences were analysed using a One-way analysis of variance (ANOVA). When appropriate, a Tukey's post hoc test was performed. Statistical differences between the means of two groups were determined using an independent samples t Test. The p value cut-off for all targeted analyses was set at 0.05 for all analyses. Statistical analyses of differentially methylated sites were performed using the limma function embedded in ChAMP in R Studio 3.6.0. The null hypothesis was rejected for tests with FDR <5%. Statistical analyses of differentially expressed genes was performed using DESeq2 in R Studio 3.6.0. The null hypothesis was rejected for tests with FDR <1%.
To investigate whether the CDKL5 gene is amenable to transcriptional reprogramming via dCas9 effector domains, U87MG cells were transiently co-transfected with dCas9 constructs and gRNA expression vectors. In particular, a dCas9-VP64 expression plasmid (dC-V) was used for the co-transfection. Plasmid expressing dCas9 without effector domain was used as a control (dC). 6 individual guide RNAs were design to span DNase I hypersensitive sites and H3K4me3 peaks of the CDKL5 promoter within a ±1 kb window on either side of the CDKL5 transcriptional start site (
Due to the lack of informative allele-specific polymorphisms in either U87MG, BE2C, and HEK cell lines, bi-allelic mRNA activation in female SH-SY5Y cells was examined in order to assess whether the increase in gene expression was due to superactivation of the active CDKL5 allele, reactivation of the silenced allele, or a combination of both. Comparative analysis across several GTEx tissues demonstrated that CDKL5 did not display female-biased expression, which served as a proxy for X-chromosome Inactivation (XCI) status when compared to the known escape gene CA5B (
Due to the importance of methylated CGI promoters in XCI, the role of a dCas9-TET1CD fusion protein for DNA methylation editing (dC-T) was investigated. In order to determine X chromosome reactivation efficiency, allele-specific activation facilitated by dC-V or dC-T was evaluated (
To determine reactivation of the silenced CDKL5 allele with high sensitivity, amplicon-based targeted RNA-sequencing was performed. Targeting of dC to CDKL5 was sufficient to significantly reactivate expression of the silenced allele by greater than 11-fold to 8% of total allelic reads compared to mock-treated cells (p<0.0001;
Due to the fact that the observed allelic reads via amplicon sequencing are a ratio of active versus silenced CDKL5 expression, allele-specific RT-qPCR was performed in order to compare reactivation levels from the inactive allele to the active allele baseline expression in SH-SY5Y (
For the expression of the active CDKL5 allele, no significant difference between dC and mock-treated cells was observed (
The status of XCI highly correlates to promoter CGI methylation. Due to the differences in targeted reactivation between effector domains, targeted bisulfite amplicon sequencing was performed in the CDKL5 core promoter region in order to identify the role of differential DNA methylation in X-reactivation between groups (
Genes that escape from XCI show a specific epigenetic signature, such as the depletion of the repressive histone mark H3K27me3. Therefore, an investigation as to whether targeted reactivation of the observed allele coincided with a remodelling of heterochromatin was conducted. ChIP-qPCR was used to test three different regions within a 1-kb fragment upstream of the transcriptional start site for changes in the H3K27me3 mark that have strong signal enrichment in brain tissue as determined by ENCODE and overlap the guide RNA target sites (
To determine on- and off-target effects of dC-T on the DNA methylome in stably transduced SH-SY5Y cells, the Illumina Infinium HumanMethylationEPIC (EPIC) array was used to interrogate 764,090 CpG sites genome-wide (Tables 2-3,
To evaluate the effect of targeting CDKL5 with dCas9 effector fusions on global gene expression, RNA-seq was performed in stably transduced SH-SY5Y. As shown in
Four genes containing heterozygous SNPs in the coding region within a ±2 Mb range of the CDKL5 target site were identified (MAP3K15, RAI2, NHS and BEND2, Tables 4-7). However, mean read counts for these genes were generally unchanged from the mock-treated group, albeit 3 out of 4 genes were lowly or not expressed. Regardless, since the mean read counts for these gene were not significantly altered, the results showed that the X-chromosomal genes were not reactivated. In total, 274 differentially expressed (DE) genes in dC-V (100 up- and 174 downregulated genes), 84 DE genes in dC-T (n=29 up- and n=55 downregulated) and 43 DE genes in dC-V+dC-T (13 up- and 30 downregulated genes) were identified. In general, a greater number of differentially downregulated genes in transduced cells was observed, which was attributed to off-target binding of the constructs as both effector domains conferred transcriptional activation to direct targets, not repression.
Although CDKL5 sgRNA sequences were designed to target a unique site in the human genome, it was possible that the sgRNAs could tolerate mismatches leading to off-target binding. To address this issue, a search for potential off-target (OT) sites with up to 3 mismatches within the sgRNA sequences using CasOFF-Finder was conducted. CasOFF-Finder scane for both nucleotide mismatches and bulges in the sequence, thereby making it a comprehensive in silico prediction tool for OT analysis. To include OT sites that fell within intergenic regions, the targets was extended by ±5 kb from the predicted OT site to include neighboring transcripts and identified a total of 30 predicted OT genes (Tables 4-7).
The majority of OT sites required at least 2 mismatches, with sgRNA 2 only being permissive for OT sites with 3 mismatches in the sequence. Out of 30 OT genes, a single target, CNTNAP2, that was downregulated in dC-V, dC-T and dC-V+dC-T in all three conditions was identified. While the predicted OT site for CNTNAP2 falls within an intronic sequence of the gene, the fact that that the differential expression was a consequence of off-target binding of the dCas9 effector domain could not be precluded. Cells transduced with dC-V showed the highest number of unique differentially expressed transcripts (n=223), followed by dC-T (n=58) and dC-V+dC-T (n=10). As shown in
To assess whether the observed global changes in DNA methylation in cells transduced with dC-T were associated with altered transcript levels, the overlap between all 81 differentially hypomethylated genes in CGI promoter regions with greater than 3 DM positions was investigated. As shown in
A significant number of X-linked genes escape XCI and are expressed from the inactive X chromosome (16). Whether or not the epigenetic signature associated with these escapees is a cause or merely a consequence of expression from otherwise transcriptionally inert X-chromatin remains to be elucidated (17). The present disclosure demonstrates for one such epigenetic barrier in a specific gene context, that removal of CGI methylation from the promoter of the X-chromosomal gene CDKL5 by directing a fusion of the catalytic domain of TET1 to dCas9 results in reactivation of gene expression in a targeted manner. In addition, employment of a strong transcriptional activator further increased the degree of escape in a synergistic fashion, resulting in expression levels in excess of 60% of the inactive allele when compared to the active allele.
The present disclosure further demonstrates that programmable transcription using a transactivator achieved a moderate but significant CDKL5 upregulation that was achieved across several cell lines. However, the effect of the VP64 transactivator was mainly due to superactivation of the already active allele, demonstrating that the epigenetic landscape of active X-chromatin presented a chromatin state more permissive for programmable transcription. Unexpectedly, the present invention identified that binding of dCas9 with no effector was capable of reactivating CDKL5 expression from the silent allele. This may be due to the large dCas9 protein serving as a pioneer factor when constitutively expressed and targeted to transcriptionally inactive X-chromatin, thereby causing limited gene reactivation on its own. In contrast to previous studies (52, 53), the present invention did not show any hindrance of dCas9 binding to regions largely embedded in CpG-dense hypermethylated CGI promoters. However, that binding of a sgRNA outside of the methylated region on the inactive X chromosome could, at least in part, be causative for the observed effect. The limited but significant reactivation was associated with the loss of the repressive histone mark H3K27me3 in the core promoter of CDKL5. While the direct role between dCas9 binding and depletion of the histone mark is not well understood, it is possible that binding of dCas9 causes displacement of the nucleosome, resulting in the loss of H3K27me3 and enhanced chromatin accessibility (54, 55). In the present disclosure, H3K27me3 was assessed due to its role in XCI. However, future studies to investigate dCas9 effect on nucleosome rearrangement would require the assessment of multiple histone subunits. In line with previous findings (2), a spread of heterochromatin loss to the nearest neighboring gene, which may suggest a targeted effect of dCas9 binding was not observed.
Previous studies suggested that nucleosome occupancy strongly impeded binding of (d)Cas9 (56, 57). However, considering that the disclosed sgRNA design takes DNase hypersensitive sites into account, and considering the finding that the inactive X-allele is approximately 1.2 fold more compact than the active allele (58), the present disclosure demonstrates that a promoter of a gene on the inactive X-chromatin is generally targetable by dCas9. In addition, the accessibility of CDKL5 can be further attributed to the location of the gene on a chromosomal segment that is part of a younger evolutionary strata of the X chromosome (17). Indeed, the majority of facultative and constitutive escape genes are located on the short arm of the X chromosome (17). Therefore, the chromosomal location of CDKL5 might be favourable to induce an artificial escape. The fusion of VP64 to dCas9 did not further increase the observed reactivation, further supporting a steric effect primarily attributed to the large size of dCas9 that is not augmented by the addition of a small transactivator. The indirect recruitment of transcription factors by VP64 did not result in higher reactivation levels and may be due to the chromatin microenvironment, specifically the presence of DNA methylation as an epigenetic barrier that does not permit abundant transcription via VP64.
Changing the chromatin microenvironment via the introduction of TET1CD resulted in decreased DNA methylation of around 15% in the CDKL5 core promoter and significantly reactivated XCI-silenced CDKL5, thereby creating an artificial escape gene as previously defined at expression levels of at least 10% of the active allele (17). Likely due to the depletion of 5-methylcytosine substrate in the promoter of the active allele, recruitment of TET1CD to this region did not result in superactivation of the allele on the active X chromosome. Due to a lack of polymorphisms in the CDKL5 promoter of SH-SY5Y cells, the working model system of the present disclosure did not allow for testing for allele specific changes of the epigenetic signature. Rather, the working model was reliant on the assessment of total changes in DNA methylation in light of the fact that CGI methylation is highly correlative with the inactive X allele. Furthermore, a recent genome-wide assessment revealed global DNA hypomethylation of CGI promoters following TET1CD overexpression via lentiviral integration (29). However, the genome-wide assessment of promoter regions did not identify a strong correlation between reduced methylation of CpG sites in and changes in transcription by RNA-seq. This is likely because the vast majority of genes identified only contain a single differentially methylated site indicative of one CpG site, and the generally small effect size of the measured changes of DNA methylation. The change of a single CpG site in a promoter which typically contains multiple CpG sites likely would not result in biological significance, thus the lack of correlation with transcriptional activation.
Since CGI promoters on the inactive X allele frequently show higher methylation levels than on the active X allele, targeted reduction of CpG methylation is directed to a single allele, unlike the case for autosomal genes. For example, in an autosomal setting, directed epigenetic editing may confer small changes to methylation levels of both alleles. These small changes do not necessarily translate to an additive effect on transcription if neither of the alleles reaches a threshold of biological significance. However, targeting a single X-chromosomal allele of a gene has the potential to concentrate the effects of epigenetic editing that would otherwise be divided over two alleles, increasing its potential to pass this arbitrary biological threshold. Thus, a decrease of DNA methylation on the inactive X chromosome can have a broader implication for regulation of gene expression. In future studies, it will be crucial to test whether a more transient delivery of TET1CD impacts the amount of observed methylation changes. While it was suggested that the effects of dCas9-TET1CD are specific (26), the present disclosure demonstrates global DNA methylome changes (29). Similar findings have been demonstrated for genome-wide DNA methylation changes with fusions of the DNA methyltransferase DNMT3A to dCas9 (59), likely attributed to the high substrate abundance of methylated cytosines for constitutively expressed TET1CD. This highlights the need to assess transient exposure of dCas9-effectors to the CDKL5 promoter in order to reduce potential off-target effects in future studies.
Due to the strong effect of VP64 on upregulating genes in an unmethylated chromatin context, a combination of TET1CD and VP64 targeted to CDKL5 via dCas9 was assessed. A synergistic effect between removal of DNA methylation and strong transcriptional activation that resulted in a greater than 60% expression from the inactive allele was observed. Since the employment of VP64 alone did not significantly increase reactivation levels, it is most likely that the introduction of dCas9-TET1CD causes a dynamic reprograming in which methyl groups are removed from CpG dinucleotides, thus allowing for further binding of transcription factors to the inactive chromatin via an indirect recruitment from VP64. The present disclosure supports a synergistic effect between TET1CD and transactivators that have recently been supported by others (28, 29). In alternative aspect, the effect of improved transcriptional activators, such as the VP64-p65-Rta tripartite fusion (60) or the use of the SunTag (61) system can be harnessed to further potentiate the expression of XCI silenced CDKL5 in combination with TET1CD.
Interestingly, following dual expression of VP64 and TET1CD resulted in the fewest number of DE genes in RNAseq analysis. In silico analysis provided a predicted list of potential off-target genes either through base-pair mismatches or bulges in the gRNA. Only a single gene from the predicted off-target list, CNTNAP2, a gene implicated in autism-spectrum disorders (68), demonstrated differential expression following genome wide transcriptomics. Novel methodologies have been proposed to alter the binding specificities of sgRNAs in order to reduce off-target binding, such as engineering a hairpin secondary structure onto the sgRNA spacer region (66), and will be explored in future studies.
Up until recently, technical hurdles have hampered the assessment of the role of epigenetic heterogeneity in biological systems. One challenge that remains is whether the observed reactivation levels of CDKL5 are due to a limited or partial reactivation at the population-wide level or if the observed effects are specific to a fully reactivated subgroup of cells. Recent evidence suggests that there are specific populations of cells that are more responsive to targeted effects, which will then drive the phenotype at the bulk level (28). It is possible that there are different kinds of responders to the epigenetic edits in our tested culture system and future studies will need to address this mechanistic question. Most likely this biological inquiry will need to be answered at a single cell level in future studies.
Reactivation strategies hold great promise for individuals suffering from X-linked disorders. In contrast to pharmacological inhibition of DNMT1, which postulates the need for mitosis, TET1CD might be a promising tool for demethylation in quiescent tissues that have been traditionally more difficult to target, such as the brain (27). In addition, superactivation by VP64 of the already active CDKL5 allele needs to be carefully assessed due to the fact that Xp22 duplications containing the CDKL5 gene have been described as pathogenic variants (67). Interestingly, Applicant identified that epigenetic editing of dCas9-TET1 does not exceed super-physiological levels of an X-linked target gene, further making this approach favorable in the light of a dosage sensitive gene.
The present technology is not to be limited in terms of the particular embodiments described in this application, which are intended as single illustrations of individual aspects of the present technology. Many modifications and variations of this present technology can be made without departing from its spirit and scope, as will be apparent to those skilled in the art. Functionally equivalent methods and apparatuses within the scope of the present technology, in addition to those enumerated herein, will be apparent to those skilled in the art from the foregoing descriptions. Such modifications and variations are intended to fall within the scope of the present technology. It is to be understood that this present technology is not limited to particular methods, reagents, compounds compositions or biological systems, which can, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting.
The inventions illustratively described herein may suitably be practiced in the absence of any element or elements, limitation or limitations, not specifically disclosed herein. Thus, for example, the terms “comprising,” “including,” “containing,” etc. shall be read expansively and without limitation. Additionally, the terms and expressions employed herein have been used as terms of description and not of limitation, and there is no intention in the use of such terms and expressions of excluding any equivalents of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope of the invention claimed.
Thus, it should be understood that the materials, methods, and examples provided here are representative of preferred embodiments, are exemplary, and are not intended as limitations on the scope of the invention.
The invention has been described broadly and generically herein. Each of the narrower species and sub-generic groupings falling within the generic disclosure also form part of the invention. This includes the generic description of the invention with a proviso or negative limitation removing any subject matter from the genus, regardless of whether or not the excised material is specifically recited herein.
In addition, where features or aspects of the disclosure are described in terms of Markush groups, those skilled in the art will recognize that the disclosure is also thereby described in terms of any individual member or subgroup of members of the Markush group.
As will be understood by one skilled in the art, for any and all purposes, particularly in terms of providing a written description, all ranges disclosed herein also encompass any and all possible subranges and combinations of subranges thereof. Any listed range can be easily recognized as sufficiently describing and enabling the same range being broken down into at least equal halves, thirds, quarters, fifths, tenths, etc. As a non-limiting example, each range discussed herein can be readily broken down into a lower third, middle third and upper third, etc. As will also be understood by one skilled in the art all language such as “up to,” “at least,” “greater than,” “less than,” and the like, include the number recited and refer to ranges which can be subsequently broken down into subranges as discussed above. Finally, as will be understood by one skilled in the art, a range includes each individual member. Thus, for example, a group having 1-3 cells refers to groups having 1, 2, or 3 cells. Similarly, a group having 1-5 cells refers to groups having 1, 2, 3, 4, or 5 cells, and so forth.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
All patents, patent applications, provisional applications, and publications referred to or cited herein are incorporated by reference in their entirety, including all figures and tables, to the extent they are not inconsistent with the explicit teachings of this specification. In case of conflict, the present specification, including definitions, will control.
This application claims priority to U.S. Provisional Application No. 62/924,141, filed Oct. 21, 2019, and U.S. Provisional Application No. 62/925,731, filed Oct. 24, 2019, the entire contents of each of which are incorporated herein by reference.
This invention was made with government support under Grant No. P30 CA093373 awarded by the National Cancer Institute; and under Grant Nos. NCRR C06-RR12088, S10 OD018223, S10 RR12964, S10 RR 026825, and 1S100D010786-01 awarded by the National Institute of Health. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2020/056726 | 10/21/2020 | WO |
Number | Date | Country | |
---|---|---|---|
62924141 | Oct 2019 | US | |
62925731 | Oct 2019 | US |