COMPOSITIONS AND METHODS FOR EDITING OF THE CDKL5 GENE

BACKGROUND

The following description of the background of the present technology is provided simply as an aid in understanding the present technology and is not admitted to describe or constitute prior art to the present technology. Throughout and within this disclosure technical and patent literature is referenced by an Arabic numeral or an identifying citation. The complete bibliographic citation for the literature referenced by an Arabic numeral can be found immediately preceding the claims.

Epigenetics is the study of mitotically and/or meiotically stable but reversible modifications to nucleotides or higher order chromatin structure that can alter expression patterns of genes in the absence of changes to the underlying DNA sequence (1). These modifications occur on multiple levels, such as 5-methyl-cytosine (5-meC) DNA methylation, post-translational modifications of histones bound by protein domains that serve as epigenetic writers, readers and erasers and noncoding RNAs that assist in the recruitment of chromatin modifying proteins to DNA (2). These epigenetic layers dynamically dictate the three-dimensional organization of the genome within the nuclear ultrastructure and orchestrate local accessibility for the eukaryotic transcriptional machinery (3). Because of this, epigenetic signatures play a crucial role in dictating cellular identity during development and throughout life in response to the environment (1), correlate with aging (4) and are linked to disease (5), for instance, Rett syndrome (RTT) and CDKL5 deficiency disorder (CDD), two rare X-linked developmental brain disorders associated with epigenetic modification. The neurodevelopmental disorder CDKL5 deficiency is caused by de novo mutations in the CDKL5 gene on the X chromosome (30). Due to random X-chromosome inactivation (XCI), females affected by the disorder form a mosaic of tissue with cells expressing either the mutant or wild type allele (31). Phenotypic variation observed between females in families with RTT are also ascribed to differences in X-inactivation patterns.

Accordingly, there is a need to improve our understanding of XCI and reactivation of X-linked genes, and a need for targeted approaches that result in specific gene reactivation. Targeted DNA demethylation of genes on the X chromosome would allow for a directed assessment of the causal role between DNA methylation and gene expression on the inactive X chromosome. Furthermore, the presence of coding SNPs that exist in clonally-derived female cell lines provides an allele-specific model to study escape from XCI induced by targeted epigenetic remodelling. There is also a need for potential therapeutic approaches that activate a silenced wild type allele of a gene such as CDKL5 in cells expressing the loss-of-function mutant allele.

This disclosure satisfies these needs and provides related advantages as well.

SUMMARY OF THE DISCLOSURE

The process of XCI epigenetically regulates the amount of transcriptionally active X-chromatin in somatic tissue as a dosage compensation mechanism to ensure equal expression levels of X-linked genes in males and females (6). In female somatic cells, one X chromosome randomly becomes inactive and is cytologically manifested during interphase as a perinuclear heterochromatic Barr body, which is then clonally maintained through mitosis (7, 8). This mechanism is mediated by the long noncoding RNA X-inactive specific transcript (XIST) expressed from the inactive X chromosome in cis (9), which serves as a guiding factor to tether Polycomb proteins for gene silencing to target sites on the X-chromatin (10). XIST induces the formation of repressive heterochromatin through histone deacetylation (11), DNA methylation of CpG-island (CGI) promoters (12), di- and trimethylation of histone 3 at lysine 9 (H3K9me2/3) (13), the deposition and spreading of H3K27me3 across the inactive X-chromatin (14) and the H2A histone variant macroH2A (15).

Gene expression data suggests there is an estimated 15-30% of human X-linked genes that escape XCI (16) at an arbitrary transcriptional threshold of 10% of the active allele (17). The level of escape from XCI is variable between genes and individuals (16), demonstrates tissue heterogeneity (18) and increases with age (19). X-escapees have a distinct epigenetic signature from genes that are subject to XCI, including enrichment of active and depletion of repressive histone marks, and generally reduced levels of DNA methylation near regulatory elements (17). In particular, the degree of CGI promoter 5meC DNA methylation has been demonstrated to be highly correlative with XCI (12, 20).

In line with the idea that DNA methylation forms an epigenetic barrier on the inactive X chromosome, the most potent X-reactivation to date has been achieved by treatment with 5-azacytidine, a global DNA hypomethylating agent in combination with X-wide genetic ablation of XIST (21). In addition, pharmacological and genetic screens aiming to identify trans-acting factors promoting XCI have identified the maintenance DNA methyltransferase DNMT1 as a key player in XCI (22, 23). However, previous studies aiming to elucidate the mechanism of XCI-escape, such as the aforementioned small molecule approaches, utilized untargeted approaches. While these studies have provided a significant foundation of knowledge, in particular demonstrating the importance of DNA methylation in our understanding of X-reactivation, the global side-effects of these types of approaches limit the study of specific gene reactivation.

Until recently, the lack of targeted approaches by which epigenetics can be modified has limited the studies of XCI mechanisms. With the availability of the RNA-guided clustered regularly interspaced palindromic repeats (CRISPR) system, catalytically inactive dCas9 fused to epigenetic effector domains has become the method of choice for targeted rewriting of the epigenome to further elucidate the causality between epigenetic marks and gene expression (24, 25). In particular, dCas9 fusions with the catalytic domain of ten-eleven translocation dioxygenase 1 (TET1) have gained prominence as a candidate to precisely demethylate gene promoters or enhancers for multiple gene targets (26-29). Synthetically inducing a gene escape from XCI via DNA methylation editing of a gene promoter using a dCas9 fusion proteins for targeted DNA demethylation has the potential for providing a much needed therapy for at least X-linked developmental brain disorders.

Building on these discoveries, Applicant provides the following aspects and disclosures.

In one aspect, the present disclosure provides a gene editing system comprising, or consisting essentially of or yet further consisting of: (i) a first nucleotide molecule encoding a dCas9-Ten-Eleven Translocation methylcytosine dioxygenase 1 catalytic domain (TET1CD) fusion protein, and (ii) a second nucleotide molecule encoding at least one single guide RNA (sgRNA), comprising, or consisting essentially of, or yet further consisting of a scaffold region and a spacer region; wherein the spacer region hybridizes to a nucleotide sequence complementary to a target sequence adjacent to a 5′-end of a protospacer adjacent motif (PAM); and wherein the target sequence and the PAM are located within about 1 kilobase (kb) of the transcriptional start site (TSS) of the cyclin dependent kinase-like 5 (CDKL5) gene.

In some embodiments, the spacer region comprises, or consists essentially of, or yet further consists of a spacer sequence provided in Table 1.

In some embodiments, the gene editing system further comprises a third nucleotide molecule encoding a dCas9 protein fused to at least one transcriptional activator.

In some embodiments, the at least one transcriptional activator fused to the dCas9 protein that comprises, or consists essentially of or consists of VP64 or a fragment thereof.

In some embodiments, the target sequence for the sgRNA comprises, or consists essentially of, or consist of one or more of AGAGCATCGGACCGAAGCGG, GGGGGAGAACATACTCGGGG, and/or CCCAGGTTGCTAGGGCTTGG.

In some embodiments, the at least one sgRNA comprises a first sgRNA, a second sgRNA, and a third sgRNA, wherein the target sequence for the first sgRNA comprises or consists essentially of, or yet further consists of AGAGCATCGGACCGAAGCGG, wherein the target sequence for the second sgRNA comprises or consists essentially of, or yet further consists of GGGGGAGAACATACTCGGGG, and wherein the target sequence for the third sgRNA comprises or consists essentially of, or yet further consists of CCCAGGTTGCTAGGGCTTGG.

In some embodiments, the first nucleotide molecule, the second nucleotide molecule, and the third nucleotide molecule are integrated into one or more viral or plasmid vectors.

In some embodiments, the viral vector is a selected from the group of a lentiviral vector, an adeno-associated viral (AAV) vector, or an adenoviral vector.

In one aspect, the disclosure provides a kit comprising the system as described herein and optional instructions for use in the methods as described herein.

In one aspect, the disclosure provides a host cell comprising the gene editing system.

In some embodiments, the host cell comprises a prokaryotic or a eukaryotic cell.

In some embodiments, the host cell comprises a mammalian or a human cell. In another aspect, the mammalian or host cell is a stem cell or progenitor cell, e.g., a iPSC, an embryonic stem cell or a stem cell with the capacity to differentiate into a specific lineage, e.g., neuronal lineage.

In some embodiments, the host cell as described herein has reduced CDKL5 gene expression and/or reduced DNA methylation in the CDKL5 promoter region.

In some embodiments, the host cell is a cultured cell or a primary cell.

In some embodiments, the host cell further comprising a therapeutic molecule.

In one aspect, the disclosure provides a pharmaceutical composition comprising the gene editing system, the vectors or the host cell comprising the gene editing system.

In some embodiments, the pharmaceutical composition comprises a carrier.

In some embodiments, the pharmaceutical composition comprises a pharmaceutically acceptable carrier or excipient.

In one aspect, the disclosure provides a method for increasing CDKL5 gene expression in a cell or subject in need thereof comprising or consists essentially of, or yet further consists of administering to the subject the gene editing system or the pharmaceutical composition comprising or consists essentially of, or yet further consists of the gene editing system.

In some embodiments, DNA methylation in a CDKL5 promoter region of the subject is methylated or hypermethylated, and in one aspect as compared to a non-silenced X-chromosome.

In some embodiments, the CDKL5 promoter region is located on a silenced X-chromosomal allele of the subject.

In some embodiments, the subject has been diagnosed with CDKL5 deficiency disorder (CDD).

In some embodiments, a cell is isolated from a subject having been diagnosed with CDD.

In some embodiments, the cell is a neuronal cell.

In some embodiments, the gene editing system or the pharmaceutical composition is administered to the subject by one or more of: an intravenous route, a subcutaneous route, an intramuscular route, an intradermal route, an intranasal route, an oral route, an intracranial route, an intrathecal route, an ocular route, an otic route, a rectal route, a vaginal route, an optic route, or an intraperitoneal route.

In some embodiments, the subject to be treated is a mammal.

In some embodiments, the mammal is a non-human fetus, an infant, a juvenile, or an adult.

In some embodiments, a biological sample from the subject is analyzed for CDKL5 gene expression, prior to and/or after treatment.

In some embodiments, CDKL5 gene expression is analyzed by quantitative PCR using exon-spanning primers for CDKL5 and for the reference gene GAPDH. Exemplary primer oligonucleotides for analyzing CDKL5 gene expression are provided in Table 1.

In one aspect, the disclosure provides a method for treating or preventing CDD in a subject in need thereof comprising administering to the subject the gene editing system or the pharmaceutical composition comprising the gene editing system. In one aspect, a biological system is analyzed for CDKL5 gene expression prior to or after treatment.

In some embodiments, DNA methylation in a CDKL5 promoter region of the subject is reduced, in one aspect, as compared to wild-type gene.

In some embodiments, the CDKL5 promoter region is located on a silenced X-chromosomal allele of the subject.

In some embodiments, the gene editing system or pharmaceutical composition is administered to the subject by one or more of: an intravenous route, a subcutaneous route, an intramuscular route, an intradermal route, an intranasal route, an oral route, an intracranial route, an intrathecal route, an ocular route, an otic route, a rectal route, a vaginal route, an optic route, or an intraperitoneal route.

In some embodiments, the subject is a mammal. In some embodiments, the mammal is a non-human fetus, an infant, a juvenile, or an adult.

In some embodiments, genomic DNA isolated from the subject is analyzed for targeted DNA methylation.

In some embodiments, targeted DNA methylation is analyzed by bisulfite-sequencing PCR. Exemplary primers for bisulfite-sequencing PCR are provided in Table 1.

In one aspect, the disclosure provides a vector encoding a sgRNA, wherein the sgRNA comprises a scaffold region and a spacer region, wherein the spacer region hybridizes to a nucleotide sequence complementary to a target sequence comprising, or consisting essentially of, or yet further consisting of one or more of AGAGCATCGGACCGAAGCGG, and/or GGGGGAGAACATACTCGGGG, and/or CCCAGGTTGCTAGGGCTTGG.

In some embodiments, the spacer region comprises a spacer sequence provided in Table 1.

In some embodiments, the vector encodes a first sgRNA and a second sgRNA; wherein the first sgRNA and the second sgRNA each comprise (a) a scaffold region and (b) a spacer region that hybridizes to a nucleotide sequence complementary to a target sequence; and wherein: (i) the target sequence of the first sgRNA comprises or consists essentially of, or yet further consists of AGAGCATCGGACCGAAGCGG, and the target sequence of the second sgRNA comprises or consists essentially of, or yet further consists of GGGGGAGAACATACTCGGGG; (ii) the target sequence of the first sgRNA comprises or consists essentially of, or yet further consists of AGAGCATCGGACCGAAGCGG, and the target sequence of the second sgRNA comprises or consists essentially of, or yet further consists of CCCAGGTTGCTAGGGCTTGG; or (iii) the target sequence of the first sgRNA comprises or consists essentially of, or yet further consists of GGGGGAGAACATACTCGGGG, and the target sequence of the second sgRNA comprises or consists essentially of, or yet further consists of CCCAGGTTGCTAGGGCTTGG.

In some embodiments, the vector encodes a first sgRNA, a second sgRNA, and a third sgRNA, wherein the first sgRNA, the second sgRNA, and the third sgRNA each comprise (a) a scaffold region and (b) a spacer region that hybridizes to a nucleotide sequence complementary to a target sequence, wherein the target sequence of the first sgRNA comprises or consists essentially of, or yet further consists of AGAGCATCGGACCGAAGCGG, wherein the target sequence of the second sgRNA comprises or consists essentially of, or yet further consists of GGGGGAGAACATACTCGGGG, and wherein the target sequence of the third sgRNA comprises or consists essentially of, or yet further consists of CCCAGGTTGCTAGGGCTTGG.

In some embodiments, the vector further comprises a nucleotide molecule encoding a dCas9-TET1CD fusion protein.

In some embodiments, the vector further comprises a nucleotide molecule encoding a dCas9 protein fused to at least one transcriptional activator.

In some embodiments, the vector further comprises a first nucleotide molecule encoding a dCas9-TET1CD fusion protein and a second nucleotide molecule encoding a dCas9 protein fused to at least one transcriptional activator.

In some embodiments, the transcriptional activator fused to the dCas9 protein comprises VP64 or a fragment thereof. In some embodiments, the vector is a viral vector or a plasmid vector.

In some embodiments, the viral vector is a lentiviral vector, an AAV vector, or an adenoviral vector.

In one aspect, the disclosure provides a host cell comprising the vector.

In one aspect, the disclosure provides a pharmaceutical composition comprising the vector or the host cell comprising the vector.

In some embodiments, the pharmaceutical composition comprises a carrier.

In some embodiments, the pharmaceutical composition comprises a pharmaceutically acceptable carrier or excipient.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A-G show the programmable transcription of the CDKL5 gene.

FIG. 1A shows a schematic illustrating the University of California Santa Cruz (UCSC) genome browser snapshot of the target sites of six sgRNAs directed against the CDKL5 promoter on the X-chromosome (Xp22.13). FIG. 1A further shows DNase hypersensitive sites and H3K4me3, which are often found near promoters derived from ENCODE. Sense sgRNAs are 2, 6, and antisense sgRNAs are 1, 3, 4, and 5.

FIG. 1B shows a bar graph illustrating CDKL5 mRNA fold change relative to mock-treated cells in U87MG cells determined by RT-qPCR resulting from programmable transcription using a dCas9-no effector (dC) or dCas9-VP64 (dC-V) in combination with different pools of three to six sgRNAs targeted to the CDKL5 promoter 48 hours after transient transfection. #Significantly different from dCas9 sgRNAs 1-3, n=3 independent experiments, Tukey's HSD, p<0.05.

FIG. 1C shows a bar graph illustrating CDKL5 mRNA fold change relative to mock-treated cells in BE2C cells determined by RT-qPCR resulting from programmable transcription using dCas9-no effector or dCas9-VP64 co-expressed with sgRNAs 1-3 48 hours after transient transfection.

FIG. 1D shows a bar graph illustrating CDKL5 mRNA fold change relative to mock-treated cells in Lenti-X 293T determined by RT-qPCR resulting from programmable transcription using dCas9-no effector or dCas9-VP64 co-expressed with sgRNAs 1-3 48 hours after transient transfection. #Significantly different from dCas9 sgRNAs 1-3, n=3 independent experiments, Student t-test p<0.05.

FIG. 1E shows a bar graph illustrating Male-female expression differences in CDKL5 compared to a known X chromosome Inactivation (XCI) escape gene CA5B across 27 GTEx tissues.

FIG. 1F shows a bar graph illustrating the analysis of XCI status of CDKL5 compared to genes showing variable expression from the inactive X-chromosome using scRNA-seq from previously published data. #Significantly different from CA5B, p<0.05.

FIG. 1G shows a Sanger sequencing of genomic DNA and cDNA from SH-SY5Y illustrating that CDKL5 showed mono-allelic expression of a SNP, in contrast to an escape gene, CA5B, which showed expression from the escape allele.

FIGS. 2A-E show targeted reactivation of CDKL5 from the inactive X allele.

FIG. 2A shows a schematic illustrating the targeted reactivation of CDKL5 on the X-chromosome using a coding SNP in the CDKL5 gene.

FIG. 2B shows graphs illustrating a flow sort of cells purified to stably express dCas9 or dCas9-VP64 fused to a GFP via a T2A peptide or dCas9-TET1CD-P2A-BFP.

FIG. 2C shows a bar graph illustrating allele specific read counts for the mRNA expression of the active (Xa) or inactive (Xi) CDKL5 allele of mock-treated SH-SY5Y or after constitutive expression of dCas9 effector domains dCas9 (dC), dCas9-VP64 (dC-V), dCas9-TET1CD (dC-T) or a combination of dCas9-VP64 and dCas9-TET1CD (dC-V+dC-T) and sgRNAs 1-3 after 21 days post-transduction. #Significantly different from mock-treated, ‡significantly different from dCas9, n=3 independent experiments, Tukey's HSD, p<0.05.

FIG. 2D shows a bar graph illustrating the relative Xi CDKL5 mRNA expression of mock-treated or stably transduced SH-SY5Y relative to CDKL5 Xa mRNA expression of mock-treated cells as determined by allele-specific RT-qPCR after 21 days post-transduction. #Significantly different from dC, ‡significantly different from dC-V, †significantly different from dC-T, n=3 independent experiments, Tukey's HSD, p<0.05.

FIG. 2E shows a bar graph illustrating the relative Xa CDKL5 mRNA expression in mock-treated and stably transduced SH-SY5Y cells determined by allele-specific RT-qPCR after 21 days post-transduction. #Significantly different from mock-treated, †significantly different from dCas9, n=3 independent experiments, Tukey's HSD, all p<0.05.

FIGS. 3A-E show that dCas9-TET1CD caused removal of DNA methylation from the CDKL5 CGI promoter.

FIG. 3A shows a schematic illustrating a UCSC genome browser snapshot of the target sites of sgRNAs 1-3 directed against the CDKL5 promoter on Xp22.13 and a large CpG Island (>1 kb) spanning the transcriptional start site of CDKL5. The black box represents a >200 bp region assessed for targeted DNA methylation changes containing 24 individual CpG dinucleotides (drawn to scale).

FIG. 3B shows a scatter plot illustrating 5-methylcytosine levels in a CpG context (5meCG) over total CpG context as assessed by targeted bisulfite sequencing across 11 CpG dinucleotides in mock-treated cells or cells transduced to constitutively express dCas9-no effector (dC) or dCas9 fused to either VP64 (dC-V) or TET1CD (dC-T), a combination thereof (dC-V+dC-T) or a catalytically inactive TET1CD (dC-dT). X-axis depicts the individual CpG position relative to the amplicon (not drawn to scale).

FIG. 3C shows a bar graph illustrating the mean 5-methylcytosine levels in a CpG context over all 11 CpG dinucleotides in all treatment groups. #Significantly different from mock-treated cells, ‡significantly different from dCas9, †significantly different from dC-dT, ¥significantly different from dC-T, n=3 independent experiments, Tukey's HSD, all p<0.05.

FIG. 3D shows a scatter plot illustrating 5-methylcytosine levels in a CpG context (5meCG) over total CpG context as assessed by targeted bisulfite sequencing across CpG dinucleotides 12-24 in mock-treated cells or cells transduced to constitutively express dCas9-no effector (dC) or dCas9 fused to either VP64 (dC-V) or TET1CD (dC-T), a combination thereof (dC-V+dC-T) or a catalytically inactive TET1CD (dC-dT). X-axis depicts the individual CpG position relative to the amplicon (not drawn to scale).

FIG. 3E shows a bar graph illustrating the mean 5-methylcytosine levels in a CpG context over all 12 CpG dinucleotides in all treatment groups from FIG. 3D, n=3 independent experiments.

FIG. 3F shows a scatter plot of the combination of data from FIG. 3B and FIG. 3D, illustrating 5-methylcytosine levels in a CpG context (5meCG) over total CpG context as assessed by targeted bisulfite sequencing across CpG dinucleotides 1-24 in mock-treated cells or cells transduced to constitutively express dCas9-no effector (dC) or dCas9 fused to either VP64 (dC-V) or TET1CD (dC-T), a combination thereof (dC-V+dC-T) or a catalytically inactive TET1CD (dC-dT). X-axis depicts the individual CpG position relative to the amplicon (not drawn to scale).

FIG. 3G shows a bar graph illustrating the mean 5-methylcytosine levels in a CpG context over all 24 CpG dinucleotides in all treatment groups from FIG. 3E. #significantly different from mock-treated cells, ‡significantly different from dCas9, †significantly different from dC-dT, ¥significantly different from dC-T, n=3 independent experiments, Tukey's HSD, all p<0.05.

FIGS. 4A-F shows the depletion of the XCI hallmark histone modification H3K27me3.

FIG. 4A shows a University of California Santa Cruz (UCSC) genome browser snapshot of the target sites of sgRNAs 1-3 directed against the CDKL5 promoter on Xp22.13 and H3K27me3 peaks derived from ENCODE. Black boxes show the regions assessed by ChIP-qPCR

FIG. 4B shows a bar graph illustrating input normalized H3K27me enrichment levels determined by ChIP-qPCR in region A of the CDKL5 promoter in mock-treated cells or cells transduced to constitutively express dCas9-no effector (dC) or dCas9 fused to either VP64 (dC-V) or TET1CD (dC-T).

FIG. 4C shows a bar graph illustrating input normalized H3K27me enrichment levels determined by ChIP-qPCR in region B of the CDKL5 promoter.

FIG. 4D shows a bar graph illustrating input normalized H3K27me enrichment levels determined by ChIP-qPCR in region C of the CDKL5 promoter.

FIG. 4E shows a bar graph illustrating input normalized H3K27me enrichment levels determined by ChIP-qPCR in the promoter of the nearest neighboring gene, SCML2.

FIG. 4F shows a bar graph illustrating input normalized H3K27me enrichment levels determined by ChIP-qPCR in the promoter of a distal gene, MECP2, that serves as a negative control. #Significantly different from mock-treated cells, n=3 independent experiments, p<0.05.

FIGS. 5A-K show global DNA hypomethylation due to constitutive dCas9-TET1CD expression.

FIG. 5A shows a scatter plot illustrating 32 CpG positions shown with their respective location on the X-chromosome (hg19) from the 850K MethylationEPIC array across the CDKL5 promoter were used to assess gene-wide changes in DNA methylation levels represented as changes in the beta value of the TSS200, TSS1500, 5′UTR and gene body of CDKL5. In particular, FIG. 5A shows reduced DNA methylation levels in the TSS1500 and TSS200 region of cells transduced with dCas9-TET1CD found after the transduction with dCas9-no effector (dC), dCas9-TET1CD (dC-T) and a catalytically inactive TET1CD (dC-dT). The line above TSS1500 demonstrates the sgRNA binding sites in the CDKL5 promoter. *illustrates significantly differentially methylated positions for further assessment.

FIG. 5B shows a bar graph illustrating side-by-side assessment of significantly differentially methylated positions in the CDKL5 promoter with a mean difference in beta value of <0.05. #Significantly different from dC, †significantly different from dC-dT, n=2 independent experiments, FDR <5%.

FIG. 5C shows an histogram illustrating the number of genes by the number of significantly hypomethylated sites of dCas9-TET1CD transduced cells when compared to dCas9 or a catalytically inactive TET1 fused to dCas9 demonstrates that the majority of genes shows only a single probe falling within the respective promoter region.

FIG. 5D shows a bar graph illustrating side-by-side assessment of significantly differentially methylated positions in the COL9A3 promoter with a mean difference in beta value of <0.05. #significantly different from dC, \significantly different from dC-dT, n=2 independent experiments, FDR <5%.

FIG. 5E shows a Venn diagram illustrating shared genes between dCas9-TET1CD and dCas9 or a catalytically inactive TET1CD mutant, and shows an overlap of 48 genes between the two groups.

FIG. 5F shows a flow chart diagram representing the analysis pipeline for genome-wide methylation effects of dCas9-TET1CD, starting from a total number of probes, down to significantly differentially methylated sites and ultimately differentially methylated genes.

FIG. 5G shows QC analysis of 850K Methylation EPIC data illustrating a dendogram demonstrating that biological replicates clustered together and controls showed different hierarchies than dCas9-TET1CD.

FIG. 5H shows density plots of beta value distribution before and after normalization with preprocessNoob and preprocessFunNorm of the data of FIG. 5G.

FIG. 5I shows the total probe statistics by feature of the data of FIG. 5G.

FIG. 5J shows total number of hypermethylated differentially methylated positions by feature of the data of FIG. 5G.

FIG. 5K shows the total number of hypomethylated differentially methylated positions by feature of the data of FIG. 5G.

FIGS. 6A-E show Off-target analysis of CRISPR/dCas9 effectors by RNA-seq.

FIG. 6A shows a volcano plot illustrating significance (FDR adjusted p value) versus fold change for differential DESeq2 expression analysis of mock-treated, dCas9-VP64 (dC-V), dCas9-TET1CD (dC-T) or dCas9-VP64 and dCas9-TET1CD (dC-V+dC-T) guided by sgRNAs 1-3 to the CDKL5 promoter compared to a dCas9-no effector control (dC). Differentially expressed genes are illustrated by black dots (FDR <1%, log fold change >1), predicted CRISPR off-target sites are highlighted in blue and the CDKL5 target gene is highlighted in green. The number of downregulated genes is shown in the upper left of each panel, and the number of upregulated genes is shown in the upper right of each panel.

FIG. 6B shows a Venn diagram illustrating the overlap of differentially expressed genes between all conditions and the putative off-target list, and shows that a single gene, CNTNAP2, was shared between all four groups as a putative off-target.

FIG. 6C shows a Venn diagram illustrating the overlap between differentially expressed genes and differentially methylated positions identified in a comparison between dCas9-TET1CD and dCas9 and potential CRISPR off-targets.

FIG. 6D shows a bar graph illustrating the validation of the differentially expressed gene, CNTNAP, by RT-qPCR, and shows the relative CNTNAP2 mRNA levels in SH-SY5Y determined by RT-qPCR after constitutive expression of dCas9 (dC), dCas9-VP64 (dC-V), dCas9-TET1CD (dC-T) or a combination of dCas9-VP64 and dCas9-TET1CD (dC-V+dC-T) and sgRNAs 1-3 after 21 days post-transduction. #significantly different from dCas9, n=3 independent experiments, Tukey's HSD, p<0.05. #significantly different from dCas9 sgRNAs 1-3, n=3 independent experiments, Student t-test p<0.05.

FIG. 6E shows a bar graph illustrating the validation of the differentially expressed gene, HHIPL1, by RT-qPCR, and shows the relative HHIPL1 mRNA levels in SH-SY5Y determined by RT-qPCR after constitutive expression of dCas9 (dC) or dCas9-TET1CD (dC-T) and sgRNAs 1-3 after 21 days posttransduction. #significantly different from dCas9 sgRNAs 1-3, n=3 independent experiments, Student's t-test, p<0.05.

FIG. 7 shows a schematic illustrating a model of the programmable transcription of the CDKL5 gene using Cas9 effector domain fused to epigenetic effector domains from VSP64 or ten-eleven translocation dioxygenase 1 (TET1). In particular, DNA methylation editing of the CDKL5 promoter using a dCas9-TET1 fusion protein for targeted DNA demethylation resulted in a significant increase in allele-specific expression of the inactive allele of CDKL5 and a significant reduction in methylated CpG dinucleotides in the CGI core promoter. Moreover, while dCas9-VSP64 fusion protein had no effect alone, co-expression of dCas9-TET1 and a dCas9-VP64 transactivator has a synergistic effect on the reactivation of the inactive CDKL5 allele to levels above 60% of the active allele.

DETAILED DESCRIPTION

Embodiments according to the present disclosure are described more fully hereinafter. Aspects of the disclosure may, however, be embodied in different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. The terminology used in the description herein is for the purpose of describing particular embodiments only and is not intended to be limiting. Throughout and within this disclosure various technical and patent publications are references by a citation or an Arabic numeral. The full bibliographic citations for each reference identified by an Arabic numeral is found in the reference section, immediately preceding the claims.

It is to be appreciated that certain aspects, modes, embodiments, variations and features of the present methods are described below in various levels of detail in order to provide a substantial understanding of the present technology. The definitions of certain terms as used in the specification are provided below. Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the present application and relevant art and should not be interpreted in an idealized or overly formal sense unless expressly so defined herein. While not explicitly defined below, such terms should be interpreted according to their common meaning.

The terminology used in the description herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. All publications, patent applications, patents and other references mentioned herein are incorporated by reference in their entirety.

The practice of the present technology will employ, unless otherwise indicated, conventional techniques of tissue culture, immunology, molecular biology, microbiology, cell biology, and recombinant DNA, which are within the skill of the art.

Unless the context indicates otherwise, it is specifically intended that the various features of the invention described herein can be used in any combination. Moreover, the disclosure also contemplates that in some embodiments, any feature or combination of features set forth herein can be excluded or omitted. To illustrate, if the specification states that a complex comprises components A, B and C, it is specifically intended that any of A, B or C, or a combination thereof, can be omitted and disclaimed singularly or in any combination. The term consisting of intends the recited elements and any additional elements that do not materially change of the function of the recited element or elements.

Unless explicitly indicated otherwise, all specified embodiments, features, and terms intend to include both the recited embodiment, feature, or term and biological equivalents thereof.

All numerical designations, e.g., pH, temperature, time, concentration, and molecular weight, including ranges, are approximations which are varied (+) or (−) by increments of 1.0 or 0.1, as appropriate, or alternatively by a variation of +/−15%, or alternatively 10%, or alternatively 5%, or alternatively 2%. It is to be understood, although not always explicitly stated, that all numerical designations are preceded by the term “about”. It also is to be understood, although not always explicitly stated, that the reagents described herein are merely exemplary and that equivalents of such are known in the art.

The practice of the present technology employs, unless otherwise indicated, conventional techniques of tissue culture, immunology, molecular biology, microbiology, cell biology, and recombinant DNA, which are within the skill of the art. See, e.g., Green and Sambrook eds. (2012) Molecular Cloning: A Laboratory Manual, 4th edition; the series Ausubel et al. eds. (2015) Current Protocols in Molecular Biology; the series Methods in Enzymology (Academic Press, Inc., N.Y.); MacPherson et al. (2015) PCR 1: A Practical Approach (IRL Press at Oxford University Press); MacPherson et al. (1995) PCR 2: A Practical Approach; McPherson et al. (2006) PCR: The Basics (Garland Science); Harlow and Lane eds. (1999) Antibodies, A Laboratory Manual; Greenfield ed. (2014) Antibodies, A Laboratory Manual; Freshney (2010) Culture of Animal Cells: A Manual of Basic Technique, 6th edition; Gait ed. (1984) Oligonucleotide Synthesis; U.S. Pat. No. 4,683,195; Hames and Higgins eds. (1984) Nucleic Acid Hybridization; Anderson (1999) Nucleic Acid Hybridization; Herdewijn ed. (2005) Oligonucleotide Synthesis: Methods and Applications; Hames and Higgins eds. (1984) Transcription and Translation; Buzdin and Lukyanov ed. (2007) Nucleic Acids Hybridization: Modern Applications; Immobilized Cells and Enzymes (IRL Press (1986)); Grandi ed. (2007) In Vitro Transcription and Translation Protocols, 2nd edition; Guisan ed. (2006) Immobilization of Enzymes and Cells; Perbal (1988) A Practical Guide to Molecular Cloning, 2nd edition; Miller and Calos eds, (1987) Gene Transfer Vectors for Mammalian Cells (Cold Spring Harbor Laboratory); Makrides ed. (2003) Gene Transfer and Expression in Mammalian Cells; Mayer and Walker eds. (1987) Immunochemical Methods in Cell and Molecular Biology (Academic Press, London); Lundblad and Macdonald eds. (2010) Handbook of Biochemistry and Molecular Biology, 4th edition; Herzenberg et al. eds (1996) Weir's Handbook of Experimental Immunology, 5th edition; and/or more recent editions thereof.

Definitions

As used herein, the singular forms “a,” “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise.

As used herein, the term “about,” when referring to a measurable value such as an amount or concentration and the like, is meant to encompass variations of 20%, 10%, 5%, 1%, 0.5%, or even 0.1% of the specified amount.

As used herein, the terms or “acceptable,” “effective,” or “sufficient” when used to describe the selection of any components, ranges, dose forms, etc. disclosed herein intend that said component, range, dose form, etc. is suitable for the disclosed purpose.

As used herein, the term “adeno-associated virus” or “AAV” refers to a member of the class of viruses associated with this name and belonging to the genus dependoparvovirus, family Parvoviridae. Multiple serotypes of this virus are known to be suitable for gene delivery; all known serotypes can infect cells from various tissue types. At least 11 or 12, sequentially numbered, are disclosed in the prior art. Non-limiting exemplary serotypes useful in the gene editing systems, host cells, pharmaceutical compositions, vectors, and methods disclosed herein include any of the 11 or 12 serotypes, e.g., AAV2, AAV5, and AAV8, or variant serotypes, e.g. AAV-DJ. The AAV structural particle is composed of 60 protein molecules made up of VP1, VP2 and VP3. Each particle contains approximately 5 VP1 proteins, 5 VP2 proteins and 50 VP3 proteins ordered into an icosahedral structure.

As used herein, the term “administering” a compound or composition to a subject means delivering the compound to the subject. “Administering” includes prophylactic administration of the compound or composition (i.e., before the disease and/or one or more symptoms of the disease are detectable) and/or therapeutic administration of the composition (i.e., after the disease and/or one or more symptoms of the disease are detectable). The methods of the present technology include administering one or more compounds or agents.

If more than one compound is to be administered, the compounds may be administered together at substantially the same time, and/or administered at different times in any order.

Also, the compounds of the present technology may be administered before, concomitantly with, and/or after administration of another type of drug or therapeutic procedure (e.g., surgery).

As used herein, “ameliorate,” “ameliorating,” and the like, as used herein, refer to inhibiting, relieving, eliminating, or slowing progression of one or more symptoms.

As used herein, “and/or” refers to and encompasses any and all possible combinations of one or more of the associated listed items, as well as the lack of combinations when interpreted in the alternative (“or”).

As used herein, the term “aptamer” as used herein refers to single stranded DNA or RNA molecules that can bind to one or more selected targets with high affinity and specificity. Non-limiting exemplary targets include by are not limited to proteins or peptides.

As used herein, the term “Cas9” refers to a CRISPR-associated, RNA-guided endonuclease such as Streptococcus pyogenes Cas9 (spCas9) and orthologs and biological equivalents thereof. Biological equivalents of Cas9 include but are not limited to C2c1 from Alicyclobacillus acideterrestris and Cpf1 (which performs cutting functions analogous to Cas9) from various bacterial species including Acidaminococcus spp. and Francisella novicida U112. Cas9 may refer to an endonuclease that causes double stranded breaks in DNA, a nickase variant such as a RuvC or HNH mutant that causes a single stranded break in DNA, as well as other variations such as deadCas-9 or dCas9, which lack endonuclease activity. Cas9 may also refer to “split-Cas9” in which CAs9 is split into two halves—C-Cas9 and N-Cas9—and fused with a two intein moieties. See, e.g., U.S. Pat. No. 9,074,199 B1; Zetsche et al., Nat Biotechnol. 33(2):139-42 (2015); Wright et al., PNAS 112(10) 2984-89 (2015).

As used herein, the term “cell” or “host cell” may refer to either a prokaryotic or eukaryotic cell, optionally obtained from a subject or a commercially available source.

As used herein, the term “CRISPR” refers to Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR). CRISPR may also refer to a technique or system of sequence-specific genetic manipulation relying on the CRISPR pathway. A CRISPR recombinant expression system can be programmed to cleave a target polynucleotide using a CRISPR endonuclease and a guide RNA. A CRISPR system can be used to cause double stranded or single stranded breaks in a target polynucleotide. A CRISPR system can also be used to recruit proteins or label a target polynucleotide. In some aspects, CRISPR-mediated gene editing utilizes the pathways of nonhomologous end-joining (NHEJ) or homologous recombination to perform the edits. These applications of CRISPR technology are known and widely practiced in the art. See, e.g., U.S. Pat. No. 8,697,359, and Hsu et al., Cell 156(6): 1262-1278 (2014).

As used herein, the term “comprising” is intended to mean that the compositions and methods include the recited elements, but do not exclude others.

As used herein, the transitional phrase “consisting essentially of” (and grammatical variants) is to be interpreted as encompassing the recited materials or steps “and those that do not materially affect the basic and novel characteristic(s)” of the recited embodiment. Thus, the term “consisting essentially of” as used herein should not be interpreted as equivalent to “comprising.” “Consisting of” shall mean excluding more than trace elements of other ingredients and substantial method steps for administering the compositions disclosed herein. Aspects defined by each of these transition terms are within the scope of the present disclosure.

As used herein, the term “effective amount” or “therapeutically effective amount” refers to the amount of an agent that is sufficient to effect beneficial or desired results. The therapeutically effective amount may vary depending upon one or more of: the subject and disease condition being treated, the weight and age of the subject, the severity of the disease condition, the manner of administration and the like, which can readily be determined by one of ordinary skill in the art. The specific dose may vary depending on one or more of: the particular agent chosen, the dosing regimen to be followed, whether it is administered in combination with other compounds, timing of administration, the route of administration, and the physical delivery system in which it is carried.

In some embodiments, “effective amount” or “therapeutically effective amount” refers to a quantity sufficient to achieve a desired therapeutic and/or prophylactic effect, e.g., an amount which results in the full or partial amelioration of disease or disorders or symptoms associated with mitochondrial dysfunction, neurological disease, lack of energy, glycolytic process dysfunction or cellular respiration related dysfunction in a subject in need thereof. In the context of therapeutic or prophylactic applications, the amount of a composition administered to the subject will depend on the type and severity of the disease and on the characteristics of the individual, such as general health, age, sex, body weight and tolerance to drugs. It will also depend on the degree, severity and type of disease. A person of ordinary skill in the art will be able to determine appropriate dosages depending on these and other factors. The compositions can also be administered in combination with one or more additional compounds. Multiple doses may be administered. Additionally or alternatively, multiple therapeutic compositions or compounds may administered. In the methods described herein, the compounds may be administered to a subject having one or more signs or symptoms of a disease or disorder described herein.

As used herein, the term “encode” as it is applied to nucleic acid sequences refers to a polynucleotide which is said to “encode” a polypeptide if, in its native state or when manipulated by methods well known to those skilled in the art, can be transcribed and/or translated to produce the mRNA for the polypeptide and/or a fragment thereof. The antisense strand is the complement of such a nucleic acid, and the encoding sequence can be deduced therefrom.

As used herein, the term “endonuclease” refers to any suitable endonuclease enzyme protein or a variant thereof that will be specifically directed by the selected guide polynucleotide to enzymatically knock-out the target sequence of the guide polynucleotide.

As used herein, the term “variant thereof,” as used with respect to an endonuclease, refers to the referenced endonuclease in its enzymatically functional form expressed in any suitable host organism or expression system and/or including any modifications to enhance the enzymatic activity of the endonuclease.

In some embodiments of the present disclosure, a suitable endonuclease includes a CRISPR-associated sequence 9 (Cas9) endonuclease or a variant thereof, a CRISPR-associated sequence 13 (Cas13) endonuclease or a variant thereof, CRISPR-associated sequence 6 (Cas6) endonuclease or a variant thereof, a CRISPR from Prevotella and Francisella 1 (Cpf1) endonuclease or a variant thereof, or a CRISPR from Microgenomates and Smithella 1 (Cms1) endonuclease or a variant thereof. In some embodiments of the present disclosure, a suitable endonuclease includes a Streptococcus pyogenes Cas9 (SpCas9), a Staphylococcus aureus Cas9 (SaCas9), a Francisella novicida Cas9 (FnCas9), or a variant thereof. Variants may include a protospacer adjacent motif (PAM) SpCas9 (xCas9), high fidelity SpCas9 (SpCas9-FIF1), a high fidelity SaCas9, or a high fidelity FnCas9.

In some embodiments of the present disclosure, the endonuclease comprises a Cas fusion nuclease comprising a Cas9 protein or a variant thereof fused with a Fok1 nuclease or variant thereof. Variants of the Cas9 protein of this fusion nuclease include a catalytically inactive Cas9 (e.g., dead Cas9). In some embodiments of the present disclosure, the endonuclease may be a Cas9, Cas1 3, Cas6, Cpf1, CMS1 protein, or any variant thereof that is derived or expressed from Methanococcus maripaludis C7, Corynebacterium diphtheria, Corynebacterium efficiens YS-314, Corynebacterium glutamicum (ATCC 13032), Corynebacterium glutamicum (ATCC 13032), Corynebacterium glutamicum R, Corynebacterium kroppenstedtii (DSM 44385), Mycobacterium abscessus (ATCC 19977), Nocardia farcinica IFM1 0 152, Rhodococcus erythropolis PR4, Rhodococcus jostii RFIA1, Rhodococcus opacus B4 (uid36573), Acidothermus cellulolyticus 11B, Arthrobacter chlorophenolicus A6, Kribbella flavida (DSM 17836, uid43465), Thermomonospora curvata (DSM431 83), Bifidobacterium dentium Bd1, Bifidobacterium longum DJO10A, Slackia heliotrinireducens (DSM 20476), Persephonella marina EX H1, Bacteroides fragilis NCTC 9434, Capnocytophaga ochracea (DSM 7271), Flavobacterium psychrophilum JIP02 86, Akkermansia muciniphila (ATCC BAA 835), Roseiflexus castenholzii (DSM 13941), Roseiflexus RS1, Synechocystis PCC6803, Elusimicrobium minutum Pei1 9 1, uncultured Termite group 1 bacterium phylotype Rs D 17, Fibrobacter succinogenes S85, Bacillus cereus (ATCC 10987), Listeria innocua, Lactobacillus casei, Lactobacillus rhamnosus GG, Lactobacillus salivarius UCC1 18, Streptococcus agalactiae-5-A909, Streptococcus agalactiae NEM316, Streptococcus agalactiae 2603, Streptococcus dysgalactiae equisimilis GGS 124, Streptococcus equi zooepidemicus MGCS1 0565, Streptococcus gallolyticus UCN34 (uid46061), Streptococcus gordonii Challis subst CH1, Streptococcus mutans NN2025 (uid46353), Streptococcus mutans, Streptococcus pyogenes M 1 GAS, Streptococcus pyogenes MGAS5005, Streptococcus pyogenes MGAS2096, Streptococcus pyogenes MGAS9429, Streptococcus pyogenes MGAS 10270, Streptococcus pyogenes MGAS61 80, Streptococcus pyogenes MGAS31 5, Streptococcus pyogenes SSI-1, Streptococcus pyogenes MGAS1 0750, Streptococcus pyogenes NZ1 3 1, Streptococcus thermophiles CNRZ1 066, Streptococcus thermophiles LMD-9, Streptococcus thermophiles LMG 1831 1, Clostridium botulinum A3 Loch Maree, Clostridium botulinum B Eklund 17B, Clostridium botulinum Ba4 657, Clostridium botulinum F Langeland, Clostridium cellulolyticum H 10, Finegoldia magna (ATCC 29328), Eubacterium rectale (ATCC 33656), Mycoplasma gallisepticum, Mycoplasma mobile 163K, Mycoplasma penetrans, Mycoplasma synoviae 53, Streptobacillus moniliformis (DSM 121 12), Bradyrhizobium BTAil, Nitrobacter hamburgensis X14, Rhodopseudomonas palustris BisB1 8, Rhodopseudomonas palustris BisB5, Parvibaculum lavamentivorans DS-1, Dinoroseobacter shibae. DFL 12, Gluconacetobacter diazotrophicus Pal 5 FAPERJ, Gluconacetobacter diazotrophicus Pal 5 JGI, Azospirillum B51 0 (uid46085), Rhodospirillum rubrum (ATCC 11170), Diaphorobacter TPSY (uid29975), Verminephrobacter eiseniae EFO1-2, Neisseria meningitides 053442, Neisseria meningitides alpha14, Neisseria meningitides Z2491, Desulfovibrio salexigens DSM 2638, Campylobacter jejuni doylei 269 97, Campylobacter jejuni 8 1116, Campylobacter jejuni, Campylobacter lari RM21 00, Helicobacter hepaticus, Wolinella succinogenes, Tolumonas auensis DSM 9 187, Pseudoalteromonas atlantica T6c, Shewanella pealeana (ATCC 700345), Legionella pneumophila Paris, Actinobacillus succinogenes 130Z, Pasteurella multocida, Francisella tularensis novicida U112, Francisella tularensis holarctica, Francisella tularensis FSC 198, Francisella tularensis, Francisella tularensis WY96-3418, or Treponema denticola (ATCC 35405).

As used herein, the terms “equivalent” or “biological equivalent” are used interchangeably when referring to a particular molecule, biological, or cellular material and intend those having minimal homology while still maintaining desired structure or functionality.

As used herein, the term “expression” refers to the process by which polynucleotides are transcribed into mRNA and/or the process by which the transcribed mRNA is subsequently being translated into peptides, polypeptides, or proteins. If the polynucleotide is derived from genomic DNA, expression may include splicing of the mRNA in a eukaryotic cell. The expression level of a gene may be determined by measuring the amount of mRNA or protein in a cell or tissue sample; further, the expression level of multiple genes can be determined to establish an expression profile for a particular sample.

As used herein, the term “functional” may be used to modify any molecule, biological, or cellular material to intend that it accomplishes a particular, specified effect.

As used herein, the term “guide polynucleotide” refers to a polynucleotide having a “synthetic sequence” capable of binding the corresponding endonuclease enzyme protein (e.g., Cas9) and a variable target sequence capable of binding the genomic target (e.g., a nucleotide sequence found in an exon of a target gene). In some embodiments of the present disclosure, a guide polynucleotide is a guide ribonucleic acid (gRNA). In some embodiments, the variable target sequence of the guide polynucleotide is any sequence within the target that is unique with respect to the rest of the genome and is immediately adjacent to a Protospacer Adjacent Motif (PAM). The exact sequence of the PAM sequence may vary as different endonucleases require different PAM sequences.

As used herein, “homology” or “identity” or “similarity” refers to sequence similarity between two peptides or between two nucleic acid molecules. Homology can be determined by comparing a position in each sequence which may be aligned for purposes of comparison. When a position in the compared sequence is occupied by the same base or amino acid, then the molecules are homologous at that position. A degree of homology between sequences is a function of the number of matching or homologous positions shared by the sequences. An “unrelated” or “non-homologous” sequence shares less than 40% identity, or alternatively less than 25% identity, with one of the sequences of the present invention.

As used herein, “hybridization” or “hybridizes” refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues. The hydrogen bonding may occur by Watson-Crick base pairing, Hoogstein binding, or in any other sequence-specific manner. The complex may comprise two strands forming a duplex structure, three or more strands forming a multi-stranded complex, a single self-hybridizing strand, or any combination of these. A hybridization reaction may constitute a step in a more extensive process, such as the initiation of a PC reaction, or the enzymatic cleavage of a polynucleotide by a ribozyme.

Examples of stringent hybridization conditions include: incubation temperatures of about 25° C. to about 37° C.; hybridization buffer concentrations of about 6× saline-sodium citrate (“SSC”) to about 10×SSC; formamide concentrations of about 0% to about 25%; and wash solutions from about 4×SSC to about 8×SSC. Examples of moderate hybridization conditions include: incubation temperatures of about 40° C. to about 50° C.; buffer concentrations of about 9×SSC to about 2×SSC; formamide concentrations of about 30% to about 50%; and wash solutions of about 5×SSC to about 2×SSC. Examples of high stringency conditions include: incubation temperatures of about 55° C. to about 68° C.; buffer concentrations of about 1×SSC to about 0.1×SSC; formamide concentrations of about 55% to about 75%; and wash solutions of about 1×SSC, 0.1×SSC, or deionized water. In general, hybridization incubation times are from 5 minutes to 24 hours, with 1, 2, or more washing steps, and wash incubation times are about 1, 2, or 15 minutes. SSC is 0.15 M sodium chloride (“NaCl”) and 15 mM citrate buffer. It is understood that equivalents of SSC using other buffer systems can be employed.

As used herein, the term “isolated” as used herein refers to molecules or biologicals or cellular materials being substantially free from other materials.

As used herein, the term “lentivirus” refers to a member of the class of viruses associated with this name and belonging to the genus lentivirus, family Retroviridae. While some lentiviruses are known to cause diseases, other lentivirus are known to be suitable for gene delivery. See, e.g., Tomas et al. (2013) Biochemistry, Genetics and Molecular Biology: “Gene Therapy—Tools and Potential Applications,” ISBN 978-953-51-1014-9, DOI: 10.5772/52534.

As used herein, the terms “nucleic acid sequence,” “nucleotide sequence,” and “polynucleotide” are used interchangeably to refer to a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides. Thus, this term includes, but is not limited to, single-, double-, or multi-stranded DNA or RNA, genomic DNA, cDNA, DNA-RNA hybrids, or a polymer comprising purine and pyrimidine bases or other natural, chemically or biochemically modified, non-natural, or derivatized nucleotide bases.

As used herein, the term “organ” a structure which is a specific portion of an individual organism, where a certain function or functions of the individual organism is locally performed and which is morphologically separate. Non-limiting examples of organs include the skin, blood vessels, cornea, thymus, kidney, heart, liver, umbilical cord, intestine, nerve, lung, placenta, pancreas, thyroid and brain.

As used herein, the term “ortholog” is used in reference of another gene or protein and intends a homolog of said gene or protein that evolved from the same ancestral source.

Orthologs may or may not retain the same function as the gene or protein to which they are orthologous. Non-limiting examples of Cas9 orthologs include S. aureus Cas9 (“spCas9”), S. thermophiles Cas9, L. pneumophilia Cas9, N. lactamica Cas9, N. meningitides Cas9, B. longum Cas9, A. muciniphila Cas9, and O. laneus Cas9.

As used herein, “prevention,” “prevents,” or “preventing” of a disorder or condition refers to a compound that, in a statistical sample, reduces the occurrence of the disorder, symptom, or condition in the treated sample relative to a control subject, or delays the onset of one or more symptoms of the disorder or condition relative to the control subject.

As used herein, the term “promoter” as used herein refers to any sequence that regulates the expression of a coding sequence, such as a gene. refers to a region of DNA that initiates transcription of a particular gene. The promoter includes the core promoter, which is the minimal portion of the promoter required to properly initiate transcription and can also include regulatory elements such as transcription factor binding sites. The regulatory elements may promote transcription or inhibit transcription. Regulatory elements in the promoter can be binding sites for transcriptional activators or transcriptional repressors. A promoter can be constitutive or inducible. A constitutive promoter refers to one that is always active and/or constantly directs transcription of a gene above a basal level of transcription. An inducible promoter is one which is capable of being induced by a molecule or a factor added to the cell or expressed in the cell. An inducible promoter may still produce a basal level of transcription in the absence of induction, but induction typically leads to significantly more production of the protein. Promoters can also be tissue specific. A tissue specific promoter allows for the production of a protein in a certain population of cells that have the appropriate transcriptional factors to activate the promoter.

Promoters may be constitutive, inducible, repressible, or tissue-specific, for example. A “promoter” is a control sequence that is a region of a polynucleotide sequence at which initiation and rate of transcription are controlled. It may contain genetic elements at which regulatory proteins and molecules may bind such as RNA polymerase and other transcription factors. Non-limiting exemplary promoters include CDKL5 promoter, SCML2 promoter, COL9A3 promoter, MECP2, CMV promoter and U6 promoter, the phosphoglycerate kinase 1 (PGK) promoter; SSFV, CMV, MNDU3, SV40, Efla, UBC and CAGG. Non-limiting exemplary promoter sequences are provided herein below:

CMV Promoter

ATACGCGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGG GGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAA TGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACG TATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGT ATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTAC GCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTAC ATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTAT TACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTGGATAGCGGTTTGAC TCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGC ACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGC AAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTCGTTTAG TGAACCGTCAGATCGCCTGGAGACGCCATCCACGCTGTTTTGACCTCCATAGAAG ACACCGGGACCGATCCAGCCTCCGGACTCTAGAGGATCGAACCCTT, or a biological equivalent thereof.

U6 Promoter

GAGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCT GTTAGAGAGATAATTAGAATTAATTTGACTGTAAACACAAAGATATTAGTACAA AATACGTGACGTAGAAAGTAATAATTTCTTGGGTAGTTTGCAGTTTTAAAATTAT GTTTTAAAATGGACTATCATATGCTTACCGTAACTTGAAAGTATTTCGATTTCTTG GCTTTATATATCTTGTGGAAAGGACGAAACACC, or a biological equivalent thereof.

A number of effector elements are disclosed herein for use in these vectors; e.g., a tetracycline response element (e.g., tetO), a tet-regulatable activator, T2A, VP64, RtA, KRAB, and a miRNA sensor circuit. The nature and function of these effector elements are commonly understood in the art and a number of these effector elements are commercially available. Non-limiting exemplary sequences thereof are disclosed herein and further description thereof is provided herein below.

As used herein, the term “protein”, “peptide” and “polypeptide” are used interchangeably and in their broadest sense to refer to a compound of two or more subunits of amino acids, amino acid analogs or peptidomimetics. The subunits may be linked by peptide bonds. In another aspect, the subunit may be linked by other bonds, e.g., ester, ether, etc. A protein or peptide must contain at least two amino acids and no limitation is placed on the maximum number of amino acids which may comprise a protein's or peptide's sequence. As used herein the term “amino acid” refers to either natural and/or unnatural or synthetic amino acids, including glycine and both the D and L optical isomers, amino acid analogs and peptidomimetics.

As used herein, “protospacer adjacent motif” (PAM) refers to a short nucleotide sequence adjacent to a target sequence (protospacer) that is recognized (targeted) by a sgRNA/Cas endonuclease system described herein. The sequence and length of a PAM herein can differ depending on the Cas protein or Cas protein complex used. The PAM sequence can be of any length but is typically 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 nucleotides long. The PAM sequence plays a key role in target recognition by licensing sgRNA base pairing to the protospacer sequence (Szczelkun et al., Proc. Natl. Acad. Sci. U.S.A 111: 9798-803 (2014)).

As used herein, the term “recombinant expression system” refers to a genetic construct for the expression of certain genetic material formed by recombination.

As used herein, the term “sgRNA” or “single guide RNA” as used herein refers to the guide RNA sequences used to target specific genes for correction employing the CRISPR technique. Techniques of designing sgRNAs and donor therapeutic polynucleotides for target specificity are well known in the art. For example, Doench et al., Nature Biotechnology 32(12):1262-7 (2014), Mohr et al., FEBS J. 283: 3232-38 (2016), and Graham et al., Genome Biol. 16:260 (2015). sgRNA comprises or alternatively consists essentially of, or yet further consists of a fusion polynucleotide comprising CRISPR RNA (crRNA; i.e., a scaffold region) and trans-activating CRIPSPR RNA (tracrRNA; i.e., a spacer region); or a polynucleotide comprising crRNA (i.e., a scaffold region) and tracrRNA (i.e., a spacer region). In some aspects, a sgRNA is synthetic (Kelley et al., J of Biotechnology 233:74-83 (2016).

As used herein, the terms “subject,” “individual,” or “patient” can be an individual organism, a vertebrate, a mammal, or a human. “Mammal” includes a human, non-human mammal, non-human primate, murine (e.g., mouse, rat, guinea pig, hamster), ovine, bovine, ruminant, lagomorph, porcine, caprine, equine, canine, feline, avis, etc. In any embodiment herein, the mammal is feline or canine. In any embodiment herein, the mammal is human.

As used herein, “target sequence” refers to a nucleotide sequence adjacent to a 5′-end of a protospacer adjacent motif (PAM). Being “adjacent” herein means being within 1 to 8 nucleotides of the site of reference, including being “immediately adjacent,” which means that there is no intervening nucleotides between the immediately adjacent nucleotide sequences and the immediately adjacent nucleotide sequences are within one nucleotide of each other.

As used herein, “target site” refers to a site of the target sequence including both the target sequence and its complementary sequence, for example, in double stranded nucleotides. The target site described herein may mean a nucleotide sequence hybridizing to a sgRNA spacer region, a complementary nucleotide sequence of the nucleotide sequence hybridizing to a sgRNA spacer region, and/or a nucleotide sequence adjacent to the 5′-end of a PAM. Full complementarity of a sgRNA spacer region with a target site is not necessarily required, provided there is sufficient complementarity to cause hybridization and promote formation of a CRISPR complex. A target sequence or target site may comprise any polynucleotide, such as DNA or RNA polynucleotides. In some embodiments, a target sequence or target site is located in the nucleus or cytoplasm of a cell. In some embodiments, the target sequence or target site may be within an organelle of a eukaryotic cell, for example, mitochondrion or chloroplast.

As used herein, the term “tissue” is used herein to refer to tissue of a living or deceased organism or any tissue derived from or designed to mimic a living or deceased organism. The tissue may be healthy, diseased, and/or have genetic mutations. The biological tissue may include any single tissue (e.g., a collection of cells that may be interconnected) or a group of tissues making up an organ or part or region of the body of an organism. The tissue may comprise a homogeneous cellular material or it may be a composite structure such as that found in regions of the body including the thorax which for instance can include lung tissue, skeletal tissue, and/or muscle tissue. Exemplary tissues include, but are not limited to those derived from liver, lung, thyroid, skin, pancreas, blood vessels, bladder, kidneys, brain, biliary tree, duodenum, abdominal aorta, iliac vein, heart and intestines, including any combination thereof.

As used herein, “treating” or “treatment” of a disease in a subject refers to (1) preventing the symptoms or disease from occurring in a subject that is predisposed or does not yet display symptoms of the disease; (2) inhibiting the disease or arresting its development; or (3) ameliorating or causing regression of the disease or the symptoms of the disease. As understood in the art, “treatment” is an approach for obtaining beneficial or desired results, including clinical results. For the purposes of the present technology, beneficial or desired results can include one or more, but are not limited to, alleviation or amelioration of one or more symptoms, diminishment of extent of a condition (including a disease), stabilized (i.e., not worsening) state of a condition (including disease), delay or slowing of condition (including disease), progression, amelioration or palliation of the condition (including disease), states and remission (whether partial or total), whether detectable or undetectable. In one aspect, the term “treatment” excludes prevention or prophylaxis.

As used herein, “stem cell” defines a cell with the ability to divide for indefinite periods in culture and give rise to specialized cells. At this time and for convenience, stem cells are categorized as somatic (adult) or embryonic. A somatic stem cell is an undifferentiated cell found in a differentiated tissue that can renew itself (clonal) and (with certain limitations) differentiate to yield all the specialized cell types of the tissue from which it originated. An embryonic stem cell is a primitive (undifferentiated) cell from the embryo that has the potential to become a wide variety of specialized cell types. An embryonic stem cell is one that has been cultured under in vitro conditions that allow proliferation without differentiation for months to years. A clone is a line of cells that is genetically identical to the originating cell; in this case, a stem cell.

A population of cells intends a collection of more than one cell that is identical (clonal) or non-identical in phenotype and/or genotype. A substantially homogenous population of cells is a population having at least 70%, or alternatively at least 75%, or alternatively at least 80%, or alternatively at least 85%, or alternatively at least 90%, or alternatively at least 95%, or alternatively at least 98% identical phenotype, as measured by pre-selected markers.

As used herein, “embryonic stem cells” refers to stem cells derived from tissue formed after fertilization but before the end of gestation, including pre-embryonic tissue (such as, for example, a blastocyst), embryonic tissue, or fetal tissue taken any time during gestation, typically but not necessarily before approximately 10-12 weeks gestation. Most frequently, embryonic stem cells are pluripotent cells derived from the early embryo or blastocyst. Embryonic stem cells can be obtained directly from suitable tissue, including, but not limited to human tissue, or from established embryonic cell lines. “Embryonic-like stem cells” refer to cells that share one or more, but not all characteristics, of an embryonic stem cell.

A neural stem cell is a cell that can be isolated from the adult central nervous systems of mammals, including humans. They have been shown to generate neurons, migrate and send out aconal and dendritic projections and integrate into pre-existing neuroal circuits and contribute to normal brain function. Reviews of research in this area are found in Miller (2006) The Promise of Stem Cells for Neural Repair, Brain Res. Vol. 1091(1):258-264; Pluchino et al. (2005) Neural Stem Cells and Their Use as Therapeutic Tool in Neurological Disorders, Brain Res. Brain Res. Rev., Vol. 48(2):211-219; and Goh, et al. (2003) Adult Neural Stem Cells and Repair of the Adult Central Nervous System, J. Hematother. Stem Cell Res., Vol. 12(6):671-679.

As use herein, the term “differentiation” describes the process whereby an unspecialized cell acquires the features of a specialized cell such as a heart, liver, or muscle cell. “directed differentiation” refers to the manipulation of stem cell culture conditions to induce differentiation into a particular cell type. “Dedifferentiated” defines a cell that reverts to a less committed position within the lineage of a cell. As used herein, the term “differentiates or differentiated” defines a cell that takes on a more committed (“differentiated”) position within the lineage of a cell. As used herein, “a cell that differentiates into a mesodermal (or ectodermal or endodermal) lineage” defines a cell that becomes committed to a specific mesodermal, ectodermal or endodermal lineage, respectively. Examples of cells that differentiate into a mesodermal lineage or give rise to specific mesodermal cells include, but are not limited to, cells that are adipogenic, leiomyogenic, chondrogenic, cardiogenic, dermatogenic, hematopoetic, hemangiogenic, myogenic, nephrogenic, urogenitogenic, osteogenic, pericardiogenic, or stromal.

As used herein, the term “differentiates or differentiated” defines a cell that takes on a more committed (“differentiated”) position within the lineage of a cell. “Dedifferentiated” defines a cell that reverts to a less committed position within the lineage of a cell. Induced pluripotent stem cells are examples of dedifferentiated cells.

As used herein, the “lineage” of a cell defines the heredity of the cell, i.e. its predecessors and progeny. The lineage of a cell places the cell within a hereditary scheme of development and differentiation.

A “multi-lineage stem cell” or “multipotent stem cell” refers to a stem cell that reproduces itself and at least two further differentiated progeny cells from distinct developmental lineages. The lineages can be from the same germ layer (i.e. mesoderm, ectoderm or endoderm), or from different germ layers. An example of two progeny cells with distinct developmental lineages from differentiation of a multilineage stem cell is a myogenic cell and an adipogenic cell (both are of mesodermal origin, yet give rise to different tissues). Another example is a neurogenic cell (of ectodermal origin) and adipogenic cell (of mesodermal origin).

A “precursor” or “progenitor cell” intends to mean cells that have a capacity to differentiate into a specific type of cell. A progenitor cell may be a stem cell. A progenitor cell may also be more specific than a stem cell. A progenitor cell may be unipotent or multipotent. Compared to adult stem cells, a progenitor cell may be in a later stage of cell differentiation. An example of progenitor cell includes, without limitation, a progenitor nerve cell.

A “parthenogenetic stem cell” refers to a stem cell arising from parthenogenetic activation of an egg. Methods of creating a parthenogenetic stem cell are known in the art. See, for example, Cibelli et al. (2002) Science 295(5556):819 and Vrana et al. (2003) Proc. Natl. Acad. Sci. USA 100 (Suppl. 1) 11911-6.

As used herein, a “pluripotent cell” defines a less differentiated cell that can give rise to at least two distinct (genotypically and/or phenotypically) further differentiated progeny cells. In another aspect, a “pluripotent cell” includes an Induced Pluripotent Stem Cell (iPSC) which is an artificially derived stem cell from a non-pluripotent cell, typically an adult somatic cell, that has historically been produced by inducing expression of one or more stem cell specific genes. Such stem cell specific genes include, but are not limited to, the family of octamer transcription factors, i.e. Oct-3/4; the family of Sox genes, i.e., Sox1, Sox2, Sox3, Sox 15 and Sox 18; the family of Klf genes, i.e. Klf1, Klf2, Klf4 and Klf5; the family of Myc genes, i.e. c-myc and L-myc; the family of Nanog genes, i.e., OCT4, NANOG and REX1; or LIN28. Examples of iPSCs are described in Takahashi et al. (2007) Cell advance online publication 20 Nov. 2007; Takahashi & Yamanaka (2006) Cell 126:663-76; Okita et al. (2007) Nature 448:260-262; Yu et al. (2007) Science advance online publication 20 Nov. 2007; and Nakagawa et al. (2007) Nat. Biotechnol. Advance online publication 30 Nov. 2007.

As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. Vectors include, but are not limited to, nucleic acid molecules that are single-stranded, double-stranded, or partially double-stranded; nucleic acid molecules that comprise one or more free ends, no free ends (e.g., circular); nucleic acid molecules that comprise DNA, RNA, or both; and other varieties of polynucleotides known in the art. One type of vector is a “plasmid,” which refers to a circular double stranded DNA loop into which additional DNA segments can be inserted, such as by standard molecular cloning techniques.

Another type of vector is a viral vector, wherein virally-derived DNA or RNA sequences are present in the vector for packaging into a virus (e.g., retroviruses, replication defective retroviruses, adenoviruses, replication defective adenoviruses, lentiviruses, replication defective lentiviruses, and adeno-associated viruses). Viral vectors also include polynucleotides carried by a virus for transfection into a host cell. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., non-episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome. Moreover, certain vectors are capable of directing the expression of genes to which they are operatively linked. Such vectors are referred to herein as “expression vectors.” Common expression vectors of utility in recombinant DNA techniques are often in the form of plasmids. Recombinant expression vectors can comprise a nucleic acid of the invention in a form suitable for expression of the nucleic acid in a host cell, which means that the recombinant expression vectors include one or more regulatory elements, which may be selected on the basis of the host cells to be used for expression, that is operatively-linked to the nucleic acid sequence to be expressed. Within a recombinant expression vector, “operably linked” is intended to mean that the nucleotide sequence of interest is linked to the regulatory element(s) in a manner that allows for expression of the nucleotide sequence (e.g., in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell). Advantageous viral expression vectors include retroviruses, replication defective retroviruses, adenoviruses, replication defective adenoviruses, lentiviruses, replication defective lentiviruses, and adeno-associated viruses.

It is to be inferred without explicit recitation and unless otherwise intended, that when the present disclosure relates to a polypeptide, protein, polynucleotide or antibody, a fragement an equivalent or a biologically equivalent of such is intended within the scope of this disclosure. As used herein, the term “biological equivalent thereof” is intended to be synonymous with “equivalent thereof” when referring to a reference protein, antibody, polypeptide or nucleic acid, intends those having minimal homology while still maintaining desired structure or functionality. Unless specifically recited herein, it is contemplated that any polynucleotide, polypeptide or protein mentioned herein also includes equivalents thereof. For example, an equivalent intends at least about 70% homology or identity, or at least 80% homology or identity and alternatively, or at least about 85%, or alternatively at least about 90%, or alternatively at least about 95%, or alternatively 98% percent homology or identity and exhibits substantially equivalent biological activity to the reference protein, polypeptide or nucleic acid. Alternatively, when referring to polynucleotides, an equivalent thereof is a polynucleotide that hybridizes under stringent conditions to the reference polynucleotide or its complement.

Applicants have provided herein the polypeptide and/or polynucleotide sequences for use in gene and protein transfer and expression techniques described below. It should be understood, although not always explicitly stated that the sequences provided herein can be used to provide the expression product as well as substantially identical sequences that produce a protein that has the same biological properties. These “biologically equivalent” or “biologically active” polypeptides are encoded by equivalent polynucleotides as described herein. They may possess at least 60%, or alternatively, at least 65%, or alternatively, at least 70%, or alternatively, at least 75%, or alternatively, at least 80%, or alternatively at least 85%, or alternatively at least 90%, or alternatively at least 95% or alternatively at least 98%, identical primary amino acid sequence to the reference polypeptide when compared using sequence identity methods run under default conditions. Specific polypeptide sequences are provided as examples of particular embodiments. Modifications to the sequences to amino acids with alternate amino acids that have similar charge. Additionally, an equivalent polynucleotide is one that hybridizes under stringent conditions to the reference polynucleotide or its complement or in reference to a polypeptide, a polypeptide encoded by a polynucleotide that hybridizes to the reference encoding polynucleotide under stringent conditions or its complementary strand. Alternatively, an equivalent polypeptide or protein is one that is expressed from an equivalent polynucleotide.

Pharmaceutically acceptable salts of compounds described herein are within the scope of the present technology and include acid or base addition salts which retain the desired pharmacological activity and is not biologically undesirable (e.g., the salt is not unduly toxic, allergenic, or irritating, and is bioavailable). When the compound of the present technology has a basic group, such as, for example, an amino group, pharmaceutically acceptable salts can be formed with inorganic acids (such as hydrochloric acid, hydroboric acid, nitric acid, sulfuric acid, and phosphoric acid), organic acids (e.g. alginate, formic acid, acetic acid, benzoic acid, gluconic acid, fumaric acid, oxalic acid, tartaric acid, lactic acid, maleic acid, citric acid, succinic acid, malic acid, methanesulfonic acid, benzenesulfonic acid, naphthalene sulfonic acid, and p-toluenesulfonic acid) or acidic amino acids (such as aspartic acid and glutamic acid). When the compound of the present technology has an acidic group, such as for example, a carboxylic acid group, or a hydroxyl group(s) it can form salts with metals, such as alkali and earth alkali metals (e.g. Na*, Li*, K*, Ca2+, Mg2+, Zn²⁺), ammonia or organic amines (e.g. dicyclohexylamine, trimethylamine, triethylamine, pyridine, picoline, ethanolamine, diethanolamine, triethanolamine) or basic amino acids (e.g. arginine, lysine and ornithine). Such salts can be prepared in situ during isolation and purification of the compounds or by separately reacting the purified compound in its free base or free acid form with a suitable acid or base, respectively, and isolating the salt thus formed.

Modes for Carrying Out the Disclosure Gene Editing Systems

The disclosure provides a gene editing systems comprising, or alternatively consisting essentially of, or yet further consisting of: (i) a first nucleotide molecule encoding a dCas9-Ten-Eleven Translocation methylcytosine dioxygenase 1 catalytic domain (TET1CD) fusion protein; and (ii) a second nucleotide molecule encoding at least one small guide RNA (sgRNA). In some embodiment, the second nucleotide molecule encoding at least one small guide RNA (sgRNA) comprises, or consists essentially of, or consisting of a scaffold region and a spacer region. In some embodiments, the scaffold region is an amino acid sequence that is necessary for dCas9 binding to the gRNA (addgene.org/guides/crispr/). In some embodiments, the spacer region hybridizes to a nucleotide sequence that is complementary to a target sequence adjacent to a 5′-end of a protospacer adjacent motif (PAM). In some embodiments, the target sequence and the PAM are located at least about 2 or about 1 kilobase (kb), at least about 1.5 kb, at least about 1 kb, at least about 0.9 kb, at least about 0.8 kb, at least about 0.7 kb, at least about 0.6 kb, at least about 0.5 kb, at least about 0.4 kb, at least about 0.3 kb, at least about 0.2 kb, at least about 0.1 kb from the transcriptional start site (TSS) of the CDKL5 gene. While the target sequence and the PAM are in one aspect located can be located at least about 1 kb from the transcriptional start site, it is apparent to the skilled artisan that other ranges are within the scope of this invention, e.g., the target sequence and the PAM are located from about 2 kb, or from about 1 kb to about 0.1 kb.

In some embodiments, the first nucleotide molecule encoding a dCas9-Ten-Eleven Translocation methylcytosine dioxygenase 1 catalytic domain (TET1CD) fusion protein; and (ii) the second nucleotide molecule encoding at least one small guide RNA (sgRNA) induce DNA demethylation of CpGs (GC islands or region) at positions of at least about −1500, at least about −1000, at least about −500, at least about −200, at least about −148, at least about −66 and, at least about −19 relative to transcription start site.

In some embodiments, the first nucleotide and second nucleotide molecules permit the transcriptional reprogramming of a gene promoter by precisely demethylating gene promoters or enhancers for desired gene targets. Thus, in one aspect, as described herein, is a method for transcriptionally reprogramming a gene promoter in a cell in need thereof, by inserting into the cell, the system as disclosed herein. In some embodiments, DNA is methylated at 5-cytosine (5mC), and such methylation silence gene expression and is important for genomic imprinting, regulation of gene expression, chromatic architecture organization, and cell-fate determination. In some embodiments, gene demythylation is associated with gene activation and occurs either via passive demethylation or through the oxidation of the methyl group. In some embodiments, demethylation via oxidation is mediated by TET (ten-eleven translocation) dioxygenases that oxidizes 5 methyl cytosine (5mC) to 5-hydroxymethylcytosine (5-hmC), which is a critical step in the ultimate removal of the methyl group.

In some embodiments, the full-length TET1 protein comprises typical features of 20G-Fe(II) oxygenases, including conservation of residues predicted to be important for coordination of the cofactors Fe(II) and 20G. The full-length TET1 protein has 2136 amino acids, and comprises an N-terminal a helix followed by a continuous series of p strands, typical of the double-stranded 0 helix (DSBH) fold of the 20G-Fe(II) oxygenases, a unique conserved cysteine-rich region (amino acids 1418-1610 of the full-length human TET1 protein; MIM:607790; ENSG00000138336) that is contiguous with the N terminus of the DSBH region (amino acids 1611-2074), a CXXC-type zinc-binding domain (amino acids 584-624 of the full-length human TET1 protein) domain, binuclear Zn-chelating domain, and three bipartite nuclear localization signals (NLS) (66, 68). In some embodiments, TET1 catalytic domain (TET1CD) comprises, or consists essentially of, or consisting of amino acids 1418 to 2136 of the full-length TET1 protein, and encompasses the conserved cysteine-rich region and the DSBH domain (68). In some embodiments, the DSBH domain of the catalytic domain construct comprises a nuclear localization (NLS) sequence. In some embodiments, the DSBH domain of the catalytic domain construct does not comprise a NLS sequence.

In some embodiments, the dCas9-TET1 fusion protein facilitates the targeted demethylation of gene targets (24-29). In particular, dCas9-TET1 facilitates the targeted demethylation of gene targets selected from the group consisting of CDK5L, SCML2 (Scm Polycomb Group Protein Like 2), COL9A3, or Methyl-CpG Binding Protein 2 (MECP) as shown in the Examples below. In some embodiments, both (i) a first nucleotide molecule encoding a dCas9-Ten-Eleven Translocation methylcytosine dioxygenase 1 catalytic domain (TET1CD) fusion protein and (ii) a second nucleotide molecule encoding at least one small guide RNA (sgRNA), are required to target dCas9-Tetl to a specific locus to demethylate DNA without altering the DNA sequence.

In some embodiments, the dCas9 is a catalytically inactive Cas9 nuclease from the Clustered regularly interspaced palindromic repeats (CRISPR), a type II bacterial adaptive immune system that has been modified to target the dCas9 to a desired genomic loci using sequence-specific guide RNAs for genome editing. In some embodiments, the desired genomic loci include any genes, optionally CDK5L, SCML2 (Scm Polycomb Group Protein Like 2), COL9A3, or Methyl-CpG Binding Protein 2 (MECP). In some embodiments, CDKL5 sgRNAs 20-bp spacer sequences are selected within at least about about 1 kb or about 2 kb, at least about 1.5 kb, at least about 1 kb, at least about 0.9 kb, at least about 0.8 kb, at least about 0.7 kb, at least about 0.6 kb, at least about 0.5 kb, at least about 0.4 kb, at least about 0.3 kb, at least about 0.2 kb, at least about 0.1 kb of the CDKL5 TSS (chrX:18,443,725, hg19) using the CRISPR/Cas9 and TALEN online tool for genome editing, CHOPCHOP. In some embodiments, guide RNAs (sgRNAs) span DNase I hypersensitive sites and H3K4me3 peaks of the CDKL5 promoter within at least about 2 kb, at least about 1.5 kb, at least about 1 kb, at least about 0.9 kb, at least about 0.8 kb, at least about 0.7 kb, at least about 0.6 kb, at least about 0.5 kb, at least about 0.4 kb, at least about 0.3 kb, at least about 0.2 kb, at least about 0.1 kb of window on either side of the CDKL5 transcriptional start site. In some embodiments, the second nucleotide molecule encoding at least one small guide RNA (sgRNA) used to create target-specific sgRNA expression vectors are listed in Table 1.

In some embodiments, the targeted sequence is a sequence in the gene promoter. The targeted sequence or a fragment thereof hybridizes to the corresponding gRNA. In one embodiment, the targeted sequence hybridizes to the corresponding gRNA without any mismatches. In another embodiment, the targeted sequence hybridizes to the corresponding gRNA with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more mismatches. Based on the targeted sequence, the gRNA sequence can be determined. In one embodiment, a gRNA comprises, or consists essentially of, or yet further consists of a sequence complement to a targeted sequence, such as those as disclosed herein, or an equivalent that is capable of binding to the same targeted sequence but comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more mismatches. In another embodiment, a gRNA comprises, or consists essentially of, or yet further consists of a sequence reverse-complement to a targeted sequence, such as those as disclosed herein, or an equivalent that is capable of binding to the same targeted sequence but comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more mismatches. In yet another embodiment, a gRNA comprises, or consists essentially of, or yet further consists of a sequence reverse to a targeted sequence, such as those as disclosed herein, or an equivalent that is capable of binding to the same targeted sequence but comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more mismatches.

In one aspect, this disclosure provides a third nucleotide molecule encoding a dCas9 protein fused to at least one transcriptional activator. In some embodiments, the fusion protein comprising the deactivated CRISPR-associated protein 9 (dCas9) with at least one tandem repeat of the transcriptional activator herpes simplex virus VP16 (i.e.VP64) induces transcriptional activation of endogenous of an endogenous gene. In some embodiments, the at least one transcriptional activator comprises VP64 or a biologically active fragment of VP16. Transcription factors act through a DNA-binding domain that localizes a protein to a specific site within the genome and through accessory effector domains that either activate or repress transcription at or near that site. Effector domains, such as the activation domain the herpes simplex virus VP16 (66) and the repression domain Kruppel-associated box (KRAB), are modular and retain their activity when they are fused to other DNA-binding proteins. In some embodiments, VP64 is the activation domain VP16 In some embodiments, VP64 is a recombinant tetrameric repeat of comprising the minimal activation domain VP64. In some embodiments, the activation domain of VP16 comprises amino acids 413-489 of the VP16 protein (66). In some embodiments, the recombinant tetrameric repeat of VP16's minimal activation domain comprises, or consists essentially of, or yet further consists of the amino acid sequence DAL DDFDLDMIL (66) In some embodiments, a third nucleotide molecule encoding a dCas9 protein fused to at least one of dCas9-VP64, VP64-p65-Rta triparte fusion (addgene.org/99670/), and or SunTag. SunTag is a novel protein scaffold/tagging system with a repeating peptide array for signal amplification in gene expression.

In some embodiment, dCas9-VP64 fusion protein upregulates genes in an unmethylated chromatin context. In some embodiment combination of dCas9-VP64 fusion protein and dCas9-TET1CD shows a synergistic effect resulted in a greater than 60% expression of an inactive allele (i.e. silence allele). In some embodiments, expression of dCas9-VP64 fusion protein alone does not significantly increase the reactivation levels of the inactive allele. In some embodiments, dual expression of dCas9-VP64 fusion protein and dCas9-TET1CD resulted in the fewest number of differentially expressed genes in RNAseq analysis.

In some embodiments, gene activation requires several sgRNAs. In some embodiments, gene activation requires six sgRNAs. In some embodiments, gene activation requires at least about, 1-10, 1-5, 1-6, 1-3, 3-6, or 4-6 sgRNAs. In some embodiments, the target sequence for the sgRNA comprises or consists essentially of or consist of one or more of: AGAGCATCGGACCGAAGCGG, GGGGGAGAACATACTCGGGG, CCCAGGTTGCTAGGGCTTGG, ATCGCCTGAAACTTGTCCGG, CGAAAGGGTGTGAAAGAGGG, and/or TGGGGAAGGTAAAGCGGCGA. In some embodiments, the target sequence for the sgRNA comprises or consists essentially of or consist of AGAGCATCGGACCGAAGC. In some embodiments, the target sequence for the sgRNA comprises or consists essentially of or consist of GGGGGAGAACATACTCGGGG.

In some embodiments, the target sequence for the sgRNA comprises or consists essentially of or consist of CCCAGGTTGCTAGGGCTTGG.

In some embodiments, the second nucleotide molecule encoding at least one small guide RNA (sgRNA) comprises or consists essentially of or consist of at least three sgRNAs.

In some embodiments, the second nucleotide molecule encoding at least one small guide RNA (sgRNA) comprises a first sgRNA, a second sgRNA, and a third sgRNA. In some embodiments, the target sequence for the first sgRNA comprises or consists essentially of or consist of AGAGCATCGGACCGAAGCGG. In some embodiments, the target sequence for the second sgRNA comprises or consists essentially of or consist of GGGGGAGAACATACTCGGGG. In some embodiments, the target sequence for the third sgRNA comprises or consists essentially of or consist of CCCAGGTTGCTAGGGCTTGG.

In some embodiments, the target sequence for the first sgRNA comprises or consists essentially of or consist of one or more of AGAGCATCGGACCGAAGCGG, GGGGGAGAACATACTCGGGG, and/or CCCAGGTTGCTAGGGCTTGG.

In one aspect, the present disclosure provides a gene editing system comprising, or consisting essentially of or yet further consisting of: a first nucleotide molecule encoding a dCas9-Ten-Eleven Translocation methylcytosine dioxygenase 1 catalytic domain (TET1CD) fusion protein, wherein the dCas9-TET1 fusion protein facilitates the targeted demethylation of a gene target selected from the group consisting of CDK5L, SCML2, COL9A3, or MECP. and a second nucleotide molecule encoding at least one single guide RNA (sgRNA), comprising, or consisting essentially of, or yet further consisting of a scaffold region and a spacer region; wherein the spacer region hybridizes to a nucleotide sequence complementary to a target sequence adjacent to a 5′-end of a protospacer adjacent motif (PAM); and wherein the target sequence and the PAM are located within about 2 or aboutl kilobase (kb) and ranges as described herein of the transcriptional start site (TSS) of the cyclin dependent kinase-like 5 (CDKL5) gene, and wherein the target sequence for the first sgRNA comprises or consists essentially of AGAGCATCGGACCGAAGCGG, the target sequence for the second sgRNA comprises or consists essentially of or consists of GGGGGAGAACATACTCGGGG, and the target sequence for the third sgRNA comprises or consists essentially of or consists of CCCAGGTTGCTAGGGCTTGG.