The content of the following submission on ASCII text file is incorporated herein by reference in its entirety: a computer readable form (CRF) of the Sequence Listing (file name: 165392000441SEQLIST.TXT, date recorded: May 24, 2022, size: 75,959 bytes).
The present application relates to compositions, systems and methods for editing RNA via targeted pseudouridylation using DKC1.
Pseudouridine (Ψ) is the most abundant post-transcriptionally modified nucleotide in stable RNAs, including tRNA, rRNA, snRNA and mRNA, constituting approximately 5% of total ribonucleotides. The conversion of uridine to Y (pseudouridylation) requires two distinct chemical reactions: the breaking of the C1′-N1 glycosydic bond and the making of a new carbon-glycosydic (C1′-C5) bond that relinks the base to the sugar. Pseudouridylation is a true isomerization reaction, which creates an extra hydrogen bond donor and influences a wide variety of functional aspects depending on the type of RNA that carries the Y and the position within the RNA sequence, such as protein synthesis and increased stop-codon read-through (Yu and Meier, 2014, RNA Biology 11:1483-1494). Many of the mRNA Ψs reside in coding regions, and the majority of them respond to environmental stress, indicating functional significance (Carlile et al. 2014, Nature 515:143).
In eukaryotes and archaea, pseudouridylation can be introduced by box H/ACA ribonucleoproteins (RNPs), each of which contains a unique small RNA (box H/ACA RNA, one of the two major classes of small nucleolar RNAs, or ‘snoRNAs’) and four core proteins (dyskerin (DKC1), NHP2, NOP10 and GAR1). Dyskerin (DKC1; also known as NAP57/CBF5) is a highly conserved multifunctional protein that acts as RNA-guided pseudouridine synthase, directing the enzymatic conversion of specific uridines to pseudouridines. It concentrates in the nucleoli and the Cajal bodies (CBs) where, in association with three other highly conserved proteins—Nop10, Nhp2, Gar1—composes a tetramer able to enter in the composition of different nuclear RNPs playing key biological functions. Within the nucleolus, the tetramer associates with H/ACA small nucleolar RNAs (snoRNAs) to compose the H/ACA snoRNPs, that regulate rRNA processing and pseudouridylate RNA targets by snoRNA-guided base complementarity. Within the CBs, it associates with CB specific small RNAs (scaRNAs) to compose the scaRNPs, that direct pseudouridylation of spliceosomal snRNAs. NAP57/dyskerin (DKC1)/CBF5 catalyzes the chemical reactions, converting the target uridine to Ψ. The RNA component serves as a guide that specifies, through base-pairing interaction with its substrate RNA, the target uridine for pseudouridylation (Ge and Yu, 2013, Trends Biochem Sci 38(4):210-218). Based on this guide-substrate base-pairing scheme, Karijolich and Yu (2011, Nature 474:395-398) designed an artificial box H/ACA RNA to introduce Ψ into mRNA at a Premature Termination Codon (PTC) in S. cerevisiae. They demonstrated that Ψ was indeed incorporated into TRM4 mRNA at the PTC. Pseudouridylated PTC promoted nonsense suppression by altering ribosome decoding (Fernandez et al. 2013, Nature 500:107-110; Wu et al. 2015, Methods in Enzymology 560:187-217; U.S. Pat. No. 8,603,457). Using a similar strategy, others showed that artificial H/ACA RNAs could site-specifically pseudouridylate pre-mRNA after microinjection into Xenopus oocytes (Chen et al. 2010, Mol Cell Biol 30:4108-4119). In both examples, the artificial H/ACA RNAs were modified to alter the loops that serve as the guide sequence, but otherwise these snoRNAs were unaltered.
Although site-specific pseudouridylation or target RNAs is a potentially powerful technique, the methods available thus far have resulted in low editing efficiency of target RNAs. Accordingly, there is a need in the art for optimized gsnoRNAs, gsnoRNA-based gene editing systems, and methods of editing a target RNA by pseudouridylation.
The present application provides methods for editing target RNAs in host cells using gsnoRNA and DKC1 protein. Embodiments of the methods are also referred herein as the “RESTART” method, which can be used to allow read-through of RNA transcripts having a premature termination codons (PTC).
In some aspects, the present application provides a method for editing a target RNA in a host cell, comprising introducing a guide small nucleolar RNA (gsnoRNA) and a nucleic acid molecule encoding a DKC1 protein into the host cell, wherein the gsnoRNA comprises a guide sequence that hybridizes to a sequence comprising a target uridine residue in the target RNA, and wherein the gsnoRNA recruits the DKC1 protein to modify the target uridine residue into a pseudouridine residue in the target RNA. In some embodiments, the gsnoRNA comprises a scaffold sequence derived from a wildtype H/ACA-snoRNA selected from the group consisting of ACA19, ACA2b, ACA36, ACA44, ACA27, E2, ACA3, and ACA17.
In some aspects, provided herein is a method for editing a target RNA in a host cell, comprising introducing an engineered gsnoRNA into the host cell, wherein the gsnoRNA comprises a guide sequence that hybridizes to a sequence comprising a target uridine residue in the target RNA, wherein the gsnoRNA comprises a scaffold sequence derived from wildtype ACA2b, ACA36, ACA44, ACA27, E2, ACA3, or ACA17, and wherein the gsnoRNA recruits a DKC1 protein in the host cell to modify the target uridine residue into a pseudouridine residue in the target RNA. In some embodiments, the DKC1 protein is an endogenous DKC1 protein of the host cell. In some embodiments, the method further comprises introducing a nucleic acid encoding the DKC1 protein into the host cell.
In some aspects, provided herein is a method for editing a target RNA in a host cell, comprising introducing an engineered gsnoRNA into the host cell, wherein the gsnoRNA comprises a guide sequence that hybridizes to a sequence comprising a target uridine residue in the target RNA, wherein the gsnoRNA comprises a nucleotide sequence selected from the group consisting of the nucleotide sequences provided in Table 2, Table 3, or Table 4, and wherein the gsnoRNA recruits a DKC1 protein in the host cell to modify the target uridine residue into a pseudouridine residue in the target RNA.
In some aspects, provided herein is a method for editing a target RNA in a host cell, comprising introducing an engineered gsnoRNA into the host cell, wherein the gsnoRNA comprises a guide sequence that hybridizes to a sequence comprising a target uridine residue in the target RNA, wherein the gsnoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 4-6, 9-12, 15-19, 22-36, and 177-179 and wherein the gsnoRNA recruits a DKC1 protein in the host cell to modify the target uridine residue into a pseudouridine residue in the target RNA.
In some aspects, provided herein is a method for editing a target RNA in a host cell, comprising introducing an engineered gsnoRNA into the host cell, wherein the gsnoRNA comprises a guide sequence that hybridizes to a sequence comprising a target uridine residue in the target RNA, wherein the gsnoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 20-21 and 145-150, and wherein the gsnoRNA recruits a DKC1 protein in the host cell to modify the target uridine residue into a pseudouridine residue in the target RNA. In some embodiments, the DKC1 protein is an endogenous DKC1 protein of the host cell. In some embodiments, the method further comprises introducing a nucleic acid encoding the DKC1 protein into the host cell.
In some embodiments according to any of the methods described above, the DKC1 protein has cytoplasmic localization in the host cell.
In some embodiments according to any of the methods described above, the DKC1 protein comprises a DKC1 protein fragment corresponding to amino acid residues 41 to 420 of a human DKC1 isoform 3 protein, wherein the amino acid numbering is according to SEQ ID NO: 2.
In some embodiments according to any of the methods described above, the DKC1 protein comprises an amino acid sequence having at least 85% (e.g., at least about any one of 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more) identity to SEQ ID NO: 88. In some embodiments, the DKC1 protein comprises the amino acid sequence of SEQ ID NO: 88. In some embodiments according to any of the methods described above, the DKC1 protein comprises a naturally occurring DKC1 isoform with cytoplasmic localization in the host cell.
In some aspects, provided herein is a method for editing a target RNA in a host cell, comprising introducing an engineered gsnoRNA into the host cell, wherein the gsnoRNA comprises a guide sequence that hybridizes to a sequence comprising a target uridine residue in the target RNA, wherein the host cell expresses a DKC1 isoform with cytoplasmic localization, and wherein the gsnoRNA recruits the DKC1 isoform to modify the target uridine residue into a pseudouridine residue in the target RNA.
In some aspects, provided herein is a method for editing a target RNA in a host cell, comprising introducing (a) an engineered gsnoRNA and (b) a splice-switching antisense oligonucleotide (ASO) into the host cell, wherein the gsnoRNA comprises a guide sequence that hybridizes to a sequence comprising a target uridine residue in the target RNA, wherein the ASO enhances expression of a DKC1 protein that is an endogenous DKC1 isoform with cytoplasmic localization in the host cell, and wherein the gsnoRNA recruits the DKC1 protein to modify the target uridine residue into a pseudouridine residue in the target RNA. In some embodiments, the gsnoRNA comprises a scaffold sequence derived from a wildtype H/ACA-snoRNA selected from the group consisting of ACA19, ACA2b, ACA36, ACA44, ACA27, E2, ACA3, and ACA17.
In some embodiments according to any of the methods described above, the DKC1 isoform corresponds to isoform 3 of human DKC1 protein.
In some embodiments according to any of the methods described above, the DKC1 protein comprises an amino acid sequence having at least 85% (e.g., at least about any one of 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more) identity to SEQ ID NO: 2. In some embodiments, the DKC1 protein comprises the amino acid sequence of SEQ ID NO: 2.
In some embodiments according to any of the methods described above, the DKC1 protein comprises a DKC1 protein fragment corresponding to amino acid residues 1 to 419 of a full-length human DKC1 isoform 1 protein, wherein the amino acid numbering is according to SEQ ID NO: 1.
In some embodiments according to any of the methods described above, the gsnoRNA comprises a scaffold sequence is derived from ACA2b. In some embodiments according to any of the methods described above, the gsnoRNA comprises a scaffold sequence derived from ACA36. In some embodiments, the gsnoRNA comprises a mutation in the 3′ hairpin of the ACA36 scaffold.
In some embodiments according to any of the methods described above, the gsnoRNA comprises a scaffold sequence derived from ACA19.
In some embodiments according to any of the methods described above, the gsnoRNA comprises one or more guide sequences each located in a region corresponding to a hairpin structure of the wildtype H/ACA-snoRNA. In some embodiments, at least one of the one or more guide sequences is located in a hairpin structure at the 3′ terminal part (also referred herein as “3′ hairpin structure”) of the wildtype H/ACA-snoRNA. In some embodiments, at least one of the one or more guide sequences is located in a hairpin structure at the 5′ terminal part (also referred herein as “5′ hairpin structure”) of the wildtype H/ACA-snoRNA. In some embodiments, the gsnoRNA comprises a single guide sequence. In some embodiments, the gsnoRNA comprises two or more (e.g., 2, 3, 4, 5, 6, or more) guide sequences.
In some embodiments according to any of the methods described above, the gsnoRNA comprises one or more mutations (e.g., substitution, insertion and/or deletion) in one or more hairpin structures (e.g., the 3′ and/or 5′ hairpin structures) of the wildtype ACA19.
In some embodiments according to any of the methods described above, the engineered gsnoRNA comprises one or more substitution mutations in nucleotides of a polyU sequence in the wildtype H/ACA-snoRNA, wherein the polyU sequence comprises at least 4 consecutive U residues.
In some embodiments according to any of the methods described above, the engineered gsnoRNA comprises one or more insertion or deletion mutations positioned between the nucleotide residue in the guide region that hybridizes to the target uridine and an H/ACA box of the wildtype H/ACA snoRNA, whereby the engineered gsnoRNA comprises 14 or 15 nucleotides between the nucleotide residue in the guide region that hybridizes to the target uridine and the H/ACA box.
In some embodiments, the one or more mutations are selected from the group consisting of substitution of residues 26-29 with UUCU, substitution of residues 26-29 with UGUU, addition of G to the 3′ hairpin structure after residue 115, and addition of a dinucleotide sequence (XX, e.g., CU) to the 5′ hairpin after residue 8, wherein X is a nucleotide selected from A, U, C, and G, and wherein the numbering is according to SEQ ID NO: 37. In some embodiments, the dinucleotide sequence is part of the guide RNA designed to hybridize to the target RNA.
In some embodiments according to any of the methods described above, the gsnoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 3-12, 15-19, 22-36, and 177-179. In some embodiments, the gsnoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 15-19.
In some embodiments according to any of the methods described above, the gsnoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 20-21 and 145-150.
In some embodiments according to any of the methods described above, the method comprises introducing a nucleic acid molecule encoding the gsnoRNA into the host cell. In some embodiments, the nucleic acid molecule encoding the gsnoRNA is under a small RNA promoter. In some embodiments, the nucleic acid molecule encoding the gsnoRNA is under a promoter selected from the group consisting of U6 (e.g., transcribed by Polymerase III) and U1 (e.g., transcribed by Polymerase II) promoters. In some embodiments, the nucleic acid molecule encoding the gsnoRNA is embedded in an intron sequence located between a first exon sequence and a second exon sequence, and wherein the first exon sequence, the intron sequence and the second exon sequence are derived from a naturally-occurring gene. In some embodiments, the intron sequence is from an intron of an endogenous gene in the host cell, wherein the gene is selected from the group consisting of EIF3A, SNHG12, RPL21, and RPSA. In some embodiments, the intron sequence is from an intron of an exogenous gene, such as HBB. In some embodiments, the nucleic acid encoding the gsnoRNA is not embedded in an intron sequence.
In some embodiments according to any of the methods described above, the nucleic acid molecule encoding the DKC1 protein is present in a viral vector. In some embodiments, the nucleic acid molecule encoding the gsnoRNA is present in a viral vector.
In some embodiments according to any of the methods described above, the method comprises introducing into the host cell a vector comprising a first nucleic acid sequence encoding the DKC1 protein and a second nucleic acid sequence encoding the gsnoRNA. In some embodiments, the nucleic acid molecule encoding the DKC1 protein and the nucleic acid molecule encoding the gsnoRNA are present in separate vectors. In some embodiments, the vector is a viral vector. In some embodiments, the vector is an adeno-associated viral (AAV) vector.
In some embodiments according to any of the methods described above, the gsnoRNA comprises one or more chemically modified nucleosides and/or inter-nucleosidic linkages. In some embodiments, the gsnoRNA comprises one or more nucleosides having 2′-OMe or 2′-MOE modifications. In some embodiments, the gsnoRNA comprises no more than 10, no more than 8, no more than 6, or no more than 4 chemically modified nucleosides. In some embodiments, the gsnoRNA comprises one or more phosphorothioate inter-nucleosidic linkages. In some embodiments, the gsnoRNA comprises no more than 10, no more than 9, no more than 8, or no more than 6 phosphorothioate inter-nucleosidic linkages. In some embodiments, the gsnoRNA comprises a 5′ cap modification. In some embodiments, the 5′ cap modification is a 7-methylguanosine (m7G) cap. In some embodiments, the gsnoRNA does not comprise one or more chemically modified nucleosides or inter-nucleosidic linkages. cap.
In some embodiments according to any of the methods described above, efficiency of editing the target RNA is at least 10% (e.g., at least about any one of 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 60%, or higher).
In some embodiments according to any of the methods described above, wherein the sequence comprising the target uridine in the target RNA is a premature termination codon in a sequence encoding a protein, the method results in expression of the full-length protein in the host cell of at least 4% (e.g., at least 5%, at least 6%, at least 7%, at least 8%, at least 9% or at least 10%) of the expression level of the full-length protein without a premature termination codon.
In some embodiments according to any of the methods described above, wherein the sequence comprising the target uridine in the target RNA is a premature termination codon in a sequence encoding a protein, the method results in expression of the full-length protein, wherein the expression of the protein is detectable without enrichment (e.g., without enrichment by immunoprecipitation). In some embodiments, the protein is detected via a tag (e.g., via a fluorescent tag). In some embodiments, the protein is detected by immo-staining according to methods known in the art.
In some embodiments according to any of the methods described above, wherein the sequence comprising the target uridine in the target RNA is a premature termination codon in a sequence encoding a protein, the method results in expression of the full-length protein in at least 20% of host cells (e.g., at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, or at least 50% of host cells).
In some embodiments according to any of the methods described above, the target RNA is not a ribosomal RNA (rRNA) such as an endogenous rRNA of the host cell.
In some embodiments according to any of the methods described above, the target RNA is a messenger RNA (mRNA). In some embodiments, the sequence comprising the target uridine in the target RNA is a stop codon, and modification of the target uridine to pseudouridine causes the stop codon to be translated as a coding codon. In some embodiments, the stop codon is a premature termination codon (PTC). In some embodiments, the PTC is associated with a genetic disease or condition. In some embodiments, the sequence comprising the target uridine in the target RNA is a stop codon, and modification of the target uridine to pseudouridine reduces or prevents nonsense-mediate decay (NMD).
In some embodiments according to any of the methods described above, the DKC1 protein is part of a ribonucleoprotein (RNP) complex that associates with the gsnoRNA. In some embodiments, the RNP complex comprises NOP10, GAR1, and NHP2.
In some embodiments according to any of the methods described above, the host cell is an archaeal cell. In some embodiments, the host cell is a eukaryotic cell. In some embodiments, the host cell is a mammalian cell. In some embodiments, the host cell is a human cell.
In some embodiments according to any of the methods described above, the method is carried out in vivo. In some embodiments, the method is carried out ex vivo.
In some aspects, provided herein is a method of treating a disease or condition associated with a PTC in a target RNA in a subject, comprising editing the target RNA in a cell of the subject using any of the methods described above, wherein the gsnoRNA comprises a guide sequence that hybridizes to the PTC in the target RNA, and wherein modification of the uridine residue in the PTC to a pseudouridine residue causes translation read-through of the PTC in the target RNA, thereby treating the disease or condition in the subject.
In some embodiments, the disease or condition is selected from the group consisting of Cystic fibrosis, Hurler Syndrome, alpha-1-antitrypsin (A1AT) deficiency, Parkinson's disease, Alzheimer's disease, albinism, Amyotrophic lateral sclerosis, Asthma, 8-thalassemia, Cadasil syndrome, Charcot-Marie-Tooth disease, Chronic Obstructive Pulmonary Disease (COPD), Distal Spinal Muscular Atrophy (DSMA), Duchenne/Becker muscular dystrophy, Dystrophic Epidermolysis bullosa, Epidermolysis bullosa, Fabry disease, Factor V Leiden associated disorders, Familial Adenomatous Polyposis, Galactosemia, Gaucher's Disease, Glucose-6-phosphate dehydrogenase, Haemophilia, Hereditary Hemochromatosis, Hunter Syndrome, Huntington's disease, Inflammatory Bowel Disease (IBD), Inherited polyagglutination syndrome, Leber congenital amaurosis, Lesch-Nyhan syndrome, Lynch syndrome, Marfan syndrome, Mucopolysaccharidosis, Muscular Dystrophy, Myotonic dystrophy types I and II, neurofibromatosis, Niemann-Pick disease type A, B and C, NY-esol related cancer, Peutz-Jeghers Syndrome, Phenylketonuria, Pompe's disease, Primary Ciliary Disease, Prothrombin mutation related disorders, such as the Prothrombin G20210A mutation, Pulmonary Hypertension, (autosomal dominant) Retinitis Pigmentosa, Sandhoff Disease, Severe Combined Immune Deficiency Syndrome (SCID), Sickle Cell Anemia, Spinal Muscular Atrophy, Stargardt's Disease, Tay-Sachs Disease, Usher syndrome, X-linked immunodeficiency, Sturge-Weber Syndrome, and cancer.
In some aspects, provided herein is an engineered gsnoRNA comprising a guide sequence that hybridizes to a sequence comprising a target uridine residue in a target RNA in a host cell. In some embodiments, the gsnoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 4-6, 9-12, 15-19, 22-36, and 177-179, and the gsnoRNA is capable of recruiting a DKC1 protein in the host cell to modify the target uridine residue into a pseudouridine residue in the target RNA. In some embodiments, the gsnoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 20-21 and 145-150, and the gsnoRNA is capable of recruiting a DKC1 protein in the host cell to modify the target uridine residue into a pseudouridine residue in the target RNA.
In some aspects, provided herein is an engineered gsnoRNA comprising a guide sequence that hybridizes to a sequence comprising a target uridine residue in a target RNA in a host cell, wherein the gsnoRNA comprises a scaffold sequence derived from a wildtype H/ACA-snoRNA selected from the group consisting of ACA2b, ACA36, ACA44, ACA27, E2, ACA3, and ACA17, and wherein the gsnoRNA is capable of recruiting a DKC1 protein in the host cell to modify the target uridine residue into a pseudouridine residue in the target RNA.
In some embodiments according to any of the engineered gsnoRNAs described above, the gsnoRNA comprises a 5′ cap modification. In some embodiments, the 5′ cap modification is a 7-methylguanosine (m7G) cap. In some embodiments according to any of the engineered gsnoRNAs described above, the gsnoRNA comprises one or more chemically modified nucleosides and/or inter-nucleosidic linkages. In some embodiments according to any one of the engineered gsnoRNAs described above, the gsnoRNA comprises one or more nucleosides having 2′-OMe or 2′-MOE modifications. In some embodiments, the gsnoRNA comprises no more than 10, no more than 8, no more than 6, or no more than 4 chemically modified nucleosides. In some embodiments, the gsnoRNA comprises one or more phosphorothioate inter-nucleosidic linkages. In some embodiments, the gsnoRNA comprises no more than 10, no more than 9, no more than 8, or no more than 6 phosphorothioate inter-nucleosidic linkages.
In some embodiments according to any of the engineered gsnoRNAs described above, the gsnoRNA comprises a scaffold sequence is derived from ACA2b. In some embodiments according to any of the engineered gsnoRNAs described above, the gsnoRNA comprises a scaffold sequence derived from ACA36. In some embodiments, the gsnoRNA comprises a mutation in the 3′ hairpin of the ACA36 scaffold.
In some embodiments according to any of the engineered gsnoRNAs described above, the gsnoRNA comprises a scaffold sequence derived from ACA19.
In some embodiments according to any of the engineered gsnoRNAs described above, the gsnoRNA comprises one or more guide sequences each located in a region corresponding to a hairpin structure of the wildtype H/ACA-snoRNA. In some embodiments, at least one of the one or more guide sequences is located in a hairpin structure at the 3′ terminal part of the wildtype H/ACA-snoRNA. In some embodiments, at least one of the one or more guide sequences is located in a hairpin structure at the 5′ terminal part of the wildtype H/ACA-snoRNA. In some embodiments, the gsnoRNA comprises a single guide sequence. In some embodiments, the gsnoRNA comprises two or more (e.g., 2, 3, 4, 5, 6, or more) guide sequences.
In some embodiments according to any of the engineered gsnoRNAs described above, the gsnoRNA comprises one or more mutations (e.g., substitution, insertion and/or deletion) in one or more hairpin structures (e.g., the 3′ and/or 5′ hairpin structures) of the wildtype ACA19.
In some embodiments according to any of the engineered gsnoRNAs described above, the engineered gsnoRNA comprises one or more substitution mutations in nucleotides of a polyU sequence in the wildtype H/ACA-snoRNA, wherein the polyU sequence comprises at least 4 consecutive U residues.
In some embodiments according to any of the engineered gsnoRNAs described above, the engineered gsnoRNA comprises one or more insertion or deletion mutations positioned between the nucleotide residue in the guide region that hybridizes to the target uridine and an H/ACA box of the wildtype H/ACA snoRNA, whereby the engineered gsnoRNA comprises 14 or 15 nucleotides between the nucleotide residue in the guide region that hybridizes to the target uridine and the H/ACA box.
In some embodiments, the one or more mutations are selected from the group consisting of substitution of residues 26-29 with UUCU, substitution of residues 26-29 with UGUU, addition of G to the 3′ hairpin structure after residue 115, and addition of a dinucleotide sequence (XX, e.g., CU) to the 5′ hairpin after residue 8, wherein X is a nucleotide selected from A, U, C, and G, and wherein the numbering is according to SEQ ID NO: 37. In some embodiments, the engineered gsnoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 15-19.
In some aspects, provided herein is an engineered gsnoRNA comprising a guide sequence that hybridizes to a sequence comprising a target uridine residue in a target RNA in a host cell, wherein the gsnoRNA comprises a scaffold sequence derived from a wildtype ACA19, wherein the engineered gsnoRNA comprises one or more insertion or deletion mutations positioned between the nucleotide residue in the guide region that hybridizes to the target uridine and an H/ACA box of the wildtype ACA19, wherein the engineered gsnoRNA comprises 14 or 15 nucleotides between the nucleotide residue in the guide region that hybridizes to the target uridine and the H/ACA box, and wherein the gsnoRNA is capable of recruiting a DKC1 protein in the host cell to modify the target uridine residue into a pseudouridine residue in the target RNA. In some embodiments, the one or more mutations are selected from the group consisting of the one or more mutations are selected from the group consisting of substitution of residues 26-29 with UUCU, substitution of residues 26-29 with UGUU, addition of G to the 3′ hairpin structure after residue 115, and addition of a dinucleotide sequence (XX, e.g., CU) to the 5′ hairpin after residue 8, wherein X is a nucleotide selected from A, U, C, and G, and wherein the numbering is according to SEQ ID NO: 37. In some embodiments, the gsnoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 15-19.
In some aspects, provided herein is an isolated nucleic acid molecule comprising a sequence encoding the engineered gsnoRNA of any of the preceding embodiments. In some embodiments, provided herein is a vector (e.g., viral vector) comprising the nucleic acid molecule.
In some aspects, provided herein is an engineered RNA-editing system comprising: (a) a gsnoRNA comprising a guide sequence that hybridizes to a sequence comprising a target uridine residue in a target RNA in a host cell, or a nucleic acid molecule encoding the gsnoRNA; and (b) a DKC1 protein, or a nucleic acid molecule encoding the DKC1 protein, wherein the gsnoRNA is capable of recruiting the DKC1 protein to modify the target uridine residue into a pseudouridine residue in the target RNA.
In some embodiments, the DKC1 protein has cytoplasmic localization in the host cell. In some embodiments according to any of the methods described above, the DKC1 protein comprises a DKC1 protein fragment corresponding to amino acid residues 41 to 420 of a human DKC1 isoform 3 protein, wherein the amino acid numbering is according to SEQ ID NO: 2.
In some embodiments according to any of the methods described above, the DKC1 protein comprises an amino acid sequence having at least 85% (e.g., at least about any one of 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more) identity to SEQ ID NO: 88. In some embodiments, the DKC1 protein comprises the amino acid sequence of SEQ ID NO: 88. In some embodiments according to any of the methods described above, the DKC1 protein comprises a naturally occurring DKC1 isoform with cytoplasmic localization in the host cell.
In some embodiments according to any of the engineered RNA-editing systems described above, the DKC1 isoform corresponds to isoform 3 of human DKC1 protein.
In some embodiments according to any of the engineered RNA-editing systems described above, the DKC1 protein comprises an amino acid sequence having at least 85% (e.g., at least about any one of 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more) identity to SEQ ID NO: 2. In some embodiments, , the DKC1 protein comprises the amino acid sequence of SEQ ID NO: 2.
In some embodiments according to any of the engineered RNA-editing systems described above, the DKC1 protein comprises a DKC1 protein fragment corresponding to amino acid residues 1 to 419 of a full-length human DKC1 isoform 1 protein, wherein the amino acid numbering is according to SEQ ID NO: 1.
In some embodiments according to any of the engineered RNA-editing systems described above, the gsnoRNA comprises a scaffold sequence is derived from ACA2b. In some embodiments according to any of the engineered RNA-editing systems described above, the gsnoRNA comprises a scaffold sequence derived from ACA36. In some embodiments, the gsnoRNA comprises a mutation in the 3′ hairpin of the ACA36 scaffold.
In some embodiments according to any of the engineered RNA-editing systems described above, the gsnoRNA comprises a scaffold sequence derived from ACA19.
In some embodiments according to any of the engineered RNA-editing systems described above, the gsnoRNA comprises one or more guide sequences each located in a region corresponding to a hairpin structure of the wildtype H/ACA-snoRNA. In some embodiments, at least one of the one or more guide sequences is located in a hairpin structure at the 3′ terminal part of the wildtype H/ACA-snoRNA. In some embodiments, at least one of the one or more guide sequences is located in a hairpin structure at the 5′ terminal part of the wildtype H/ACA-snoRNA. In some embodiments, the gsnoRNA comprises a single guide sequence. In some embodiments, the gsnoRNA comprises two or more (e.g., 2, 3, 4, 5, 6, or more) guide sequences.
In some embodiments according to any of the engineered RNA-editing systems described above, the gsnoRNA comprises one or more mutations (e.g., substitution, insertion and/or deletion) in one or more hairpin structures (e.g., the 3′ and/or 5′ hairpin structures) of the wildtype ACA19.
In some embodiments according to any of the engineered RNA-editing systems described above, the engineered gsnoRNA comprises one or more substitution mutations in nucleotides of a polyU sequence in the wildtype H/ACA-snoRNA, wherein the polyU sequence comprises at least 4 consecutive U residues.
In some embodiments according to any of the engineered RNA-editing systems described above, the engineered gsnoRNA comprises one or more insertion or deletion mutations positioned between the nucleotide residue in the guide region that hybridizes to the target uridine and an H/ACA box of the wildtype H/ACA snoRNA, whereby the engineered gsnoRNA comprises 14 or 15 nucleotides between the nucleotide residue in the guide region that hybridizes to the target uridine and the H/ACA box.
In some embodiments of the engineered RNA-editing system, the DKC1 protein is part of a ribonucleoprotein (RNP) complex that associates with the gsnoRNA. In some embodiments, the ribonucleoprotein complex comprises NOP10, GAR1, and/or NHP2.
In some embodiments, the one or more mutations are selected from the group consisting of the one or more mutations are selected from the group consisting of substitution of residues 26-29 with UUCU, substitution of residues 26-29 with UGUU, addition of G to the 3′ hairpin structure after residue 115, and addition of a dinucleotide sequence (XX, e.g., CU) to the 5′ hairpin after residue 8, wherein X is a nucleotide selected from A, U, C, and G, and wherein the numbering is according to SEQ ID NO: 37.
In some embodiments according to any of the engineered RNA-editing systems described above, the gsnoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 3-12, 15-19, 22-36, and 177-179. In some embodiments, the gsnoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 15-19.
In some embodiments according to any of the engineered RNA-editing systems described above, the gsnoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 20-21 and 145-150.
In some aspects, provided herein is a pharmaceutical composition comprising any of the gsnoRNA, the nucleic acid molecules, or the engineered RNA-editing systems described above, and a pharmaceutically acceptable carrier.
In some aspects, provided herein is a host cell comprising any of the gsnoRNA, the nucleic acid molecules, or the engineered RNA-editing systems described above.
In some aspects, provided herein is a kit for editing a target RNA in a host cell, comprising any of the gsnoRNA, the nucleic acid molecules, or the engineered RNA-editing systems described above.
Also provided are compositions, kits and articles of manufacture for use in any one the methods described above.
The present application provides methods and compositions for editing a target RNA in a host cell, comprising introducing an engineered guide small nucleolar RNA (gsnoRNA) into the host cell, wherein the gsnoRNA comprises a guide sequence that hybridizes to a sequence comprising a target uridine residue in the target RNA, and wherein the gsnoRNA recruits a DKC1 protein to modify the target uridine residue into a pseudouridine residue in the target RNA. In some aspects, the gsnoRNA is an engineered gsnoRNA comprising one or more mutations compared to a wildtype H/ACA scaffold. In some embodiments, the one or more mutations increase the editing efficiency of the gsnoRNA. In some aspects, the method further comprises increasing the cellular levels of a DKC1 protein with cytoplasmic localization, whereby the editing efficiency of the gsnoRNA/DKC1 protein complex is increased. In some aspects, the methods and compositions provided herein can be used to edit a premature termination codon (PTC) in a target gene mRNA, thereby suppressing nonsense-mediated decay of the mRNA and promoting translation of the full-length protein. In some embodiments, the methods disclosed herein can be used to treat a disease associated with a PTC in a target gene.
In some aspects, the present disclosure provides engineered gsnoRNAs and gsnoRNA scaffolds, or nucleic acid molecules encoding the gsnoRNAs. In some embodiments, the engineered gsnoRNA scaffolds are based on wildtype H/ACA snoRNA scaffolds identified by the present inventors as having higher editing efficiency compared to other scaffolds. In some embodiments, the engineered gsnoRNA scaffolds comprise mutations that increase their editing efficiency.
The methods and compositions described in the present application are based at least in part on the unexpected discovery that expression of an isoform of DKC1 with cytoplasmic localization (e.g., isoform 3 of human DKC1) significantly increases the editing efficiency of a target RNA using a gsnoRNA/DKC1 system. In one aspect, the present inventors realized that by introducing an exogenous DKC1 isoform with cytoplasmic localization, the editing efficiency of a gsnoRNA could be increased. In another aspect, the present inventors identified truncation and deletion variants of the DKC1 protein that can be used to increase the editing efficiency of a gsnoRNA.
In some aspects, provided herein are nucleic acid constructs encoding gsnoRNA for use according to the methods described herein. In some embodiments, the present inventors identified promoters and construct configurations for gsnoRNA expression that provide increased editing efficiency of the gsnoRNA.
Terms are used herein as generally used in the art, unless otherwise defined as follows.
The terms “polynucleotide,” “nucleic acid,” “nucleotide sequence,” and “nucleic acid sequence” are used interchangeably. They refer to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof.
As used herein, “complementarity” refers to the ability of a nucleic acid to form hydrogen bond(s) with another nucleic acid by traditional Watson-Crick and Wobble base-pairing. A percent complementarity indicates the percentage of residues in a nucleic acid molecule which can form hydrogen bonds (i.e., Watson-Crick and Wobble base pairing) with a second nucleic acid (e.g., about 5, 6, 7, 8, 9, 10 out of 10, being about 50%, 60%, 70%, 80%, 90%, and 100% complementary respectively). “Perfectly complementary” means that all the contiguous residues of a nucleic acid sequence form hydrogen bonds with the same number of contiguous residues in a second nucleic acid sequence. “Substantially complementary” as used herein refers to a degree of complementarity that is at least about any one of 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%, 99%, or 100% over a region of about 40, 50, 60, 70, 80, 100, 150, 200, 250 or more nucleotides, or refers to two nucleic acids that hybridize under stringent conditions.
References to “hybridization” typically refer to specific hybridization, and exclude non-specific hybridization. Specific hybridization can occur under experimental conditions chosen, using techniques well known in the art, to ensure that the majority of stable interactions between probe and target are where the probe and target have at least 70%, preferably at least 80%, more preferably at least 90% sequence identity.
The term “mismatch” is used herein to refer to opposing nucleotides in a double stranded RNA complex which do not form perfect base pairs according to the Watson-Crick and Wobble base pairing rules. Mismatching nucleotides are G-A, C-A, U-C, A-A, G-G, C-C, U-U pairs. Wobble base pairs are: G-U, I-U, I-A, and I-C base pairs.
The present disclosure provides several types of compositions that are polynucleotide or polypeptide based, including variants and derivatives. These include, for example, substitutional, insertional, deletion and covalent variants and derivatives. The term “derivative” is synonymous with the term “variant” and generally refers to a molecule that has been modified and/or changed in any way relative to a reference molecule or a starting molecule.
As such, polynucleotides encoding peptides or polypeptides containing substitutions, insertions and/or additions, deletions and covalent modifications with respect to reference sequences, in particular, the polypeptide sequences disclosed herein, are included within the scope of this disclosure. For example, sequence tags or amino acids, such as one or more lysines, can be added to peptide sequences (e.g., at the N-terminal or C-terminal ends). Sequence tags can be used for peptide detection, purification or localization. Lysines can be used to increase peptide solubility or to allow for biotinylation. Alternatively, amino acid residues located at the carboxy and amino terminal regions of the amino acid sequence of a peptide or protein may optionally be deleted providing for truncated sequences. Certain amino acids (e.g., C-terminal residues or N-terminal residues) alternatively may be deleted depending on the use of the sequence, as for example, expression of the sequence as part of a larger sequence that is soluble, or linked to a solid support.
The term “identity” refers to the overall relatedness between polymeric molecules, for example, between polynucleotide molecules (e.g. DNA molecules and/or RNA molecules) and/or between polypeptide molecules. Calculation of the percent identity of two polynucleic acid sequences, for example, can be performed by aligning the two sequences for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second nucleic acid sequences for optimal alignment and non-identical sequences can be disregarded for comparison purposes). In certain embodiments, the length of a sequence aligned for comparison purposes is at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or 100% of the length of the reference sequence. The nucleotides at corresponding nucleotide positions are then compared. When a position in the first sequence is occupied by the same nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which needs to be introduced for optimal alignment of the two sequences. The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. For example, the percent identity between two nucleic acid sequences can be determined using methods such as those described in Computational Molecular Biology, Lesk, A. M., ed., Oxford University Press, New York, 1988; Biocomputing: Informatics and Genome Projects, Smith, D. W., ed., Academic Press, New York, 1993; Sequence Analysis in Molecular Biology, von Heinje, G., Academic Press, 1987; Computer Analysis of Sequence Data, Part I, Griffin, A. M., and Griffin, H. G., eds., Humana Press, New Jersey, 1994; and Sequence Analysis Primer, Gribskov, M. and Devereux, J., eds., M Stockton Press, New York, 1991; each of which is incorporated herein by reference. For example, the percent identity between two nucleic acid sequences can be determined using the algorithm of Meyers and Miller (CABIOS, 1989, 4:11-17), which has been incorporated into the ALIGN program (version 2.0) using a PAM 120 weight residue table, a gap length penalty of 12 and a gap penalty of 4. The percent identity between two nucleic acid sequences can, alternatively, be determined using the GAP program in the GCG software package using an NWSgapdna. CMP matrix. Methods commonly employed to determine percent identity between sequences include, but are not limited to those disclosed in Carillo, H., and Lipman, D., SIAM J Applied Math., 48:1073 (1988); incorporated herein by reference. Techniques for determining identity are codified in publicly available computer programs. Exemplary computer software to determine homology between two sequences include, but are not limited to, GCG program package, Devereux, J., et al., Nucleic Acids Research, 12(1), 387 (1984)), BLASTP, BLASTN, and FASTA Altschul, S. F. et al., J. Molec. Biol., 215, 403 (1990)).
“Percent (%) amino acid sequence identity” with respect to the polypeptide sequences identified herein is defined as the percentage of amino acid residues in a candidate sequence that are identical with the amino acid residues in the polypeptide being compared, after aligning the sequences considering any conservative substitutions as part of the sequence identity. Alignment for purposes of determining percent amino acid sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN, Megalign (DNASTAR), or MUSCLE software. Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full-length of the sequences being compared. For purposes herein, however, % amino acid sequence identity values are generated using the sequence comparison computer program MUSCLE (Edgar, R. C., Nucleic Acids Research 32(5): 1792-1797, 2004; Edgar, R. C., BMC Bioinformatics 5(1): 113, 2004, each of which are incorporated herein by reference in their entirety for all purposes).
The terms “non-naturally occurring” or “engineered” are used interchangeably and indicate the involvement of the hand of man. The terms, when referring to nucleic acid molecules or polypeptides mean that the nucleic acid molecule or the polypeptide is comprises at least one modification (e.g., at least one mutation, such as a substitution, insertion, or deletion, or at least one non-naturally occurring chemical modification) compared to a naturally-occurring nucleic acid molecule or polypeptide, or is at least substantially free from at least one other component with which they are naturally associated in nature and as found in nature.
The term “wildtype” as used herein in reference to an ACA scaffold sequence refers to the sequence of a naturally occurring box H/ACA small nucleolar RNA.
As used herein, “expression” refers to the process by which a polynucleotide is transcribed from a DNA template (such as into and mRNA or other RNA transcript) and/or the process by which a transcribed mRNA is subsequently translated into peptides, polypeptides, or proteins. Transcripts and encoded polypeptides may be collectively referred to as “gene product.” If the polynucleotide is derived from genomic DNA, expression may include splicing of the mRNA in a eukaryotic cell.
The terms “polypeptide” or “peptide” are used herein to encompass all kinds of naturally occurring and synthetic proteins, including protein fragments of all lengths, fusion proteins and modified proteins, including without limitation, glycoproteins, as well as all other types of modified proteins (e.g., proteins resulting from phosphorylation, acetylation, myristoylation, palmitoylation, glycosylation, oxidation, formylation, amidation, polyglutamylation, ADP-ribosylation, pegylation, biotinylation, etc.).
The term “pharmaceutical composition” refers to a preparation which is in such form as to permit the biological activity of an active ingredient contained therein to be effective, and which contains no additional components which are unacceptably toxic to a subject to which the formulation would be administered.
A “pharmaceutically acceptable carrier” refers to one or more ingredients in a pharmaceutical formulation, other than an active ingredient, which is nontoxic to a subject. A pharmaceutically acceptable carrier includes, but is not limited to, a buffer, excipient, stabilizer, cryoprotectant, tonicity agent, preservative, and combinations thereof. Pharmaceutically acceptable carriers or excipients have preferably met the required standards of toxicological and manufacturing testing and/or are included on the Inactive Ingredient Guide prepared by the U.S. Food and Drug administration or other state/federal government or listed in the U.S. Pharmacopeia or other generally recognized pharmacopeia for use in mammals, and more particularly in humans.
The term “package insert” is used to refer to instructions customarily included in commercial packages of therapeutic products, that contain information about the indications, usage, dosage, administration, combination therapy, contraindications and/or warnings concerning the use of such therapeutic products.
An “article of manufacture” is any manufacture (e.g., a package or container) or kit comprising at least one reagent, e.g., a medicament for treatment of a disease or condition (e.g., coronavirus infection), or a probe for specifically detecting a biomarker described herein. In certain embodiments, the manufacture or kit is promoted, distributed, or sold as a unit for performing the methods described herein.
It is understood that embodiments described herein include “consisting” and/or “consisting essentially of” embodiments.
Reference to “about” a value or parameter herein includes (and describes) variations that are directed to that value or parameter per se. For example, description referring to “about X” includes description of “X”.
As used herein, reference to “not” a value or parameter generally means and describes “other than” a value or parameter. For example, the method is not used to treat disease of type X means the method is used to treat disease of types other than X.
The term “about X-Y” used herein has the same meaning as “about X to about Y.”
As used herein and in the appended claims, the singular forms “a,” “an,” or “the” include plural referents unless the context clearly dictates otherwise.
The term “and/or” as used herein a phrase such as “A and/or B” is intended to include both A and B; A or B; A (alone); and B (alone). Likewise, the term “and/or” as used herein a phrase such as “A, B, and/or C” is intended to encompass each of the following embodiments: A, B, and C; A, B, or C; A or C; A or B; B or C; A and C; A and B; B and C; A (alone); B (alone); and C (alone).
In some aspects, provided herein is an engineered gsnoRNA comprising a guide sequence that hybridizes to a sequence comprising a target uridine residue in a target RNA in a host cell, wherein the gsnoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 4-6, 9-12, 15-19, 22-36, and 177-179, and wherein the gsnoRNA is capable of recruiting a DKC1 protein in the host cell to modify the target uridine residue into a pseudouridine residue in the target RNA. In some embodiments, the gsnoRNA comprises one or more nucleosides having 2′-OMe or 2′-MOE modifications. In some embodiments, the engineered gsnoRNA comprises no more than 10, no more than 8, no more than 6, or no more than 4 chemically modified nucleosides. In some embodiments, the engineered gsnoRNA comprises one or more phosphorothioate inter-nucleosidic linkages. In some embodiments, the gsnoRNA comprises no more than 10, no more than 9, no more than 8, or no more than 6 phosphorothioate inter-nucleosidic linkages. In some embodiments, the gsnoRNA comprises a 5′ cap modification (e.g., a 7-methylguanosine (m7G) cap modification). In some embodiments, the 5′ cap modification is introduced by in vitro transcription using an m7G(5′)ppp(5′)G cap analog.
In some embodiments, the engineered gsnoRNA is produced by in vitro transcription. In some embodiments, the engineered gsnoRNA produced by in vitro transcription is a full-length gsnoRNA (e.g., comprising a 3′ hairpin, a 5′ hairpin, an H box, and an ACA box). In some embodiments, the engineered gsnoRNA produced by in vitro transcription comprises a 5′ cap modification (e.g., a 7-methylguanosine (m7G) cap modification).
In some embodiments, the engineered gsnoRNA comprises a single hairpin and an H box, but does not comprise an ACA box. In some embodiments, the engineered gsnoRNA comprises the sequence of SEQ ID NO: 179. In some embodiments, the engineered gsnoRNA comprises a single hairpin and an ACA box, but does not comprise an H box. In some embodiments, the engineered gsnoRNA comprises the sequence of SEQ ID NO: 180. In some embodiments, the gsnoRNA comprises one or more nucleosides having 2′-OMe or 2′-MOE modifications. In some embodiments, the engineered gsnoRNA comprises no more than 10, no more than 8, no more than 6, or no more than 4 chemically modified nucleosides. In some embodiments, the engineered gsnoRNA comprises one or more phosphorothioate inter-nucleosidic linkages. In some embodiments, the gsnoRNA comprises no more than 10, no more than 9, no more than 8, or no more than 6 phosphorothioate inter-nucleosidic linkages.
In some aspects, provided herein is an engineered gsnoRNA comprising a guide sequence that hybridizes to a sequence comprising a target uridine residue in a target RNA in a host cell, wherein the gsnoRNA comprises a scaffold sequence derived from wildtype ACA2b or ACA36, and wherein the gsnoRNA is capable of recruiting a DKC1 protein in the host cell to modify the target uridine residue into a pseudouridine residue in the target RNA. In some embodiments, the gsnoRNA comprises a scaffold sequence derived from the sequence of SEQ ID NOs: 11 or 12. In some embodiments, the gsnoRNA comprises one, two, three, or four substitution, deletion, and/or insertion mutations compared to SEQ ID NOs: 11 or 12. In some embodiments, the gsnoRNA comprises one or more nucleosides having 2′-OMe or 2′-MOE modifications. In some embodiments, the engineered gsnoRNA comprises no more than 10, no more than 8, no more than 6, or no more than 4 chemically modified nucleosides. In some embodiments, the engineered gsnoRNA comprises one or more phosphorothioate inter-nucleosidic linkages. In some embodiments, the gsnoRNA comprises no more than 10, no more than 9, no more than 8, or no more than 6 phosphorothioate inter-nucleosidic linkages. In some embodiments, the gsnoRNA comprises a 5′ cap modification (e.g., a 7-methylguanosine (m7G) cap modification). In some embodiments, the 5′ cap modification is introduced by in vitro transcription using an m7G(5′)ppp(5′)G cap analog.
In some aspects, provided herein is an isolated nucleic acid molecule comprising a sequence encoding a gsnoRNA provided herein. In some embodiments, the gsnoRNA comprises a scaffold sequence derived from a wildtype H/ACA-snoRNA selected from the group consisting of ACA19, ACA2b, ACA36, ACA44, ACA27, E2, ACA3, and ACA17. In some embodiments, the gsnoRNA comprises a scaffold sequence derived from a wildtype H/ACA-snoRNA selected from the group consisting of ACA2b, ACA36, ACA44, ACA27, E2, ACA3, and ACA17. In some embodiments, the gsnoRNA comprises a scaffold sequence derived from ACA36. In some embodiments, the gsnoRNA comprises a scaffold sequence derived from ACA2b. In some embodiments, the gsnoRNA comprises a scaffold sequence derived from ACA19. In some embodiments, the nucleic acid molecule further comprises a sequence encoding an agent that promotes expression of isoform 3 of a DKC1 protein (e.g., a splice-switching antisense oligonucleotide (ASO), wherein the ASO enhances expression of a DKC1 protein that is an endogenous DKC1 isoform with cytoplasmic localization in the host cell). In some embodiments, the nucleic acid molecule further comprises a sequence encoding a DKC1 isoform or DKC1 protein variant, wherein the isoform or variant has cytoplasmic localization. Exemplary DKC1 proteins are described in Section II A below.
In some aspects, provided herein is an engineered RNA-editing system comprising: (a) a gsnoRNA (such as any one of the gsnoRNAs described in Section II B below) comprising a guide sequence that hybridizes to a sequence comprising a target uridine residue in a target RNA in a host cell, or a nucleic acid molecule encoding the gsnoRNA; and (b) a DKC1 protein (such as any one of the DKC1 proteins described in Section II A below), or a nucleic acid molecule encoding the DKC1 protein, wherein the gsnoRNA is capable of recruiting the DKC1 protein to modify the target uridine residue into a pseudouridine residue in the target RNA. In some embodiments, the DKC1 protein is a DKC1 isoform with cytoplasmic localization.
In some aspects, provided herein is a host cell comprising any of the gsnoRNAs, nucleic acid constructs/molecules, or engineered RNA-editing systems described herein.
In some aspects, provided herein is a kit for editing a target RNA in a host cell, comprising any of the gsnoRNAs, nucleic acid constructs/molecules, or engineered RNA-editing systems described herein.
The present application in some embodiments provides engineered DKC1 proteins or nucleic acid constructs encoding a DKC1 protein.
Dyskerin (DKC1) is a highly conserved multifunctional protein that acts as RNA-guided pseudouridine synthase, directing the enzymatic conversion of specific uridines to pseudouridines. It concentrates in the nucleoli and the Cajal bodies (CBs) where, in association with three other highly conserved proteins (Nop10, Nhp2, Gar1), DKC1 composes a tetramer able to enter in the composition of different nuclear RNPs playing key biological functions. Within the nucleolus, the tetramer associates with H/ACA small nucleolar RNAs (snoRNAs) to compose the H/ACA snoRNPs, that regulate rRNA processing and pseudouridylate RNA targets by snoRNA-guided base complementarity. Within the CBs, it associates with CB specific small RNAs (scaRNAs) to compose the scaRNPs, that direct pseudouridylation of spliceosomal snoRNAs.
There are two DKC1 isoforms in human cells: DKC1 isoform 1 is the canonical DKC1 form containing the bipartite N- and C-terminal nuclear localization signals (NLSs); DKC1 isoform 3 is an alternative splicing variant, which is produced by retention of the intron 12 and lacks C-terminal NLS (
In some aspects, compositions of the present disclosure comprise nucleic acid constructs for expression of a DKC1 protein. In some aspects, compositions of the present disclosure comprise a DKC1 protein (e.g., a DKC1 protein in complex with a gsnoRNA). In some embodiments, the DKC1 protein is isoform 3 of a mammalian DKC1 protein. In some embodiments, the DKC1 protein is homologous to isoform 3 of a human DKC1 protein. In some embodiments, the DKC1 protein has at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to isoform 3 of human DKC1 protein. In some embodiments, the DKC1 protein is isoform 3 of human DKC1 protein. In some embodiments, the DKC1 protein comprises an amino acid sequence having at least 85% (e.g., at least any of 90%, 95%, 96%, 97%, 98%, 99%, or 100%) sequence identity to SEQ ID NO: 2. The sequence of full-length DKC1 (isoform 1) and isoform 3 DKC1 are shown in Table 1 below.
In some embodiments, the DKC1 protein is part of a ribonucleoprotein (RNP) complex that associates with the gsnoRNA. In some embodiments, the ribonucleoprotein complex comprises NOP10, GAR1, and/or NHP2.
In some aspects, provided herein are truncated DKC1 protein variants and nucleic acid constructs encoding the same. In some embodiments, the DKC1 protein comprises a deletion of a nuclear localization signal (NLS) relative to a wildtype DKC1 protein of the same species. In some embodiments, the DKC1 protein comprises a deletion of amino acid residues 9-21 of DKC1 isoform 3, wherein the amino acid numbering is based on SEQ ID NO: 2. In some embodiments, the DKC1 protein comprises amino acid residues 22-420 of DKC1 isoform 3, wherein the amino acid numbering is based on SEQ ID NO: 2. In some embodiments, the DKC1 protein comprises amino acid residues 35-420 of DKC1 isoform 3, wherein the amino acid numbering is based on SEQ ID NO: 2. In some embodiments, the DKC1 protein comprises amino acid residues 41-420 of DKC1 isoform 3, wherein the amino acid numbering is based on SEQ ID NO: 2. Although the DKC1 sequence in SEQ ID NO: 2 is isoform 3 of human DKC1, the person of ordinary skill in the art will understand how to generate corresponding truncation and deletion variants of homologous DKC1 proteins based on sequence alignments (e.g., corresponding deletion/truncation variants of DKC1 proteins from other mammalian species).
In some embodiments, the DKC1 protein comprises an amino acid sequence having at least 85% (e.g., at least any of 90%, 95%, 96%, 97%, 98%, 99%, or 100%) sequence identity to SEQ ID NO: 85. In some embodiments, the DKC1 protein comprises an amino acid sequence having at least 85% (e.g., at least any of 90%, 95%, 96%, 97%, 98%, 99%, or 100%) sequence identity to SEQ ID NO: 86. In some embodiments, the DKC1 protein comprises an amino acid sequence having at least 85% (e.g., at least any of 90%, 95%, 96%, 97%, 98%, 99%, or 100%) sequence identity to SEQ ID NO: 87. In some embodiments, the DKC1 protein comprises an amino acid sequence having at least 85% (e.g., at least any of 90%, 95%, 96%, 97%, 98%, 99%, or 100%) sequence identity to SEQ ID NO: 88.
In some embodiments, amino acid sequence variants of the DKC1 proteins provided herein are contemplated. For example, it may be desirable to improve the stability and/or other biological properties of DKC1 (e.g., of the catalytic domain of DKC1) or of its interaction with other proteins in a ribonucleoprotein complex. Structures of DKC1 and other proteins in the ribonucleoprotein complex have been described, for example in Rashid et al. (Molecular Cell (2006) 21(2): 249-260) and Czekay et al. (Front. Microbiol. (2021) 12:654370), the contents of which are herein incorporated by reference in their entirety. Amino acid sequence variants of a DKC1 protein may be prepared by introducing appropriate modifications into the nucleotide sequence encoding the target-binding moiety, or by peptide synthesis. Such modifications include, for example, deletions from, and/or insertions into and/or substitutions of residues within the amino acid sequences of the target-binding moiety. Any combination of deletion, insertion, and substitution can be made to arrive at the final construct, provided that the final construct possesses the desired characteristics.
In some embodiments, DKC1 protein variants having one or more amino acid substitutions are provided. Amino acid substitutions may be introduced into a DKC1 protein and the products screened for a desired activity
Conservative substitutions are shown in Table A below.
Amino acids may be grouped into different classes according to common side-chain properties:
Non-conservative substitutions will entail exchanging a member of one of these classes for another class.
Also contemplated are fusion proteins comprising a fragment of a naturally occurring DKC1 protein or a functional variant thereof and a heterologous amino acid sequence, e.g., at the N-terminus, the C-terminus, or an internal location of the DKC1 fragment.
B. Nucleic Acid Constructs and Engineered gsnoRNA
In some aspects, provided herein are engineered gsnoRNA based on H/ACA snoRNAs. In some embodiments, the gsnoRNA comprises a single guide sequence. In some embodiments, the gsnoRNA comprise two guide sequences. In some embodiments, the engineered gsnoRNA comprises more than two (e.g., 3, 4, 5, 6, or more) guide sequences. For example, H/ACA snoRNAs contain two hairpins followed by the Hand ACA box motifs. In some embodiments, both hairpins of the engineered gsnoRNAs provided herein contain guide sequences that are capable of targeting the target pseudouridylation site. In other embodiments, only one hairpin of an engineered gsnoRNA contains a guide sequence capable of targeting the target pseudouridylation site. Exemplary engineered gsnoRNA sequences are provided in Tables 2 and 3 below.
In some aspects, gsnoRNAs disclosed herein are synthetic oligonucleotides, which can be synthesized according to methods known in the art. In some embodiments, gsnoRNAs according to the present disclosure are oligoribonucleotides (full RNA). However, in some embodiments, gsnoRNAs of the present disclosure may comprise DNA. In some embodiments, especially when exclusively consisting of nucleotides or linkages that can be expressed in a biological system, gsnoRNAs may be expressed in situ, e.g. from a plasmid or a viral vector.
In some aspects, the gsnoRNA comprises a scaffold sequence derived from a wildtype H/ACA-snoRNA selected from the group consisting of ACA2b and ACA36. In some aspects, the editing efficiency of a gsnoRNA derived from a wildtype H/ACA scaffold is at least 5% (e.g., between or between about 5%-15% or 5-10%) in mammalian cells (e.g., in human cells such as HEK293T cells). In some embodiments, the gsnoRNA comprises a scaffold sequence derived from ACA2b. In some embodiments, the gsnoRNA comprises a scaffold sequence derived from ACA36. In some embodiments, the gsnoRNA comprises a scaffold sequence derived from ACA19.
In some aspects, disclosed herein are engineered gsnoRNA and engineered gsnoRNA scaffolds derived from wildtype H/ACA-snoRNA (e.g., from ACA2b, ACA36, or ACA19), wherein the gsnoRNA are capable of modifying a PTC in an RNA encoding a protein, wherein said modification results in expression of the full-length protein. In some embodiments, the engineered gsnoRNA is capable of causing expression of the full-length protein in the host cell of at least 4% (e.g., at least 5%, at least 6%, at least 7%, at least 8%, at least 9% or at least 10%) of the expression level of the full-length protein without a premature termination codon. In some embodiments, the engineered gsnoRNA is capable of causing expression of the full-length protein, wherein the expression of the protein is detectable without enrichment (e.g., without enrichment by immunoprecipitation). In some embodiments, the protein is detected via a tag (e.g., via a fluorescent tag). In some embodiments, the protein is detected by immo-staining according to methods known in the art. In some embodiments, the engineered gsnoRNA is capable of causing expression of the full-length protein in at least 20% of host cells (e.g., at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, or at least 50% of host cells).
In some embodiments, the gsnoRNA comprises one or more guide sequences each located in a region corresponding to a hairpin structure of the wildtype H/ACA-snoRNA. In some embodiments, the gsnoRNA comprise one or more guide sequences located in a hairpin structure at the 3′ terminal part of the wildtype H/ACA-snoRNA. In some embodiments, the gsnoRNA comprise one or more guide sequences located in a hairpin structure at the S′ terminal part of the wildtype H/ACA-snoRNA.
In some embodiments, the gsnoRNA comprises one or more mutations (e.g., substitution, insertion and/or deletion) in one or more hairpin structures (e.g., the 3′ and/or 5′ hairpin structures) of the wildtype ACA19. In some embodiments, the gsnoRNA comprises one or more mutations that alter the distance between the nucleotide residue in the guide region that hybridizes to the target uridine and the H/ACA box compared to a wildtype scaffold. In some embodiments, the one or more mutations comprise insertion or deletion of one or more nucleotide residues. In some embodiments, the engineered gsnoRNA comprises 14 nucleotides between the nucleotide residue in the guide region that hybridizes to the target uridine and the H/ACA box. In some embodiments, the engineered gsnoRNA comprises 15 nucleotides between the nucleotide residue in the guide region that hybridizes to the target uridine and the H/ACA box. In some embodiments, said mutations increase the efficiency of pseudouridylation (e.g., the efficiency of PTC-readthrough) by at least 1.2-, 1.3-, 1.4-, 1.5-, or 1.6-fold compared to the wildtype scaffold.
In some embodiments, the one or more mutations comprise substitutions in a small polyU sequence (e.g., a sequence of 4 or more, or 5 or more consecutive uridine (U) residues). In some embodiments, the one or more mutations comprise altering a small polyU sequence so that it comprises no more than two consecutive U residues. In some embodiments, the one or more mutations comprise a single base mutation in a “UUUU” sequence. In some embodiments, the mutation is a “UUCU” mutation or a “UGUU” mutation. In some embodiments, the mutated polyU sequence is located in a loop region of the gsnoRNA scaffold. In some embodiments, the engineered gsnoRNA comprises the sequence of SEQ ID NO. 49 or 50. In some embodiments, the engineered gsnoRNA comprises the sequence of SEQ ID NO. 15 or 16. In some embodiments, said mutations increase the efficiency of pseudouridylation (e.g., the efficiency of PTC-readthrough) by at least 1.2-, 1.3-, 1.4-, 1.5-, or 1.6-fold compared to the wildtype scaffold.
In some embodiments, the one or more mutations comprise mutations that increase the openness of a guide region compared to the guide region of a wildtype scaffold. In some embodiments, the one or more mutations reduce the base-pairing probability of one or more residues within a guide region of the gsnoRNA scaffold (e.g., the 5′ guide region of the gACA19 scaffold). In some embodiments, the one or more mutations comprise insertion or one or more nucleotides. In some embodiments, the one or more mutations comprise the addition of CU after residue 8, wherein the numbering is according to SEQ ID NO: 37. In some embodiments, the engineered gsnoRNA is the gsnoRNA of SEQ ID NO: 53. The predicted secondary structure of gACA19-5addCU (SEQ ID NO: 53) is shown in
In some embodiments, the one or more mutations are selected from the group consisting of substitution of residues 26-29 with UUCU, substitution of residues 26-29 with UGUU, addition of G to the 3′ hairpin structure after residue 115, and addition of a dinucleotide sequence (XX, e.g., CU) to the 5′ hairpin after residue 8, wherein X is a nucleotide selected from A, U, C, and G, and wherein the numbering is according to SEQ ID NO: 37.
In some embodiments, gsnoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 17-19 and 22-29.
In some embodiments, the gsnoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 4-6, 9-12, 15-19, 22-36, and 177-179.
In some embodiments, the gsnoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 15-19.
In some embodiments, the gsoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 20-21 and 145-150.
In some embodiments, the gsnoRNA is a disease-targeting gsnoRNA (e.g., any of the gsnoRNA sequences provided in Table 4).
In some embodiments, the gsnoRNA comprises one or more chemically modified nucleosides and/or inter-nucleosidic linkages. In some embodiments, the gsnoRNA comprises one or more nucleosides having 2′ O-methyl (2′-OMe) or 2′-O-methoxyethyl (2′-MOE) modifications. In some embodiments, a gsnoRNA according to the present disclosure may be chemically modified almost in its entirety, for example by providing nucleotides with a 2′-O-methylated sugar moiety (2′-OMe) and/or with a 2′-O-methoxyethyl sugar moiety (2′-MOE). In some embodiments, the gsnoRNA comprises no more than 20, no more than 15, no more than 10, no more than 8, no more than 6, or no more than 4 2′-OMe or 2′-MOE modifications. In some embodiments, the gsnoRNA comprises between about 2 and about 6 2′-OMe or 2′-MOE modifications. In some embodiments, the gsnoRNA comprises about 4 2′-OMe or 2′-MOE modifications. In some embodiments, the gsnoRNA comprises no more than 5 modified sugars. In some embodiments, the gsnoRNA comprises two nucleosides comprising modified sugar moieties (e.g., 2′-OMe) at the 5′ end and two nucleosides comprising modified sugar moieties (e.g., 2′-OMe) at the 3′ end of the gsnoRNA. In some embodiments, the gsnoRNA comprises no more than four, three, or two nucleosides comprising modified sugar moieties (e.g., 2′-OMe) at the 5′ end and no more than four, three, or two nucleosides comprising modified sugar moieties (e.g., 2′-OMe) at the 3′ end of the gsnoRNA. In some embodiments, the gsnoRNA comprises one or more phosphorothioate inter-nucleosidic linkages. In some embodiments, the gsnoRNA comprises no more than 20, no more than 15, no more than 10, no more than 8, or no more than 6 phosphorothioate linkages. In some embodiments, the gsnoRNA comprises between about 2 and about 10 phosphorothioate linkages. In some embodiments, the gsnoRNA comprises about 6 phosphorothioate linkages. In some embodiments, the gsnoRNA comprises about three phosphorothioate linkages at the 5′ end and about three phosphorothioate linkages at the 3′ end of the gsnoRNA. In some embodiments, the gsnoRNA comprises no more than five, four, or three phosphorothioate linkages at the 5′ end and no more than five, four, or three phosphorothioate linkages at the 3′ end of the gsnoRNA. Example 7 provides results demonstrating that a limited number of modifications is sufficient for stability and function of gsnoRNA oligonucleotides.
In some embodiments, the gsnoRNA comprises one or more chemically modified nucleosides and/or inter-nucleosidic linkages. In some embodiments, the gsnoRNA comprises one or more nucleosides having 2′ O-methyl (2′-OMe) or 2′-O-methoxyethyl (2′-MOE) modifications. In some embodiments, a gsnoRNA according to the present disclosure may be chemically modified almost in its entirety, for example by providing nucleotides with a 2′-O-methylated sugar moiety (2′-OMe) and/or with a 2′-O-methoxyethyl sugar moiety (2′-MOE). In some embodiments In some embodiments, the gsnoRNA comprises a 5′ hairpin, an H box (consensus sequence ANANNA), a 3′ hairpin, and an ACA box (consensus sequence ANA). In some embodiments, the gsnoRNA comprises a single hairpin and an H box (referred to herein as a gH5 or rH5 for 5′ half gsnoRNA encoding sequence or gsnoRNA oligonucleotide, respectively), and lacks an ACA box. In some embodiments, the gsnoRNA comprises a single hairpin and an ACA box (referred to herein as a gH3 or rH3 for 3′ half gsnoRNA encoding sequence or gsnoRNA oligonucleotide, respectively), and lacks an H box. In some embodiments, the gsnoRNA comprising a single hairpin is between 60 and 70 nucleotides in length. In some embodiments, the gsnoRNA comprising a single hairpin is about 65 nucleotides in length.
In some embodiments, the gsnoRNA is prepared by in vitro transcription. In some embodiments, the gsnoRNA prepared by in vitro transcription comprises the sequence of any one of SEQ ID NOs: 4-6, 9-12, 15-19, 22-36. In some embodiments, the gsnoRNA prepared by in vitro transcription comprises a 5′ cap modification or a 5′ hairpin (e.g., of a U6+U27 expression cassette). In some embodiments, the gsnoRNA prepared by in vitro transcription comprises a 5′ cap modification. In some embodiments, the 5′ cap modification is is a m7G modification (e.g., a cap 0, cap 1, or cap 2 modification) or an m6Am modification. Suitable methods for adding a 5′ cap to an RNA oligonucleotide have been described, for example, in U.S. Pat. No. 10,494,399, the contents of which are herein incorporated by reference in their entirety. In some embodiments, the gsnoRNA further comprises a 3′ hairpin (e.g., the gsnoRNA comprises the sequence of any one of SEQ ID NOs: 4-6, 9-12, 15-19 and 22-36 and a 3′ hairpin). In some embodiments, the gsnoRNA comprises a 5′ cap modification and does not comprise a 3′ hairpin (e.g., as shown in
Various chemistries and modification are known in the field of oligonucleotides that can be readily used in accordance with the disclosure. The regular internucleosidic linkages between the nucleotides may be altered by mono- or di-thioation of the phosphodiester bonds to yield phosphorothioate esters or phosphorodithioate esters, respectively. Other modifications of the internucleosidic linkages are possible, including amidation and peptide linkers. In a preferred aspect the gsnoRNAs of the present disclosure have one, two, three, four, five, six or more phosphorothioate linkages between the most terminal nucleotides of the gsnoRNA (hence, preferably at both the 5′ and 3′ end), which means that in the case of three phosphorothioate linkages, the ultimate four nucleotides are linked accordingly. It will be understood by the skilled person that the number of such linkages may vary on each end, depending on the target sequence, or based on other aspects, such as toxicity. However, it is some embodiments of the disclosure that the gsnoRNA does comprise one or more PS linkages between any position at its terminal seven nucleotides.
The ribose sugar may be modified by substitution of the 2′-0 moiety with a lower alkyl (C1-4, such as 2′-OMe), alkenyl (C2-4), alkynyl (C2-4), methoxyethyl (2′-methoxyethoxy; or 2′-O-methoxyethyl; or 2′-MOE), or other substituent. In some embodiments, substituents of the 2′ OH group are a methyl, methoxyethyl or 3,3′-dimethylallyl group. The latter is known for its property to inhibit nuclease sensitivity due to its bulkiness, while improving efficiency of hybridization. Alternatively, locked nucleic acid sequences (LNAs), comprising a 2′-4′ intramolecular bridge (usually a methylene bridge between the 2′ oxygen and 4′ carbon) linkage inside the ribose ring, may be applied. Purine nucleobases and/or pyrimidine nucleobases may be modified to alter their properties, for example by amination or deamination of the heterocyclic rings. Other modifications that may be present in the gsnoRNAs of the present disclosure are 2′-F modified sugars, BNA and cEt. The exact chemistries and formats may depend from oligonucleotide construct to oligonucleotide construct and from application to application, and may be worked out in accordance with the wishes and preferences of those of skill in the art.
Examples of chemical modifications in the gsnoRNAs of the present disclosure are modifications of the sugar moiety, including by cross-linking substituents within the sugar (ribose) moiety (e.g. as in LNA or locked nucleic acids, BNA, cEt and the like), by substitution of the 2′-O atom with alkyl (e.g. 2′-O-methyl), alkynyl (2′-O-alkynyl), alkenyl (2′- O)-alkenyl), alkoxyalkyl (e.g. 2′-O-methoxyethyl, 2′-MOE) groups, having a length as specified above, and the like. In the context of the present disclosure, a sugar ‘modification’ also comprises 2′ deoxyribose (as in DNA). In addition, the phosphodiester group of the backbone may be modified by thioation, dithioation, amidation and the like to yield phosphorothioate, phosphorodithioate, phosphoramidate, etc., internucleosidic linkages. The internucleosidic linkages may be replaced in full or in part by peptidic linkages to yield in peptidonucleic acid sequences and the like. Alternatively, or in addition, the nucleobases may be modified by (de)amination, to yield inosine or 2′6′-diaminopurines and the like. A further modification may be methylation of the C5 in the cytidine moiety of the nucleotide, to reduce potential immunogenic properties known to be associated with CpG sequences.
In some embodiments, the gsnoRNA does not comprise one or more chemically modified nucleosides and/or inter-nucleosidic linkages. In some embodiments, the gsnoRNA does not comprise any non-natural inter-nucleosidic linkages.
Mammalian H/ACA snoRNAs are generally embedded (positioned) within pre-mRNA intronic regions of protein-coding genes. During transcription elongation, several proteins with a functional role in pseudouridylation, such as NOP10, dyskerin (DKC1) or NHP2 bind to the nascent H/ACA snoRNA sequences. Following splicing, the guide RNAs are processed through debranching and exonucleolytic processing, resulting in a RNA-protein complex called ‘small nuclear ribonucleoproteins’ (snRNPs, or snRNP complex). Box H/ACA snoRNAs have no preference for localization relative to the 5′ or 3′ ends of the intron and can be present in small or very large introns, as opposed to box C/D snoRNAs, which are usually localized 60-90 nucleotides upstream the 3′-splice site and are encoded in relatively small introns. It has been suggested by Kiss and Filipowicz (1995, Genes Dev 9 (11): 141 1-1424) that a given snoRNA sequence could be excised and fully processed from an intronic region of any given actively spliced mRNA. To show the feasibility of this snoRNA processing independently from the host intron context, Kiss and Filipowicz artificially imbedded several snoRNAs (III 7a, U17b and U19) into the second intron of the human B-globin gene and expressed the resulting vector in fibroblast-like cells. After transfection, they found that the artificial, intronically delivered snoRNAs were properly processed from the human B-globin intron and the β-globin pre-mRNA was correctly spliced. Darzacq et al. (2002, EMBO J 21(11); 2746-2756) corroborated that other guide RNAs could be inserted into the second intron of the human β-globin gene using an expression vector under the control of the cytomegalovirus (CMV) promoter and be delivered to mammalian cells via transfection.
The inventors of the present application unexpectedly identified divergent host intron context-dependent effects on the pseudouridylation editing efficiency of different gsnoRNAs (as discussed in Example 1). For example, the present inventors tested the PTC readthrough efficiency of gsnoRNAs based on wildtype ACA19 (embedded in the host intron of EIF3A), ACA-44 (embedded in the host intron of SNHG12), ACA27 (embedded in the host intron of RPL21), and E2 (embedded in the host intron of RPSA) host genes, and embedded in a non-host intron of the HBB gene (
In some aspects, provided herein is a nucleic acid construct encoding the gsnoRNA. In some embodiments of the methods described herein, the method comprises introducing a nucleic acid molecule encoding the gsnoRNA into the host cell. In some embodiments, the nucleic acid molecule encoding the gsnoRNA is under the control of a small RNA promoter. In some embodiments, the small RNA promoter is a U6 (transcribed by Polymerase III) or U1 (transcribed by Polymerase II) promoter. In some embodiments, the expression of the gsnoRNA from the small RNA promoter according to the methods disclosed herein provides an increased pseudouridylation efficiency (e.g., an increased PTC-read-through efficiency) compared to the same gsnoRNA embedded in a host intron sequence or other intron sequence. In some embodiments, the pseudoridylation efficiency of the gsnoRNA expressed from a nucleic acid under the control of the small RNA promoter is 1.2-, 1.3-, 1.4-, 1.5-, 1.6-, 1.7-, 1.8-, 1.9- or 2-fold higher compared to the same gsnoRNA embedded in a host intron. Example 1 (
In some embodiments, the nucleic acid molecule encoding the gsnoRNA is embedded in an intron sequence located between a first exon sequence and a second exon sequence. In some embodiments, the first exon sequence, the intron sequence and the second exon sequence are derived from a naturally-occurring gene. In some embodiments, the intron may comprise (besides the nucleic acid molecule of the present disclosure, comprising the guide region) additional nucleotides. Since the guide region is expressed from the intron sequence, such additional nucleotides may be selected to render the most efficient expression from the intron. In some embodiments, the exon A/intron/exon B sequence is present in a vector, such as a plasmid or a viral vector. Such a vector can be used to deliver the exon-intron-exon sequence to the cell. Additional introns and exons may be present in such a vector. In some embodiments, the exon A sequence (upstream of the intron that carries the nucleic acid encoding the gsnoRNA (which is expressed after transcription)) comprises or consists of exon 1 of the human β-globin gene, and the exon B sequence (downstream of the intron that carries the nucleic acid encoding the gsnoRNA (which is expressed after transcription)) comprises or consists of exon 2 of the human β-globin gene. In some embodiments, the exon A sequence (upstream of the intron that carries the nucleic acid encoding the gsnoRNA (which is expressed after transcription)) comprises or consists of exon 2 of the human Hemoglobin subunit β (HBB) gene, and the exon B sequence (downstream of the intron that carries the nucleic acid encoding the gsnoRNA (which is expressed after transcription)) comprises or consists of exon 3 of the human Hemoglobin subunit β (HBB) gene. In some embodiments, the nucleic acid molecule encoding the gsnoRNA is embedded in an intron sequence between a first exon sequence and a second exon sequence, wherein the intron sequence, first exon sequence, and second econ sequence correspond to the sequences of a naturally-occurring snoRNA-carrying host gene. In some embodiments, the construct comprising the intron-embedded gsnoRNA encoding sequence is under the control of a CMV promoter.
In some aspects, provided herein are engineered gsnoRNA targeting disease-associated PTCs. In some embodiments, the engineered gsnoRNA targeting disease-associated PTCs comprise one or more mutations to enhance the editing efficiency and/or expression of the gsnoRNAs. In some embodiments, the engineered gsnoRNA targeting disease-associated PTCs are selected from SEQ ID NOs: 71-84 (shown in
In some embodiments, the gsnoRNA may be administered in a free form (or ‘naked’, without the context of a vector), or being delivered to a cell by other means, such as liposomes, or nanoparticles, or by using iontophoresis. In some embodiments, the gsnoRNA can be administered in a ribonucleoprotein complex (e.g., in a complex comprising DKC1, HNP2, NOP10, and/or GAR1). In some embodiments, the free gsnoRNA comprises one or more chemically modified nucleosides and/or inter-nucleosidic linkages as described above.
In some aspects, provided herein is a nucleic acid construct encoding DKC1 (e.g., any of the DKC1 proteins described in Section II A above). In some embodiments of the methods described herein, the method comprises introducing a nucleic acid molecule encoding the DKC1 protein into the host cell. In some embodiments, the nucleic acid molecule comprises a promoter operably linked to a nucleotide sequence encoding the DKC1. In some embodiments, the promoter is a Pol II promoter. In some embodiments, the promoter is a CMV promoter.
As disclosed herein, vectors may carry DNA or RNA, and are generally used to express the gsnoRNA and/or DKC1 protein constructs of the present disclosure after the vector is processed in the cell in which it is introduced. Such is generally through transcription of the DNA or RNA present in the vector. In some embodiments, vectors are viral vectors (that may be used to infect target cells to be treated), or plasmids, that may be introduced into the cell in a variety of ways, known to the person skilled in the art.
In some embodiments, the nucleic acid molecule encoding the DKC1 protein and/or the nucleic acid molecule encoding the gsnoRNA are present in a viral vector. In some embodiments, the method comprises introducing into the host cell a vector (e.g., a plasmid or viral vector) comprising a first nucleic acid sequence encoding the DKC1 protein and a second nucleic acid sequence encoding the gsnoRNA. In some embodiments, the vector is an adeno-associated viral (AAV) vector.
Exemplary engineered ACA scaffold sequences are shown in Table 2 below. The guide sequence is indicated as (Xn) and underlined, wherein Xn is a sequence of X nucleotides of length n, wherein X is any of A, U, G, or C and n is 4, 5, 6, 7, 8, 9, 10, 11, or 12. The guide sequence (Xn) can be modified to target the gsnoRNA to the desired target site, as will be understood by one of ordinary skill in the art. In some embodiments, n is an integer of a suitable length for the guide region. In some embodiments, n is 4, 5, 6, 7, 8, or 9.
Exemplary engineered gsnoRNA sequences, including exemplary guide sequences are shown in Table 3 below.
Exemplary engineered gsnoRNA sequences targeting exemplary disease-associated PTCs are shown in Table 4 below.
(Xn)GAGGAAACAAAU
(Xn)CCUUCAGACAAAA
GUGAUGUGCUAUACAAAUAAUUGAAGGCGAUCCCGCAGUAUAACUAU
GUGGCAUGCAAGAGCAACCUGGAAAGAAUCCCACAGCGCAGGUCAGU
GGUAGUGGGGCAAAGGAAAUAUCCUUUGAUCCCUCAGGCAAACUGGG
CACACCACAAGGGUCUCUGGCCCAAUGAGUGGAGUUUGAUCCGGAUU
AUGUGCUAUACAAAUAAUUGAAGGCGAUCCCGCAGUAUAACUAUAAA
CGGUGACCUCGAAGAGUAACUGCUGACUGAUCCCACUGGCUGUGGGC
CGGUGAGUGGCAGAGAUUAGAGAGGCUAUGUUGAUCCCCAAGCGUUC
GCAGGAGCCUAAAGAAUUGUCUUUCUAUGAUCCCUUGGCCAUUUCAU
GGUGAUUUGGGACAUUAAAAUGGGCUAAGGGAGGAUCCCGGGUAGAA
UUAAUGUGCUAUACAAAUAAUUGAAGGCGAUCCCGCAGUAUAACUAU
GUGAUGUGCUAUACAAAUAAUUGAAGGCGACCGGGCAGUAUAACUAU
GUGAUGUGCUAUACAAAUAAUUGAAGGCGAUCCCGCAGUAUAACUAU
GUGAUGUGCUAUACAAAUAAUUGAAGGCGAUCCCGCAGUAUAACUAU
GUGAUGUGCUAUACAAAUAAUUGAAGGCGAUCCCGCAGUAUAACUAU
GUGAUGUGCUAUACAAAUAAUUGAAGGCUGAUCCCGCAGUAUAACUA
CGGUGAUGUGCUAUACAAAUAAUUGAAGGCGAUCCCGCAGUAUAACU
AUUGAUUUGGGACAUUAAAAUGGGCUAAGGGAGGAUCCCGGGUAGAA
GGUGAUUUGGGACAUUAAAAUGGGCUAAGGGAGGUAAGAGGGUAGAA
GGUGAUUUGGGACAUUAAAAUGGGCUAAGGGAGGAUCCCGGGUAGAA
GGUGAUUUGGGACAUUAAAAUGGGCUAAGGGAGGAUCCCGGGUAGAA
GGUGACUUGGGACAUUAAAAUGGGCUAAGGGAGGAUCCCGGGUAGAA
GGUGAUUUGGGACAUUAAAAUGGGCUAAGGGAUGAUCCCGGGUAGAA
GGUGAUUUGGGACAUUAAAAUGGGCUGGGAUGAUCCCGGGUAGAAAG
GGUGAUUUGGGACAUUAAAAUGGGCUAGGGAUGAUCCCGGGUAGAAA
GGUGAUUUGGGACAUUAAAAUGGGCUAAGGGAGGAUCCCGGGUAGAA
GGUGAUUUGGGACAUUAAAAUGGGCUGGGAUGAUCCCGGGUAGAAAG
GGUGAUUUGGGACAUUAAAAUGGCUAAGGGAGGAUCCCGGGUAGAAA
GGUGAUUUGGGACAUUAAAAUGGGCUAAGGGAGGAUCCCGGGUAGAA
GGUGAUUUGGGACAUUAAAAUGGGCUAAGGGAUGAUCCCGGGUAGAA
GUGAUGUGCUAUACAAAUAAUUGAAGGCGAUCCCGCAGUAUAACUAU
CGGUGAUGUGCUAUACAAAUAAUUGAAGGCGAUCCCGCAGUAUAACU
CGGUGAUGUGCUAUACAAAUAAUUGAAGGCGAUCCCGCAGUAUAACU
CGGUGAUGUGCUAUACAAAUAAUUGAAGGCGAUCCCGCAGUAUAACU
UCAGGAGCCUAAAGAAUUGUCUUUCUAUGAUCCCUUGGCCAUUUCAU
GCAGGAGCCUAAAGAAUUGUCUUUCUAUGUAGGAUUGGCCAUUUCAU
GGCAGGAGCCUAAAGAAUUGUCUUUCUAUGAUCCCUUGGCCAUUUCA
GCAGGAGCCUAAAGAAUUGUCUUUCUAUGAUCCCUUGGCCAUUUCAU
GUGGGAGCCUAAAGAAUUGUCUUUCUAUGAUCCCUUGGCCAUUUCAU
GGUGGGAGCCUAAAGAAUUGUCUUUCUAUGAUCCCUUGGCCAUUUCA
GUGGGAGCCUAAAGAAUUGUCUUUCUAUGAUCCCUUGGCCAUUUCAU
GCAGGAGCCUAAAGAAUUGUCUUUCUUGAUCCCUUGGCCAUUUCAUA
GCAGGAGCCUAAAGAAUUGUCGUUCUUGAUCCCUUGGCCAUUUCAUA
GCAGGAGCCUAAAGAAUUGUCUUUCUAUGAUCCCGUGGCCAUUUCAU
CGGUGAUGUGCUAUACAAAUAAUUGAAGGCGAUCCUGCAGUAUAACU
CGGUGAUGUGCUAUACAAAUAAUUGAAGGCGAUCCUGCAGUAUAACU
GGUCCUUCAGACAAAAUCUAGAGCGGACUUCGGUCCGCUUUU
CGGUGAUGUGCUAUACAA
CCGGUCCUUCAGACAAAA
UcccaAUGUGCUAUACAAAUAAUUGAAGGCgcacgUGCAGUAUAACU
ccUaaUUUGGGACAUUAAAAUCGGCUGGUGagcacgUGGACUAAGAA
gagUaAUGUGCUAUACAAAUAAUUGAAGGCgagUUUGCAGUAUAACU
agUagUUUGGGACAUUAAAAUCGGCUGGUGagagUUUGGACUAAGAA
UaUacAUGUGCUAUACAAAUAAUUGAAGGagUUcUUGCAGUAUAACU
acacAUUUGGGACAUUAAAAUGGGCUAAGGGUagUUcUUGGGUAGAA
cgggGAUGUGCUAUACAAAUAAUUGAAGGgaUUcUUGCAGUAUAACU
gggacUUUGGGACAUUAAAAUCGGCUGGUGgaUUcUUGGACUAAGAA
UgUcCAUGUGCUAUACAAAUAAUUGAAGGUagUgUUGCAGUAUAACU
gUcgAAUUGGGACAUUAAAAUCGGCUGGUgUagUgUUGGACUAAGAA
UagUgaUGUGCUAUACAAAUAAUUGAAGUUcggccUGCAGUAUAACU
agUUCUUUGGAACAUUAAAAUCGGCUGGAAUcggccUGGACUAAGAA
UaUUGAUGUGCUAUACAAAUAAUUGAAGGUggUUgUGCAGUAUAACU
aUUUcUUUGGGACAUUAAAAUCGGCUGGUcUggUUgUGGACUAAGAA
UCUCGAUGUGCUAUACAAAUAAUUGAAGGCAUUAGUGGAGUAUAACU
CUCAUCCUGGGACAUUAAAAUCGGCUGGUCCAUUGGUUGACUAAGAA
CUGAGAUGUGCUAUACAAAUAAUUGAAGCGAGUAGCGCAGUAUAACU
UGAAAUUUGGGACAUUAAAAUCGGCUGGUUGAGUAGCGGACUAAGAA
AAAUGAUGUGCUAUACAAAUAAUUGAAGCUAGAUAUGCAGUAUAACU
AAGGAUUUGGGACAUUAAAAUCGGCUGGUGUAGAUAUGGACUAAGAA
CUGCGAUGUGCUAUACAAAUAAUUGAAGCUAUCCUUGCAGUAUAACU
UGCAGUUUGGGACAUUAAAAUCGGCUGGUGUAUCCUUGGACUAAGAA
GGUUGAUGUGCUAUACAAAUAAUUGAAGGCUUGUGCGCAGUAUAACU
GUAACUCUGGGACAUUAAAAUCGGCUGGUGUUUGUGCGGACUAAGAA
GGAGAUAUGUGCUAUACAAAUAAUUGAAGGUUUACAUGCAGUAUAAC
GGCGAUUUGGGACAUUAAAAUCGGCUGGUGUUUACAUGGACUAAGAA
In one aspect, the present inventors discovered that the editing efficiency of a gsnoRNA was surprisingly higher when the gsnoRNA was encoded in tandem with its target RNA. Example 3 provides results demonstrating the increased editing efficiency using a reporter construct encoding a gsnoRNA and a target RNA in tandem.
Thus, in some aspects, provided herein is a nucleic acid molecule comprising a nucleotide sequence encoding the guide small nucleolar RNA (gsnoRNA) in tandem with a nucleotide sequence encoding the target RNA. In some embodiments, the nucleotide sequence encoding the gsnoRNA is driven by a U6 or U1 promoter. In some embodiments, the nucleotide sequence encoding the target RNA is driven by the same or a different promoter. In some embodiments, the gsnoRNA encoded in tandem with a nucleotide sequence encoding the target RNA provides at least 1.1-, 1.2-, 1.3-, 1.4-, 1.5-, 1.6-, 1.7-, 1.8-, 1.9-, or 2-fold greater editing efficiency of the target RNA compared to the same gsnoRNA encoded in a separate nucleic acid molecule from the target RNA.
In some embodiments, provided herein is a method for editing a target RNA in a host cell, comprising introducing an engineered guide small nucleolar RNA (gsnoRNA) and a nucleic acid molecule encoding a DKC1 protein into the host cell, wherein the gsnoRNA comprises a guide sequence that hybridizes to a sequence comprising a target uridine residue in the target RNA, and wherein the gsnoRNA recruits the DKC1 protein to modify the target uridine residue into a pseudouridine residue in the target RNA. In some embodiments, the DKC1 protein is a DKC1 isoform (e.g., isoform 3) with cytoplasmic localization. In some embodiments, the DKC1 protein is part of a ribonucleoprotein (RNP) complex that associates with the gsnoRNA. In some embodiments, the DKC1 protein comprises a deletion of a nuclear localization signal (NLS) relative to a wildtype DKC1 protein of the same species. In some embodiments, the DKC1 protein is a truncated DKC1 variant or a DKC1 variant comprising a deletion, such as any of the truncation or deletion variants described in Section II A above. In some embodiments, the gsnoRNA recruits NOP10, GAR1, and NHP2 in the host cell.
In some embodiments, the method comprises introducing a nucleic acid (e.g., a nucleic acid vector) encoding the gsnoRNA into the cell. In other embodiments, the method comprises introducing a gsnoRNA oligonucleotide into the cell. In some embodiments, the gsnoRNA comprises a first hairpin and H box and a second hairpin and ACA box. In some embodiments, the gsnoRNA is prepared by in vitro transcription. In some embodiments, the gsnoRNA prepared by in vitro transcription comprises the sequence of any one of SEQ ID NOs: 4-6, 9-12, 15-19 and 22-36. In some embodiments, the gsnoRNA prepared by in vitro transcription comprises a 5′ cap modification or a 5′ hairpin (e.g., of a U6+U27 expression cassette). In some embodiments, the gsnoRNA prepared by in vitro transcription comprises a 5′ cap modification. In some embodiments, the 5′ cap modification is is a m7G modification (e.g., a cap 0, cap 1, or cap 2 modification) or an m6Am modification. Suitable methods for adding a 5′ cap to an RNA oligonucleotide have been described, for example, in U.S. Pat. No. 10,494,399, the contents of which are herein incorporated by reference in their entirety. In some embodiments, the gsnoRNA further comprises a 3′ hairpin (e.g., the gsnoRNA comprises the sequence of any one of SEQ ID NOs: 4-6, 9-12, 15-19 and 22-36 and a 3′ hairpin). In some embodiments, the gsnoRNA comprises a 5′ cap modification and does not comprise a 3′ hairpin (e.g., as shown in
In some embodiments, the method comprises introducing a nucleic acid (e.g., a nucleic acid vector) encoding a gsnoRNA that is a half gsnoRNA (e.g., comprising a single hairpin and an H box, or comprising a single hairpin and an ACA box) into a cell. In other embodiments, the method comprises introducing a gsnoRNA that is a half gsnoRNA (e.g., comprising a single hairpin and an H box, or comprising a single hairpin and an ACA box) into a cell. In some embodiments, the gsnoRNA comprises no more than 20, no more than 15, no more than 10, no more than 8, no more than 6, or no more than 4 2′-OMe or 2′-MOE modifications. In some embodiments, the gsnoRNA comprises between about 2 and about 6 2′-OMe or 2′-MOE modifications. In some embodiments, the gsnoRNA comprises about 4 2′-OMe or 2′-MOE modifications. In some embodiments, the gsnoRNA comprises no more than 5 modified sugars. In some embodiments, the gsnoRNA comprises two nucleosides comprising modified sugar moieties (e.g., 2′-OMe) at the 5′ end and two nucleosides comprising modified sugar moieties (e.g., 2′-OMe) at the 3′ end of the gsnoRNA. In some embodiments, the gsnoRNA comprises no more than four, three, or two nucleosides comprising modified sugar moieties (e.g., 2′-OMe) at the 5′ end and no more than four, three, or two nucleosides comprising modified sugar moieties (e.g., 2′-OMe) at the 3′ end of the gsnoRNA. In some embodiments, the gsnoRNA comprises one or more phosphorothioate inter-nucleosidic linkages. In some embodiments, the gsnoRNA comprises no more than 20, no more than 15, no more than 10, no more than 8, or no more than 6 phosphorothioate linkages. In some embodiments, the gsnoRNA comprises between about 2 and about 10 phosphorothioate linkages. In some embodiments, the gsnoRNA comprises about 6 phosphorothioate linkages. In some embodiments, the gsnoRNA comprises about three phosphorothioate linkages at the 5′ end and about three phosphorothioate linkages at the 3′ end of the gsnoRNA. In some embodiments, the gsnoRNA comprises no more than five, four, or three phosphorothioate linkages at the 5′ end and no more than five, four, or three phosphorothioate linkages at the 3′ end of the gsnoRNA. In some embodiments, the gsnoRNA comprises a 5′ hairpin, an H box (consensus sequence ANANNA), a 3′ hairpin, and an ACA box (consensus sequence ANA). In some embodiments, the gsnoRNA comprises a single hairpin and an H box (referred to herein as a gH5 or H5 for 5′ half gsnoRNA encoding sequence or gsnoRNA oligonucleotide, respectively), and lacks an ACA box. In some embodiments, the gsnoRNA comprises a single hairpin and an ACA box (referred to herein as a gH3 or rH3 for 3′ half gsnoRNA encoding sequence or gsnoRNA oligonucleotide, respectively), and lacks an H box. In some embodiments, the gsnoRNA comprising a single hairpin is between 60 and 70 nucleotides in length. In some embodiments, the gsnoRNA comprising a single hairpin is about 65 nucleotides in length.
The present disclosure is exemplified by, but not limited to, reversing the effect of nonsense stop mutations that usually lead to translation termination and mRNA degradation (via Nonsense Mediated Decay, see below). In another aspect, targeted pseudouridylation can act as a means to recode uridine-containing codons as a mean to modulate protein function via amino acid substitution, for instance in crucial protein regions such as protein kinase active centers.
One of the consequences of mutations leading to PTCs in the coding sequence of a gene is the decrease of the mRNA levels. This is due to a mechanism known as the Nonsense-Mediated Decay (NMD), which is a cellular surveillance mechanism that degrades aberrant mRNA transcripts, preventing transcripts that were not correctly processed from being translated. It is estimated that one-third of genetic disorders are a result of a mutation leading to a PTC (such as for instance in CF, retiniti pigmentosa (RP), and beta-thalassemia). In a normal scenario, exon-junction complexes (EJCs) are formed during splicing. Then, during the first translation round, ribosomes displace these EJCs. On the other hand, when a PTC is located more than 50-54 nucleotides upstream of the last EJC, the NMD pathway is triggered by formation of a termination complex consisting of EJC-associated NMD factors. When this happens during the first pioneer round of translation and the ribosomes co-exist with at least one EJC downstream their location, this triggers the de capping and 5′-to-3′ exonuclease activity and also de-adenylation of the tail and 3′-to-5′ exonuclease-mediated transcript decay. In order to tackle the aforementioned genetic disorders, or any disorder that is due to a similar mutation, inhibition of this pathway in a gene-specific and sequence-specific manner is therefore crucial.
In some aspects, provided herein are methods for recoding a PTC, which results in an increase of mRNA levels, and in translational read-though of the recoded mRNA into a full-length protein. In some embodiments, the methods and compositions provided herein allow for PTC read-through of more than 4%, more than 5%, more than 10%, more than 12%, more than 15%, more than 20%, or more than 30%. In some embodiments, the methods and compositions herein allow for suppression of nonsense-mediated decay (NMD) by more than 10%, more than 12%, more than 15%, more than 20%, or more than 30%. PTC read-through can be assayed by evaluating protein levels, either by directly quantifying the protein expression or by assaying an activity of the expressed protein. Methods for assessing NMD suppression are also known in the art. For example, to assess NMD suppression, a known NMD-inhibition reporter assay (Zhang et al. 1998, RNA 4(7):801-815) can be used, and translational read-through of a gene carrying a PTC can also be assessed. As exemplified herein, fluorescent reporter genes carrying nonsense mutations were used as the target sequence. Without correction, this nonsense mutation leads to a lower abundance of mRNA (as a result of NMD) as well as to a truncated protein, resulting in the absence of fluorescent signal. As shown herein, correction of the mutation via targeted pseudouridylation allows the full length protein to be translated from the mRNA. The skilled person understands that the PTC region of the fluorescent reporter constructs described herein can be exchanged by any other model or therapeutically relevant target RNA of interest.
In some embodiments, provided herein are methods for recoding a PTC in an RNA encoding a protein, wherein the method results in expression of the full-length protein in the host cell of at least 4% (e.g., at least 5%, at least 6%, at least 7%, at least 8%, at least 9% or at least 10%) of the expression level of the full-length protein without a premature termination codon. In some embodiments, the method results in expression of the full-length protein, wherein the expression of the protein is detectable without enrichment (e.g., without enrichment by immunoprecipitation). In some embodiments, the protein is detected via a tag (e.g., via a fluorescent tag). In some embodiments, the protein is detected by immo-staining according to methods known in the art. In some embodiments, the method results in expression of the full-length protein in at least 20% of host cells (e.g., at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, or at least 50% of host cells).
In some aspects, provided herein are methods for treating, preventing, and/or blocking nonsense-mediated RNA decay of a target mRNA, the methods comprising introducing a guide small nucleolar RNA (gsnoRNA) and a nucleic acid molecule encoding a DKC1 protein into the host cell, wherein the gsnoRNA comprises a guide sequence that hybridizes to a premature termination codon (PTC) sequence comprising a target uridine residue in the target mRNA, and wherein the gsnoRNA recruits the DKC1 protein to modify the target uridine residue into a pseudouridine residue in the target RNA, whereby the pseudouridylation of the target uridine promotes read-through of the PTC. In some embodiments, the DKC1 protein is a DKC1 isoform with cytoplasmic localization. In some embodiments, the DKC1 protein is part of a ribonucleoprotein (RNP) complex that associates with the gsnoRNA. In some embodiments, the DKC1 protein comprises a deletion of a nuclear localization signal (NLS) relative to a wildtype DKC1 protein of the same species. In some embodiments, the DKC1 protein is a truncated DKC1 variant or a DKC1 variant comprising a deletion, such as any of the truncation or deletion variants described in Section II A above.
In some embodiments, the DKC1 protein is an endogenous protein of the host cell. In some embodiments, the DKC1 protein is an endogenous, naturally expressed DKC1 isoform of the host cell, wherein the DKC1 isoform has cytoplasmic localization in the host cell. In some embodiments, the DKC1 protein corresponds to isoform 2 of a human DKC1 protein.
In some embodiments, the DKC1 and snoRNA can be delivered into the cell together (e.g., as part of a ribonucleoprotein (RNP) complex). In some embodiments, the snoRNP comprises the gsnoRNA and DKC1, NHP2, GAR1, and/or NOP10.
In some embodiments, provided herein is a method for editing a target RNA in a host cell, comprising introducing an engineered gsnoRNA into the host cell, wherein the gsnoRNA comprises a guide sequence that hybridizes to a sequence comprising a target uridine residue in the target RNA (e.g., mRNA), wherein the host cell expresses a DKC1 isoform with cytoplasmic localization, and wherein the gsnoRNA recruits the DKC1 isoform to modify the target uridine residue into a pseudouridine residue in the target RNA. In some embodiments, the method comprises introducing a splice-switching antisense oligonucleotide (ASO) into the host cell, wherein the ASO enhances expression of a DKC1 protein that is an endogenous DKC1 isoform with cytoplasmic localization in the host cell.
Splice-switching antisense oligonucleotides (ASOs) alter splicing by directing splice site selection. Splice-switching ASOs can modulate pre-mRNA splicing by binding to target pre-mRNAs and blocking access of the splicing machinery to a particular splice site, and can be used to produce novel splice variants, correct aberrant splicing or manipulate alternative splicing. Methods for the design and delivery of splice-switching antisense oligonucleotides to cells have been described, for example in U.S. Patent Publications US20180334677 and US20120040917, U.S. Pat. No. 10,190,117, and Disterer et al. Hum Gene Ther. 2014 Jul; 25(7):587-98, the contents of which are herein incorporated by reference in their entirety.
In some embodiments, the splice-switching ASO binds to a pre-mRNA of a DKC1 gene and directs splicing of DKC1 isoform 3. In some embodiments, introducing the splice-switching ASO increases expression of a DKC1 protein that is an endogenous DKC1 isoform with cytoplasmic localization in the host cell by at least 1.1-, 1.2-, 1.3-, 1.4-, 1.5-, 1.6-, 1.7-, 1.8-, 1.9-, 2-, 3-, 4-, 5-, or 10-fold compared to the expression of the same isoform in the host cell in the absence of the ASO. In some embodiments, administering the splice-switching ASO increases expression of DKC1 isoform 3 by at least 1.1-, 1.2-, 1.3-, 1.4-, 1.5-, 1.6-, 1.7-, 1.8-, 1.9-, 2-, 3-, 4-, 5-, or 10-fold compared to the expression of DKC1 isoform 3 in the host cell in the absence of the ASO.
In some embodiments, the splice-switching ASOs may be delivered via aptamers, Inverse Molecular Sentinel nanoprobes, ASO encapsulated liposome-DNA-polycation or ASO encapsulated liposome-protamine-hyluronic acid nanoparticles and the like. Suitable methods of delivering aptamers can be found in Kotula, J. W., et al., Aptamer-mediated delivery of splice-switching oligonucleotides to the nuclei of cancer cells. Nucleic Acid Ther, 2012. 22(3): p. 187-95, the contents of which are incorporated by reference in their entirety.
In some embodiments, provided herein is a method for editing a target RNA in a host cell, comprising introducing an engineered gsnoRNA into the host cell, wherein the gsnoRNA comprises a guide sequence that hybridizes to a sequence comprising a target uridine residue in the target RNA, wherein the gsnoRNA comprises a scaffold sequence derived from wildtype ACA19, ACA44, ACA27, E2, ACA3, ACA17, ACA2b or ACA36, and wherein the gsnoRNA recruits a DKC1 protein in the host cell to modify the target uridine residue into a pseudouridine residue in the target RNA (e.g., mRNA). In some embodiments, the gsnoRNA comprises a scaffold sequence derived from wildtype ACA2b or ACA36. In some embodiments, the DKC1 protein is a DKC1 isoform with cytoplasmic localization. In some embodiments, the DKC1 protein is part of a ribonucleoprotein (RNP) complex that associates with the gsnoRNA.
In some embodiments, provided herein is a method for editing a target RNA in a host cell, comprising introducing an engineered gsnoRNA into the host cell, wherein the gsnoRNA comprises a guide sequence that hybridizes to a sequence comprising a target uridine residue in the target RNA, wherein the gsnoRNA comprises a scaffold sequence derived from wildtype ACA19, ACA44, ACA27, E2, ACA3, ACA17, ACA2b or ACA36, and wherein the gsnoRNA recruits a DKC1 protein in the host cell to modify the target uridine residue into a pseudouridine residue in the target RNA (e.g., mRNA). The engineered gsnoRNA can be any one of the engineered gsnoRNAs described in Section II B. In some embodiments, the gsnoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 4-6, 9-12, 15-19, 22-36, and 177-179. In some embodiments, the DKC1 protein is a DKC1 isoform with cytoplasmic localization. In some embodiments, the DKC1 protein is part of a ribonucleoprotein (RNP) complex that associates with the gsnoRNA. In some embodiments, the DKC1 protein comprises a deletion of a nuclear localization signal (NLS) relative to a wildtype DKC1 protein of the same species. In some embodiments, the DKC1 protein is a truncated DKC1 variant or a DKC1 variant comprising a deletion, such as any of the truncation or deletion variants described in Section II A above.
In some embodiments, provided herein is a method for editing a target RNA in a host cell, comprising introducing an engineered gsnoRNA into the host cell, wherein the gsnoRNA comprises a guide sequence that hybridizes to a sequence comprising a target uridine residue in the target RNA, wherein the gsnoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 4-6, 9-12, 15-19, 22-36, and 177-179, and wherein the gsnoRNA recruits a DKC1 protein in the host cell to modify the target uridine residue into a pseudouridine residue in the target RNA. In some embodiments, the DKC1 protein is a DKC1 isoform with cytoplasmic localization. In some embodiments, the DKC1 protein comprises a DKC1 protein fragment corresponding to amino acid residues 1 to 419 of a full-length human DKC1 protein, wherein the amino acid numbering is according to SEQ ID NO: 1. In some embodiments, the DKC1 protein is part of a ribonucleoprotein (RNP) complex that associates with the gsnoRNA. In some embodiments, the DKC1 protein comprises a deletion of a nuclear localization signal (NLS) relative to a wildtype DKC1 protein of the same species. In some embodiments, the DKC1 protein is a truncated DKC1 variant or a DKC1 variant comprising a deletion, such as any of the truncation or deletion variants described in Section II A above.
In some embodiments, the methods provided herein comprise introducing a nucleic acid molecule comprising a nucleotide sequence encoding the guide small nucleolar RNA (gsnoRNA) in tandem with a nucleotide sequence encoding the target RNA into a host cell. In some embodiments, the nucleotide sequence encoding the gsnoRNA is driven by a U6 or U1 promoter. In some embodiments, the nucleotide sequence encoding the target RNA is driven by the same or a different promoter. In some embodiments, the gsnoRNA encoded in tandem with a nucleotide sequence encoding the target RNA provides at least 1.1-, 1.2-, 1.3-, 1.4-, 1.5-, 1.6-, 1.7-, 1.8-, 1.9-, or 2-fold greater editing efficiency of the target RNA compared to the same gsnoRNA encoded in a separate nucleic acid molecule from the target RNA.
In some embodiments, the methods provided herein comprise introducing the guide small nucleolar RNA (gsnoRNA) to an endogenous nucleic acid molecule of a host cell, wherein the endogenous nucleic acid molecule comprises a nucleotide sequence encoding the target RNA. In some embodiments, the introducing comprises inserting a nucleotide sequence encoding the gsnoRNA into a region of the endogenous nucleic acid molecule that is directly or indirectly adjacent to the region encoding the target RNA. In some embodiments, the nucleotide sequence encoding the gsnoRNA is driven by a U6 or U1 promoter. Methods for inserting nucleotide sequences into endogenous nucleic acid molecules are known in the art, such as guided-nuclease (e.g., CRISPR/Cas) editing and homology-directed repair. In some embodiments, the gsnoRNA inserted into a region of an endogenous nucleic acid molecule that is directly or indirectly adjacent to a nucleotide sequence encoding the target RNA provides at least 1.1-, 1.2-, 1.3-, 1.4-, 1.5-, 1.6-, 1.7-, 1.8-, 1.9-, or 2-fold greater editing efficiency of the target RNA compared to the same gsnoRNA encoded in a separate nucleic acid molecule from the target RNA.
In some embodiments, the degree of recruiting and redirecting the pseudouridylation entities resident in the cell may be regulated by the dosing and the dosing regimen of the gsnoRNA. This is something to be determined by the experimenter (e.g., in vitro) or the clinician, usually in phase I and/or II clinical trials.
In some embodiments, the methods provided herein comprise modification of target RNA (e.g., mRNA) sequences in eukaryotic, (e.g., metazoan or mammalian cells, such as human cells). In some aspects, the methods and compositions provided herein can be used with cells from any organ e.g. skin, lung, heart, kidney, liver, pancreas, gut, muscle, gland, eye, brain, blood and the like. The cell can be located in vitro or in vivo. One advantage of the methods, compositions, systems, kits, and articles of manufacture of the present disclosure is that they can be used with cells in situ in a living organism, but can also be used with cells in culture. In some embodiments cells are treated ex vivo and are then introduced into a living organism (e.g. re-introduced into an organism from whom they were originally derived). The methods, compositions, systems, kits, and articles of manufacture of the present disclosure can also be used to edit target RNA sequences in cells within a so-called organoid. Organoids can be thought of as three-dimensional in vitro-derived tissues but are driven using specific conditions to generate individual, isolated tissues (e.g. see Lancaster and Knoblich. 2014, Science 345 (6194):1247125). In a therapeutic setting they are useful because they can be derived in vitro from a patient's cells, and the organoids can then be re-introduced to the patient as autologous material, which is less likely to be rejected than a normal transplant. The cell to be treated will generally have a genetic mutation. The mutation may be heterozygous or homozygous. In some embodiments, the methods and compositions provided herein can be used to modify point mutations. In some embodiments, the methods and compositions provided herein are suitable for modifying sequences in cells, tissues or organs implicated in a diseased state of a subject (e.g., a human subject), for instance when the human subject suffers from a disease associated with a PTC.
The present disclosure provides methods that can be used to make a change (pseudouridylation) in a target RNA sequence in a eukaryotic cell through the use of an oligonucleotide (e.g., any of the gsnoRNAs described in Section II B above, or any gsnoRNA based on the engineered scaffolds described in Section II B above) that is capable of targeting a site to be edited and recruiting RNA editing proteins (e.g., DKC1) to bring about the editing reaction(s). In some embodiments, the DKC1 is endogenous DKC1. In some embodiments, the DKC1 is exogenously delivered. In some embodiments, the method comprises increasing the relative proportion of DKC1 isoform 3 or a DKC1 protein with cytoplasmic localization. The target RNA sequence may comprise a mutation that one may wish to correct or alter, such as a point mutation (a transition or a trans version). The target RNA may be any cellular or viral RNA sequence, but is more usually a pre-mRNA or an mRNA with a protein coding function. In some embodiments, the target sequence is endogenous to the eukaryotic, (e.g., mammalian, e.g., human) cell.
In some embodiments, the methods provided herein are suitable for promoting read-through of a PTC, wherein the PTC is an opal codon (UGA), an amber codon (UAG), or an ochre codon (UAA). In some embodiments, the PTC is an opal codon, and the method results at least 10%, at least 15%, at least 20%, or at least 25% read-through efficiency, wherein read-through efficiency is assayed as the percent of protein expression or activity (e.g., fluorescent intensity) compared to a control that lacks the PTC. In some embodiments, the PTC is an amber codon (UAG), and the method results in at least 2%, at least 5%, at least 10%, at least 12%, or at least 14% read-through efficiency, wherein read-through efficiency is assayed as the percent of protein expression or activity (e.g., fluorescent intensity) compared to a control that lacks the PTC. In some embodiments, the method results in at least 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, or 70% of cells expressing detectable levels of the full-length protein encoded by a target gene comprising the PTC.
In some embodiments, the target uridine in the target RNA is a premature termination codon in a sequence encoding a protein, wherein the method results in expression of the full-length protein in the host cell at at least 4% (e.g., at least 5%, at least 6%, at least 7%, at least 8%, at least 9%, at least 10%, or higher) of the expression level of the full-length protein without a premature termination codon.
In some embodiments, the target uridine in the target RNA is a premature termination codon in a sequence encoding a protein, wherein the method results in expression of the full-length protein, and wherein the expression of the protein is detectable without enrichment (e.g., without enrichment by immunoprecipitation). In some embodiments, the protein is detected via a tag (e.g., via a fluorescent tag). In some embodiments, the protein is detected by immo-staining according to methods known in the art.
In some embodiments, the target uridine in the target RNA is a premature termination codon in a sequence encoding a protein, wherein the method results in expression of the full-length protein in at least 20% of host cells, e.g., at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, or higher percentage of host cells.
Also provided are engineered gsnoRNA compositions or engineered RNA editing systems described herein for use in any one of the methods described herein, such as a method of editing a target RNA or a method of treatment. Use of any one of the engineered gsnoRNA compositions or engineered RNA editing systems described herein in the preparation of a medicament for treating a disease or condition.
In some aspects, the methods provided herein comprise modifying a target RNA using a gsnoRNA that recruits a DKC1 protein to modify the target RNA. In some embodiments, the gsnoRNA hybridizes to a target a sequence comprising a target uridine residue, and the modification of the RNA comprises modification of the target uridine to pseudouridine.
In some embodiments, the target RNA is an endogenous RNA of a cell (e.g., a eukaryotic cell such as a mammalian or human cell). In some embodiments, the target RNA is an endogenously transcribed RNA of the cell (e.g., transcribed from an endogenous nucleic acid sequence of the cell). In some embodiments, the target RNA is transcribed from a nucleic acid sequence that has been introduced into the cell (e.g., an RNA transcribed from an exogenously added nucleic acid molecule). In some embodiments, the target RNA is a ribosomal RNA. In some embodiments, the target RNA is a messenger RNA (mRNA).
In some embodiments, the sequence comprising the target uridine in the target RNA is a stop codon, and modification of the target uridine to pseudouridine causes the stop codon to be translated as a coding codon. In some embodiments, the stop codon is a premature termination codon (PTC). In some embodiments, the PTC is associated with a genetic disease or condition. Converting the target uridine in such a PTC to a pseudouridine, by using the means and methods of the present disclosure, then results in proper read-through of the reading frame during translation, thereby providing a (partly or fully) functional full length protein.
In some embodiments, provided herein is a method of treating a disease or condition associated with a PTC in a target RNA in a subject, comprising editing the target RNA in a cell of the subject using any of the RNA editing methods described herein, wherein the gsnoRNA comprises a guide sequence that hybridizes to the PTC in the target RNA, and wherein modification of the uridine residue in the PTC to a pseudouridine residue causes translation read-through of the PTC in the target RNA, thereby treating the disease or condition in the subject.
In some embodiments, the method of treating a disease or condition associated with a PTC in a target RNA in a subject comprises introducing an engineered gsnoRNA into a host cell of the subject, wherein the gsnoRNA comprises a guide sequence that hybridizes to the PTC comprising the uridine residue in the target RNA, wherein the gsnoRNA comprises a scaffold sequence derived from wildtype ACA2b, ACA36, ACA44, ACA27, E2, ACA3, or ACA17, and wherein the gsnoRNA recruits a DKC1 protein in the host cell to modify the target uridine residue into a pseudouridine residue in the target RNA. In some embodiments, the DKC1 protein is an endogenous DKC1 protein of the host cell. In some embodiments, the method further comprises introducing a nucleic acid encoding the DKC1 protein into the host cell. In some embodiments, the DKC1 protein has cytoplasmic localization in the host cell. In some embodiments, the DKC1 protein comprises a DKC1 protein fragment corresponding to amino acid residues 41 to 420 of a human DKC1 isoform 3 protein, wherein the amino acid numbering is according to SEQ ID NO: 2. In some embodiments, the DKC1 protein comprises an amino acid sequence having at least 85% (e.g., at least about any one of 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more) identity to SEQ ID NO: 88. In some embodiments, the DKC1 protein comprises the amino acid sequence of SEQ ID NO: 88.
In some embodiments, the method of treating a disease or condition associated with a PTC in a target RNA in a subject comprises introducing an engineered gsnoRNA into a host cell of the subject, wherein the gsnoRNA comprises a guide sequence that hybridizes to the PTC comprising the target uridine residue in the target RNA, wherein the gsnoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 4-6, 9-12, 15-19, 22-36, and 177-179, and wherein the gsnoRNA recruits a DKC1 protein in the host cell to modify the target uridine residue into a pseudouridine residue in the target RNA. In some embodiments, the DKC1 protein is an endogenous DKC1 protein of the host cell. In some embodiments, the method further comprises introducing a nucleic acid encoding the DKC1 protein into the host cell. In some embodiments, the DKC1 protein has cytoplasmic localization in the host cell. In some embodiments, the DKC1 protein comprises a DKC1 protein fragment corresponding to amino acid residues 41 to 420 of a human DKC1 isoform 3 protein, wherein the amino acid numbering is according to SEQ ID NO: 2. In some embodiments, the DKC1 protein comprises an amino acid sequence having at least 85% (e.g., at least about any one of 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more) identity to SEQ ID NO: 88. In some embodiments, the DKC1 protein comprises the amino acid sequence of SEQ ID NO: 88.
In some embodiments, the gsnoRNA is a half gsnoRNA, e.g., comprising a single hairpin and an H box, or a single hairpin and an ACA box. In some embodiments, the gsnoRNA comprises or consists of any one of the sequences set forth in SEQ ID NOS: 89-100 and 113-128, as shown in
In some embodiments, the method of treating a disease or condition associated with a PTC in a target RNA in a subject comprises introducing an engineered gsnoRNA into a host cell of the subject, wherein the gsnoRNA comprises a sequence selected from SEQ ID NOs: 71-84. Sequences of exemplary engineered gsnoRNAs targeting uridine residues of exemplary disease-associated PTCs are shown in Table 4.
In some embodiments the method of treating a disease or condition associated with a PTC in a target RNA in a subject comprises introducing (a) an engineered gsnoRNA and (b) a splice-switching antisense oligonucleotide (ASO) into a host cell of the subject, wherein the gsnoRNA comprises a guide sequence that hybridizes to a sequence comprising a target uridine residue in the target RNA, wherein the ASO enhances expression of a DKC1 protein that is an endogenous DKC1 isoform with cytoplasmic localization in the host cell, and wherein the gsnoRNA recruits the DKC1 protein to modify the target uridine residue into a pseudouridine residue in the target RNA. In some embodiments, the splice-switching ASO binds to a pre-mRNA of a DKC1 gene and directs splicing of DKC1 isoform 3. In some embodiments, introducing the splice-switching ASO increases expression of a DKC1 protein that is an endogenous DKC1 isoform with cytoplasmic localization in the host cell by at least 1.1-, 1.2-, 1.3-, 1.4-, 1.5-, 1.6-, 1.7-, 1.8-, 1.9-, 2-, 3-, 4-, 5-, or 10-fold compared to the expression of the same isoform in the host cell in the absence of the ASO. In some embodiments, administering the splice-switching ASO increases expression of DKC1 isoform 3 by at least 1.1-, 1.2-, 1.3-, 1.4-, 1.5-, 1.6-, 1.7-, 1.8-, 1.9-, 2-, 3-, 4-, 5-, or 10-fold compared to the expression of DKC1 isoform 3 in the host cell in the absence of the ASO. In some embodiments, the gsnoRNA comprises a scaffold sequence derived from a wildtype H/ACA-snoRNA selected from the group consisting of ACA19, ACA2b, ACA36, ACA44, ACA27, E2, ACA3, and ACA17. In some embodiment, the gsnoRNA comprises a scaffold sequence derived from ACA2b. In some embodiments, the gsnoRNA comprises a scaffold sequence derived from ACA36. In some embodiments, the gsnoRNA comprises a scaffold sequence derived from ACA19. In some embodiments, the gsnoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 3-12, 15-19, 22-36, and 177-179. In some embodiments, the gsnoRNA comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 15-19.
In some embodiments, the disease or condition is selected from the group consisting of Cystic fibrosis, Hurler Syndrome, alpha-1-antitrypsin (A1AT) deficiency, Parkinson's disease, Alzheimer's disease, albinism, Amyotrophic lateral sclerosis, Asthma, 8-thalassemia, Cadasil syndrome, Charcot-Marie-Tooth disease, Chronic Obstructive Pulmonary Disease (COPD), Distal Spinal Muscular Atrophy (DSMA), Duchenne/Becker muscular dystrophy, Dystrophic Epidermolysis bullosa, Epidermolysis bullosa, Fabry disease, Factor V Leiden associated disorders, Familial Adenomatous Polyposis, Galactosemia, Gaucher's Disease, Glucose-6-phosphate dehydrogenase, Haemophilia, Hereditary Hemochromatosis, Hunter Syndrome, Huntington's disease, Inflammatory Bowel Disease (IBD), Inherited polyagglutination syndrome, Leber congenital amaurosis, Lesch-Nyhan syndrome, Lynch syndrome, Marfan syndrome, Mucopolysaccharidosis, Muscular Dystrophy, Myotonic dystrophy types I and II, neurofibromatosis, Niemann-Pick disease type A, B and C, NY-esol related cancer, Peutz-Jeghers Syndrome, Phenylketonuria, Pompe's disease, Primary Ciliary Disease, Prothrombin mutation related disorders, such as the Prothrombin G20210A mutation, Pulmonary Hypertension, (autosomal dominant) Retinitis Pigmentosa, Sandhoff Disease, Severe Combined Immune Deficiency Syndrome (SCID), Sickle Cell Anemia, Spinal Muscular Atrophy, Stargardt's Disease, Tay-Sachs Disease, Usher syndrome, X-linked immunodeficiency, Sturge-Weber Syndrome, and cancer. Exemplary diseases or conditions associated with PTCs in target RNAs are listed in the Human Gene Mutation Database (HGMD®, available at hgmd.cf.ac.uk) and ClinVar database (see Landrum et al. ClinVar: improvements to accessing data. Nucleic Acids Res. 2020; 48(D1): D835-D844; available at ncbi.nlm.nih.gov/clinvar/intro). In some embodiments, threonine or serine is incorporated at the ΨAA and ΨAG codons, and phenylalanine or tyrosine at the ΨGA codons.
In some embodiments, the present disclosure provides for the use of a nucleic acid molecule (encoding an engineered gsnoRNA as described herein) in the manufacture of a medicament for the treatment of one or more of the diseases listed herein. In some embodiments, provided herein are engineered gsnoRNA for use in the treatment of cystic fibrosis (CF). Exemplary PTCs associated with CF are known in the art, for example as described in international patent publication WO2019191232, the contents of which are herein incorporated by reference in their entirety. Exemplary cystic fibrosis-associated PTC mutations include, but are not limited to, G542X (UGA), W1282X (UGA), R553X (UGA), R1162X (UGA), Y122X (UAA), W1089X, W846X, and W401X mutations, which can be modified through pseudouridylation to amino acid encoding codons, thereby allowing the translation to full length proteins. It has for instance been well established in the art that ΨAA and ΨAG codons are both translated to serine or threonine, whereas a ΨGA is translated to tyrosine or phenylalanine, instead of being seen as a stop codon (Karijolich and Yu, 2011). In some embodiments, the host cell is an archaeal or eukaryotic cell. In some embodiments, the host cell is a mammalian cell. In some embodiments, the host cell is a human cell. In some embodiments, the method is carried out in vivo. In other embodiments, the method is carried out ex vivo.
The methods of the present disclosure can be applied to suppress NMD and/or promote PTC read-through of a disease-associated PTC for a wide range of known disease-associated PTCs. There are a large number of human diseases that result from nonsense mutations in the respective disease genes. For instance, Usher syndrome is an inherited retinal dystrophy (IRD) that is the principal cause of combined deafness and blindness. Nonsense mutations occur in 12% of Usher syndrome patients and have been described in different genes, such as the USH2A gene. Some Hurler Syndrome patients, suffering from skeletal abnormalities and cognitive impairment, carry a nonsense mutation in the IDUA gene that prevents the production of a functional full-length IDUA protein in these patients. A substantial fraction of cystic fibrosis (CF) cases, a chronic disease affecting the lungs and the digestive system, is due to nonsense mutations in the CFTR gene. The PTCs resulting from these nonsense mutations are identified in the coding region at several different sites, each of which leads to total lack of functional full-length CFTR protein. Nonsense mutations are also found in some relevant oncogenes of many cancer patients, resulting in complete lack of full-length protein products. Given the deleterious role of nonsense mutations in gene expression and disease, nonsense suppression becomes an attractive strategy and the ultimate goal in combating these diseases.
In some aspects, the methods provided herein comprise delivering (e.g., administering) a gsnoRNA and/or DKC1 protein, or a nucleic acid encoding the gsnoRNA and/or DKC1 protein, to a host cell comprising the target RNA. The amount of nucleic acid encoding a gsnoRNA and/or DKC1 protein to be administered, the dosage and the dosing regimen can vary from cell type to cell type, the disease to be treated, the target population, the mode of administration (e.g. systemic versus local), the severity of disease and the acceptable level of side activity, but these can and should be assessed by trial and error during in vitro research, in pre-clinical and clinical trials. The trials are particularly straightforward when the modified sequence leads to an easily-detected phenotypic change.
In some embodiments, the method comprises delivering one or more nucleic acids (e.g., a gsnoRNA or a nucleic acid encoding the gsnoRNA and/or DKC1 protein) and/or a pre-formed gsnoRNA protein complex (which may comprise the gsnoRNA, DKC1 protein, NOP10 protein, GAR1 protein, and/or NHP2 protein) to a cell (e.g., a mammalian or human cell). Exemplary intracellular delivery methods, include, but are not limited to: viruses or virus-like agents; chemical-based transfection methods, such as those using calcium phosphate, dendrimers, liposomes, or cationic polymers (e.g., DEAE-dextran or polyethylenimine); non-chemical methods, such as microinjection, electroporation, cell squeezing, sonoporation, optical transfection, impalefection, protoplast fusion, bacterial conjugation, delivery of plasmids or transposons; particle-based methods, such as using a gene gun, magnectofection or magnet assisted transfection, particle bombardment; and hybrid methods, such as nucleofection. In some embodiments, the present application further provides cells produced by such methods, and organisms (e.g., non-human mammals) comprising or produced from such cells.
Methods of non-viral delivery of nucleic acids include lipofection, nucleofection, microinjection, biolistics, virosomes, liposomes, immunoliposomes, polycation or lipid: nucleic acid conjugates, naked DNA, artificial virions, and agent-enhanced uptake of DNA. Lipofection is described in e.g., U.S. Pat. Nos. 5,049,386, 4,946,787, and 4,897,355, and lipofection reagents are sold commercially (e.g., TRANSFECTAMINE™ and LIPOFECTAMIN®). In some embodiments, LIPOFECTAMINE®2000 is used to transfect a nucleic acid encoding the gsnoRNA and/or the DKC1 protein (e.g., a nucleic acid vector encoding the gsnoRNA and/or the DKC1 protein).
One suitable trial technique involves delivering the nucleic acid molecule according to the present disclosure to cell extracts, cell lines, or a test organism and then taking biopsy samples at various time points thereafter. The sequence of the target RNA can be assessed in the biopsy sample and the proportion of cells having the modification can easily be followed. After this trial has been performed once then the knowledge can be retained and future delivery can be performed without needing to take biopsy samples. A method of the present disclosure can thus include a step of identifying the presence of the desired change in the cell's target RNA sequence, thereby verifying that the target RNA sequence has been modified. The change may be assessed on the level of the protein (length, glycosylation, function or the like), or by some functional read-out, such as a(n) (inducible) current, when the protein encoded by the target RNA sequence is an ion channel, for example. In the case of CFTR function, an Ussing chamber assay or an NPD test in a mammal, including humans, are well known to a person skilled in the art to assess restoration or gain of function.
After pseudouridylation has occurred in a cell, the modified RNA can become diluted over time, for example due to cell division, limited half-life of the edited RNAs, etc. Thus, in practical therapeutic terms a method of the present disclosure may involve repeated delivery of an oligonucleotide until enough target RNAs have been modified to provide a tangible benefit to the patient and/or to maintain the benefits over time.
In some embodiments, gsnoRNAs can be delivered to cells in the form of a naked nucleic acid. One other way by which such constructs (a gsnoRNA and/or DKC1 protein, or a nucleic acid encoding the gsnoRNA and/or DKC1 protein) can be delivered to the cell (either in vitro, ex vivo or in vivo) is by using a delivery vehicle such as a viral vector.
Conventional viral based systems for nucleic acid delivery include retroviral, lenti virus, adenoviral, adeno-associated and herpes simplex virus vectors. Integration in the host genome is possible with the retrovirus, lentivirus, and adeno-associated virus methods, often resulting in long term expression of the inserted transgene. Additionally, high transduction efficiencies have been observed in many different cell types. The tropism of a retrovirus can be altered by incorporating foreign envelope proteins, expanding the potential target population of target cells. Lentiviral vectors are retroviral vectors that are able to transduce or infect non-dividing cells and typically produce high viral titers. Retroviral vectors are comprised of cis-acting long terminal repeats with packaging capacity for up to 6-10 kb of foreign sequence. The minimum cis-acting LTRs are sufficient for replication and packaging of the vectors, which are then used to integrate the nucleic acids into the target cell to provide permanent transgene expression. Widely used retroviral vectors include those based upon murine leukemia vims (MuLV), gibbon ape leukemia virus (GaLV), Simian Immuno deficiency vims (SIV), human immuno deficiency vims (HIV), and combinations thereof. In applications where transient expression is preferred, adenoviral based systems may be used. Adenoviral based vectors are capable of very high transduction efficiency in many cell types and do not require cell division. With such vectors, high titer and levels of expression have been obtained. This vector can be produced in large quantities in a relatively simple system.
Packaging cells are typically used to form virus particles that are capable of infecting a host cell. Such cells include 293T cells, which package adenovirus, and ψ2 cells or PA317 cells, which package retrovirus. Viral vectors are usually generated by producing a cell line that packages a nucleic acid vector into a viral particle. The vectors typically contain the minimal viral sequences required for packaging and subsequent integration into a host, other viral sequences being replaced by an expression cassette for the polynucleotide(s) to be expressed. The missing viral functions are typically supplied in trans by the packaging cell line. For example, AAV vectors used in gene therapy typically only possess ITR sequences from the AAV genome which are required for packaging and integration into the host genome. Viral DNA is packaged in a cell line, which contains a helper plasmid encoding the other AAV genes, namely rep and cap, but lacking ITR sequences. The cell line may also be infected with adenovirus as a helper. The helper virus promotes replication of the AAV vector and expression of AAV genes from the helper plasmid. The helper plasmid is not packaged in significant amounts due to a lack of ITR sequences. Contamination with adenovirus can be reduced by, e.g., heat treatment to which adenovirus is more sensitive than AAV.
In some embodiments, the viral vector is based on Adeno-Associated Virus (AAV). In some embodiments, the viral vector is for instance a retroviral vector such as a lentivirus vector and the like. Also, plasmids, artificial chromosomes, and plasmids usable for targeted homologous recombination and integration in the human genome of cells may be suitably applied for delivery of a gsnoRNA as described herein. In some embodiments, when the gsnoRNA is delivered by a viral vector, it is in the form of an RNA transcript that comprises the sequence of an oligonucleotide according to the present disclosure in a part of the transcript. In some embodiments, an AAV vector according to the present disclosure is a recombinant AAV vector and refers to an AAV vector comprising part of an AAV genome comprising an exon-intron-exon sequence according to the present disclosure encapsidated in a protein shell of capsid protein derived from an AAV serotype. Part of an AAV genome may contain the inverted terminal repeats (ITR) derived from an adeno-associated virus serotype, such as AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9 and others. Protein shell comprised of capsid protein may be derived from an AAV serotype such as AAV1, 2, 3, 4, 5, 6, 7, 8, 9 and others. A protein shell may also be named a capsid protein shell. AAV vector may have one or all wild type AAV genes deleted, but may still comprise functional ITR nucleic acid sequences. Functional ITR sequences are necessary for the replication, rescue and packaging of AAV virions. The ITR sequences may be wild type sequences or may have at least 80%, 85%, 90%, 95, or 100% sequence identity with wild type sequences or may be altered by for example in insertion, mutation, deletion or substitution of nucleotides, as long as they remain functional. In this context, functionality refers to the ability to direct packaging of the genome into the capsid shell and then allow for expression in the host cell to be infected or target cell. In the context of the present disclosure a capsid protein shell may be of a different serotype than the AAV vector genome ITR. An AAV vector according to the present disclosure may thus be composed of a capsid protein shell, i.e. the icosahedral capsid, which comprises capsid proteins (VP1, VP2, and/or VP3) of one AAV serotype, e.g. AAV serotype 2, whereas the ITRs sequences contained in that AAV2 vector may be any of the AAV serotypes described above, including an AAV2 vector. An “AAV2 vector” thus comprises a capsid protein shell of AAV serotype 2, while e.g. an “AAV5 vector” comprises a capsid protein shell of AAV serotype 5, whereby either may encapsidate any AAV vector genome ITR according to the present disclosure. In some embodiments, a recombinant AAV vector according to the present disclosure comprises a capsid protein shell of AAV serotype 2, 5, 8 or AAV serotype 9 wherein the AAV genome or ITRs present in said AAV vector are derived from AAV serotype 2, 5, 8 or AAV serotype 9; such AAV vector is referred to as an AAV2/2, AAV 2/5, AAV2/8, AAV2/9, AAV5/2, AAV5/5, AAV5/8, AAV 5/9, AAV8/2, AAV 8/5, AAV8/8, AAV8/9, AAV9/2, AAV9/5, AAV9/8, or an AAV9/9 vector.
In some embodiments, a recombinant AAV vector according to the present disclosure comprises a capsid protein shell of AAV serotype 2 and the AAV genome or ITRs present in said vector are derived from AAV serotype 5; such vector is referred to as an AAV 2/5 vector. In some embodiments, a recombinant AAV vector according to the present disclosure comprises a capsid protein shell of AAV serotype 2 and the AAV genome or ITRs present in said vector are derived from AAV serotype 8; such vector is referred to as an AAV 2/8 vector. In some embodiments, a recombinant AAV vector according to the present disclosure comprises a capsid protein shell of AAV serotype 2 and the AAV genome or ITRs present in said vector are derived from AAV serotype 9; such vector is referred to as an AAV 2/9 vector. In some embodiments, a recombinant AAV vector according to the present disclosure comprises a capsid protein shell of AAV serotype 2 and the AAV genome or ITRs present in said vector are derived from AAV serotype 2; such vector is referred to as an AAV 2/2 vector. In some embodiments, a nucleic acid molecule harboring an exon-intron-guide RNA-intron-exon sequence according to the present disclosure represented by a nucleic acid sequence of choice is inserted between the AAV genome or ITR sequences as identified above, for example an expression construct comprising an expression regulatory element operably linked to a coding sequence and a 3′ termination sequence. “AAV helper functions” generally refers to the corresponding AAV functions required for AAV replication and packaging supplied to the AAV vector in trans. AAV helper functions complement the AAV functions which are missing in the AAV vector, but they lack AAV ITRs (which are provided by the AAV vector genome). AAV helper functions include the two major ORFs of AAV, namely the rep coding region and the cap coding region or functional substantially identical sequences thereof. Rep and Cap regions are well known in the art. The AAV helper functions can be supplied on an AAV helper construct, which may be a plasmid.
Introduction of the helper construct into the host cell can occur e.g. by transformation, transfection, or transduction prior to or concurrently with the introduction of the AAV genome present in the AAV vector as identified herein. The AAV helper constructs of the present disclosure may thus be chosen such that they produce the desired combination of serotypes for the AAV vector's capsid protein shell on the one hand and for the AAV genome present in said AAV vector replication and packaging on the other hand. “AAV helper virus” provides additional functions required for AAV replication and packaging.
Suitable AAV helper viruses include adenoviruses, herpes simplex viruses (such as HSV types 1 and 2) and vaccinia viruses. The additional functions provided by the helper virus can also be introduced into the host cell via vectors, as described in U.S. Pat. No. 6,531,456. In some embodiments, an AAV genome as present in a recombinant AAV vector according to the present disclosure does not comprise any nucleotide sequences encoding viral proteins, such as the rep (replication) or cap (capsid) genes of AAV. An AAV genome may further comprise a marker or reporter gene, such as a gene for example encoding an antibiotic resistance gene, a fluorescent protein (e.g. gfp) or a gene encoding a chemically, enzymatically or otherwise detectable and/or selectable product (e.g. lacZ, aph, etc.) known in the art. In some embodiments, an AAV vector according to the present disclosure is an AAV2/5, AAV2/8, AAV2/9 or AAV2/2 vector.
In some embodiments, the gsnoRNA and DKC1 are delivered to the cell as a ribonucleoprotein complex (e.g., a complex comprising the gsnoRNA. DKC1, NOP10, GAR1, and/or NHP2). Methods for intracellular delivery of protein or protein complexes, such as pre-formed gsnoRNA-DKC1/NOP10/GAR1/NHP2 complex, include, but are not limited to, mechanical methods, such as microinjection, electroporation and mechanical deformation of cells using a microfluidic device; carrier-based methods, such as cell-penetrating peptides (CPPs), virus-like particles, supercharged proteins, nanocarriers, supramolecular carrier-based delivery systems, and nanoparticle-stabilized nanocapsules. See, for example, Fu et al. Bioconjugate Chem. 2014, 25, 1602-1608. Some mechanical methods, such as microinjection and electroporation, can be invasive, and low-throughput. In some embodiments, the ribonucleoprotein complex is delivered into the cell by inserting the complex through the cell membrane while passing cells through a microfluidic system, such as CELL SQUEEZE® (see, for example, U.S. Patent Application Publication No. 20140287509).
As described above, introduction of the nucleic acid molecule according to the present disclosure into the cell is performed by general methods known to the person skilled in the art. After pseudouridylation, the read-out of the effect (alteration of the target RNA sequence) can be monitored through different ways in an optional identification step. Hence, the identification step of whether the desired pseudouridylation of the target uridine has indeed taken place depends generally on the position of the target uridine in the target RNA sequence, and the effect that is incurred by the presence of the uridine (point mutation, PTC). Hence, in some embodiments, depending on the ultimate effect of U to Ψ conversion, the identification step comprises: assessing the presence of a functional, elongated, full length and/or wild type protein; assessing whether splicing of the pre-mRNA was altered by the pseudouridylation; or using a functional read-out, wherein the target RNA after the pseudouridylation encodes a functional, full length, elongated and/or wild type protein. The functional assessment for each of the diseases mentioned herein will generally be according to methods known to the skilled person.
The nucleic acid molecule, such as a gsnoRNA expression construct or vector according to the present disclosure is suitably administrated in aqueous solution, e.g. saline, or in suspension, optionally comprising additives, excipients and other ingredients, compatible with pharmaceutical use. Administration may be by inhalation (e.g. through nebulization), intranasally, orally, by injection or infusion, intravenously, subcutaneously, intra-dermally, intra-cranially, intravitreally, intramuscularly, intra-tracheally, intra-peritoneally, intra-rectally, and the like. Administration may be in solid form, in the form of a powder, a pill, or in any other form compatible with pharmaceutical use in humans. The present disclosure is particularly suitable for treating genetic diseases, such as CF.
In some embodiments the nucleic acid molecule, such as a gsnoRNA, expression construct or vector can be delivered systemically. In some embodiments, the nucleic acid molecule, such as a gsnoRNA, expression construct or vector can be delivered to cells or delivered locally to a tissue in which the target sequence's phenotype is seen. For instance, mutations in CFTR cause CF which is primarily seen in lung epithelial tissue, so with a CFTR target sequence in some embodiments the deliver the oligonucleotide construct specifically and directly to the lungs. This can be conveniently achieved by inhalation e.g. of a powder or aerosol, typically via the use of a nebuliser. In some embodiments, the nebulizer is a nebulizer that uses a so-called vibrating mesh, including the PARI eFlow (Rapid) or the i-neb from Respironics. It is to be expected that inhaled delivery of oligonucleotide constructs according to the present disclosure can also target these cells efficiently, which in the case of CFTR gene targeting could lead to amelioration of gastrointestinal symptoms also associated with CF. In some diseases the mucus layer shows an increased thickness, leading to a decreased absorption of medicines via the lung. One such a disease is chronical bronchitis, another example is CF. A variety of mucus normalizers are available, such as DNases, hypertonic saline or mannitol, which is commercially available under the name of Bronchitol. When mucus normalizers are used in combination with pseudouridylating oligonucleotide constructs, such as the gsnoRNA constructs according to the present disclosure, they might increase the effectiveness of those medicines. Accordingly, administration of an oligonucleotide construct according to the present disclosure to a subject, such as a human subject, may be combined with mucus normalizers. In addition, administration of the oligonucleotide constructs according to the present disclosure can be combined with administration of small molecule for treatment of CF, such as potentiator compounds for example Kalydeco (ivacaftor; VX-770), or corrector compounds, for example VX-809 (lumacaftor) and/or VX-661. Alternatively, or in combination with the mucus normalizers, delivery in mucus penetrating particles or nanoparticles can be applied for efficient delivery of pseudouridylating molecules to epithelial cells of for example lung and intestine. In some embodiments, administration of an oligonucleotide construct according to the present disclosure to a subject, such as a human subject, is combined with antibiotic treatment to reduce bacterial infections and the symptoms of those such as mucus thickening and/or biofilm formation. The antibiotics can be administered systemically or locally or both. For application in CF patients the oligonucleotide constructs according to the present disclosure, or packaged or complexed oligonucleotide constructs according to the present disclosure may be combined with any mucus normalizer such as a DNase, mannitol, hypertonic saline and/or antibiotics and/or a small molecule for treatment of CF, such as potentiator compounds for example ivacaftor, or corrector compounds, for example lumacaftor and/or VX-661. To increase access to the target cells, Broncheo-Alveolar Favage (BAF) could be applied to clean the lungs before administration of the oligonucleotide according to the present disclosure.
In some aspects, provided herein is a pharmaceutical composition comprising any of the gsnoRNAs, nucleic acid constructs/molecules, or engineered RNA-editing systems described herein, and a pharmaceutically acceptable carrier.
Pharmaceutical compositions can be prepared by mixing the therapeutic agents described herein having the desired degree of purity with optional pharmaceutically acceptable carriers, excipients or stabilizers (Remington's Pharmaceutical Sciences 16th edition, Osol, A. Ed. (1980)), in the form of lyophilized formulations or aqueous solutions. Acceptable carriers, excipients, or stabilizers are nontoxic to recipients at the dosages and concentrations employed, and include buffers, antioxidants including ascorbic acid, methionine, Vitamin E, sodium metabisulfite; preservatives, isotonicifiers (e.g. sodium chloride), stabilizers, metal complexes (e.g. Zn-protein complexes); chelating agents such as EDTA and/or non-ionic surfactants.
In some embodiments, the pharmaceutical composition is contained in a single-use vial, such as a single-use sealed vial. In some embodiments, the pharmaceutical composition is contained in a multi-use vial. In some embodiments, the pharmaceutical composition is contained in bulk in a container. In some embodiments, the pharmaceutical composition is cryopreserved.
In some embodiments, the pharmaceutical composition comprises a gsnoRNA. In other embodiments, the pharmaceutical composition comprises a nucleic acid construct (e.g., a vector such as a plasmid or viral vector) encoding the gsnoRNA. In some embodiments, the pharmaceutical composition comprises free gsnoRNAs (‘naked’ gsnoRNAs), or gsnoRNAs conjugated to other components, such as ligands for targeting, for uptake and/or for intracellular trafficking. gsnoRNAs may be used in aqueous solutions (generally pharmaceutically acceptable carriers and/or solvents), or formulated using transfection agents, liposomes or nanoparticulate forms (e.g. SNALPs, LNPs and the like). Such formulations may comprise functional ligands to enhance bioavailability and the like.
The present application further provides kits and articles of manufacture for use in any embodiment of the treatment methods described herein. The kits and articles of manufacture may comprise any one of the formulations and pharmaceutical compositions described herein.
In some aspects, provided herein is a kit for editing a target RNA in a host cell, comprising any of the gsnoRNA or nucleic acid molecules described in Section II B. In some embodiments, the kit further comprises an agent for enhancing expression of an endogenous DKC1 isoform 3 in the host cell. In some embodiments, the kit comprises a splice-switching antisense oligonucleotide (ASO), wherein the ASO enhances expression of a DKC1 protein that is an endogenous DKC1 isoform with cytoplasmic localization in the host cell. In some embodiments, the kit further comprises a DKC1 protein or nucleic acid encoding a DKC1 protein. In some embodiments, the DKC1 protein is a DKC1 isoform (e.g., isoform 3) with cytoplasmic localization. In some embodiments, the DKC1 protein is part of a ribonucleoprotein (RNP) complex that associates with the gsnoRNA. In some embodiments, the DKC1 protein comprises a deletion of a nuclear localization signal (NLS) relative to a wildtype DKC1 protein of the same species. In some embodiments, the DKC1 protein is a truncated DKC1 variant or a DKC1 variant comprising a deletion, such as any of the truncation or deletion variants described in Section II A above. In some embodiments, the kit further includes instructions for editing the target RNA according to any of the methods described herein.
In some aspects, provided herein is a kit for editing a target RNA in a host cell, comprising an engineered RNA-editing system, wherein the engineered RNA editing system comprises: (a) a gsnoRNA comprising a guide sequence that hybridizes to a sequence comprising a target uridine residue in a target RNA in a host cell, or a nucleic acid molecule encoding the gsnoRNA; and (b) a DKC1 protein, or a nucleic acid molecule encoding the DKC1 protein, wherein the gsnoRNA is capable of recruiting the DKC1 protein to modify the target uridine residue into a pseudouridine residue in the target RNA. In some embodiments, the DKC1 protein is a DKC1 isoform with cytoplasmic localization. In some embodiments, the DKC1 protein is a DKC1 isoform (e.g., isoform 3) with cytoplasmic localization. In some embodiments, the DKC1 protein is part of a ribonucleoprotein (RNP) complex that associates with the gsnoRNA. In some embodiments, the DKC1 protein comprises a deletion of a nuclear localization signal (NLS) relative to a wildtype DKC1 protein of the same species. In some embodiments, the DKC1 protein is a truncated DKC1 variant or a DKC1 variant comprising a deletion, such as any of the truncation or deletion variants described in Section II A above. In some embodiments, the kit further includes instructions for editing the target RNA according to any of the methods described herein.
The kits of the present disclosure are in suitable packaging. Suitable packaging includes, but is not limited to, vials, bottles, jars, flexible packaging (e.g., sealed Mylar or plastic bags), and the like. Kits may optionally provide additional components such as buffers and interpretative information. The present application thus also provides articles of manufacture, which include vials (such as sealed vials), bottles, jars, flexible packaging, and the like.
The instructions relating to the use of the compositions generally include information as to dosage, dosing schedule, and route of administration for the intended treatment. The containers may be unit doses, bulk packages (e.g., multi-dose packages) or sub-unit doses. For example, kits may be provided that contain sufficient dosages of the gsnoRNA and/or DKC1 protein, or nucleic acid molecules encoding the gsnoRNA and/or DKC1 protein as disclosed herein to provide effective treatment of an individual or of many individuals. Additionally, kits may be provided that contain sufficient dosages of the gsnoRNA and/or DKC1 protein, or nucleic acid molecules encoding the gsnoRNA and/or DKC1 protein to allow for multiple administrations to an individual. Kits may also include multiple unit doses of the pharmaceutical compositions and instructions for use and packaged in quantities sufficient for storage and use in pharmacies, for example, hospital pharmacies and compounding pharmacies.
In some embodiments, the kit comprises a delivery system. The delivery system may be a unit dose delivery system. Delivery systems for these various dosage forms can be syringes, dropper bottles, plastic squeeze units, atomizers, nebulizers or pharmaceutical aerosols in either unit dose or multiple dose packages. In some embodiments, there is provided a delivery system of any one of the gsnoRNA and/or DKC1 protein, or nucleic acid molecules encoding the gsnoRNA and/or DKC1 protein described herein, comprising the gsnoRNA and/or DKC1 protein, or nucleic acid molecules encoding the gsnoRNA and/or DKC1 protein and a device for delivering the gsnoRNA and/or DKC1 protein, or nucleic acid molecules encoding the gsnoRNA and/or DKC1 protein.
All of the features disclosed in this specification may be combined in any combination. Each feature disclosed in this specification may be replaced by an alternative feature serving the same, equivalent, or similar purpose. Thus, unless expressly stated otherwise, each feature disclosed is only an example of a generic series of equivalent or similar features.
The present disclosure will be more fully understood by reference to the following examples. They should not, however, be construed as limiting the scope of the present disclosure. It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims.
This example demonstrates the efficiency of pseudouridylation of target RNAs (e.g., mRNA pseudouridylation evidenced by pseudouridylation-dependent PTC-read-through) by engineered guide snoRNAs, and provides different expression systems for guide snoRNAs.
To achieve site-specific pseudouridylation in mRNA in vivo, artificial guide snoRNAs (gsnoRNAs) were engineered to target specific mRNAs for modification (
The sequence of the control gsnoRNA (gCtrl) is provided below, with guide regions underlined (SEQ ID NO: 14):
In the human genome, more than 90% of snoRNA genes are encoded in pre-mRNA introns1. The present inventors first evaluated the effect of PTC-read-through mediated by several gsnoRNAs located in host gene introns (RESTART v1.0). The present inventors first selected 4 endogenous snoRNAs that have high expression levels in human2, including ACA19, ACA44, ACA27, and E2 (within EIF3A, SNHG12, RPL21, RPSA host genes, respectively) (
Based on the present inventors' observation that host gene sequences have divergent effects on different gsnoRNAs (as shown in
To determine whether endogenous DKC1 proteins are responsible for the above observation, the present inventors carried out RESTART v1. 1 on DKC1 stably knockdown (DKC1 KD) HEK293T cells (
To identify optimal gsnoRNA scaffolds, the present inventors selected five snoRNAs (gACA3, gACA17, gACA19, gACA2b, and gACA36) with stable secondary structures predicted by RNAfold3 as candidate scaffolds for further characterization (RESTART v1.2) (
To investigate the roles of the two hairpins of gsnoRNAs, the present inventors introduced mutations in 5′ and 3′ guide elements, respectively (
The present inventors next sought to further improve the PTC read-through efficiency by engineering the gsnoRNA scaffolds (RESTART v1.3) (
The present inventors next asked if spatial proximities of gsnoRNA and target PTC site have impacts on the efficiency of PTC-readthrough. The inventors designed two new reporters: (1) the Report-2 contains a PTC site in between mCherry and EGFP coding regions, and activated by gsnoRNAs from RESTART v1.3. mCherry was utilized to normalize the transfection efficiency. (2) in Reporter-3 (RESTART v1.4), the gsnoRNA is arranged in tandem with the PTC reporter, which is the same PTC reporter as Reporter-2. The gsnoRNAs have comparable efficiencies in suppressing PTC of both Reporter-2 and Reporter-1, indicating gsnoRNAs work for different reporters. Unexpectedly, gsnoRNAs had increased PTC-read-through efficiencies in Reporter-3 (relative EGFP positive cells: ˜30%, ˜2-fold compared to RESTART v1.3).
The present inventors tested RESTART v1.4 in four different cell lines that originated from distinct tissues, including three human cell lines and one murine cell line. Efficient PTC-read-through events were observed for all cell lines tested, suggesting that the gsnoRNA design of the present disclosure is a versatile strategy to suppress PTC in different mammalian cell types.
Notably, neither combining of optimized mutations nor increasing the gsnoRNA expression level by transfecting construct of two tandem gsnoRNAs further increased the PTC-read-through, suggesting RESTART v1.3 offers gsnoRNAs with optimal structure and expression levels. Based on the present inventor's realization that the engineered gsnoRNAs of the present disclosure provided optimized gsnoRNA structure and expression levels, the present inventors wondered whether enzyme levels and accessibility, rather than gsnoRNA stability and expression, might be rate-limiting factors. DKC1 is responsible for snoRNA-guided deposition of pseudouridine and the accompanied PTC read-through in RESTART (
First, the present inventors generated DKC1 stable overexpressing cell lines, and transfected said DKC1-isoform1 overexpressing cells with Reporter-3 (
To better characterize RESTART, an additional set of Reporter-3s was constructed to include all three types of stop codons, and the resulting reporter constructs were transfected into HEK293T with and without exogenous DKC1-isoform3 (
Next, Reporter-3 constructs of each of the three stop codons were individually co-transfected together with 200 ng DKC1-isoform3 expression construct into HEK293T cells. The locus-specific pseudouridine modification of the target was detected by a radiolabeling-free, qPCR-based method6 (
This Example demonstrates correction of disease-relevant premature termination codons (PTCs) using RESTART. RNA-guided pseudouridylation of disease-relevant PTCs by the RESTART system resulted in expression of full-length gene products. Furthermore, restoration of protein function using RESTART was demonstrated for a CFTR gene containing a disease-relevant PTC. In the following example, “X” indicates a stop codon mutation. Sequences of the gsnoRNAs tested are provided in Table 4.
PTC-disease reporters were constructed in which a disease gene containing the PTC site was followed by EGFP (as shown in
RESTART was further validated for suppression of disease-relevant PTCs LMNA-R225X (associated with familial dilated cardiomyopathy (DCM) with conduction disease (DCM-CD)), F9-Y22X and F9-G21X (associated with hemophelia B), ABCA4-R408X (associated with Starfardt disease), RS1-Y65X (associated with X-linked retinoschisis), and Rpe65-R44X (associated with leber congenital amaurosis), as shown in
Finally, restoration of protein function using RESTART was demonstrated for a CFTR CFTR (cystic fibrosis transmembrane conductance regulator) gene containing a disease-relevant PTC. Mutations in CFTR cause the monogenetic disease cystic fibrosis, which affects approximately 1:2500 live births in caucasians. The ability of RESTART to repair the CFTR R553X (CGA-TGA) and W1282X (TGG-TGA) PTC sites and restore protein function was tested by electrophysiological assays, which is the “gold standard” for evaluating CFTR functional rescue. After delivery of RESTART, the function of CFTR containing PTC could be rescued to about 30% of WT CFTR level, indicating the therapeutic potential of RESTART in targeting certain monogenetic diseases.
This example demonstrates the design and synthesis of functional oligonucleotides for gsnoRNA delivery to cells.
Full-length gsnoRNA oligonucleotides were prepared by in vitro transcription (IVT). To increase the stability of the gsnoRNA oligonucleotides in cells, a 5′ Cap modification (m7G(5′)ppp(5′)G cap analog) was added to the gsnoRNA oligonucleotides. The 5′ Cap modification is not present in endogenous intronic snoRNA. As an example, a 5′ Cap modified full-length gACA19 oligonucleotide targeting Reporter-2 (rACA19) was prepared by in vitro transcription (
Chemically synthesized half rACA19 oligonucleotides with 2′-O-methyl and phosphorothioate linkage modifications were prepared and tested for their ability to achieve efficient PTC-readthrough in cells, as shown in
Advantageously, the half gsnoRNA oligonucleotides facilitate chemical synthesis compared to the full-length gsnoRNA (˜130 nt), which is too long synthesized efficiently. Furthermore, the rH5 and rH3 oligonucleotides were synthesized with only six phosphorothioate linkages and four 2′ O-methyl modifications per oligonucleotide, indicating that a small number of modifications is sufficient to promote stability and function of the chemically synthesized half gsnoRNAs. The 5′ hairpin (gH5, with H box) and 3′ hairpin (gH3, with ACA box) constructs reduced the efficiency of PTC-readthrough compared to the gACA19 oligonucleotide prepared by IVT. However, both the rH5 and rH3 oligonucleotides, which have the same sequences as gH5 and gH3, exhibited comparable efficiency with the full-length gACA19 construct (
These results indicate that a gsnoRNA can be effectively delivered to cells as a full-length RNA oligonucleotide prepared by in vitro transcription (e.g., with a 5′ cap to increase stability), or as a half oligonucleotide comprising the 5′ hairpin or the 3′ hairpin prepared by chemical synthesis. Moreover, the data demonstrate that chemically synthesized rH3 or rH5 with six phosphorothioate linkages and only four 2′ O-methyl modifications are stable and functional in cells. Advantageously, the use of chemically synthesized rH3 and rH5 oligonucleotides with a small number of modifications can reducing the cost of preparing the chemically synthesized oligonucleotides. The delivered RNA oligonucleotides can function better than the same construct delivered to cells as a DNA vector encoding the same gsnoRNA construct.
Number | Date | Country | Kind |
---|---|---|---|
PCT/CN2021/096122 | May 2021 | WO | international |
This application claims priority benefit of International Patent Application No. PCT/CN2021/096122 filed May 26, 2021, the content of which is incorporated herein by reference in its entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2022/095172 | 5/26/2022 | WO |