The present application relates to methods and compositions for editing RNAs using engineered linear or circular RNA capable of recruiting an adenosine deaminase to deaminate one or more adenosines in target RNAs.
Genome editing is a powerful tool for biomedical research and development of therapeutics for diseases. Editing technologies using engineered nucleases, such as zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and Cas proteins of CRISPR system have been applied to manipulate the genome in a myriad of organisms. Recently, taking advantage of the deaminase proteins, such as Adenosine Deaminase Acting on RNA (ADAR), new tools were developed for RNA editing. In mammalian cells, there are three types of ADAR proteins, Adar1 (two isoforms, p110 and p150), Adar2 and Adar3 (catalytically inactive). The catalytic substrate of ADAR protein is double-stranded RNA, and ADAR can remove the —NH2 group from an adenosine (A) nucleobase, changing A to inosine (I). (I) is recognized as guanosine (G) and paired with cytidine (C) during subsequent cellular transcription and translation processes. To achieve targeted RNA editing, the ADAR protein or its catalytic domain was fused with a λN peptide, a SNAP-tag or a Cas protein (dCas13b), and a guide RNA was designed to recruit the chimeric ADAR protein to the target site. Alternatively, overexpressing ADAR1 or ADAR2 proteins together with an R/G motif-bearing guide RNA was also reported to enable targeted RNA editing.
However, currently available ADAR-mediated RNA editing technologies have certain limitations. For example, the most effective in vivo delivery for gene therapy is through viral vectors, but the highly desirable adeno-associated virus (AAV) vectors are limited with the cargo size (˜4.5 kb), making it challenging for accommodating both the protein and the guide RNA. Furthermore, over-expression of ADAR1 has recently been reported to confer oncogenicity in multiple myelomas due to aberrant hyper-editing on RNAs, and to generate substantial global off-targeting edits. In addition, ectopic expression of proteins or their domains of non-human origin has potential risk of eliciting immunogenicity. Moreover, pre-existing adaptive immunity and p53-mediated DNA damage response may compromise the efficacy of the therapeutic protein, such as Cas9.
The present application provides methods of RNA editing using ADAR-recruiting RNAs (“dRNA” or “arRNA”), including circular ADAR-recruiting RNAs (“circ-dRNA” or “circ-arRNA”), which are capable of leveraging endogenous Adenosine Deaminase Acting on RNA (“ADAR”) proteins for the RNA editing. Also provided herein are engineered dRNAs or constructs comprising a nucleic acid sequence encoding the engineered dRNAs used in these methods, and compositions and kits comprising the same. Further provided herein are methods for treating or preventing a disease or condition in an individual comprising editing a target RNA associated with the disease or condition in a cell of the individual.
In one aspect, provided herein is a method for editing a target adenosine in a target RNA in a host cell, comprising introducing a deaminase-recruiting RNA (dRNA) or a construct comprising a nucleic acid sequence encoding the dRNA into the host cell, wherein: (1) the dRNA comprises a targeting RNA sequence that is capable of hybridizing to the target RNA to form a duplex RNA, wherein the duplex RNA comprises a bulge comprising a non-target adenosine in the target RNA; and (2) the dRNA is capable of recruiting an adenosine deaminase acting on RNA (ADAR). In some embodiments, the duplex RNA comprises a bulge at each non-target adenosine in the target RNA. In some embodiments, the targeting RNA sequence is complementary to the target RNA with the exception of one or more nucleotides opposite to non-target adenosines in the target RNA. In some embodiments, the targeting RNA sequence is complementary to the target RNA except for lacking two or more consecutive nucleotides opposite to non-target adenosines in the target RNA. In some embodiments, the method comprises reducing level of editing of non-target adenosine in the target RNA.
In some embodiments according to any one of the methods described above, the dRNA is a linear RNA. In some embodiments the dRNA is a linear RNA capable of forming a circular RNA. In some embodiments, the dRNA is a circular RNA.
In some embodiments according to any one of the methods described above, the dRNA comprises a linker nucleic acid sequence flanking an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA.
In some embodiments according to any one of the methods described above, the dRNA comprises a linker nucleic acid sequence replacing an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA.
In another aspect, provided herein is a method for editing a target adenosine in a target RNA in a host cell, comprising introducing a dRNA or a construct comprising a nucleic acid sequence encoding the dRNA into the host cell, wherein: (1) the dRNA comprises a targeting RNA sequence that is capable of hybridizing to the target RNA, wherein the dRNA comprises a linker nucleic acid sequence flanking an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA; (2) the dRNA is capable of recruiting an ADAR; and (3) the dRNA is a circular RNA or a linear RNA capable of forming a circular RNA. In some embodiments, the dRNA is a circular RNA. In some embodiments, the dRNA is a linear RNA capable of forming a circular RNA.
In some embodiments according to any one of the methods described above, wherein the dRNA comprises a linker nucleic acid sequence, the linker nucleic acid sequence is about 5 nucleotides (nt) to about 500 nt long. In some embodiments, the linker nucleic acid sequence is about 50 nt to about 500 nt long. In some embodiments, the linker nucleic acid sequence is about 5 nucleotides (nt) to about 500 nt long. In some embodiments, the linker nucleic acid sequence is less than or equal to 70 nt in length, optionally wherein the length of the linker nucleic acid sequence is any integer between 10 nt-50 nt, 10 nt-40 nt, 10 nt-30 nt, 10 nt-20 nt, 20 nt-50 nt, 20 nt-40 nt, 20 nt-30 nt, 30 nt-50 nt, 30 nt-40 nt or 40 nt-50 nt. In some embodiments, the linker nucleic acid sequence is about 20 nt to about 60 nt in length; optionally wherein the linker nucleic acid sequence is about 30 nt in length, or about 50 nt in length. In some embodiments, at least about any one of: 50%, 60%, 70%, 80%, 85%, 90%, or 95% of the linker nucleic acid sequence comprises adenosine or cytidine; optionally wherein 100% of the linker nucleic acid sequence comprises adenosine or cytidine. In some embodiments, the linker nucleic acid sequence comprises a polyadenosine (polyA), polyguanosine (polyG), or polycytosine (polyC) sequence. In some embodiments, at least 50% of the linker nucleic acid sequence comprises adenosine. In some embodiments, the linker nucleic acid sequence comprises a dinucleotide repeat sequence. In some embodiments, linker nucleic acid sequence comprises (AT)n, wherein n is an integer greater or equal to 3. In some embodiments, the linker nucleic acid sequence comprises SEQ ID NO: 22.
In some embodiments according to any one of the methods described above, wherein the dRNA comprises a linker nucleic acid sequence, the dRNA comprises a first linker nucleic acid sequence flanking the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence flanking the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence flanking the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence replacing the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence replacing the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence flanking the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence replacing the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence replacing the 3′ end of the targeting RNA sequence. In some embodiments, the first linker nucleic acid sequence is identical to the second linker nucleic acid sequence. In some embodiments, the first linker nucleic acid sequence is different from the second linker nucleic acid sequence. In some embodiments, the dRNA is a circular RNA, and the linker nucleic acid sequence connects the 5′ end of the targeting RNA sequence and the 3′ end of the targeting RNA sequence.
In some embodiments according to any one of the methods described above, wherein the dRNA is a circular RNA, the dRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the targeting RNA sequence, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the targeting RNA sequence.
In some embodiments according to any one of the methods described above, the dRNA further comprises a 3′ ligation sequence and a 5′ ligation sequence. In some embodiments, the 3′ ligation sequence and the 5′ ligation sequence are at least partially complementary to each other. In some embodiments, the 3′ ligation sequence and the 5′ ligation sequence are about 20 to about 75 nucleotides in length. In some embodiments, the ligation sequence is about 5 nucleotides (nt) to about 500 nt long. In some embodiments, the ligation sequence is less than or equal to 70 nt in length, optionally wherein the length of the ligation sequence is any integer between 10 nt-50 nt, 10 nt-40 nt, 10 nt-30 nt, 10 nt-20 nt, 20 nt-50 nt, 20 nt-40 nt, 20 nt-30 nt, 30 nt-50 nt, 30 nt-40 nt or 40 nt-50 nt. In some embodiments, the ligation sequence is about 20 nt to about 60 nt in length; optionally wherein the ligation sequence is about 30 nt in length, or about 50 nt in length. In some embodiments, at least about any one of: 50%, 60%, 70%, 80%, 85%, 90%, or 95% of the ligation sequence comprises adenosine or cytidine; optionally wherein 100% of the ligation sequence comprises adenosine or cytidine. In some embodiments, the dRNA is circularized by RNA ligase RtcB. In some embodiments, the RNA ligase RtcB is expressed endogenously in the host cell. In some embodiments, the dRNA is circularized by T4 RNA ligase 1 (Rnl1) or RNA ligase 2 (Rnl2).
In some embodiments according to any one of the methods described above, the dRNA or the construct comprising a nucleic acid sequence encoding the dRNA edits the target adenosine in the target RNA in a dose dependent manner.
In some embodiments according to any one of the methods described above, the method comprises introducing a construct comprising a nucleic acid sequence encoding the dRNA into the host cell. In some embodiments, the construct further comprises a promoter operably linked to the nucleic acid sequence encoding the dRNA. In some embodiments, the promoter is a polymerase II promoter (“Pol II promoter”). In some embodiments, the promoter is a polymerase III promoter (“Pol III promoter”). In some embodiments, the construct is a viral vector or a plasmid. In some embodiments, the construct is an adeno-associated viral (AAV) vector. In some embodiments, the construct is a self-complementary AAV (scAAV).
In some embodiments according to any one of the methods described above, the ADAR is endogenously expressed by the host cell. In some embodiments, the host cell is a T cell.
In some embodiments according to any one of the methods described above, the targeting RNA sequence is about 100 to about 200 nt (e.g., about 150 nt or about 170 nt) long. In some embodiments, the targeting RNA sequence comprises a cytidine, adenosine or uridine directly opposite the target adenosine in the target RNA. In some embodiments, the targeting RNA sequence comprises a cytidine mismatch directly opposite the target adenosine in the target RNA. In some embodiments, the cytidine mismatch is located at least 20 nucleotides away from the 3′ end of the targeting RNA sequence, and at least 5 nucleotides away from the 5′ end of the targeting RNA sequence. In some embodiments, the 5′ nearest neighbor of the target adenosine in the target RNA is a nucleotide selected from U, C, A and G with the preference U>C≈A>G and the 3′ nearest neighbor of the target adenosine in the target RNA is a nucleotide selected from G, C, A and U with the preference G>C>A≈U. In some embodiments, the target adenosine is in a three-base motif of UAG, and wherein the targeting RNA comprises an A directly opposite the uridine in the three-base motif, a cytidine directly opposite the target adenosine, and a cytidine, guanosine or uridine directly opposite the guanosine in the three-base motif.
In some embodiments according to any one of the methods described above, the target RNA is an RNA selected from the group consisting of a pre-messenger RNA, a messenger RNA, a ribosomal RNA, a transfer RNA, a long non-coding RNA and a small RNA. In some embodiments, the target RNA is a pre-messenger RNA.
In some embodiments according to any one of the methods described above, the method further comprises introducing an inhibitor of ADAR3 and/or a stimulator of interferon into the host cell.
In some embodiments according to any one of the methods described above, the method comprises introducing a plurality of dRNAs or constructs each targeting a different target RNA into the host cell.
In some embodiments according to any one of the methods described above, the efficiency of editing the target RNA is at least 40%.
In some embodiments according to any one of the methods described above, the method further comprises introducing an ADAR into the host cell.
In some embodiments according to any one of the methods described above, deamination of the target adenosine in the target RNA results in a missense mutation, an early stop codon, aberrant splicing, or alternative splicing in the target RNA, or reversal of a missense mutation, an early stop codon, aberrant splicing, or alternative splicing in the target RNA. In some embodiments, deamination of the target adenosine in the target RNA results in point mutation, truncation, elongation and/or misfolding of the protein encoded by the target RNA, or a functional, full-length, correctly-folded and/or wild-type protein by reversal of a missense mutation, an early stop codon, aberrant splicing, or alternative splicing in the target RNA.
In some embodiments according to any one of the methods described above, the host cell is a eukaryotic cell. In some embodiments, the host cell is a mammalian cell. In some embodiments, the host cell is a human or mouse cell.
In another aspect, provided herein is an edited RNA or a host cell having an edited RNA produced by any one of the methods presented above.
In another aspect, provided herein is a method for treating or preventing a disease or condition in an individual, comprising editing a target RNA associated with the disease or condition in a cell of the individual according to the method of any one of the methods described above. In some embodiments, the disease or condition is a hereditary genetic disease or a disease or condition associated with one or more acquired genetic mutations. In some embodiments, the disease or condition is a monogenetic or a polygenetic disease or condition. In some embodiments, the target RNA has a G to A mutation.
In some embodiments according to any one of the methods described above, the target RNA is TP53, and the disease or condition is cancer. In some embodiments, the target RNA is IDUA, and the disease or condition is Mucopolysaccharidosis type I (MPS I). In some embodiments, the target RNA is COL3A1, and the disease or condition is Ehlers-Danlos syndrome. In some embodiments, the target RNA is BMPR2, and the disease or condition is Joubert syndrome. In some embodiments, the target RNA is FANCC, and the disease or condition is Fanconi anemia. In some embodiments, the target RNA is MYBPC3, and the disease or condition is primary familial hypertrophic cardiomyopathy. In some embodiments, the target RNA is IL2RG, and the disease or condition is X-linked severe combined immunodeficiency. In some embodiments, the target RNA is MALAT1, and the disease or condition is Hyperglycemia. In some embodiments, the target RNA is RAB7A, and the disease or condition is Charcot-Marie-Tooth disease 2B (CMT2B).
Also provided are compositions, kits and articles of manufacture for use in any one the methods described above.
In another aspect, provided herein is a dRNA for editing a target RNA comprising a targeting RNA sequence that is that is capable of hybridizing to the target RNA to form a duplex RNA, wherein the duplex RNA comprises a bulge comprising a non-target adenosine in the target RNA. In some embodiments, the duplex RNA comprises a bulge at each non-target adenosine in the target RNA. In some embodiments, the targeting RNA sequence is complementary to the target RNA with the exception of one or more nucleotides opposite to non-target adenosines in the target RNA. In some embodiments, the targeting RNA sequence is complementary to the target RNA with the exception of two or more consecutive nucleotides opposite to non-target adenosines in the target RNA. In some embodiments, the method comprises reducing level of editing of non-target adenosine in the target RNA.
In some embodiments according to any one of the dRNAs described above, the dRNA is a linear RNA. In some embodiments the dRNA is a linear RNA capable of forming a circular RNA. In some embodiments, the dRNA is a circular RNA.
In some embodiments according to any one of the dRNAs described above, the dRNA comprises a linker nucleic acid sequence flanking an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA. In some embodiments, the dRNA comprises a linker nucleic acid sequence replacing an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA.
In another aspect presented herein is a dRNA for editing a target RNA comprising a targeting RNA sequence that is capable of hybridizing to the target RNA, wherein the dRNA comprises a linker nucleic acid sequence flanking an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA, and wherein the dRNA is a circular RNA or a linear RNA capable of forming a circular RNA. In some embodiments, the dRNA is a circular RNA. In some embodiments, the dRNA is a linear RNA capable of forming a circular RNA.
In an additional aspect presented herein is a dRNA for editing a target RNA comprising a targeting RNA sequence that is capable of hybridizing to the target RNA, wherein the dRNA comprises a linker nucleic acid sequence replacing an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA, and wherein the dRNA is a circular RNA or a linear RNA capable of forming a circular RNA. In some embodiments, the dRNA is a circular RNA. In some embodiments, the dRNA is a linear RNA capable of forming a circular RNA.
In some embodiments according to any one of the dRNAs described above, wherein the dRNA comprises a linker nucleic acid sequence, the linker nucleic acid sequence is about 5 nucleotides (nt) to about 500 nt long. In some embodiments, the linker nucleic acid sequence is about 50 nt to about 500 nt long. In some embodiments, the linker nucleic acid sequence is about 5 nucleotides (nt) to about 500 nt long. In some embodiments, the linker nucleic acid sequence is less than or equal to 70 nt in length, optionally wherein the length of the linker nucleic acid sequence is any integer between 10 nt-50 nt, 10 nt-40 nt, 10 nt-30 nt, 10 nt-20 nt, 20 nt-50 nt, 20 nt-40 nt, 20 nt-30 nt, 30 nt-50 nt, 30 nt-40 nt or 40 nt-50 nt. In some embodiments, the linker nucleic acid sequence is about 20 nt to about 60 nt in length; optionally wherein the linker nucleic acid sequence is about 30 nt in length, or about 50 nt in length. In some embodiments, at least about any one of: 50%, 60%, 70%, 80%, 85%, 90%, or 95% of the linker nucleic acid sequence comprises adenosine or cytidine; optionally wherein 100% of the linker nucleic acid sequence comprises adenosine or cytidine. In some embodiments, the linker nucleic acid sequence comprises a polyadenosine (polyA), polyguanosine (polyG), or polycytosine (polyC) sequence. In some embodiments, at least 50% of the linker nucleic acid sequence comprises adenosine. In some embodiments, the linker nucleic acid sequence comprises a dinucleotide repeat sequence. In some embodiments, linker nucleic acid sequence comprises (AT)n, wherein n is an integer greater or equal to 3.
In some embodiments according to any one of the dRNAs described above, wherein the dRNA comprises a linker nucleic acid sequence, the dRNA comprises a first linker nucleic acid sequence flanking the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence flanking the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence flanking the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence replacing the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence replacing the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence flanking the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence replacing the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence replacing the 3′ end of the targeting RNA sequence. In some embodiments, the first linker nucleic acid sequence is identical to the second linker nucleic acid sequence. In some embodiments, the first linker nucleic acid sequence is different from the second linker nucleic acid sequence. In some embodiments, the dRNA is a circular RNA, and the linker nucleic acid sequence connects the 5′ end of the targeting RNA sequence and the 3′ end of the targeting RNA sequence.
In some embodiments according to any one of the dRNAs described above, wherein the dRNA is a circular RNA, the dRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the targeting RNA sequence, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the targeting RNA sequence.
In some embodiments according to any one of the dRNAs described above, the dRNA further comprises a 3′ ligation sequence and a 5′ ligation sequence. In some embodiments, the 3′ ligation sequence and the 5′ ligation sequence are at least partially complementary to each other. In some embodiments, the 3′ ligation sequence and the 5′ ligation sequence are about 20 to about 75 nucleotides in length. In some embodiments, the dRNA is circularized by RNA ligase RtcB. In some embodiments, the RNA ligase RtcB is expressed endogenously in the host cell. In some embodiments, the dRNA is circularized by T4 RNA ligase 1 (Rnl1) or RNA ligase 2 (Rnl2).
In some embodiments according to any one of the dRNAs described above, the targeting RNA sequence is about 100 to about 200 nt (e.g., about 150 nt or about 170 nt) long. In some embodiments, the targeting RNA sequence comprises a cytidine, adenosine or uridine directly opposite the target adenosine in the target RNA. In some embodiments, the targeting RNA sequence comprises a cytidine mismatch directly opposite the target adenosine in the target RNA. In some embodiments, the cytidine mismatch is located at least 20 nucleotides away from the 3′ end of the targeting RNA sequence, and at least 5 nucleotides away from the 5′ end of the targeting RNA sequence. In some embodiments, the 5′ nearest neighbor of the target adenosine in the target RNA is a nucleotide selected from U, C, A and G with the preference U>C≈A>G and the 3′ nearest neighbor of the target adenosine in the target RNA is a nucleotide selected from G, C, A and U with the preference G>C>A≈U. In some embodiments, the target adenosine is in a three-base motif of UAG, and wherein the targeting RNA comprises an A directly opposite the uridine in the three-base motif, a cytidine directly opposite the target adenosine, and a cytidine, guanosine or uridine directly opposite the guanosine in the three-base motif.
In some embodiments according to any one of the dRNAs described above, the target RNA is an RNA selected from the group consisting of a pre-messenger RNA, a messenger RNA, a ribosomal RNA, a transfer RNA, a long non-coding RNA and a small RNA. In some embodiments, the target RNA is a pre-messenger RNA.
In some embodiments, there is provided a construct comprising a nucleic acid sequence encoding any one of the dRNAs described above. In some embodiments, the construct further comprises a promoter operably linked to the nucleic acid sequence encoding the dRNA, wherein the promoter is a Pol III promoter. In some embodiments, the construct is a viral vector or a plasmid. In some embodiments, the construct is an adeno-associated viral (AAV) vector. In some embodiments, the construct is a self-complementary AAV (scAAV).
In some embodiments, there is provided a host cell comprising any one of the constructs or dRNAs described above. In some embodiments, there is provided a kit comprising any one of the constructs or dRNAs described above, wherein the kit further comprises instructions for editing a target RNA in a host cell.
It is to be understood that one, some, or all of the properties of the various embodiments described herein may be combined to form other embodiments of the present application. These and other embodiments of the present application are further described by the detailed description that follows.
The present application provides improved RNA editing methods and specially designed RNAs, referred herein as deaminase-recruiting RNAs (“dRNAs”) or ADAR-recruiting RNAs (“arRNAs”) or constructs comprising nucleic acids encoding these arRNAs, to edit target RNAs in a host cell.
“LEAPER” (Leveraging Endogenous ADAR for Programmable Editing on RNA) have been previously developed by inventors of the present application, which leverages endogenous ADAR to edit target RNA by utilizing dRNAs. LEAPER method was described in WO2021/008447 and PCT/CN2021/071292, which are incorporated herein by reference in their entirety. Specifically, a targeting RNA that is partially complementary to the target transcript was used to recruit native ADAR1 or ADAR2 to change adenosine to inosine at a specific site in a target RNA. As such, RNA editing can be achieved in certain systems without ectopic or overexpression of the ADAR proteins in the host cell.
The present application provides improved LEAPER methods that allow for increased editing efficiency, decreased off-target (also referred herein as “bystander editing”) effects, and/or more precise and long-lasting RNA editing. In some embodiments, the dRNA is linear or circular, wherein the dRNA contains deletion of one or more uridines that would base-pair with non-target adenosines in the target RNA. In some embodiments, the dRNA is a circular RNA or a linear RNA capable of forming a circular RNA, wherein the circular RNA comprises one or more linker sequences flanking the targeting RNA sequence that is partially complementary to the target RNA. The linker sequence(s) may enhance on-target editing efficiency for a circular dRNA having a targeting RNA sequence that is prone to form secondary structure(s). The circular dRNAs described herein may provide enhanced stability and efficacy compared to linear dRNAs. The methods described herein have been successfully used to correct pathogenic point mutations. The improved LEAPER methods may provide broad applicability for both therapeutics and biomedical research.
Accordingly, one aspect of the present application provides a method for editing a target adenosine in a target RNA in a host cell, comprising introducing a deaminase-recruiting RNA (dRNA) or a construct comprising a nucleic acid sequence encoding the dRNA into the host cell, wherein: (1) the dRNA comprises a targeting RNA sequence that is capable of hybridizing to the target RNA to form a duplex RNA, wherein the duplex RNA comprises a bulge comprising a non-target adenosine in the target RNA; and (2) the dRNA is capable of recruiting an adenosine deaminase acting on RNA (ADAR).
Another aspect of the present application provides a method for editing a target adenosine in a target RNA in a host cell, comprising introducing a dRNA or a construct comprising a nucleic acid sequence encoding the dRNA into the host cell, wherein: (1) the dRNA comprises a targeting RNA sequence that is capable of hybridizing to the target RNA, wherein the dRNA comprises a linker nucleic acid sequence flanking an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA; (2) the dRNA is capable of recruiting an ADAR; and (3) the dRNA is a circular RNA or a linear RNA capable of forming a circular RNA.
A further aspect of the present application provides a method for editing a target adenosine in a target RNA in a host cell, comprising introducing a dRNA or a construct comprising a nucleic acid sequence encoding the dRNA into the host cell, wherein: (1) the dRNA comprises a targeting RNA sequence that is capable of hybridizing to the target RNA, wherein the dRNA comprises a linker nucleic acid sequence replacing an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA; (2) the dRNA is capable of recruiting an ADAR; and (3) the dRNA is a circular RNA or a linear RNA capable of forming a circular RNA.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as is commonly understood by one of ordinary skill in the art to which this disclosure belongs. All patents, applications, published applications and other publications referred to herein are incorporated by reference in their entireties. If a definition set forth in this section is contrary to or otherwise inconsistent with a definition set forth in a patent, application, or other publication that is herein incorporated by reference, the definition set forth in this section prevails over the definition incorporated herein by reference.
It is appreciated that certain features of the disclosure, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the disclosure, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable sub-combination. All combinations of the embodiments pertaining to particular method steps, reagents, or conditions are specifically embraced by the present disclosure and are disclosed herein just as if each and every combination was individually and explicitly disclosed.
As used herein, the term “bulge” refers to an asymmetric bubble region in a nucleic acid duplex formed due to one or more unpaired nucleotides (e.g., a non-target adenosine) in one strand of the nucleic acid duplex. A bulge described herein may have a fully unpaired region in one strand that does not have any corresponding complementary region in the opposite strand. Alternatively, a bulge described herein may be formed by two non-complementary regions (one in each strand) having different number of nucleotides, which may further contain mismatched nucleotides that do not form Watson-Crick base pairs. The longer of the two non-complementary regions has at least one nucleotide (e.g., a non-target adenosine) that is not paired with any nucleotides in the non-complementary region of the opposite strand, i.e., the opposite strand comprises nucleic acid sequence(s) complementary to nucleic acid sequence(s) flanking the bulge but the opposite strand does not contain at least one nucleotide opposite a nucleotide (e.g., a non-target adenosine) in the bulge. A “bulge” described herein does not encompass a fully mispaired region of nucleotides located within one strand of a nucleic acid duplex, i.e., the opposite strand contains a nucleotide that is non-complementary for each of the nucleotides in the bulge, which results in a symmetric bubble in the nucleic acid duplex. In some examples, the bulge contain 1, 2, 3, 4, 5, or greater than 5 nucleotides in the strand having the unpaired nucleotides. For example, the duplex shown in the schematic of
When a first nucleic acid strand and a second nucleic acid strand form a double-stranded nucleic acid region, a first nucleoside in the first nucleic acid strand that base-pair with a second nucleoside in the second nucleic acid strand are described herein as being “opposite” with respect to each other, or “corresponding” to each other, i.e., the first nucleoside is opposite to the second nucleoside, and the second nucleoside is opposite to the first nucleoside.
The terms “polynucleotide,” “nucleic acid,” “nucleotide sequence,” and “nucleic acid sequence” are used interchangeably. They refer to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof. It will be understood by one of ordinary skill in the art that uracil and thymine can both be represented by ‘t’, instead of ‘u’ for uracil and ‘t’ for thymine; in the context of a ribonucleic acid (RNA), it will be understood that ‘t’ is used to represent uracil unless otherwise indicated.
The terms “deaminase-recruiting RNA,” “dRNA,” “ADAR-recruiting RNA” and “arRNA” are used herein interchangeably to refer to an engineered RNA capable of recruiting an ADAR to deaminate a target adenosine in an RNA.
The terms “Group I intron” and “Group I catalytic intron” are used interchangeably to refer to a self-splicing ribozyme that can catalyze its own excision from an RNA precursor. Group I introns comprise two fragments, the 5′ catalytic Group I intron fragment and the 3′ catalytic Group I intron fragment, which retain their folding and catalytic function (i.e., self-splicing activity). In its native environment, the 5′ catalytic Group I intron fragment is flanked at its 5′ end by a 5′ exon, which comprises a 5′ exon sequence that is recognized by the 5′ catalytic Group I intron fragment; and the 3′ catalytic Group I intron fragment is flanked at its 3′ end by a 3′ exon, which comprises a 3′ exon sequence that is recognized by the 3′ catalytic Group I intron fragment. The terms “5′ exon sequence” and “3′ exon sequence” used herein are labeled according to the order of the exons with respect to the Group I intron in its natural environment.
The terms “adenine,” “guanine,” “cytosine,” “thymine,” “uracil” and “hypoxanthine” as used herein refer to the nucleobases as such. The terms “adenosine,” “guanosine,” “cytidine,” “thymidine,” “uridine” and “inosine,” refer to the nucleobases linked to the ribose or deoxyribose sugar moiety. The term “nucleoside” refers to the nucleobase linked to the ribose or deoxyribose. The term “nucleotide” refers to the respective nucleobase-ribosyl-phosphate or nucleobase-deoxyribosyl-phosphate. Sometimes the terms adenosine and adenine (with the abbreviation, “A”), guanosine and guanine (with the abbreviation, “G”), cytosine and cytidine (with the abbreviation, “C”), uracil and uridine (with the abbreviation, “U”), thymine and thymidine (with the abbreviation, “T”), inosine and hypo-xanthine (with the abbreviation, “I”), are used interchangeably to refer to the corresponding nucleobase, nucleoside or nucleotide. Sometimes the terms nucleobase, nucleoside and nucleotide are used interchangeably, unless the context clearly requires differently.
The term “functional protein” refers to a naturally-occurring protein, functional variants thereof, or an engineered derivative thereof that is functional in treating a genetic disease or condition. The disease or condition may be caused in whole or in part by a change, such as a mutation, in the wildtype, naturally-occurring protein corresponding to the functional protein.
The term “functional variant” of a reference protein refers to a variant polypeptide derived from the reference protein or a portion thereof, and the variant has substantially the same activity (e.g., binding to a target or enzymatic activity) as the reference protein. “Substantially the same activity” means an activity level that is at least about any one of 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or more as the activity of the reference protein.
The present disclosure provides several types of compositions that are polynucleotide or polypeptide based, including variants and derivatives. These include, for example, substitutional, insertional, deletion and covalent variants and derivatives. The term “derivative” is synonymous with the term “variant” and generally refers to a molecule that has been modified and/or changed in any way relative to a reference molecule or a starting molecule.
The term “introducing” or “introduction” used herein means delivering one or more polynucleotides, such as dRNAs or one or more constructs including vectors as described herein, one or more transcripts thereof, to a host cell. The invention serves as a basic platform for enabling targeted editing of RNA, for example, pre-messenger RNA, a messenger RNA, a ribosomal RNA, a transfer RNA, a long non-coding RNA and a small RNA (such as miRNA). The methods of the present application can employ many delivery systems, including but not limited to, viral, liposome, electroporation, microinjection and conjugation, to achieve the introduction of the dRNA or construct as described herein into a host cell. Conventional viral and non-viral based gene transfer methods can be used to introduce nucleic acids into mammalian cells or target tissues. Such methods can be used to administer nucleic acids encoding dRNA of the present application to cells in culture, or in a host organism. Non-viral vector delivery systems include DNA plasmids, RNA (e.g. a transcript of a construct described herein), naked nucleic acid, and nucleic acid complexed with a delivery vehicle, such as a liposome. Viral vector delivery systems include DNA and RNA viruses, which have either episomal or integrated genomes for delivery to the host cell.
In the context of the present application, “target RNA” refers to an RNA sequence to which a deaminase-recruiting RNA sequence is designed to have perfect complementarity or substantial complementarity, and hybridization between the target sequence and the dRNA forms a double stranded RNA (dsRNA) region containing a target adenosine, which recruits an adenosine deaminase acting on RNA (ADAR) that deaminates the target adenosine. In some embodiments, the ADAR is naturally present in a host cell, such as a eukaryotic cell (such as a mammalian cell, e.g., a human cell). In some embodiments, the ADAR is introduced into the host cell.
As used herein, “operably linked,” when referring to a first nucleic acid sequence that is operably linked with a second nucleic acid sequence, means a situation when the first nucleic acid sequence is placed in a functional relationship with the second nucleic acid sequence. For instance, a promoter is operably linked to a coding sequence if the promoter effects the transcription of the coding sequence. Likewise, the coding sequence of a signal peptide is operably linked to the coding sequence of a polypeptide if the signal peptide effects the extracellular secretion of that polypeptide. Generally, operably linked nucleic acid sequences are contiguous and, where necessary to join two protein coding regions, the open reading frames are aligned.
As used herein, “connect” refers to linking of nucleic acid sequences, either directly or indirectly, for example, via an intervening nucleic acid sequence.
As used herein, “complementarity” refers to the ability of a nucleic acid to form hydrogen bond(s) with another nucleic acid by traditional Watson-Crick base-pairing. A percent complementarity indicates the percentage of residues in a nucleic acid molecule which can form hydrogen bonds (i.e., Watson-Crick base pairing) with a second nucleic acid (e.g., about 5, 6, 7, 8, 9, 10 out of 10, being about 50%, 60%, 70%, 80%, 90%, and 100% complementary respectively). “Perfectly complementary” means that all the contiguous residues of a nucleic acid sequence form hydrogen bonds with the same number of contiguous residues in a second nucleic acid sequence. “Substantially complementary” as used herein refers to a degree of complementarity that is at least about any one of 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%, 99%, or 100% over a region of about 40, 50, 60, 70, 80, 100, 150, 200, 250 or more nucleotides, or refers to two nucleic acids that hybridize under stringent conditions.
As used herein, “stringent conditions” for hybridization refer to conditions under which a nucleic acid having complementarity to a target sequence predominantly hybridizes with the target sequence, and substantially does not hybridize to non-target sequences. Stringent conditions are generally sequence-dependent, and vary depending on a number of factors. In general, the longer the sequence, the higher the temperature at which the sequence specifically hybridizes to its target sequence. Non-limiting examples of stringent conditions are described in detail in Tijssen (1993), Laboratory Techniques In Biochemistry And Molecular Biology—Hybridization With Nucleic Acid Probes Part I, Second Chapter “Overview of principles of hybridization and the strategy of nucleic acid probe assay,” Elsevier, N,Y.
“Hybridization” or “hybridize” refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues. The hydrogen bonding may occur by Watson Crick base pairing, Hoogstein binding, or in any other sequence specific manner. A sequence capable of hybridizing with a given sequence is referred to as the “complement” of the given sequence.
A “subject,” “patient,” or “individual” includes a mammal, such as a human or other animal, and typically is human. In some embodiments, the subject, e.g., patient, to whom the therapeutic agents and compositions are administered, is a mammal, typically a primate, such as a human. In some embodiments, the primate is a monkey or an ape. The subject can be male or female and can be any suitable age, including infant, juvenile, adolescent, adult, and geriatric subjects. In some embodiments, the subject is a non-primate mammal, such as a rodent, a dog, a cat, a farm animal, such as a cow or a horse, etc.
As used herein, the term “treatment” refers to clinical intervention designed to have beneficial and desired effects to the natural course of the individual or cell being treated during the course of clinical pathology. For the purpose of this disclosure, desirable effects of treatment include, without limitation, decreasing the rate of disease progression, ameliorating or palliating the disease state, and remission or improved prognosis. For example, an individual is successfully “treated” if one or more symptoms associated with cancer are mitigated or eliminated, including, but are not limited to, reducing the proliferation of (or destroying) cancerous cells, increasing cancer cell-killing, decreasing symptoms resulting from the disease, preventing spread of diseases, preventing recurrence of disease, increasing the quality of life of those suffering from the disease, decreasing the dose of other medications required to treat the disease, delaying the progression of the disease, and/or prolonging survival of individuals.
As used herein, the term “effective amount” or “therapeutically effective amount” of a substance is at least the minimum concentration required to effect a measurable improvement or prevention of a particular disorder. An effective amount herein may vary according to factors such as the disease state, age, sex, and weight of the patient, and the ability of the substance to elicit a desired response in the individual. An effective amount is also one in which any toxic or detrimental effects of the treatment are outweighed by the therapeutically beneficial effects. In reference to cancer, an effective amount comprises an amount sufficient to cause a tumor to shrink and/or to decrease the growth rate of the tumor (such as to suppress tumor growth) or to prevent or delay other unwanted cell proliferation in cancer. In some embodiments, an effective amount is an amount sufficient to delay development of cancer. In some embodiments, an effective amount is an amount sufficient to prevent or delay recurrence. In some embodiments, an effective amount is an amount sufficient to reduce recurrence rate in the individual. An effective amount can be administered in one or more administrations. The effective amount of the drug or composition may: (i) reduce the number of cancer cells; (ii) reduce tumor size; (iii) inhibit, retard, slow to some extent and preferably stop cancer cell infiltration into peripheral organs; (iv) inhibit (i.e., slow to some extent and preferably stop) tumor metastasis; (v) inhibit tumor growth; (vi) prevent or delay occurrence and/or recurrence of tumor; (vii) reduce recurrence rate of tumor, and/or (viii) relieve to some extent one or more of the symptoms associated with the cancer. An effective amount can be administered in one or more administrations. For purposes of this disclosure, an effective amount of drug, compound, or pharmaceutical composition is an amount sufficient to accomplish prophylactic or therapeutic treatment either directly or indirectly. As is understood in the clinical context, an effective amount of a drug, compound, or pharmaceutical composition may or may not be achieved in conjunction with another drug, compound, or pharmaceutical composition. Thus, an “effective amount” may be considered in the context of administering one or more therapeutic agents, and a single agent may be considered to be given in an effective amount if, in conjunction with one or more other agents, a desirable result may be or is achieved.
As used herein the term “wild type” is a term of the art understood by skilled persons and means the typical form of an organism, strain, gene or characteristic as it occurs in nature as distinguished from mutant or variant forms.
A “host cell” as described herein refers to any cell type that can be used as a host cell provided it can be modified as described herein. For example, the host cell may be a host cell with endogenously expressed adenosine deaminase acting on RNA (ADAR), or may be a host cell into which an adenosine deaminase acting on RNA (ADAR) is introduced by a known method in the art. For example, the host cell may be a prokaryotic cell, a eukaryotic cell or a plant cell. In some embodiments, the host cell is derived from a pre-established cell line, such as mammalian cell lines including human cell lines or non-human cell lines. In some embodiments, the host cell is derived from an individual, such as a human individual.
A “recombinant AAV vector (rAAV vector)” refers to a polynucleotide vector comprising one or more heterologous sequences (i.e., nucleic acid sequence not of AAV origin) that are flanked by at least one, and in embodiments two, AAV inverted terminal repeat sequences (ITRs). Such rAAV vectors can be replicated and packaged into infectious viral particles when present in a host cell that has been infected with a suitable helper virus (or that is expressing suitable helper functions) and that is expressing AAV rep and cap gene products (i.e. AAV Rep and Cap proteins). When a rAAV vector is incorporated into a larger polynucleotide (e.g., in a chromosome or in another vector such as a plasmid used for cloning or transfection), then the rAAV vector may be referred to as a “pro-vector” which can be “rescued” by replication and encapsidation in the presence of AAV packaging functions and suitable helper functions. A rAAV vector can be in any of a number of forms, including, but not limited to, plasmids, linear artificial chromosomes, complexed with lipids, encapsulated within liposomes, and encapsidated in a viral particle, particularly an AAV particle. A rAAV vector can be packaged into an AAV virus capsid to generate a “recombinant adeno-associated viral particle (rAAV particle)”.
An “AAV inverted terminal repeat (ITR)” sequence, a term well-understood in the art, is an approximately 145-nucleotide sequence that is present at both termini of the native single-stranded AAV genome. The outermost 125 nucleotides of the ITR can be present in either of two alternative orientations, leading to heterogeneity between different AAV genomes and between the two ends of a single AAV genome. The outermost 125 nucleotides also contains several shorter regions of self-complementarity (designated A, A′, B, B′, C, C′ and D regions), allowing intrastrand base-pairing to occur within this portion of the ITR.
As such, polynucleotides encoding peptides or polypeptides containing substitutions, insertions and/or additions, deletions and covalent modifications with respect to reference sequences, in particular, the polypeptide sequences disclosed herein, are included within the scope of this disclosure. For example, sequence tags or amino acids, such as one or more lysines, can be added to peptide sequences (e.g., at the N-terminal or C-terminal ends). Sequence tags can be used for peptide detection, purification or localization. Lysines can be used to increase peptide solubility or to allow for biotinylation. Alternatively, amino acid residues located at the carboxy and amino terminal regions of the amino acid sequence of a peptide or protein may optionally be deleted providing for truncated sequences. Certain amino acids (e.g., C-terminal residues or N-terminal residues) alternatively may be deleted depending on the use of the sequence, as for example, expression of the sequence as part of a larger sequence that is soluble, or linked to a solid support.
The terms “non-naturally occurring” or “engineered” are used interchangeably and indicate the involvement of the hand of man. The terms, when referring to nucleic acid molecules or polypeptides mean that the nucleic acid molecule or the polypeptide is at least substantially free from at least one other component with which they are naturally associated in nature and as found in nature.
As used herein, “expression” refers to the process by which a polynucleotide is transcribed from a DNA template (such as into and mRNA or other RNA transcript) and/or the process by which a transcribed mRNA is subsequently translated into peptides, polypeptides, or proteins. Transcripts and encoded polypeptides may be collectively referred to as “gene product.” If the polynucleotide is derived from genomic DNA, expression may include splicing of the mRNA in a eukaryotic cell.
As used herein, a “carrier” includes pharmaceutically acceptable carriers, excipients, or stabilizers that are nontoxic to the cell or mammal being exposed thereto at the dosages and concentrations employed. Often the physiologically acceptable carrier is an aqueous pH buffered solution. Non-limiting examples of physiologically acceptable carriers include buffers such as phosphate, citrate, and other organic acids; antioxidants including ascorbic acid; low molecular weight (less than about 10 residues) polypeptide; proteins, such as serum albumin, gelatin, or immunoglobulins; hydrophilic polymers such as polyvinylpyrrolidone; amino acids such as glycine, glutamine, asparagine, arginine or lysine; monosaccharides, disaccharides, and other carbohydrates including glucose, mannose, or dextrins; chelating agents such as EDTA; sugar alcohols such as mannitol or sorbitol; salt-forming counterions such as sodium; and/or nonionic surfactants such as TWEEN™, polyethylene glycol (PEG), and PLURONICS™.
The term “package insert” is used to refer to instructions customarily included in commercial packages of therapeutic products, that contain information about the indications, usage, dosage, administration, combination therapy, contraindications and/or warnings concerning the use of such therapeutic products.
An “article of manufacture” is any manufacture (e.g., a package or container) or kit comprising at least one reagent, e.g., a medicament for treatment of a disease or condition. In some embodiments, the manufacture or kit is promoted, distributed, or sold as a unit for performing the methods described herein.
As used herein, the terms “including,” “containing,” and “comprising” are used in their open, non-limiting sense. It is also understood that aspects and embodiments of the present application described herein may include “consisting” and/or “consisting essentially of” aspects and embodiments.
It is understood that, whether the term “about” is used explicitly or not, every quantity given herein is meant to refer to the actual given value, and it is also meant to refer to the approximation to such given value that would reasonably be inferred based on the ordinary skill in the art, including equivalents and approximations due to the experimental and/or measurement conditions for such given value. Reference to “about” a value or parameter herein includes (and describes) variations that are directed to that value or parameter per se. For example, description referring to “about X” includes description of “X”.
As used herein, reference to “not” a value or parameter generally means and describes “other than” a value or parameter. For example, the method is not used to treat disease of type X means the method is used to treat disease of types other than X.
The term “about X-Y” used herein has the same meaning as “about X to about Y.”
As used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. It is further noted that the claims may be drafted to exclude any optional element. As such, this statement is intended to serve as antecedent basis for use of such exclusive terminology as “solely,” “only” and the like in connection with the recitation of claim elements, or use of a “negative” limitation.
The term “and/or” as used herein a phrase such as “A and/or B” is intended to include both A and B; A or B; A (alone); and B (alone). Likewise, the term “and/or” as used herein a phrase such as “A, B, and/or C” is intended to encompass each of the following embodiments: A, B, and C; A, B, or C; A or C; A or B; B or C; A and C; A and B; B and C; A (alone); B (alone); and C (alone).
Provided herein are methods for editing a target RNA in a host cell using a deaminase-recruiting RNA (dRNA) comprising a targeting RNA sequence having deletion of one or more nucleosides opposite a region comprising a non-target adenosine in a target RNA and/or comprising one or more linker nucleic acid sequences flanking the 5′ and/or 3′ end of the targeting RNA sequence, wherein the dRNA is capable of recruiting an adenosine deaminase acting on RNA (ADAR). The dRNA may be any one of the dRNAs described in section III (“dRNAs, constructs and libraries”) below. In some embodiments, the dRNA is linear. In some embodiments, the dRNA is circular. In some embodiments, the dRNA is a linear RNA capable of forming a circular RNA. In some embodiments, the method uses a construct comprising a nucleic acid sequence encoding the dRNA. The construct may be any one of the constructs described in section III below.
In some embodiments, there is provided a method for editing a target adenosine in a target RNA in a host cell, comprising introducing a dRNA or a construct comprising a nucleic acid sequence encoding the dRNA into the host cell, wherein: (1) the dRNA comprises a targeting RNA sequence that is capable of hybridizing to the target RNA to form a duplex RNA, wherein the duplex RNA comprises a bulge comprising a non-target adenosine in the target RNA; and (2) the dRNA is capable of recruiting an ADAR. In some embodiments, the duplex RNA comprises a bulge at each non-target adenosine in the target RNA. In some embodiments, the targeting RNA sequence is complementary to the target RNA except for lacking one or more nucleotides opposite non-target adenosines in the target RNA. In some embodiments, the dRNA is linear. In some embodiments, the dRNA is circular. In some embodiments, the dRNA is a linear RNA capable of forming a circular RNA. In some embodiments, the method does not comprise introducing any protein or construct comprising a nucleic acid encoding a protein (e.g., Cas, ADAR or a fusion protein of ADAR and Cas) to the host cell.
In some embodiments, there is provided a method for editing a target adenosine in a target RNA in a host cell, comprising introducing a dRNA or a construct comprising a nucleic acid sequence encoding the dRNA into the host cell, wherein: (1) the dRNA comprises a targeting RNA sequence that is capable of hybridizing to the target RNA, wherein the dRNA comprises a linker nucleic acid sequence flanking an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA; (2) the dRNA is capable of recruiting an ADAR; and (3) the dRNA is a circular RNA or a linear RNA capable of forming a circular RNA. In some embodiments, the linker nucleic acid sequence is about 5 nt to about 500 nt long, such as about 50 nt to 200 nt long. In some embodiments, the linker nucleic acid sequence comprises a polyadenosine (polyA), polyguanosine (polyG), or polycytosine (polyC) sequence. In some embodiments, at least 50% of the linker nucleic acid sequence comprises adenosine. In some embodiments, the linker nucleic acid sequence comprises a dinucleotide repeat sequence, such as (AT)n, wherein n is an integer greater or equal to 3. In some embodiments, the linker nucleic acid sequence comprises SEQ ID NO: 22. In some embodiments, the method does not comprise introducing any protein or construct comprising a nucleic acid encoding a protein (e.g., Cas, ADAR or a fusion protein of ADAR and Cas) to the host cell.
In some embodiments, there is provided a method for editing a target adenosine in a target RNA in a host cell, comprising introducing a dRNA or a construct comprising a nucleic acid sequence encoding the dRNA into the host cell, wherein: (1) the dRNA comprises a targeting RNA sequence that is capable of hybridizing to the target RNA, wherein the dRNA comprises a first linker nucleic acid sequence flanking the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence flanking the 3′ end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA; (2) the dRNA is capable of recruiting an ADAR; and (3) the dRNA is a circular RNA or a linear RNA capable of forming a circular RNA. In some embodiments, the first linker nucleic acid sequence and/or the second linker nucleic acid sequence is about 5 nt to about 500 nt long, such as about 50 nt to 200 nt long. In some embodiments, the first linker nucleic acid sequence is identical to the second linker nucleic acid sequence. In some embodiments, the first linker nucleic acid sequence is different from the second linker nucleic acid sequence. In some embodiments, In some embodiments, the first linker nucleic acid sequence and/or the second linker nucleic acid sequence comprises a polyadenosine (polyA), polyguanosine (polyG), or polycytosine (polyC) sequence. In some embodiments, the first linker nucleic acid sequence and/or the second linker nucleic acid sequence comprises a dinucleotide repeat sequence, such as (AT)n, wherein n is an integer greater or equal to 3. In some embodiments, the first linker nucleic acid sequence and/or the second linker nucleic acid sequence comprises SEQ ID NO: 22. In some embodiments, the dRNA is a circular RNA, and wherein the linker nucleic acid sequence connects the 5′ end of the targeting RNA sequence and the 3′ end of the targeting RNA sequence. In some embodiments, the method does not comprise introducing any protein or construct comprising a nucleic acid encoding a protein (e.g., Cas, ADAR or a fusion protein of ADAR and Cas) to the host cell.
In some embodiments, there is provided a method for editing a target adenosine in a target RNA in a host cell, comprising introducing a dRNA or a construct comprising a nucleic acid sequence encoding the dRNA into the host cell, wherein: (1) the dRNA comprises a targeting RNA sequence that is capable of hybridizing to the target RNA to form a duplex RNA, wherein the duplex RNA comprises a bulge comprising a non-target adenosine in the target RNA, wherein the dRNA comprises a linker nucleic acid sequence flanking an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA; (2) the dRNA is capable of recruiting an ADAR; and (3) the dRNA is a circular RNA or a linear RNA capable of forming a circular RNA. In some embodiments, the duplex RNA comprises a bulge at each non-target adenosine in the target RNA. In some embodiments, the targeting RNA sequence is complementary to the target RNA except for lacking one or more nucleotides opposite non-target adenosines in the target RNA. In some embodiments, the linker nucleic acid sequence is about 5 nt to about 500 nt long, such as about 50 nt to 200 nt long. In some embodiments, the linker nucleic acid sequence comprises a polyadenosine (polyA), polyguanosine (polyG), or polycytosine (polyC) sequence. In some embodiments, at least 50% of the linker nucleic acid sequence comprises adenosine. In some embodiments, the linker nucleic acid sequence comprises a dinucleotide repeat sequence, such as (AT)n, wherein n is an integer greater or equal to 3. In some embodiments, the linker nucleic acid sequence comprises SEQ ID NO: 22. In some embodiments, the dRNA comprises a first linker nucleic acid sequence flanking the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence flanking the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence flanking the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence replacing the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence replacing the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence flanking the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence replacing the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence replacing the 3′ end of the targeting RNA sequence. In some embodiments, the first linker nucleic acid sequence is identical to the second linker nucleic acid sequence. In some embodiments, the first linker nucleic acid sequence is different from the second linker nucleic acid sequence. In some embodiments, the dRNA is a circular RNA, and wherein the linker nucleic acid sequence connects the 5′ end of the targeting RNA sequence and the 3′ end of the targeting RNA sequence. In some embodiments, the method does not comprise introducing any protein or construct comprising a nucleic acid encoding a protein (e.g., Cas, ADAR or a fusion protein of ADAR and Cas) to the host cell.
In some embodiments, there is provided a method for editing a target adenosine in a target RNA in a host cell, comprising introducing a circular dRNA or a construct comprising a nucleic acid sequence encoding the circular dRNA into the host cell, wherein: (1) the circular dRNA comprises a targeting RNA sequence that is capable of hybridizing to the target RNA to form a duplex RNA, wherein the duplex RNA comprises a bulge comprising a non-target adenosine in the target RNA; and (2) the circular dRNA is capable of recruiting an ADAR. In some embodiments, the duplex RNA comprises a bulge at each non-target adenosine in the target RNA. In some embodiments, the targeting RNA sequence is complementary to the target RNA except for lacking one or more nucleotides opposite non-target adenosines in the target RNA. In some embodiments, the circular dRNA further comprises a linker nucleic acid sequence flanking an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA. In some embodiments, the circular dRNA further comprises a linker nucleic acid sequence replacing an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA. In some embodiments, the linker nucleic acid sequence is about 5 nt to about 500 nt long, such as about 50 nt to 200 nt long. In some embodiments, the linker nucleic acid sequence comprises SEQ ID NO: 22. In some embodiments, the linker nucleic acid sequence connects the 5′ end of the targeting RNA sequence and the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence flanking the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence flanking the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence flanking the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence replacing the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence replacing the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence flanking the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence replacing the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence replacing the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the targeting RNA sequence, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA further comprises a 3′ ligation sequence and a 5′ ligation sequence. In some embodiments, the method does not comprise introducing any protein or construct comprising a nucleic acid encoding a protein (e.g., Cas, ADAR or a fusion protein of ADAR and Cas) to the host cell.
In some embodiments, there is provided a method for editing a target adenosine in a target RNA in a host cell, comprising introducing into the host cell (a) a dRNA or a construct comprising a nucleic acid sequence encoding the dRNA and (b) an ADAR or a construct comprising a nucleic acid encoding the ADAR, wherein: (1) the dRNA comprises a targeting RNA sequence that is capable of hybridizing to the target RNA to form a duplex RNA, wherein the duplex RNA comprises a bulge comprising a non-target adenosine in the target RNA; and (2) the dRNA is capable of recruiting the ADAR. In some embodiments, the duplex RNA comprises a bulge at each non-target adenosine in the target RNA. In some embodiments, the targeting RNA sequence is complementary to the target RNA except for lacking one or more nucleotides opposite non-target adenosines in the target RNA. In some embodiments, the ADAR is an endogenously encoded ADAR of the host cell, wherein introduction of the ADAR comprises over-expressing the ADAR in the host cell. In some embodiments, the ADAR is exogenous to the host cell. In some embodiments, the construct comprising a nucleic acid encoding the ADAR is a vector, such as a plasmid, or a viral vector (e.g., an AAV such as a scAAV).
In some embodiments, there is provided a method for editing a target adenosine in a target RNA in a host cell, comprising introducing into the host cell (a) a dRNA or a construct comprising a nucleic acid sequence encoding the dRNA and (b) an ADAR or a construct comprising a nucleic acid encoding the ADAR, wherein: (1) the dRNA comprises a targeting RNA sequence that is capable of hybridizing to the target RNA, wherein the dRNA comprises a linker nucleic acid sequence flanking an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA; (2) (2) the dRNA is capable of recruiting the ADAR; and (3) the dRNA is a circular RNA or a linear RNA capable of forming a circular RNA. In some embodiments, the linker nucleic acid sequence is about 5 nt to about 500 nt long, such as about 50 nt to 200 nt long. In some embodiments, the linker nucleic acid sequence comprises a polyadenosine (polyA), polyguanosine (polyG), or polycytosine (polyC) sequence. In some embodiments, at least 50% of the linker nucleic acid sequence comprises adenosine. In some embodiments, the linker nucleic acid sequence comprises a dinucleotide repeat sequence, such as (AT)n, wherein n is an integer greater or equal to 3. In some embodiments, the linker nucleic acid sequence comprises SEQ ID NO: 22. In some embodiments, the dRNA comprises a first linker nucleic acid sequence flanking the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence flanking the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence flanking the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence replacing the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence replacing the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence flanking the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence replacing the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence replacing the 3′ end of the targeting RNA sequence. In some embodiments, the ADAR is an endogenously encoded ADAR of the host cell, wherein introduction of the ADAR comprises over-expressing the ADAR in the host cell. In some embodiments, the ADAR is exogenous to the host cell. In some embodiments, the construct comprising a nucleic acid encoding the ADAR is a vector, such as a plasmid, or a viral vector (e.g., an AAV such as a scAAV).
In some embodiments, there is provided a method for editing a target adenosine in a target RNA in a host cell, comprising introducing into the host cell (a) a dRNA or a construct comprising a nucleic acid sequence encoding the dRNA and (b) an ADAR or a construct comprising a nucleic acid encoding the ADAR, wherein: (1) the dRNA comprises a targeting RNA sequence that is capable of hybridizing to the target RNA to form a duplex RNA, wherein the duplex RNA comprises a bulge comprising a non-target adenosine in the target RNA, wherein the dRNA comprises a linker nucleic acid sequence flanking an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA; (2) the dRNA is capable of recruiting the ADAR; and (3) the dRNA is a circular RNA or a linear RNA capable of forming a circular RNA. In some embodiments, the duplex RNA comprises a bulge at each non-target adenosine in the target RNA. In some embodiments, the targeting RNA sequence is complementary to the target RNA except for lacking one or more nucleotides opposite non-target adenosines in the target RNA. In some embodiments, the linker nucleic acid sequence is about 5 nt to about 500 nt long, such as about 50 nt to 200 nt long. In some embodiments, the linker nucleic acid sequence comprises a polyadenosine (polyA), polyguanosine (polyG), or polycytosine (polyC) sequence. In some embodiments, at least 50% of the linker nucleic acid sequence comprises adenosine. In some embodiments, the linker nucleic acid sequence comprises a dinucleotide repeat sequence, such as (AT)n, wherein n is an integer greater or equal to 3. In some embodiments, the linker nucleic acid sequence comprises SEQ ID NO: 22. In some embodiments, the dRNA comprises a first linker nucleic acid sequence flanking the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence flanking the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence flanking the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence replacing the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence replacing the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence flanking the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence replacing the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence replacing the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA is a circular RNA, and wherein the linker nucleic acid sequence connects the 5′ end of the targeting RNA sequence and the 3′ end of the targeting RNA sequence. In some embodiments, the ADAR is an endogenously encoded ADAR of the host cell, wherein introduction of the ADAR comprises over-expressing the ADAR in the host cell. In some embodiments, the ADAR is exogenous to the host cell. In some embodiments, the construct comprising a nucleic acid encoding the ADAR is a vector, such as a plasmid, or a viral vector (e.g., an AAV such as a scAAV).
In some embodiments, the present invention provides the method for editing a target adenosine in a target RNA as disclosed herein, wherein the dRNA or the construct comprising a nucleic acid sequence encoding the dRNA edits the target adenosine in the target RNA in a dose dependent manner.
In some embodiments, there is provided a method for reducing editing of a non-target adenosine (also referred herein as “bystander editing”) in a target RNA in a host cell, comprising: introducing into the host cell a dRNA or a construct comprising a nucleic acid sequence encoding the dRNA, wherein: (1) the dRNA comprises a targeting RNA sequence that is capable of hybridizing to the target RNA to form a duplex RNA, wherein the duplex RNA comprises a bulge comprising the non-target adenosine in the target RNA; and (2) the dRNA is capable of recruiting an ADAR; wherein the editing rate of the non-target adenosine is reduced compared to a method using a dRNA comprising a targeting RNA sequence that is complementary to the target RNA. In some embodiments, the duplex RNA comprises a bulge at each non-target adenosine in the target RNA. In some embodiments, the targeting RNA sequence is complementary to the target RNA except for lacking one or more nucleotides opposite non-target adenosines in the target RNA. In some embodiments, the targeting RNA sequence is complementary to the target RNA except for lacking two or more consecutive nucleotides opposite non-target adenosines in the target RNA. In some embodiments, the dRNA comprises a linker nucleic acid sequence flanking an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA. In some embodiments, the dRNA comprises a linker nucleic acid sequence replacing an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA. In some embodiments, the linker nucleic acid sequence is about 5 nt to about 500 nt long, such as about 50 nt to 200 nt long. In some embodiments, the linker nucleic acid sequence comprises a polyadenosine (polyA), polyguanosine (polyG), or polycytosine (polyC) sequence. In some embodiments, at least 50% of the linker nucleic acid sequence comprises adenosine. In some embodiments, the linker nucleic acid sequence comprises a dinucleotide repeat sequence, such as (AT)n, wherein n is an integer greater or equal to 3. In some embodiments, the linker nucleic acid sequence comprises SEQ ID NO: 22. In some embodiments, the dRNA comprises a first linker nucleic acid sequence flanking the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence flanking the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA is a circular RNA, and wherein the linker nucleic acid sequence connects the 5′ end of the targeting RNA sequence and the 3′ end of the targeting RNA sequence. In some embodiments, the editing rate of the non-target adenosine is reduced by at least about 20%, 30%, 50%, 60%, 70%, 80%, 90%, 95%, or more compared to a method using a dRNA comprising a targeting RNA sequence that is complementary to the target RNA.
In some embodiments, there is provided a method for increasing editing efficiency of a target adenosine in a target RNA in a host cell, comprising: introducing into the host cell a dRNA or a construct comprising a nucleic acid sequence encoding the dRNA, wherein: (1) the dRNA comprises a targeting RNA sequence that is capable of hybridizing to the target RNA, wherein the dRNA comprises a linker nucleic acid sequence flanking an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA; and (2) the dRNA is capable of recruiting an ADAR; wherein the editing efficiency of the target adenosine is increased compared to a method using a dRNA that does not comprise the linker nucleic acid sequence. In some embodiments, the linker nucleic acid sequence is about 5 nt to about 500 nt long, such as about 50 nt to 200 nt long. In some embodiments, the linker nucleic acid sequence comprises a polyadenosine (polyA), polyguanosine (polyG), or polycytosine (polyC) sequence. In some embodiments, the linker nucleic acid sequence comprises a dinucleotide repeat sequence, such as (AT)n, wherein n is an integer greater or equal to 3. In some embodiments, the linker nucleic acid sequence comprises SEQ ID NO: 22. In some embodiments, the dRNA comprises a first linker nucleic acid sequence flanking the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence flanking the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA is a circular RNA, and wherein the linker nucleic acid sequence connects the 5′ end of the targeting RNA sequence and the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a targeting RNA sequence that is capable of hybridizing to the target RNA to form a duplex RNA, wherein the duplex RNA comprises a bulge comprising the non-target adenosine in the target RNA. In some embodiments, the duplex RNA comprises a bulge at each non-target adenosine in the target RNA. In some embodiments, the targeting RNA sequence is complementary to the target RNA except for lacking one or more nucleotides opposite non-target adenosines in the target RNA. In some embodiments, the editing efficiency of the target adenosine is increased by at least about 50%, 2×, 3×, 4×, 5×, 6×, 7×, 8×, 9×, 10× or more compared to a method using a dRNA that does not comprise the linker nucleic acid sequence.
In one aspect, the present application provides a method for editing a plurality of target RNAs (e.g., at least about 2, 3, 4, 5, 10, 20, 50, 100, 1000 or more) in host cells by introducing a plurality of the dRNAs, or one or more constructs encoding the dRNAs, into the host cells.
In some embodiments, the host cell is a prokaryotic cell. In some embodiments, the host cell is a eukaryotic cell. In some embodiments, the host cell is a mammalian cell. In some embodiments, the host cell is a human cell. In some embodiments, the host cell is a murine cell. In some embodiments, the host cell is a plant cell or a fungal cell.
In some embodiments, the host cell is a cell line, such as HEK293T, HT29, A549, HepG2, RD, SF268, SW13 and HeLa cell. In some embodiments, the host cell is a primary cell, such as fibroblast, epithelial, or immune cell. In some embodiments, the host cell is a T cell. In some embodiments, the host cell is a post-mitosis cell. In some embodiments, the host cell is a cell of the central nervous system (CNS), such as a brain cell, e.g., a cerebellum cell.
In some embodiments, the ADAR is endogenous to the host cell. In some embodiments, the adenosine deaminase acting on RNA (ADAR) is naturally or endogenously present in the host cell, for example, naturally or endogenously present in the eukaryotic cell. In some embodiments, the ADAR is endogenously expressed by the host cell. In some embodiments, the ADAR is exogenously introduced into the host cell. In some embodiments, the ADAR is ADAR1 and/or ADAR2. In some embodiments, the ADAR is one or more ADARs selected from the group consisting of hADAR1, hADAR2, mouse ADAR1 and ADAR2. In some embodiments, the ADAR is ADAR1, such as p110 isoform of ADAR1 (“ADAR1p110”) and/or p150 isoform of ADAR1 (“ADAR1p150”). In some embodiments, the ADAR is ADAR2. In some embodiments, the ADAR is an ADAR2 expressed by the host cell, e.g., ADAR2 expressed by cerebellum cells.
In some embodiments, the ADAR is an ADAR exogenous to the host cell. In some embodiments, the ADAR is a hyperactive mutant of a naturally occurring ADAR. In some embodiments, the ADAR is ADAR1 comprising an E1008Q mutation. In some embodiments, the ADAR is not a fusion protein comprising a binding domain. In some embodiments, the ADAR does not comprise an engineered double-strand nucleic acid-binding domain. In some embodiments, the ADAR does not comprise a MCP domain that binds to MS2 hairpin that is fused to the complementary RNA sequence in the dRNA.
In some embodiments, the host cell has high expression level of ADAR1 (such as ADAR1p110 and/or ADAR1p150) e.g., at least about any one of 10%, 20%, 50%, 100%, 2×, 3×, 5×, or more relative to the protein expression level of β-tubulin. In some embodiments, the host cell has high expression level of ADAR2, e.g., at least about any one of 10%, 20%, 50%, 100%, 2×, 3×, 5×, or more relative to the protein expression level of β-tubulin. In some embodiments, the host cell has low expression level of ADAR3, e.g., no more than about any one of 5×, 3×, 2×, 100%, 50%, 20% or less relative to the protein expression level of β-tubulin.
In certain embodiments, the method further comprises introducing an inhibitor of ADAR3 to the host cell. In some embodiments, the inhibitor of ADAR3 is an RNAi against ADAR3, such as a shRNA against ADAR3 or a siRNA against ADAR3. In some embodiments, the method further comprises introducing a stimulator of interferon to the host cell. In some embodiments, the ADAR is inducible by interferon, for example, the ADAR is ADARp150. In some embodiments, the stimulator of interferon is IFNα. In some embodiments, the inhibitor of ADAR3 and/or the stimulator of interferon are encoded by the same construct (e.g., vector) that encodes the dRNA.
In certain embodiments, the method does not induce immune response, such as innate immune response. In some embodiments, the method does not induce interferon and/or interleukin expression in the host cell. In some embodiments, the method does not induce IFN-β and/or IL-6 expression in the host cell.
The nucleic acids, including dRNAs, constructs thereof, and nucleic acids encoding ADAR may be delivered using any known methods in the art, including viral delivery or non-viral delivery.
Methods of non-viral delivery of nucleic acids include lipofection, nucleofection, microinjection, biolistics, virosomes, liposomes, immunoliposomes, polycation or lipid: nucleic acid conjugates, electroporation, nanoparticles, exosomes, microvesicles, or gene-gun, naked DNA and artificial virions.
The use of RNA or DNA viral based systems for the delivery of nucleic acids has high efficiency in targeting a virus to specific cells and trafficking the viral payload to the cellular nuclei. In certain embodiments, the method comprises introducing a viral vector (such as an AAV, e.g., scAAV, or a lentiviral vector) encoding the dRNA into the host cell. For example, the construct described herein may be any one of the viral vectors described in section III “dRNAs, constructs and libraries” below.
In some embodiments, the method comprises introducing a plasmid encoding the dRNA into the host cell. In some embodiments, the method comprises electroporation of the dRNA (e.g., synthetic dRNA) into the host cell. In some embodiments, the method comprises transfection of the dRNA into the host cell.
In certain embodiments, the efficiency of editing of the target RNA is at least about 10%, such as at least about any one of 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or higher. In some embodiments, the efficiency of editing of the target RNA is at least about 40%. In some embodiments, the efficiency of editing is determined by Sanger sequencing. In some embodiments, the efficiency of editing is determined by next-generation sequencing. In some embodiments, the efficiency of editing is determined by assessing expression of a reporter gene, such as a fluorescence reporter, e.g., EGFP.
In certain embodiments, the method has low off-target editing rate. In some embodiments, the method has lower than about 1% (e.g., no more than about any one of 0.5%, 0.1%, 0.05%, 0.01%, 0.001% or lower) editing efficiency on non-target As in the target RNA. In some embodiments, the method does not edit non-target As in the target RNA. In some embodiments, the method has lower than about 0.1% (e.g., no more than about any one of 0.05%, 0.01%, 0.005%, 0.001%, 0.0001% or lower) editing efficiency on As in non-target RNA.
After deamination, modification of the target RNA and/or the protein encoded by the target RNA, can be determined using different methods depending on the positions of the targeted adenosines in the target RNA. For example, in order to determine whether “A” has been edited to “I” in the target RNA, RNA sequencing methods known in the art can be used to detect the modification of the RNA sequence. When the target adenosine is located in the coding region of an mRNA, the RNA editing may cause changes to the amino acid sequence encoded by the mRNA. For example, point mutations may be introduced to the mRNA of an innate or acquired point mutation in the mRNA may be reversed to yield wild-type gene product(s) because of the conversion of “A” to “I”. Amino acid sequencing by methods known in the art can be used to find any changes of amino acid residues in the encoded protein. Modifications of a stop codon may be determined by assessing the presence of a functional, elongated, truncated, full-length and/or wild-type protein. For example, when the target adenosine is located in a UGA, UAG, or UAA stop codon, modification of the target adenosine residue (UGA or UAG) or As (UAA) may create a read-through mutation and/or an elongated protein, or a truncated protein encoded by the target RNA may be reversed to create a functional, full-length and/or wild-type protein. Editing of a target RNA may also generate an aberrant splice site, and/or alternative splice site in the target RNA, thus leading to an elongated, truncated, or misfolded protein, or an aberrant splicing or alternative splicing site encoded in the target RNA may be reversed to create a functional, correctly-folding, full-length and/or wild-type protein. In some embodiments, the present application contemplates editing of both innate and acquired genetic changes, for example, missense mutation, early stop codon, aberrant splicing or alternative splicing site encoded by a target RNA. Using known methods to assess the function of the protein encoded by the target RNA can find out whether the RNA editing achieves the desired effects. Because deamination of the adenosine (A) to an inosine (I) may correct a mutated A at the target position in a mutant RNA encoding a protein, identification of the deamination into inosine may provide assessment on whether a functional protein is present, or whether a disease or drug resistance-associated RNA caused by the presence of a mutated adenosine is reversed or partly reversed. Similarly, because deamination of the adenosine (A) to an inosine (I) may introduce a point mutation in the resulting protein, identification of the deamination into inosine may provide a functional indication for identifying a cause of disease or a relevant factor of a disease.
When the presence of a target adenosine causes aberrant splicing, the read-out may be the assessment of occurrence and frequency of aberrant splicing. On the other hand, when the deamination of a target adenosine is desirable to introduce a splice site, then similar approaches can be used to check whether the required type of splicing occurs. An exemplary suitable method to identify the presence of an inosine after deamination of the target adenosine is RT-PCR and sequencing, using methods that are well-known to the person skilled in the art.
The effects of deamination of target adenosine(s) include, for example, point mutation, early stop codon, aberrant splice site, alternative splice site and misfolding of the resulting protein. These effects may induce structural and functional changes of RNAs and/or proteins associated with diseases, whether they are genetically inherited or caused by acquired genetic mutations, or may induce structural and functional changes of RNAs and/or proteins associated with occurrence of drug resistance. Hence, the dRNAs, the constructs encoding the dRNAs, and the RNA editing methods of present application can be used in prevention or treatment of hereditary genetic diseases or conditions, or diseases or conditions associated with acquired genetic mutations by changing the structure and/or function of the disease-associated RNAs and/or proteins.
In some embodiments, the target RNA is a pre-messenger RNA. In some embodiments, the target RNA is a messenger RNA. In some embodiments, the target RNA is a regulatory RNA. In some embodiments, the target RNA a ribosomal RNA, a transfer RNA, a long non-coding RNA or a small RNA (e.g., miRNA, pri-miRNA, pre-miRNA, piRNA, siRNA, snoRNA, snRNA, exRNA or scaRNA). The effects of deamination of the target adenosines include, for example, structural and functional changes of the ribosomal RNA, transfer RNA, long non-coding RNA or small RNA (e.g., miRNA), including changes of three-dimensional structure and/or loss of function or gain of function of the target RNA. In some embodiments, deamination of the target As in the target RNA changes the expression level of one or more downstream molecules (e.g., protein, RNA and/or metabolites) of the target RNA. Changes of the expression level of the downstream molecules can be increase or decrease in the expression level.
Some embodiments of the present application involve multiplex editing of target RNAs in host cells, which are useful for screening different variants of a target gene or different genes in the host cells. In some embodiments, wherein the method comprises introducing a plurality of dRNAs to the host cells, at least two of the dRNAs of the plurality of dRNAs have different sequences and/or have different target RNAs. In some embodiments, each dRNA has a different sequence and/or different target RNA. In some embodiments, the method generates a plurality (e.g., at least 2, 3, 5, 10, 50, 100, 1000 or more) of modifications in a single target RNA in the host cells. In some embodiments, the method generates a modification in a plurality (e.g., at least 2, 3, 5, 10, 50, 100, 1000 or more) of target RNAs in the host cells. In some embodiments, the method comprises editing a plurality of target RNAs in a plurality of populations of host cells. In some embodiments, each population of host cells receive a different dRNA or a dRNAs having a different target RNA from the other populations of host cells.
Also provided are edited RNA or host cells having an edited RNA produced by any one of the methods described herein. In some embodiments, the edited RNA comprises an inosine. In some embodiments, the host cell comprises a target RNA having a missense mutation, an early stop codon, an alternative splice site, or an aberrant splice site. In some embodiments, the host cell comprises a mutant, truncated, or misfolded protein. In some embodiments, the method restores the function of the target RNA.
The present application further provides dRNAs, constructs encoding dRNAs and libraries comprising a plurality of dRNAs or constructs thereof, which can be used in any one of the methods of RNA editing or methods of treatment described herein. It is intended that any of the features and parameters described herein for dRNAs or constructs can be combined with each other, as if each and every combination is individually described.
In one aspect, the present application provides a dRNA for editing a target RNA comprising a targeting RNA sequence that is capable of hybridizing to the target RNA to form a duplex RNA, wherein the duplex RNA comprises a bulge comprising a non-target adenosine in the target RNA. In some embodiments, the duplex RNA comprises a bulge at each non-target adenosine in the target RNA. In some embodiments, the targeting RNA sequence is complementary to the target RNA except for lacking one or more nucleotides opposite non-target adenosines in the target RNA. In some embodiments, the targeting RNA sequence has deletion of one or more uridine residues opposite one or more non-target adenosines in a sequence complementary to the target RNA. In some embodiments, the dRNA is a linear RNA. In some embodiments, the dRNA is a circular RNA. In some embodiments, the dRNA is a linear RNA capable of forming a circular RNA.
In one aspect, the present application provides a dRNA for editing a target RNA comprising a targeting RNA sequence that is capable of hybridizing to the target RNA, wherein the dRNA comprises a linker nucleic acid sequence flanking an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA, and wherein the dRNA is a circular RNA or a linear RNA capable of forming a circular RNA. In some embodiments, the linker nucleic acid sequence is about 5 nt to about 500 nt long, such as about 50 nt to 200 nt long. In some embodiments, the linker nucleic acid sequence comprises a polyadenosine (polyA), polyguanosine (polyG), or polycytosine (polyC) sequence. In some embodiments, at least 50% of the linker nucleic acid sequence comprises adenosine. In some embodiments, the linker nucleic acid sequence comprises a dinucleotide repeat sequence, such as (AT)n, wherein n is an integer greater or equal to 3. In some embodiments, the linker nucleic acid sequence comprises SEQ ID NO: 22. In some embodiments, the dRNA is a circular RNA. In some embodiments, the dRNA comprises a first linker nucleic acid sequence flanking the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence flanking the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence flanking the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence replacing the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence replacing the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence flanking the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence replacing the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence replacing the 3′ end of the targeting RNA sequence.
In one aspect, the present application provides a dRNA for editing a target RNA comprising a targeting RNA sequence that is capable of hybridizing to the target RNA, wherein the duplex RNA comprises a bulge comprising a non-target adenosine in the target RNA, wherein the dRNA comprises a linker nucleic acid sequence flanking an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA, and wherein the dRNA is a circular RNA or a linear RNA capable of forming a circular RNA. In some embodiments, the dRNA is a circular RNA. In some embodiments, the targeting RNA sequence is complementary to the target RNA except for lacking one or more nucleotides opposite non-target adenosines in the target RNA. In some embodiments, the linker nucleic acid sequence is about 5 nt to about 500 nt long, such as about 50 nt to 200 nt long. In some embodiments, the linker nucleic acid sequence comprises a polyadenosine (polyA), polyguanosine (polyG), or polycytosine (polyC) sequence. In some embodiments, the linker nucleic acid sequence comprises a dinucleotide repeat sequence, such as (AT)n, wherein n is an integer greater or equal to 3. In some embodiments, the linker nucleic acid sequence comprises SEQ ID NO: 22. In some embodiments, the dRNA comprises a first linker nucleic acid sequence flanking the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence flanking the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence flanking the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence replacing the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence replacing the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence flanking the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence replacing the 5′ end of the targeting RNA sequence and a second linker nucleic acid sequence replacing the 3′ end of the targeting RNA sequence.
In one aspect, the present application provides a construct comprising a nucleic acid sequence encoding any one of the dRNAs described herein. In certain embodiments, the construct is a viral vector or a plasmid. In some embodiments, the construct is an adeno-associated viral (AAV) vector. In some embodiments, the construct is a self-complementary AAV (scAAV) vector. In some embodiments, the construct encodes a single dRNA. In some embodiments, the construct encodes a plurality (e.g., about any one of 1, 2, 3, 4, 5, 10, 20 or more) dRNAs.
In one aspect, the present application provides a library comprising a plurality of the dRNAs or a plurality of the constructs described herein.
In one aspect, the present application provides a composition or a host cell comprising the deaminase-recruiting RNA or the construct described herein. In certain embodiments, the host cell is a prokaryotic cell or a eukaryotic cell. In some embodiments, the host cell is a mammalian cell. In some embodiments, the host cell is a human cell.
dRNAs
The dRNA of the present application comprises a targeting RNA sequence that hybridizes to the target RNA. The targeting RNA sequence is perfectly complementary or substantially complementarity to the target RNA to allow hybridization of the targeting RNA sequence to the target RNA. In some embodiments, the targeting RNA sequence has 100% sequence complementarity as the target RNA. In some embodiments, the targeting RNA sequence is at least about any one of 70%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more complementary to over a continuous stretch of at least about any one of 20, 40, 60, 80, 100, 150, 200, or more nucleotides in the target RNA. In some embodiments, the dsRNA formed by hybridization between the targeting RNA sequence and the target RNA has one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more) non-Watson-Crick base pairs (i.e., mismatches).
In some embodiments, the dsRNA (also referred herein as “duplex RNA”) formed by hybridization between the targeting RNA sequence and the target RNA has one or more unpaired (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more) nucleotides. In some embodiments, the dsRNA formed hybridization between the targeting RNA sequence and the target RNA has one or more non-target adenosines in the target RNA that are unpaired. In some embodiments, the dRNA lacks one or more nucleotides opposite one or more non-target adenosines in the target RNA. In some embodiments, the targeting RNA sequence in the dRNA lacks the nucleotide opposite each non-target adenosine in the target RNA. In some embodiments, the targeting RNA sequence in the dRNA has a deletion of two or more (e.g., 2, 3, 4, or more) consecutive nucleotides opposite a region comprising a non-target adenosine in the target RNA. In some embodiments, the targeting RNA sequence in the dRNA is complementary to the target RNA except for lacking one or more nucleotides opposite one or more non-target adenosines in the target RNA. In some embodiments, the targeting RNA sequence in the dRNA is complementary to the target RNA except lacking a nucleotide opposite each non-target adenosine in the target RNA.
Unpaired nucleotides in a dsRNA give rise to a bulge. In some embodiments, the target RNA hybridizes with the dRNA to form a dsRNA comprising a bulge comprising a non-target adenosine in the target RNA. The bulge in the dsRNA formed by hybridization of the dRNA with the target RNA comprises a non-target adenosine in the target RNA. The bulge maybe single nucleotide bulge, i.e., containing an unpaired non-target adenosine, or multi-nucleotide bulge, i.e., containing additional unpaired or mismatched nucleotides that flank the unpaired non-target adenosine. In some embodiments, the bulge may contain more than one (e.g., 2, 3, 4, 5 or more) unpaired nucleotides in the target RNA, i.e., the bulge is made of unpaired nucleotides that directly flanking the 5′ and/or the 3′ side of the non-target adenosine residue. In some embodiments, the bulge may contain one or more (e.g., 2, 3, 4, 5 or more) mismatched nucleotides directly flanking the 5′ and/or the 3′ side of the non-target adenosine residue. In some embodiments, the bulge comprises an unpaired non-target adenosine, one or more unpaired nucleotides flanking the 5′ and/or the 3′ side of the non-target adenosine residue, and one or more mismatched nucleotides flanking the 5′ and/or the 3′ side of the non-target adenosine residue. In some embodiments, the bulge is 1 nt, 2 nt, 3 nt, or longer.
In some embodiments, the duplex RNA comprises two or more bulges, such as any one of 2, 3, 4, 5, 6, or more bulges, wherein each bulge comprises a non-target adenosine in the target RNA. In some embodiments, the duplex RNA comprises a bulge at each non-target adenosine in the target RNA.
In some embodiments, the dsRNA formed by hybridization between the dRNA and the target RNA does not comprise a mismatch. In some embodiments, the dsRNA formed by hybridization between the dRNA and the target RNA comprises one or more, such as any one of 1, 2, 3, 4, 5, 6, 7 or more mismatches (e.g., the same type of different types of mismatches). In some embodiments, the dsRNA formed by hybridization between the dRNA and the target RNA comprises one or more kinds of mismatches, for example, 1, 2, 3, 4, 5, 6, 7 kinds of mismatches selected from the group consisting of G-A, C-A, U-C, A-A, G-G, C-C and U-U.
In some embodiments, the mismatch is upstream (5′) or downstream (3′) of the target adenosine, which may promote the efficiency of editing of the target adenosine at the target RNA. In some embodiments, the duplex RNA has additional mismatches upstream and/or downstream of the target adenosine.
In some embodiments, the targeting RNA sequence further comprises one or more guanosine(s), such as 1, 2, 3, 4, 5, 6, or more Gs, that is each directly opposite a non-target adenosine in the target RNA. In some embodiments, the targeting RNA sequence comprises two or more consecutive mismatch nucleotides (e.g., 2, 3, 4, 5, or more mismatch nucleotides) opposite a non-target adenosine in the target RNA.
In some embodiments, the dRNA comprises a targeting RNA sequence comprising a G opposite one or more non-target adenosines in the target RNA, and lacks a nucleotide opposite one or more non-target adenosines in the target RNA. The duplex RNA may have one or more bulges comprising one or more non-target adenosines, and at the dRNA comprises Gs opposite the other non-target adenosines in the target RNA.
In some embodiments, the target RNA comprises no more than about 20 non-target As, such as no more than about any one of 15, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 non-target A. The U deletions and/or Gs opposite non-target As together with flanking mismatch or unpaired nucleotides in the dRNA may reduce off-target editing effects by ADAR.
In some embodiments, the dRNA comprises one or more linker nucleic acid sequences (also referred herein as “flanking linker sequence”) flanking an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA. The inventors of the present application discovered that including of a flanking linker sequence may increase editing efficiency of a target adenosine in a target RNA. Without being bound by any theory, it is hypothesized that the linker nucleic acid sequence may increase flexibility of a circular dRNA, which may promote hybridization of the targeting RNA sequence to the target RNA.
In some embodiments, the dRNA comprises a single linker nucleic acid sequence. In some embodiments, the dRNA comprises a linker nucleic acid sequence at the 5′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a linker nucleic acid sequence at the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA comprises a first linker nucleic acid sequence at the 5′ end of the targeting RNA sequence, and a second linker nucleic acid sequence at the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA is a circular RNA comprising a linker nucleic acid sequence connecting directly or indirectly the 5′ end and the 3′ end of the targeting RNA sequence. The first linker nucleic acid sequence and the second linker nucleic acid sequence may have the same or different sequences.
In some embodiments, the linker nucleic acid sequence (including the first linker nucleic acid sequence and the second linker nucleic acid sequence) is at least about any one of 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, or 500 nt long. In some embodiments, the linker nucleic acid sequence (including the first linker nucleic acid sequence and the second linker nucleic acid sequence) is no more than about any one of 500, 450, 400, 350, 300, 250, 200, 150, 100, 90, 80, 70, 60, 50, 40, 30, 20, 10, or 5 nt long. In some embodiments, the linker nucleic acid sequence (including the first linker nucleic acid sequence and the second linker nucleic acid sequence) is about any one of 5-10, 10-20, 20-50, 5-50, 10-100, 5-50, 50-100, 100-200, 200-300, 300-400, 400-500, 5-100, 5-200, 5-300, 5-400, 5-500, 50-200, 50-300, 50-400 or 50-500 nt long. In some embodiments, the linker nucleic acid sequence (including the first linker nucleic acid sequence and the second linker nucleic acid sequence) is about 50 nt long. In some embodiments, the first linker nucleic acid sequence and the second linker nucleic acid sequence have the same length. In some embodiments, the first linker nucleic acid sequence and the second linker nucleic acid sequence have different lengths.
The linker nucleic acid sequence (including the first linker nucleic acid sequence and the second linker nucleic acid sequence) does not substantially form any secondary structure with any part of the dRNA. Computational tools are known in the art to predict the secondary structure of RNAs, including, for example, RNAfold. In some embodiments, the linker nucleic acid sequence does not form a duplex region with a portion of the targeting RNA sequence that is more than about any one of 3, 4, 5, 6, or more basepairs long. In some embodiments, the linker nucleic acid sequence does not contain complementary regions having more than 3, 4, 5, or 6 nucleotides long. In some embodiments, the first linker nucleic acid sequence does not have a complementary region having more than 3, 4, 5, or 6 nucleotides long with respect to the second linker nucleic acid sequence.
The linker nucleic acid sequence (including the first linker nucleic acid sequence and the second linker nucleic acid sequence) could be a mononucleotide or dinucleotide repeat sequence, or a random sequence. In some embodiments, the linker nucleic acid sequence comprises a polyadenosine (polyA), polyguanosine (polyG), or polycytosine (polyC) sequence. In some embodiments, at least 50% of the linker nucleic acid sequence comprises adenosine. In some embodiments, the linker nucleic acid sequence comprises a dinucleotide repeat sequence, such as an AT or TA repeat sequence. In some embodiments, the linker nucleic acid sequence comprises (AT)n, wherein n is an integer greater or equal to 3. In some embodiments, the linker nucleic acid sequence comprises SEQ ID NO: 22.
In some embodiments, the linker nucleic acid sequence serves as a ligation sequence that connects the 5′ end and the 3′ end of the targeting RNA sequence in a circular dRNA.
ADAR, for example, human ADAR enzymes edit double stranded RNA (dsRNA) structures with varying specificity, depending on a number of factors. One important factor is the degree of complementarity of the two strands making up the dsRNA sequence. Perfect complementarity of between the dRNA and the target RNA usually causes the catalytic domain of ADAR to deaminate adenosines in a non-discriminative manner. The specificity and efficiency of ADAR can be modified by introducing mismatches in the dsRNA region. For example, A-C mismatch is preferably recommended to increase the specificity and efficiency of deamination of the adenosine to be edited. Perfect complementarity is not necessarily required for a dsRNA formation between the dRNA and its target RNA, provided there is substantial complementarity for hybridization and formation of the dsRNA between the dRNA and the target RNA. In some embodiments, the dRNA sequence or single-stranded RNA region thereof has at least about any one of 70%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% of sequence complementarity to the target RNA, when optimally aligned. Optimal alignment may be determined with the use of any suitable algorithm for aligning sequences, non-limiting examples of which include the Smith-Waterman algorithm, the Needleman-Wimsch algorithm, algorithms based on the Burrows-Wheeler Transform (e.g., the Burrows Wheeler Aligner).
The nucleotides neighboring the target adenosine also affect the specificity and efficiency of deamination. For example, the 5′ nearest neighbor of the target adenosine to be edited in the target RNA sequence has the preference U>C≈A>G and the 3′ nearest neighbor of the target adenosine to be edited in the target RNA sequence has the preference G>C>A≈U in terms of specificity and efficiency of deamination of adenosine. In some embodiments, when the target adenosine may be in a three-base motif selected from the group consisting of UAG, UAC, UAA, UAU, CAG, CAC, CAA, CAU, AAG, AAC, AAA, AAU, GAG, GAC, GAA and GAU in the target RNA, the specificity and efficiency of deamination of adenosine are higher than adenosines in other three-base motifs. In some embodiments, where the target adenosine to be edited is in the three-base motif UAG, UAC, UAA, UAU, CAG, CAC, AAG, AAC or AAA, the efficiency of deamination of adenosine is much higher than adenosines in other motifs. With respect to the same three-base motif, different designs of dRNA may also lead to different deamination efficiency. Taking the three-base motif UAG as an example, in some embodiments, when the dRNA comprises cytidine (C) directly opposite the target adenosine to be edited, adenosine (A) directly opposite the uridine, and cytidine (C), guanosine (G) or uridine (U) directly opposite the guanosine, the efficiency of deamination of the target adenosine is higher than that using other dRNA sequences. In some embodiments, when the dRNA comprises ACC, ACG or ACU opposite UAG of the target RNA, the editing efficiency of the A in the UAG of the target RNA may reach about 25%-90% (e.g., about 25%-80%, 25%-70%, 25%-60%, 25%-50%, 25%-40%, or 25%-30%).
Besides the target adenosines, there may be one or more adenosines in the target RNA (referred herein as “non-target A”), which are not desirable to be edited. With respect to these adenosines, it is preferable to reduce their editing efficiency as much as possible. The inventors of the present application discovered that deletion of a U opposite a non-target A, which results in formation of a bulge having an unpaired non-target A in the dRNA-target RNA duplex, significantly reduces off-target editing at the non-target A. The dRNA may further contain one or more unpaired nucleotides and/or one or more mismatched nucleotides that directly flank the 5′ or 3′ side of a non-target adenosine. As used herein, the term “unpaired” refers to a nucleotide in a first strand of a duplex nucleic acid that does not basepair with any nucleotide in a second strand of the duplex nucleic acid. In some embodiments, where guanosine is directly opposite an adenosine in the target RNA, the deamination efficiency is significantly decreased. Therefore, in order to decrease off-target deamination, dRNAs can be designed to have deletion of one or more nucleotides (e.g., U) opposite a first non-target adenosine, and/or to have a guanosine directly opposite a second non-target adenosine to be edited in the target RNA.
The desired level of specificity and efficiency of editing the target RNA sequence may depend on different applications. Following the instructions in the present patent application, those of skill in the art will be capable of designing a dRNA having complementary or substantially complementary sequence to the target RNA sequence according to their needs, and, with some trial and error, obtain their desired results. As used herein, the term “mismatch” refers to opposing nucleotides in a double stranded RNA (dsRNA) which do not form perfect base pairs according to the Watson-Crick base pairing rules. Mismatch base pairs include, for example, G-A, C-A, U-C, A-A, G-G, C-C, U-U base pairs. Taking A-C match as an example, where a target adenosine residue is to be edited in the target RNA, a dRNA is designed to comprise a C opposite the A to be edited, generating an A-C mismatch in the dsRNA formed by hybridization between the target RNA and dRNA.
The targeting RNA sequence in the dRNA is single-stranded. The dRNA may be entirely single-stranded or have one or more (e.g., 1, 2, 3, or more) double-stranded regions and/or one or more stem loop regions.
The dRNAs described herein comprise a targeting RNA sequence that is at least partially complementary to the target RNA. In certain embodiments, the targeting RNA sequence in the dRNA comprises at least about any one of 40, 45, 50, 55, 60, 65, 70, 75, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, or 250 nucleotides (nt) long. In certain embodiments, the targeting RNA sequence in the dRNA comprises no more than about any one of 40, 45, 50, 55, 60, 65, 70, 75, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, or 250 nucleotides. In certain embodiments, the targeting RNA sequence in the dRNA is about any one of 40-260, 45-250, 50-240, 60-230, 65-220, 70-220, 70-210, 70-200, 70-190, 70-180, 70-170, 70-160, 70-150, 70-140, 70-130, 70-120, 70-110, 70-100, 70-90, 70-80, 75-200, 80-190, 85-180, 90-170, 95-160, 100-200, 100-150, 100-175, 110-200, 110-160, 110-175, 110-150, 140-160, 105-140, or 105-155 nucleotides in length. In some embodiments, the targeting RNA sequence in the dRNA is about 100 to about 200 nt long. In some embodiments, the targeting RNA sequence in the dRNA is about 70 nt (e.g., 71 nt) long. In some embodiments, the targeting RNA sequence in the dRNA is about 120 nt (e.g., 121 nt) long. In some embodiments, the targeting RNA sequence in the dRNA is about 150 nt (e.g., 151 nt) long. In some embodiments, the targeting RNA sequence in the dRNA is about 170 nt (e.g., 171 nt) long. In some embodiments, the targeting RNA sequence in the dRNA is about 200 nt (e.g., 201 nt) long. In some embodiments, the targeting RNA sequence in the dRNA is about 220 nt (e.g., 221 nt) long.
In some embodiments, the targeting RNA sequence comprises a cytidine, adenosine or uridine directly opposite the target adenosine residue in the target RNA. In some embodiments, the targeting RNA sequence comprises a cytidine mismatch directly opposite the target adenosine residue in the target RNA. In some embodiments, the cytidine mismatch is located at least 5 nucleotides, e.g., at least 10, 15, 20, 25, 30, or more nucleotides, away from the 5′ end of the targeting RNA sequence. In some embodiments, the cytidine mismatch is located at least 20 nucleotides, e.g., at least 25, 30, 35, or more nucleotides, away from the 3′ end of the complementary RNA sequence. In some embodiments, the cytidine mismatch is not located within 20 (e.g., 15, 10, 5 or fewer) nucleotides away from the 3′ end of the targeting RNA sequence. In some embodiments, the cytidine mismatch is located at least 20 nucleotides (e.g., at least 25, 30, 35, or more nucleotides) away from the 3′ end and at least 5 nucleotides (e.g., at least 10, 15, 20, 25, 30, or more nucleotides) away from the 5′ end of the targeting RNA sequence. In some embodiments, the cytidine mismatch is located in the center of the targeting RNA sequence. In some embodiments, the cytidine mismatch is located within 20 nucleotides (e.g., 15, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 nucleotide) of the center of the targeting sequence in the dRNA.
In certain embodiments, the 5′ nearest neighbor of the target adenosine residue is a nucleotide selected from U, C, A and G with the preference U>C≈A>G and the 3′ nearest neighbor of the target adenosine residue is a nucleotide selected from G, C, A and U with the preference G>C>A≈U. In some embodiments, the 5′ nearest neighbor of the target adenosine residue is U. In some embodiments, the 5′ nearest neighbor of the target adenosine residue is C or A. In some embodiments, the 3′ nearest neighbor of the target adenosine residue is G. In some embodiments, the 3′ nearest neighbor of the target adenosine residue is C.
In some embodiments, the target adenosine residue is in a three-base motif selected from the group consisting of UAG, UAC, UAA, UAU, CAG, CAC, CAA, CAU, AAG, AAC, AAA, AAU, GAG, GAC, GAA and GAU in the target RNA. In some embodiments, the three-base motif is UAG, and the dRNA comprises an A directly opposite the U in the three-base motif, a C directly opposite the target A, and a C, G or U directly opposite the G in the three-base motif. In certain embodiments, the three-base motif is UAG in the target RNA, and the dRNA comprises ACC, ACG or ACU that is opposite the UAG of the target RNA. In certain embodiments, the three-base motif is UAG in the target RNA, and the dRNA comprises ACC that is opposite the UAG of the target RNA.
In some embodiments, the dRNA, apart from the targeting RNA sequence, further comprises regions for stabilizing the dRNA, for example, one or more double-stranded regions and/or stem loop regions. In some embodiments, the double-stranded region or stem loop region of the dRNA comprises no more than about any one of 200, 150, 100, 50, 40, 30, 20, 10 or fewer base-pairs. In some embodiments, the dRNA does not comprise a stem loop or double-stranded region. In some embodiments, the dRNA comprises an ADAR-recruiting domain. In some embodiments, the dRNA does not comprise an ADAR-recruiting domain.
The dRNA may comprise one or more modifications. In some embodiments, the dRNA has one or more modified nucleotides, including nucleobase modification and/or backbone modification. Exemplary modifications to the RNA include, but are not limited to, phosphorothioate backbone modification, 2′-substitutions in the ribose (such as 2′-O-methyl and 2′-fluoro substitutions), LNA, and L-RNA.
In some embodiments, the dRNA does not comprise chemical modifications. In some embodiments, the dRNA does not comprise a chemically modified nucleotide, such as 2′-O-methyl nucleotide or a nucleotide having a phosphorothioate linkage. In some embodiments, the dRNA comprises 2′-O-methyl and phosphorothioate linkage modifications only at the first three and last three residues. In some embodiments, the dRNA is not an antisense oligonucleotide (ASO).
The dRNA may further comprise one or more additional expression elements that facilitate expression and/or circularization of the dRNA.
In some embodiments, the dRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the targeting RNA sequence, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the targeting RNA sequence. In some embodiments, the Group I catalytic intron of the T4 phage Td gene is bisected in such a way to preserve structural elements critical for ribozyme folding. Exon fragment 2 is then ligated upstream of exon fragment 1, and a targeting RNA sequence (optionally with linker nucleic acid sequence(s) flanking the 5′ and/or the 3′ ends) is inserted between the exon-exon junction.
In some embodiments, the dRNA is a linear RNA that is capable of forming a circular RNA. In some embodiments, the circulation is performed using the Tornado expression system (“Twister-optimized RNA for durable overexpression”) as described in Litke, J. L. & Jaffrey, S. R. Highly efficient expression of circular RNA aptamers in cells using autocatalytic transcripts. Nat Biotechnol 37, 667-675 (2019), which is hereby incorporated herein by reference in its entirety. Briefly, Tornado-expressed transcripts contain an RNA of interest flanked by Twister ribozymes. A twister ribozyme is any catalytic RNA sequences that are capable of self-cleavage. The ribozymes rapidly undergo autocatalytic cleavage, leaving termini that are ligated by an RNA ligase.
In some embodiments, the dRNA comprises a targeting RNA sequence flanked (directly or indirectly) by a 5′ and/or 3′ ligation sequences. In some embodiments, the dRNA comprises a 3′ ligation sequence. In some embodiments, the dRNA comprises a 5′ ligation sequence. In some embodiments, the dRNA comprises a 3′ ligation sequence and a 5′ ligation sequence. In some embodiments, the 3′ ligation sequence and the 5′ ligation sequence are at least partially complementary to each other. In some embodiments, the 3′ ligation sequence and the 5′ ligation sequence are at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98% or at least about 99% complementary to each other. In some embodiments, the 3′ ligation sequence and the 5′ ligation sequence are fully complementary to each other. In some embodiments, the 5′ and/or 3′ ligation sequences are further flanked by the 5′-Twister ribozyme and/or 3′-Twister ribozymes, respectively.
In some embodiments, the dRNA is a linear RNA capable of forming a circular RNA, wherein the dRNA comprises, from the 5′ to the 3′: a 5′ ligation sequence, a first linker nucleic acid sequence, a targeting RNA sequence, a second linker nucleic acid sequence, and a 3′ ligation sequence. In some embodiments, the dRNA is a linear RNA capable of forming a circular RNA, wherein the dRNA comprises, from the 5′ to the 3′: a 5′ ligation sequence, a linker nucleic acid sequence, a targeting RNA sequence, and a 3′ ligation sequence. In some embodiments, the dRNA is a linear RNA capable of forming a circular RNA, wherein the dRNA comprises, from the 5′ to the 3′: a 5′ ligation sequence, a targeting RNA sequence, a linker nucleic acid sequence, and a 3′ ligation sequence. In some embodiments, the dRNA is a linear RNA capable of forming a circular RNA, wherein the dRNA comprises, from the 5′ to the 3′: a 5′ ligation sequence, a targeting RNA sequence, and a 3′ ligation sequence. In some embodiments, the 3′ ligation sequence comprises SEQ ID NO: 11. In some embodiments, the 5′ ligation sequence comprises SEQ ID NO: 12.
In some embodiments, the dRNA is a circular RNA comprising a ligation sequence connecting directly or indirectly the 5′ end and the 3′ end of the targeting RNA sequence. In some embodiments, the ligation sequence comprises a 5′ ligation sequence and a 3′ ligation sequence that are ligated to each other via a ligase, e.g., T4 RNA ligase such as Rnl1 or Rnl2. In some embodiments, the ligation sequence comprises SEQ ID NO: 10.
In some embodiments, the dRNA is a circular RNA comprising in a clockwise direction: a ligation sequence, a first linker nucleic acid sequence, the targeting RNA sequence, and a second linker nucleic acid sequence, wherein the ligation sequence directly connects the 5′ end of the first linker nucleic acid sequence to the 3′ end of the second linker nucleic acid sequence. In some embodiments, the dRNA is a circular RNA comprising in a clockwise direction: a ligation sequence, a linker nucleic acid sequence, the targeting RNA sequence, wherein the ligation sequence directly connects the 5′ end of the linker nucleic acid sequence to the 3′ end of the targeting RNA sequence. In some embodiments, the dRNA is a circular RNA comprising in a clockwise direction: a ligation sequence, the targeting RNA sequence, and a linker nucleic acid sequence, wherein the ligation sequence directly connects the 5′ end of the targeting RNA sequence to the 3′ end of the linker nucleic acid sequence. In some embodiments, the dRNA is a circular RNA comprising a ligation sequence and the targeting RNA sequence, wherein the ligation sequence directly connects the 5′ end of the targeting RNA sequence to the 3′ end of the targeting RNA sequence.
In some embodiments, the 3′ ligation sequence and the 5′ ligation sequence are independently at least about 20 nucleotides, at least about 25 nucleotides, at least about 30 nucleotides, at least about 35 nucleotides, at least about 40 nucleotides, at least about 45 nucleotides, at least about 50 nucleotides, at least about 55 nucleotides, at least about 60 nucleotides, at least about 65 nucleotides, at least about 70 nucleotides, at least about 75 nucleotides, at least about 80 nucleotides, at least about 85 nucleotides, at least about 90 nucleotides, at least about 95 nucleotides or at least about 100 nucleotides in length. In some embodiments, the 3′ ligation sequence and the 5′ ligation sequence are independently about 20-30 nucleotides, about 30-40 nucleotides, about 40-50 nucleotides, about 50-60 nucleotides, about 60-70 nucleotides, about 70-80 nucleotides, about 80-90 nucleotides, about 90-100 nucleotides, about 100-125 nucleotides, about 125-150 nucleotides, about 20-50 nucleotides, about 50-100 nucleotides or about 100-150 nucleotides in length.
In some embodiments, the dRNA is circularized by an RNA ligase. Non-limiting examples of RNA ligase include: RtcB, T4 RNA Ligase 1 (Rnl1), T4 RNA Ligase 2 (Rnl2), Rnl3 and Trl1. In some embodiments, the RNA ligase is expressed endogenously in the host cell. In some embodiments, the RNA ligase is RNA ligase RtcB. In some embodiments, the method further comprises introducing an RNA ligase (e.g., RtcB) into the host cell.
In some embodiments, the dRNA is circularized before being introduced to the host cell. In some embodiments, the dRNA is chemically synthesized. In some embodiments, the dRNA is circularized through in vitro enzymatic ligation (e.g., using RNA or DNA ligase) or chemical ligation (e.g., using cyanogen bromide or a similar condensing agent).
The dRNAs described herein do not comprise a tracrRNA, crRNA or gRNA used in a CRISPR/Cas system. In some embodiments, the dRNA does not comprise an ADAR-recruiting domain. “ADAR-recruiting domain” can be a nucleotide sequence or structure that binds at high affinity to ADAR, or a nucleotide sequence that binds to a binding partner fused to ADAR in an engineered ADAR construct. Exemplary ADAR-recruiting domains include, but are not limited to, GluR-2, GluR-B (R/G), GluR-B (Q/R), GluR-6 (R/G), 5HT2C, and FlnA (Q/R) domain; see, for example, Wahlstedt, Helene, and Marie, “Site-selective versus promiscuous A-to-I editing.” Wiley Interdisciplinary Reviews: RNA 2.6 (2011): 761-771, which is incorporated herein by reference in its entirety. In some embodiments, the dRNA does not comprise a double-stranded portion. In some embodiments, the dRNA does not comprise a hairpin, such as MS2 stem loop. In some embodiments, the dRNA is single stranded.
In some embodiments, the dRNA comprises a snoRNA sequence linked to the 5′ end of the targeting RNA sequence (“5′ snoRNA sequence”). In some embodiments, the dRNA comprises a snoRNA sequence linked to the 3′ end of the targeting RNA sequence (“3′ snoRNA sequence”). In some embodiments, the dRNA comprises a snoRNA sequence linked to the 5′ end of the targeting RNA sequence (“5′ snoRNA sequence”) and a snoRNA sequence linked to the 3′ end of the targeting RNA sequence (“3′ snoRNA sequence”). In some embodiments, the snoRNA sequence is at least about 50 nucleotides, at least about 60 nucleotides, at least about 70 nucleotides, at least about 80 nucleotides, at least about 90 nucleotides, at least about 100 nucleotides, at least about 110 nucleotides, at least about 120 nucleotides, at least about 130 nucleotides, at least about 140 nucleotides, at least about 150 nucleotides, at least about 160 nucleotides, at least about 170 nucleotides, at least about 180 nucleotides, at least about 190 nucleotides or at least about 200 nucleotides in length. In some embodiments, the snoRNA sequence is about 50-75 nucleotides, about 75-100 nucleotides, about 100-125 nucleotides, about 125-150 nucleotides, about 150-175 nucleotides, about 175-200 nucleotides, about 50-100 nucleotides, about 100-150 nucleotides, about 150-200 nucleotides, about 125-175 nucleotides, or about 100-200 nucleotides in length. In some embodiments, the snoRNA sequence is a C/D Box snoRNA sequence. In some embodiments, the snoRNA sequence is an H/ACA Box snoRNA sequence. In some embodiments, the snoRNA sequence is a composite C/D Box and H/ACA Box snoRNA sequence. In some embodiments, the snoRNA sequence is an orphan snoRNA sequence.
Small nucleolar RNAs (snoRNAs) are small non-coding RNA molecules that are known to guide chemical modifications of other RNAs such as ribosomal RNAs, transfer RNAs, and small nuclear RNAs. There are two major groups of snoRNAs according to their specific secondary structure features: box C/D and box H/ACA. Both structural features of snoRNAs enable them binding to corresponding RNA binding proteins (RBPs) along with accessory proteins, forming functional small nucleolar ribonucleoprotein (snoRNP) complexes. Box C/D snoRNAs are believed to be associated with methylation, while H/ACA box snoRNAs are believed to be associated with pseudouridylation. Other families of snoRNAs include, for example, composite H/ACA and C/D box snoRNA and orphan snoRNAs. The snoRNA sequence described herein can comprise a naturally-occurring snoRNA, a portion thereof, or a variant thereof.
The present application provides constructs encoding the dRNAs and/or ADAR. In some embodiments, there is provided a construct (e.g., vector, such as viral vector) comprising a nucleotide sequence encoding the dRNA. In some embodiments, there is provided a construct (e.g., vector, such as viral vector) comprising a nucleotide sequence encoding the ADAR. In some embodiments, there is provided a construct comprising a first nucleotide sequence encoding the dRNA and a second nucleotide sequence encoding the ADAR. In some embodiments, the first nucleotide sequence and the second nucleotide sequence are operably linked to the same promoter. In some embodiments, the first nucleotide sequence and the second nucleotide sequence are operably linked to different promoters. In some embodiments, the promoter is inducible. In some embodiments, the construct does not encode for the ADAR. In some embodiments, the vector further comprises nucleic acid sequence(s) encoding an inhibitor of ADAR3 (e.g., ADAR3 shRNA or siRNA) and/or a stimulator of interferon (e.g., IFN-α).
The term “construct” as used herein refers to DNA or RNA molecules that comprise a coding nucleic acid sequence that can be transcribed into RNAs or expressed into proteins. In some embodiments, the construct contains one or more regulatory elements operably linked to the nucleic acid sequence encoding the RNA or protein. When the construct is introduced into a host cell, under suitable conditions, the coding nucleic acid sequence in the construct can be transcribed or expressed.
The constructs described herein may comprise a promoter that is operably linked to the nucleic acid sequence encoding the dRNA, such that the promoter controls the transcription or expression of the coding nucleotide sequence. The promoter may be positioned 5′ (upstream) of a coding nucleotide sequence under its control. The distance between the promoter and the coding sequence may be approximately the same as the distance between that promoter and the gene it controls in the gene from which the promoter is derived. As is known in the art, variation in this distance may be accommodated without loss of promoter function. In some embodiments, the construct comprises a 5′ UTR and/or a 3′UTR that regulates the transcription or expression of the coding nucleotide sequence. In some embodiments, the promoter is driving expression of two or more dRNAs.
The promoter may be a polymerase II promoter (“Pol II promoter”) or a polymerase III promoter (“Pol III promoter”). In some embodiments, wherein the dRNA is a linear RNA, the construct comprises a Pol II promoter operably linked to a nucleic acid sequence encoding the dRNA. Non-limiting examples of Pol II promoters include: CMV, SV40, EF-1α, CAG and RSV. In some embodiments, the Pol II promoter is a CMV promoter. In some embodiments, the CMV promoter comprises the nucleic acid sequence of SEQ ID NO: 5.
In some embodiments, wherein the dRNA is a circular RNA or a linear RNA capable of forming a circular RNA, the construct comprises a Pol III promoter. In some embodiments, the promoter is a U6 promoter. In some embodiments, the U6 promoter comprises the nucleic acid sequence of SEQ ID NO: 6.
In some embodiments, the construct is a vector encoding any one of the dRNAs disclosed in the present application. The term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. Vectors include, but are not limited to, nucleic acid molecules that are single-stranded, double-stranded, or partially double-stranded; nucleic acid molecules that comprise one or more free ends, no free ends (e.g. circular); nucleic acid molecules that comprise DNA, RNA, or both; and other varieties of polynucleotides known in the art. One type of vector is a “plasmid,” which refers to a circular double stranded DNA loop into which additional DNA segments can be inserted, such as by standard molecular cloning techniques. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., non-episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome. Moreover, certain vectors are capable of directing the transcription or expression of coding nucleotide sequences to which they are operatively linked. Such vectors are referred to herein as “expression vectors”.
In some embodiments, the construct is a viral vector. In some embodiments, the construct is lentivirus vector. In some embodiments, the vector is a recombinant adeno-associated virus (rAAV) vector. Use of any AAV serotype is considered within the scope of the present disclosure. In some embodiments, the rAAV vector is a vector derived from an AAV serotype, including without limitation, AAV ITRs are AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAVrh8, AAVrh8R, AAV9, AAV10, AAVrh10, AAV11, AAV12, AAV2R471A, AAV DJ, a goat AAV, bovine AAV, or mouse AAV capsid serotype or the like. In some embodiments, the construct is flanked by one or more AAV inverted terminal repeat (ITR) sequences. In some embodiments, the construct is flanked by two AAV ITRs. In some embodiments, the AAV ITRs are AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAVrh8, AAVrh8R, AAV9, AAV10, AAVrh10, AAV11, AAV12, AAV2R471A, AAV DJ, a goat AAV, bovine AAV, or mouse AAV serotype ITRs. In some embodiments, the AAV ITRs are AAV2 ITRs.
In some embodiments, the vector further comprises a stuffer nucleic acid. In some embodiments, the stuffer nucleic acid is located upstream or downstream of the nucleic acid encoding the dRNA. In some embodiments, the vector is a self-complementary rAAV vector. In some embodiments, the vector comprises first nucleic acid sequence encoding the dRNA and a second nucleic acid sequence encoding a complement of the dRNA, wherein the first nucleic acid sequence can form intrastrand base pairs with the second nucleic acid sequence along most or all of its length. In some embodiments, the first nucleic acid sequence and the second nucleic acid sequence are linked by a mutated AAV ITR, wherein the mutated AAV ITR comprises a deletion of the D region and comprises a mutation of the terminal resolution sequence. In some embodiments, the vector is encapsidated in a rAAV particle. In some embodiments, the AAV viral particle comprises an AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAVrh8, AAVrh8R, AAV9, AAV10, AAVrh10, AAV11, AAV12, AAV2R471A, AAV2/2-7m8, AAV DJ, AAV2 N587A, AAV2 E548A, AAV2 N708A, AAV2 V708K, AAV2-HBKO, AAVDJ8, AAVPHP.B, AAVPHP.eB, AAVBR1, AAVHSC15, AAVHSC17, goat AAV, AAV1/AAV2 chimeric, bovine AAV, mouse AAV, or rAAV2/HBoV1 serotype capsid.
In some embodiments, the construct further comprises a 3′ twister ribozyme sequence linked to the 3′ end of the nucleic acid encoding the dRNA. In some embodiments, the construct further comprises a 5′ twister ribozyme sequence linked to the 5′ end of the nucleic acid sequence encoding the dRNA. In some embodiments, the construct further comprises a 3′ twister ribozyme sequence linked to the 3′ end of the nucleic acid sequence encoding the dRNA and a 5′ twister ribozyme sequence linked to the 5′ end of the nucleic acid encoding the dRNA. In some embodiments, the 3′ twister sequence is twister P3 U2A and the 5′ twister sequence is twister P1. In some embodiments, wherein the 5′ twister sequence is twister P3 U2A and the 3′ twister sequence is twister P1. In some embodiments, the dRNA undergoes autocatalytic cleavage. In some embodiments, the catalyzed dRNA product comprises a 5′-hydroxyl group and a 2′,3′-cyclic phosphate at the 3′ terminus.
The dRNAs described herein may be prepared using any known methods in the art, including chemical synthesis and in vitro transcription. Circular dRNAs may be prepared by chemical ligation, enzymatic ligation, or ribozyme autocatalysis of linear RNAs. In some embodiments, the circular dRNA is prepared by circularizing a linear RNA in vitro.
In some embodiments, the present application provides a linear RNA capable of forming the circular dRNA of any one of the embodiments described above. In some embodiments, the linear RNA can circularized by chemical circularization methods using cyanogen bromide or a similar condensing agent. In some embodiments, the linear RNA can be circularized by autocatalysis of a Group I intron comprising a 5′ catalytic Group I intron fragment and a 3′ catalytic Group I intron fragment. In some embodiments, the linear RNA can be circularized by a ligase. In some embodiments, the linear RNA can be circularized by a T4 RNA ligase. In some embodiments, the linear RNA can be circularized by a DNA ligase. Suitable ligases include, but are not limited to a T4 DNA ligase (T4 Dnl), a T4 RNA ligase 1 (T4 Rnl1) and a T4 RNA ligase 2 (T4 Rnl2). The circular dRNA may be purified using known methods in the art, for example, by gel-purification or by high-performance liquid chromatography (HPLC).
In some embodiments, a linear RNA can be circularized by chemical methods to provide a circular dRNA. In some chemical methods, the 5′-end and the 3′-end of the nucleic acid (e.g., a linear circular polyribonucleotide) includes chemically reactive groups that, when close together, may form a new covalent linkage between the 5′-end and the 3′-end of the molecule. The 5′-end may contain an NHS ester reactive group and the 3′-end may contain a 3′-amino terminated nucleotide such that in an organic solvent the 3′-amino-terminated nucleotide on the 3′-end of a linear RNA molecule will undergo a nucleophilic attack on the 5′-NHS-ester moiety forming a new 5′-/3′-amide bond.
In some embodiments, a circular dRNA can be obtained by circularizing a linear RNA by ribozyme autocatalysis. In some embodiments, the linear RNA is circularized in vitro. In some embodiments, circularization by ribozyme autocatalysis comprises (a) subjecting the linear RNA to a condition that activates autocatalysis of the Group I intron (or 5′ and 3′ catalytic Group I intron fragments thereof) to provide a circularized RNA product; and (b) isolating the circularized RNA product, thereby providing the circular dRNA.
In some embodiments, the method comprises a step of obtaining the linear RNA by first cloning the sequence encoding the linearized RNAs into a plasmid vector, and then linearizing the recombinant plasmids. In some embodiments, the recombinant plasmids are linearized by restriction enzyme digestion. In some embodiments, the recombinant plasmids are linearized by PCR amplification. In some embodiments, the method further comprises performing in vitro transcription with the linearized plasmid template. In some embodiments, the in vitro transcription is driven by a T7 promoter. In some embodiments, the method further comprises purifying the linear RNA transcripts. In some embodiments, the linear RNAs are purified by gel purification.
In some embodiments, the present application provides a method of cyclizing a linear RNA (e.g., purified linear RNA) by ribozyme autocatalysis of the Group I intron. During splicing, the 3′ hydroxyl group of a guanosine nucleotide engages in a transesterification reaction at the 5′ splice site. The 5′ intron half is excised, and the freed hydroxyl group at the end of the intermediate engages in a second transesterification at the 3′ splice site, resulting in circularization of the intervening region and excision of the 3′ intron. In some embodiments, the condition that activates autocatalysis of the Group I intron or 5′ and 3′ catalytic Group I intron fragments is the addition of GTPs and Mg2+. In some embodiments, there is provided a step of cyclizing the linear RNAs by adding GTPs and Mg2+ at 55° C. for 15 min. In some embodiments, the method further comprises treating with RNase R to digest the linear RNA transcripts. In some embodiments, the method further comprises isolating the circular dRNA. In some embodiments, the step of isolating the circular dRNA comprises gel-purifying the circular dRNA.
In some embodiments, a circular dRNA can be obtained by circularizing a linear RNA using a ligase such as a RNA ligase. In some embodiments, the linear RNA is circularized in vitro. In some embodiments, the linear RNA can be circularized by a T4 RNA ligase. In some embodiments, the linear RNA comprises a 5′ ligation sequence at the 5′ end of the targeting RNA sequence, and a 3′ ligation sequence at the 3′ end of the targeting RNA sequence, wherein the 5′ ligation sequence and the 3′ ligation sequence can be ligated to each other via the RNA ligase. In non-limiting examples, the linear RNA can be circularized by a ligase such as a T4 DNA ligase (T4 Dnl), T4 RNA ligase 1 (T4 Rnl1), and T4 RNA ligase 2 (T4 Rnl2). The linear RNA may be circularized with or without the presence of a single stranded nucleic acid adaptor, e.g., a splint DNA.
In some embodiments, a DNA or RNA ligase may be used to enzymatically link a 5′-phosphorylated nucleic acid molecule (e.g., a linear RNA) to the 3′-hydroxyl group of a nucleic acid (e.g., a linear nucleic acid) forming a new phosphodiester linkage. In an example reaction, a linear circular RNA is incubated at 37° C. for 1 hour with 1-10 units of T4 RNA ligase (New England Biolabs, Ipswich, Mass.) according to the manufacturer's protocol. The ligation reaction may occur in the presence of a linear nucleic acid capable of base-pairing with both the 5′- and 3′-region in juxtaposition to assist the enzymatic ligation reaction. In some embodiments, the ligation is splint ligation. For example, a splint ligase, like SPLINTR® ligase, can be used for splint ligation. For splint ligation, a single stranded polynucleotide (splint), like a single stranded RNA, can be designed to hybridize with both termini of a linear polyribonucleotide, so that the two termini can be juxtaposed upon hybridization with the single-stranded splint. Splint ligase can thus catalyze the ligation of the juxtaposed two termini of the linear polyribonucleotide, generating a circular polyribonucleotide. In some embodiments, a DNA or RNA ligase may be used in the synthesis of a circular dRNA. As a non-limiting example, the ligase may be a circ ligase or circular ligase.
The RNA editing methods and compositions described herein may be used to treat or prevent a disease or condition in an individual, including, but not limited to hereditary genetic diseases and drug resistance.
In some embodiments, there is provided a method of editing a target RNA in a cell of an individual (e.g., human individual) ex vivo, comprising editing the target RNA using any one of the methods of RNA editing described herein.
In some embodiments, there is provided a method of treating or preventing a disease or condition in an individual (e.g., human individual), comprising editing a target RNA associated with the disease or condition in a cell of the individual using any one of the methods of RNA editing described herein, wherein the dRNA comprises a targeting RNA sequence that hybridizes to a target RNA associated with the disease or condition. In some embodiments, the method comprises introducing the dRNA or the construct comprising a nucleic acid encoding the dRNA into an isolated cell of the individual ex vivo. In some embodiments, the method comprises comprising administering an effective amount of the dRNA or the construct comprising a nucleic acid encoding the dRNA to the individual.
In some embodiments, the target RNA is associated with a disease or condition of the individual. In some embodiments, the disease or condition is a hereditary genetic disease, or a disease or condition associated with one or more acquired genetic mutations (e.g., drug resistance). In some embodiments, the method further comprises obtaining the cell from the individual. In some embodiments, the ADAR is an endogenously expressed ADAR in the isolated cell. In some embodiments, the method comprises introducing the ADAR or a construct comprising a nucleic acid encoding the ADAR to the isolated cell. In some embodiments, the method further comprises culturing the cell having the edited RNA. In some embodiments, the method further comprises administering the cell having the edited RNA to the individual. In some embodiments, the disease or condition is a hereditary genetic disease, or a disease or condition associated with one or more acquired genetic mutations (e.g., drug resistance).
Diseases and conditions suitable for treatment using the methods of the present application include diseases associated with a mutation, such as a G to A mutation, e.g., a G to A mutation that results in missense mutation, early stop codon, aberrant splicing, or alternative splicing in an RNA transcript. Examples of disease-associated mutations that may be restored by the methods of the present application include, but are not limited to, TP53W53X (e.g., 158G>A) associated with cancer, IDUAW402X (e.g., TGG>TAG mutation in exon 9) associated with Mucopolysaccharidosis type I (MPS I), COL3A1W1278X (e.g., 3833G>A mutation) associated with Ehlers-Danlos syndrome, BMPR2W298X (e.g., 893G>A) associated with primary pulmonary hypertension, AHI1W725X (e.g., 2174G>A) associated with Joubert syndrome, FANCCW506X (e.g., 1517G>A) associated with Fanconi anemia, MYBPC3W1098X (e.g., 3293G>A) associated with primary familial hypertrophic cardiomyopathy, and IL2RGW237X (e.g., 710G>A) associated with X-linked severe combined immunodeficiency. In some embodiments, the disease or condition is a cancer. In some embodiments, the disease or condition is a monogenetic disease. In some embodiments, the disease or condition is a polygenetic disease.
In some embodiments, there is provided a method of treating a cancer associated with a target RNA having a mutation (e.g., G>A mutation) in an individual, comprising editing the target RNA in a cell of the individual using any one of the methods of RNA editing described herein. In some embodiments, the target RNA is TP53W53X (e.g., 158G>A).
In some embodiments, there is provided a method of treating MIPS I (e.g., Hurler syndrome or Scheie syndrome) associated with a target RNA having a mutation (e.g., G>A mutation) in an individual, comprising editing the target RNA in a cell of the individual using any one of the methods of RNA editing described herein. In some embodiments, the target RNA is IDUAW402X (e.g., TGG>TAG mutation in exon 9).
In some embodiments, there is provided a method of treating a disease or condition Ehlers-Danlos syndrome associated with a target RNA having a mutation (e.g., G>A mutation) in an individual, comprising editing the target RNA in a cell of the individual using any one of the methods of RNA editing described herein. In some embodiments, the target RNA is COL3A1W1278X (e.g., 3833G>A mutation).
In some embodiments, there is provided a method of treating primary pulmonary hypertension associated with a target RNA having a mutation (e.g., G>A mutation) in an individual, comprising editing the target RNA in a cell of the individual using any one of the methods of RNA editing described herein. In some embodiments, the target RNA is BMPR2W298X (e.g., 893G>A).
In some embodiments, there is provided a method of treating Joubert syndrome associated with a target RNA having a mutation (e.g., G>A mutation) in an individual, comprising editing the target RNA in a cell of the individual using any one of the methods of RNA editing described herein. In some embodiments, the target RNA is AHI1W725X (e.g., 2174G>A).
In some embodiments, there is provided a method of treating Fanconi anemia associated with a target RNA having a mutation (e.g., G>A mutation) in an individual, comprising editing the target RNA in a cell of the individual using any one of the methods of RNA editing described herein. In some embodiments, the target RNA is FANCCW506X (e.g., 1517G>A).
In some embodiments, there is provided a method of treating primary familial hypertrophic cardiomyopathy associated with a target RNA having a mutation (e.g., G>A mutation) in an individual, comprising editing the target RNA in a cell of the individual using any one of the methods of RNA editing described herein. In some embodiments, the target RNA is MYBPC3W1098X (e.g., 3293G>A).
In some embodiments, there is provided a method of treating X-linked severe combined immunodeficiency associated with a target RNA having a mutation (e.g., G>A mutation) in an individual, comprising editing the target RNA in a cell of the individual using any one of the methods of RNA editing described herein. In some embodiments, the target RNA is IL2RGW237X (e.g., 710G>A).
In some embodiments, there is provided a method of treating hyperglycemia associated with a target RNA having a mutation (e.g., G>A mutation) in an individual, comprising editing the target RNA in a cell of the individual using any one of the methods of RNA editing described herein. In some embodiments, the target RNA is MALAT1.
In some embodiments, there is provided a method of treating Charcot-Marie-Tooth disease 2B (CMT2B) associated with a target RNA having a mutation (e.g., G>A mutation) in an individual, comprising editing the target RNA in a cell of the individual using any one of the methods of RNA editing described herein. In some embodiments, the target RNA is RAB7A.
Generally, dosages, schedules, and routes of administration of the compositions (e.g., dRNA or construct comprising a nucleic acid encoding dRNA) may be determined according to the size and condition of the individual, and according to standard pharmaceutical practice. Exemplary routes of administration include intravenous, intra-arterial, intraperitoneal, intrapulmonary, intravesicular, intramuscular, intra-tracheal, subcutaneous, intraocular, intrathecal, or transdermal.
The RNA editing methods of the present application not only can be used in animal cells, for example mammalian cells, but also may be used in modification of RNAs of plant or fungi, for example, in plants or fungi that have endogenously expressed ADARs. The methods described herein can be used to generate genetically engineered plant and fungi with improved properties.
Further provided are any one of the dRNAs, constructs, cells having edited RNA, and compositions described herein for use in any one of the methods of treatment described herein, and any one of the dRNAs, constructs, edited cells, and compositions described herein in the manufacture of a medicament for treating a disease or condition.
Also provided herein are compositions (such as pharmaceutical compositions) comprising any one of the dRNAs, constructs, libraries, or host cells having edited RNA as described herein.
In some embodiments, there is provided a pharmaceutical composition comprising any one of the dRNAs or constructs encoding the dRNA described herein, and a pharmaceutically acceptable carriers, excipients or stabilizers (Remington's Pharmaceutical Sciences 16th edition, Osol, A. Ed. (1980)). Acceptable carriers, excipients, or stabilizers are nontoxic to recipients at the dosages and concentrations employed, and include buffers such as phosphate, citrate, and other organic acids; antioxidants including ascorbic acid and methionine; preservatives (such as octadecyldimethylbenzyl ammonium chloride; hexamethonium chloride; benzalkonium chloride, benzethonium chloride; phenol, butyl or benzyl alcohol; alkyl parabens such as methyl or propylparaben; catechol; resorcinol; cyclohexanol; 3-pentanol; and m-cresol); low molecular weight (less than about 10 residues) polypeptides; proteins, such as serum albumin, gelatin, or immunoglobulins; hydrophilic polymers such as polyvinylpyrrolidone; amino acids such as glycine, glutamine, asparagine, histidine, arginine, or lysine; monosaccharides, disaccharides, and other carbohydrates including glucose, mannose, or dextrins; chelating agents such as EDTA; sugars such as sucrose, mannitol, trehalose or sorbitol; salt-forming counter-ions such as sodium; metal complexes (e.g. Zn-protein complexes); and/or non-ionic surfactants such as TWEEN™, PLURONICS™ or polyethylene glycol (PEG). In some embodiments, lyophilized formulations are provided. Pharmaceutical compositions to be used for in vivo administration must be sterile. This is readily accomplished by, e.g., filtration through sterile filtration membranes.
Further provided are kits useful for any one of the methods of RNA editing or methods of treatment described herein, comprising any one of the dRNAs, constructs, compositions, libraries, or edited host cells as described herein.
In some embodiments, there is provided a kit for editing a target RNA in a host cell, comprising a dRNA or a construct comprising a nucleic acid encoding the dRNA, wherein the dRNA comprises a targeting RNA sequence that hybridizes to a target RNA associated with the disease or condition to form a duplex RNA, wherein the duplex RNA comprises a bulge comprising a non-target adenosine in the target RNA, wherein the dRNA is capable of recruiting an ADAR to deaminate a target adenosine residue in the target RNA. In some embodiments, the dRNA is circular. In some embodiments, the dRNA comprises a linker nucleic acid sequence flanking an end of the targeting RNA sequence.
In some embodiments, there is provided a kit for editing a target RNA in a host cell, comprising a dRNA or a construct comprising a nucleic acid encoding the dRNA, wherein the dRNA comprises a targeting RNA sequence that hybridizes to a target RNA associated with the disease or condition, wherein the dRNA comprises a linker nucleic acid sequence flanking an end of the targeting RNA sequence, wherein the linker nucleic acid sequence does not substantially form any secondary structure with any part of the dRNA, wherein the dRNA is capable of recruiting an ADAR to deaminate a target adenosine residue in the target RNA, and wherein the dRNA is a circular RNA or a linear RNA capable of forming a circular RNA.
In some embodiments, the kit further comprises an ADAR or a construct comprising a nucleic acid encoding an ADAR. In some embodiments, the kit further comprises an inhibitor of ADAR3 or a construct thereof. In some embodiments, the kit further comprises a stimulator of interferon or a construct thereof. In some embodiments, the kit further comprises an instruction for carrying out any one of the RNA editing methods or methods of treatment described herein.
The kits of the present application are in suitable packaging. Suitable packaging includes, but is not limited to, vials, bottles, jars, flexible packaging (e.g., sealed Mylar or plastic bags), and the like. Kits may optionally provide additional components such as transfection or transduction reagents, cell culturing medium, buffers, and interpretative information.
The present application thus also provides articles of manufacture. The article of manufacture can comprise a container and a label or package insert on or associated with the container. Suitable containers include vials (such as sealed vials), bottles, jars, flexible packaging, and the like. In some embodiments, the container holds a pharmaceutical composition, and may have a sterile access port (for example, the container may be an intravenous solution bag or a vial having a stopper pierceable by a hypodermic injection needle). The container holding the pharmaceutical composition may be a multi-use vial, which allows for repeat administrations (e.g. from 2-6 administrations) of the reconstituted formulation. Package insert refers to instructions customarily included in commercial packages of therapeutic products that contain information about the indications, usage, dosage, administration, contraindications and/or warnings concerning the use of such products. Additionally, the article of manufacture may further comprise a second container comprising a pharmaceutically-acceptable buffer, such as bacteriostatic water for injection (BWFI), phosphate-buffered saline, Ringer's solution and dextrose solution. It may further include other materials desirable from a commercial and user standpoint, including other buffers, diluents, filters, needles, and syringes.
The kits or article of manufacture may include multiple unit doses of the pharmaceutical compositions and instructions for use, packaged in quantities sufficient for storage and use in pharmacies, for example, hospital pharmacies and compounding pharmacies.
The examples below are intended to be purely exemplary of the present application and should therefore not be considered to limit the invention in any way. The following exemplary embodiments and examples and detailed description are offered by way of illustration and not by way of limitation.
For linear arRNA-expressing construct, the sequences of arRNAs were synthesized and golden-gate cloned into pLenti-sgRNA-lib 2.0 (Addgene no. 89638) backbone, and the transcription of arRNA was driven by hU6 or CMV promoter respectively. For genetic encoded circ-arRNA-expressing construct, we first constructed a cloning vector based on the pLenti-sgRNA-lib 2.0 vector that included the Twister P3 U2A, 5′ ligation sequence, 3′ ligation sequence and Twister P148. Then the sequences of arRNAs were synthesized and golden gate cloned into the autocatalytic circular RNA expression vector.
To increase editing efficiency further, circ-arRNA151 were flanked by 20 nt spacer and 30 nt poly AC sequences and then golden-gate cloned into the genetic encoded circ-arRNA-expressing vector.
To reduce off-target editing, the nucleotide opposite potential off-target adenosines were deleted and then cloned into the genetic encoded circ-arRNA-expressing vector.
For the dual fluorescence reporter, mCherry and EGFP (the start codon ATG of EGFP was deleted) coding sequences were PCR amplified, digested using BsmBI (ThermoFisher Scientific, ER0452), followed by T4 DNA ligase (NEB, M0202L) mediated ligation with 3×GGGGS linkers. The ligation product was subsequently inserted into the pLenti-CMV-MCS-PURO backbone.
For the constructs expressing genes with pathogenic mutations, full-length coding sequences of TP53 (ordered from Vigenebio) gifts from J. Wang's laboratory, Institute of Pathogen Biology, Chinese Academy of Medical Sciences) were amplified from the constructs encoding the corresponding genes with introduction of G-to-A mutations through mutagenesis PCR. The amplified product was cloned into the pLenti-CMV-MCS-mCherry backbone through the Gibson cloning method.
Production and Purification of circRNA In Vitro
The production of circRNAs is according to methods described in Abe, N. et al. “Preparation of Circular RNA In Vitro,” Circular RNAs. Humana Press, New York, NY, 2018. 181-192; and Chen, H. et al. “Preferential production of RNA rings by T4 RNA ligase 2 without any splint through rational design of precursor strand,” Nucleic Acids Research 48, e54-e54 (2020). Briefly, circRNA precursors were synthesized via in vitro transcriptions (IVT) from the linearized circRNA plasmid templates with HISCRIBE™ T7 High Yield RNA Synthesis Kit (New England Biolabs, #E2040S). After IVT, IVT products were treated with DNase I (New England Biolabs, #M0303S) for 30 min to digest the DNA templates. For T4 Rnl cyclization, T4 Rnl 1 (New England Biolabs, #M0239L) or T4 Rnl 2 (New England Biolabs, #M0204L) was added in the linear circRNA precursors and incubated at 37° C. overnight after DNase I digestion. For group1 autocatalysis cyclization, GTP was added into the reaction at a final concentration of 2 mM after DNase I digestion, and then reactions were incubated at 55° C. for 15 min to catalyze cyclization of circRNAs. Then, cyclized circ-arRNA was column purified with Monarch RNA Cleanup Kit (New England Biolabs, #T2040L). Then, column-purified RNA was heated at 65° C. for 3 min and cooled on ice. Reactions were treated with RNase R (Epicenter, #RNR07250) at 37° C. for 15 min to enrich circRNAs. The RNase R-treated RNA was column purified.
To further enrich circ-arRNA, purified RNase R-treated circ-arRNA was resolved using high-performance liquid chromatography (Agilent HPLC1260) 4.6×300 mm size-exclusion column with particle size of 5 μm and pore size of 2000 Å (Sepax Technologies, #215980P-4630) in RNase-free TE buffer. The circ-arRNA enrich fractions were collected and then column purified (New England Biolabs, #T2040L). To further diminish the immunogenicity of the purified circ-arRNA, circ-arRNA were heated at 65° C. for 3 min, cooled down on ice, and subsequently treated with the quick CIP phosphatase (New England Biolabs, #M0525S). Finally, circ-arRNA were column purified and concentrated with RNA Clean & Concentrator Kit (ZYMO, #R1018).
HeLa cell lines came from Z. Jiang's laboratory (Peking University) and HEK293T cell line was from C. Zhang's laboratory (Peking University). A549 cell lines were from EdiGene. C2C12 cell lines were purchased from Procell. The MEF cell were generated from Idua-W392X mice. The Hep G2/RPE1/SF268/Cos7/NIH3T3 cell lines were maintained in our laboratory at Peking University. These mammalian cell lines were cultured in Dulbecco's Modified Eagle Medium (Corning, 10-013-CV) with 10% fetal bovine serum (BI), additionally supplemented with 1% penicillin-streptomycin under 5% CO2 at 37° C.
Plasmids were transfected intro cells with X-tremeGENE HP DNA transfection reagent (Roche, 06366546001) or PEI (Proteintech, B600070) and RNAs cyclized in vitro were transfected into cells with Lipofectamine MessengerMax (Invitrogen, LMRNA003) according to the manufacturer's instructions.
For the stable reporter cell lines, reporter constructs (pLenti-CMV-MCS-PURO backbone) were cotransfected into HEK293T cells, together with two viral packaging plasmids, pR8.74 and pVSVG. After 72 hours, the supernatant virus was collected and stored at −80° C. HEK293T cells were infected with lentivirus, and then, mCherry-positive cells were sorted via fluorescence-activated cell sorting (FACS) and cultured to select a single clone cell line stably expressing dual fluorescence reporter system without detectable EGFP background. The HEK293T ADAR1−/− and TP53−/− cell lines were generated according to a method described in Zhou, Y., Zhang, H. & Wei, W, “Simultaneous generation of multi-gene knockouts in human cells,” FEBS Letters 11 (2016). ADAR1-targeting single-guide RNA and PCR amplified donor DNA containing CMV-driven puromycin resistant gene were co-transfected into HEK293T cells. Then, cells were treated with puromycin 7 days after transfection. Single clones were isolated from puromycin resistant cells followed by verification through sequencing and Western blot.
For assessing RNA editing on the dual fluorescence reporter, HEK293T-reporter-cells were seeded in 12-well plates (˜1-3×105 cells per well). After 24 hours, cells were transfected with 2 μg linear arRNA or circ-arRNA plasmids. 48 hours after transfection, the editing efficiency was assayed by EGFP+ ratio. ADAR1−/− HEK293T cells were transfected with reporter and linear arRNA or circ-arRNA plasmids as in dual fluorescence reporter cells.
To assess RNA editing efficiency in multiple cell lines, 1×105 (HeLa, Hep G2, A549, RPE1, SF268, C2C12, NIH3T3) or 4×105 (HEK293T) cells were seeded in 12-well plates. Twenty-four hours later, reporters and arRNAs plasmid were transfected into these cells. The editing efficiency was assayed by an EGFP+ ratio.
To evaluate the EGFP+ ratio, cells were sorted and collected by FACS analysis 48 hours post transfection. The mCherry signal was served as a fluorescent selection marker for the reporter/circ-arRNA-expressing cells, and the percentages of EGFP+/mCherry+ cells were calculated as the readout for editing efficiency.
To assess RNA editing on endogenous mRNA transcripts, HEK293T cells were seeded in six-well plates (8×105 cells per well). Twenty-four hours later, cells were transfected with 3 μg of linear or circular arRNA plasmids. After 48 hours post transfection, cells were sorted and collected by FACS analysis. The editing efficiency was assayed by deep sequencing.
For NGS quantification of the A-to-I editing rate, at 48 hours post transfection, cells were sorted and collected by FACS assay and were subjected to RNA isolation (Zymo, R1055). Then, total RNAs were reverse-transcribed into cDNA via PCR with reverse transcription (RT-PCR) (TIANGEN, KR118), and the targeted locus was PCR amplified with the corresponding primers: mCherry-SpeI-F (SEQ ID NO: 1), mCherry-BsmBI-R1 (SEQ ID NO: 2) and EGFP-BsmBI-F1 (SEQ ID NO: 3). PCR products were purified for Sanger sequencing or NGS (Illumina HiSeq X Ten).
For deep sequencing analysis, an index was generated using the targeted site sequence (upstream and downstream 20-nt) of arRNA covering sequences. Reads were aligned and quantified using BWA (v.0.7.10-r789). Alignment BAMs were then sorted by Samtools, and RNA editing sites were analyzed using REDitools (v.1.0.4). The parameters were as follows: -U [AG or TC] -t 8 -n 0.0 -T 6-6 -e -d -u. All significant A-to-G conversion within the arRNA targeting region calculated by Fisher's exact test (P value <0.05) were considered as edits by arRNA. All conversions except for targeted adenosine were off-target edits. Mutations that appeared in control and experimental groups simultaneously were considered to be due to single nucleotide polymorphism.
Ctrl RNA151 or circ-arRNA151-PPIA-expressing plasmids with blue fluorescent protein (BFP) expression cassette were transfected into HEK293T cells. BFP+ cells were enriched by FACS 48 hours after transfection, and RNAs were purified with RNAprep Pure Micro kit (TIANGEN, DP420). Then, mRNA was purified using NEBNext Poly(A) mRNA Magnetic Isolation Module (New England Biolabs, E7490), processed with NEBNext Ultra II RNA Library Prep Kit for Illumina (New England Biolabs, E7770), and followed by deep sequencing analysis using Illumina HiSeq X Ten platform (2×150-base pair paired end; 30G for each sample). To exclude nonspecific effect caused by transfection, we included the mock group in which we only treated cells with transfection reagent. Each group contained four replications.
The bioinformatics analysis pipeline followed the work by Vogel et al. Quality control of analysis was conducted by using FastQC and quality trim was based on Cutadapt (the first 6-bp for each read were trimmed and up to 20-bp were quality trimmed). AWK scripts were used to filter out the introduced circ-arRNA. After trimming, reads with lengths shorter than 90-nt were filtered out. Subsequently, filtered reads were mapped to the reference genome (GRCh38-hg38) by STAR software. We used GATK Haplotypcaller to call the variants. Raw VCF (variant call format) files generated by GATK were filtered and annotated by GATK VariantFiltration, bcftools, and ANNOVAR. The variants in dbSNP, 1,000 Genome, and EVS were filtered out. The shared variants in six replicates of each group were then selected as the RNA editing sites. The RNA editing level of the mock group was viewed as the background, and the global targets of Ctrl RNA151 and circ-arRNA151-PPIA were obtained by subtracting the variants in the mock group.
To assess whether circ-arRNA perturbs natural editing homeostasis, we analyzed the global editing sites shared by the Ctrl RNA151 group and circ-arRNA151-PPIA group. The differential RNA editing rates at native A-to-I editing sites were assessed using Pearson's correlation coefficient analysis.
TP53W53X cDNA-expressing plasmids and circ-arRNA-expressing plasmids were transfected into HEK293T TP53−/− cells, together with p53-Firefly-luciferase cis-reporting plasmids (YRGene, VXS0446) and Renilla-luciferase plasmids (a gift from Z. Jiang's laboratory, Peking University) for detecting the transcriptional regulatory activity of p53. 48 hours after transfection, cells were collected and assayed with Promega Dual-Glo Luciferase Assay System (Promega, E2940) according to manufacturer's protocol. Luminescence was measured by Infinite M200 reader (TECAN). Fold change of p53-induced luciferase was calculated by the ratio of Firefly luminescence to Renilla luminescence.
Mouse monoclonal primary antibodies against p53 (Santa Cruz, sc-126) and β-tubulin (CWBiotech, CW0098) were used. HRP-conjugated goat anti-mouse IgG (H+L, 115-035-003) secondary antibody was purchased from Jackson ImmunoResearch. Then, 2×106 cells were sorted to be lysed and an equal amount of protein of each lysate was loaded for SDS-PAGE. Then, sample proteins were transferred onto polyvinylidene difluoride membrane (Bio-Rad Laboratories) and immunoblotted with primary antibodies (anti-p53, 1:300; anti-Tubulin, 1:2000), followed by secondary antibody incubation (1:3,000) and exposure. The experiments were repeated three times. The semi-quantitative analysis was done with Image Lab software.
Idua-W392X mice were ordered from The Jackson Laboratory. All mice were bred and kept under SPF (specific pathogen-free) conditions in the Laboratory Animal Center of Peking University. Animal experiments were approved by Peking University Laboratory Animal Center (Beijing), and undertaken in accordance with National Institute of Health Guide for Care and Use of Laboratory Animals.
The AAV8 of circ-arRNA were packed by PackGene Biotech. AAVs were injected into IDUA-W392X mice by tail vein (B6.129S-Iduatm1.1Kmke/J) at 6-8 weeks of age, at a dose of 1×1013 vector genomes per mouse. Mice were monitored four times a week for the duration of the experiment (4-5 weeks).
Harvested mouse tissues were homogenized in 1 mL Trizol and RNA were extracted by Chloroform extraction method. Then, reverse transcribed RNAs of tissues were PCR and analyzed by Sanger sequence or NGS.
The gathered cell pellet was resuspended and lysed with 28 μL 0.5% Triton X-100 in 1×PBS buffer and kept on ice for 30 min. Then, 25 μL of the cell lysis was added to 25 μL 190 μM 4-methylumbelliferyl-α-l-iduronidase substrate (Cayman, 2A-19543-500), which was dissolved in 0.4 M sodium formate buffer containing 0.2% Triton X-100 [pH 3.5] and incubated in the dark for 90 min at 37° C. The catalytic reaction was quenched by adding 200 μL 0.5 M NaOH/Glycine buffer [pH 10.3] and then centrifuged for 2 min at 4° C. The supernatant was transferred to a 96-well plate and fluorescence was measured at 365 nm and 450 nm emission wavelength with Infinite M200 reader (TECAN).
An unpaired two-sided Student's t-test was implemented for group comparison. For transcriptome-wide RNA-seq data, DESeq2 (v.1.18.1) was used for analyzing statistical significance. Statistical analyses were performed with R and Prism 8 (GraphPad Software, Inc.).
To test whether an RNA Polymerase II (Pol II) can improve editing efficiency, a plasmid expressing arRNA driven by a Pol II promoter (CMV) was constructed. Using a reporter system containing an in-frame stop codon between mCherry and EGFP (Reporter 1,
Sequences used are as follows: arRNA151 has a sequence of SEQ ID NO: 4. CMV promoter has the nucleic acid sequence of SEQ ID NO: 5. U6 promoter has the nucleic acid sequence of SEQ ID NO: 6. Dual fluorescence reporter-1 comprises sequence of mCherry (SEQ ID NO: 7), sequence comprising 3×GS linker and the targeted A (SEQ ID NO: 8), and sequence of eGFP (SEQ ID NO: 9).
It was found that CMV-arRNA outperforms U6-arRNA in RNA editing (
To test whether cyclization of arRNA can improve stability and half-life of RNA, HEK293T cells stably expressing Reporter-1, as described in Example 1, containing an in-frame stop codon between mCherry and EGFP (
The efficiency of cyclization of arRNA (circ-arRNA) was determined by PCR amplifying the targeted locus with corresponding primer pairs and the PCR products were purified for Sanger sequencing. The Sanger sequence indicated that the circ-arRNAs were successfully generated. Using Reporter 1, RNA editing efficiency between 151-nt arRNAs (arRNA151; SEQ ID NO: 4) and circ-arRNAs (circ-arRNA151) driven by CMV (Pol II) or U6 (Pol III) promoters was compared. It was found that U6-circ-arRNA151 outperforms CMV-arRNA151 in RNA editing (
Using Reporter 1, the RNA editing efficiency between arRNAs (arRNA151) and circ-arRNAs (circ-arRNA151) driven by U6 promoters was compared. Cells transfected with a non-targeting RNA were used as a control (Ctrl RNA151). It was found that circ-arRNA outperforms arRNA, as indicated by the significant increase of EGFP+ percentage in transfected HEK293T cells (
To test whether RNA editing by circ-arRNAs was also dependent on the endogenous ADAR1 proteins, Reporter-1 and plasmids expressing circular arRNA151 targeting Reporter-1 were transfected into HEK293T cells with and without an ADAR1 knockout (HEK293T ADAR−/−). EGFP+ percentages were normalized by transfection efficiency, which was determined by mCherry+. It was found that RNA editing by both arRNA and circ-arRNAs were dependent on endogenous ADAR, as indicated by complete disappearance in editing-generated EGFP signal in HEK293T ADAR−/− cells (
The RNA editing efficiency of arRNAs and circ-arRNAs were further compared in a variety of cell types, including HeLa, HepG2, A549, RPE1, SF268, C2C12, NIH3T3, and Cos7 (
Next, to test whether circ-arRNAs could also enable efficient RNA editing on multiple target sites of endogenous transcripts, 151-nt circ-arRNAs were designed to target 20 different RNA sites of nine endogenous genes, PPIB, GUSB, KRAS, MALAT1, TUBB, RAB7A, PPIA, SMYD5 and CTNNB1 (Table 2). NGS analysis revealed that circ-arRNAs outperformed their linear counterparts in targeted RNA editing at 17 of 20 sites. However, circ-arRNAs showed a comparable editing rate on MALAT1 (site 1) and even a decreased editing rate on KRAS (sites 1 and 2). Circ-arRNAs targeting these three sites might have certain structures that interfered with their target recognition or functions to mediate targeted editing activity. To test whether the addition of flexible RNA linkers flanking circ-arRNAs could further optimize their ability in mediating editing activity, fifty-nucleotide flexible polyAC RNA linkers, termed AC50, were added to flanking circ-arRNA151, and these circ-arRNA_AC50 linkers gave rise to improved editing rates at 14 sites compared with circ-arRNA151. The circ-arRNA_AC50 targeting KRAS (sites 1 and 2) did elevate the editing efficiency of the original circ-arRNAs to a level comparable to that of corresponding linear arRNAs. On average, the editing efficiency of circ-arRNA and circ-arRNA_AC50 was 2.3 and 3.1 fold higher than their linear counterparts, respectively (
Adeno-associated virus (AAV) was used to deliver circ-arRNAs into HEK293T cells, human primary hepatocytes and human cerebral organoids. NGS results showed that AAV-delivered circ-arRNAs yielded much higher levels of targeted editing in all these cells and organoids than their linear counterparts, and in a long-lasting fashion (
AACCAAAAAAACAAAACACAgggugaugggugcuggcca
AACCAAAAAAACAAAACACAcaguugcugccuacauuuu
To test an in vitro strategy to generate circ-arRNAs (
To investigate Group-I ribozyme-mediated autocatalysis activity51, 52, group I ribozyme autocatalysis ligated circ-arRNA were introduced into a primary MEF cell line generated from Hurler syndrome mice. The editing rate on a targeted adenosine of Idua transcripts with a pathogenic point mutation (IDUAW392X) was measured. Untreated cells were used as a mock control (Untreated). Cells transfected with a non-targeting RNA were also used as control (Ctrl RNAs). It was found that NGS results showed that the circ-arRNAs generated by Group-I ribozyme autocatalysis could correct the pathogenic point mutation of IDUAW392X transcripts in the MEF cells, with about 25% editing rate (
To evaluate the RNA editing specificity of circ-arRNAs, a transcriptome-wide RNA-sequencing analysis was performed. HEK293T cells were transfected with circ-arRNA151-PPIA-expressing plasmids. Cells transfected with a non-targeting RNA were used as a control (Ctrl RNAs). The transcriptome-wide RNA-sequencing results showed that there were 17 potential off-target edits in the circ-arRNA151-PPIA transfection group (
To test whether circ-arRNA151-PPIA would affect the expression level of targeted PPIA transcripts, the above transcriptome-wide RNA-seq data was used for further analysis. Circ-arRNA151-PPIA mediated editing in PPIA transcripts affected neither the expression nor splicing pattern of PPIA transcripts (
To test bystander off-targets on the arRNA-covered regions of the targeted transcripts, HEK293T cells stably expressing Reporter-1 containing an in-frame stop codon between mCherry and EGFP were transfected with either linear arRNA or circular arRNA targeting Reporter-1 (
To reduce bystander off-targets, nucleotides opposite to unwanted adenosines in the circ-arRNA-covered region were deleted (
In vitro-synthesized circ-arRNA151-AΔ14 was then tested and it is found that it could also achieve efficient editing in a dose-dependent manner (
To explore potential therapeutic uses of circ-arRNAs, TP53 tumor suppressor gene, which undergoes frequent mutations in more than 50% of human cancers59, was targeted. The c.158G-to-A variant of TP53 is a clinically relevant non-sense mutation (Trp53Ter), generating a non-functional truncated protein (
NGS analysis showed variable editing rates on the targeted adenosine, ˜30% with circ-arRNA151, and ˜40% with circ-arRNA151-AG1, circ-arRNA151-AG4, and circ-arRNA151-AΔ1, with A-G mismatch or U-deletion on one unwanted off-target site (
To test whether addition of flexible RNA linkers flanking the arRNA sequence in circ-arRNA improved its binding ability to targeted RNA resulting in elevated editing efficiency, 50-nt polyAC RNA linkers, termed AC50 (SEQ ID NO: 22), were added to flank arRNA sequences in both circ-arRNA151 and circ-arRNA151-AΔ4, giving rise to circ-arRNA151_AC50 and circ-arRNA151-AΔ4_AC50. NGS analysis showed that such optimization with flexible linkers boosted the targeted RNA editing efficiency, especially for circ-arRNA151-AΔ4_AC50, which yielded about 70% of editing (
In addition to transcript editing, all circ-arRNAs versions could effectively rescue production of full-length p53 protein in HEK293T TP53−/− cells (
Potential off-targets in the circ-arRNA-covered region were also examined. It was found that deleting an U nucleotide opposite to the potential off-target A nucleotide on circ-arRNAs almost abolished bystander off-targets on four predicted sites (
Hurler syndrome is the most severe subtype of Mucopolysaccharidosis type I because of deficiency of α-L-iduronidase (IDUA), a lysosomal metabolic enzyme responsible for metabolism of mucopolysaccharides. To explore the therapeutic potential of circ-arRNAs, the Hurler syndrome mouse model was used. This mouse model harbors a homozygous W392X (TGG-to-TAG) point mutation in the exon 9 of Idua, which is analogous to the W402X mutation found in clinical Hurler syndrome patients.
Two versions of circ-arRNAs targeting the mature mRNA or pre-mRNA of Idua, respectively, were designed. The IDUAW392X pre-mRNA transcript has a target sequence of SEQ ID NO: 23. The IDUAW392X mRNA transcript has a target sequence of SEQ ID NO: 25. The circ-arRNAmRNA-151 (with targeting RNA sequence of SEQ ID NO: 26) or circ-arRNApre-mRNA-151 (with targeting RNA sequence of SEQ ID NO: 24) were delivered into Idua-W392X mice by the transduction of self-complementary AAV (scAAV) virus. Four weeks later, the mice were sacrificed, and the liver tissues were collected for the measurement of targeted RNA editing and catalytic activity of α-L-iduronidase. NGS analysis revealed that both circ-arRNA151/mRNA targeting and circ-arRNA151/pre-RNA targeting achieved 10% of targeted editing rate (
This example describes the impact of adding flexible linker sequence(s) flanking the arRNA sequence (i.e., flanking linkers) and deleting one or more Us opposite non-target adenosines in a circular RNA on its on-target and off-target editing rates.
To test whether addition of flexible RNA linkers flanking the arRNA sequence in circ-arRNA could improve its binding ability to targeted RNA resulting in elevated on-target editing efficiency and/or reducing bystander off-target editing effects, 50-nt polyAC RNA linkers, termed AC50 (SEQ ID NO: 22), were added to flank different nucleotide positions at the 5′ or 3′ end of circular 171-nt arRNA (circ-arRNA171) sequences. See,
Three target sequences were tested: (1) an A at position 129 in the 3′ UTR of endogenous PPIA transcript (“mf-PPIA-UTR2”; SEQ ID NO: 27); (2) an A at position 155 in the 3′UTR of endogenous PPIA transcript (“mf-PPIA-3”; SEQ ID NO: 28); and (3) an A at position 134 in the 3′ UTR of endogenous IDUA-1 transcript (“mf-IDUA-1”; SEQ ID NO: 29). The reference sequence of Macaca fascicularis can be found with NCBI identifier NC_022274. The reference sequence for Rhesus Ush2a exon is XM_005540847. The targeting RNA sequence of each circ-arRNA171 is complementary to the sequence of each target sequence.
Briefly, fetal Rhesus Kidney cells (FRHK-4) and Rhesus Kidney cells (LLC-MK2) were transfected with a recombinant AAV plasmid expressing circular arRNA and eGFP. The rAAV plasmid has AAV2 ITRs flanking an expression cassette, including, from the 5′ to the 3′: U6 promoter operably linked to arRNA, a CAG promoter, EGFP, and WPRE. 48 hours after transfection, RNA samples were obtained from the cells. In FRHK-4 experiments, the cells were subject to FACS sorting based on expression of GFP. Amplicons of the target RNA were obtained, which were subject to NGS analysis.
As shown in
Depending on the target, different positions of the flanking linker have different effects on the on-target editing efficiency. However, For example, with respect to the mf-PPIA3 target, the circ-arRNA having a linker that partially replaced the 3′ sequence (-R) in the arRNA region consistently led to comparable on-target editing efficiency as the original circ-arRNA171 (
With respect to the mf-IDUA-1 target, circ-arRNAs having a linker that partially replaced either the 5′ sequence (-L) or the 3′ sequence (-R) in the arRNA sequence both reduced on-target editing efficiency. The circ-arRNA having linker sequences flanking both the 5′ end and the 3′ end of the arRNA sequence (RL) had comparable on-target editing efficiency as the original circ-arRNA171 (
Without being bound by any theory, whether including flanking sequences in the circ-arRNA could increase on-target editing efficiency may depend on whether the arRNA could form complex secondary structures. Reduction of off-target editing in the regions of mRNA that correspond to AC50 linker-replaced regions of the arRNA using -L, -R, and -LR circ-arRNAs is consistent with the previous observation that double stranded RNA formation is required for ADAR editing.
To reduce bystander off-target editing, circ-arRNAs with U opposite certain non-target A were constructed to target mf-PPIA-3 and mf-IDUA-1. Table 1 below lists the positions of the non-target A residues corresponding to the deleted U residue(s) in the circ-arRNAs, and the corresponding circ-arRNA sequences.
The A to G editing rates at the non-target and target A positions of mf-PPIA-3 and mf-IDUA-1 using the circ-arRNAs are shown in heatmaps in
With respect to mf-IDUA-1 (
Number | Date | Country | Kind |
---|---|---|---|
PCT/CN2021/113290 | Aug 2021 | WO | international |
PCT/CN2022/085144 | Apr 2022 | WO | international |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2022/113304 | 8/18/2022 | WO |