COMPOSITIONS AND METHODS FOR CIRCULAR RNA AFFINITY PURIFICATION

SEQUENCE LISTING

The instant application contains a Sequence Listing which has been submitted electronically in XML format and is hereby incorporated by reference in its entirety. Said XML file, created on Dec. 12, 2024, is named 759390_SA9-328PCCON_ST26.xml and is 206,994 bytes in size.

BACKGROUND OF THE DISCLOSURE

Exogenous circularized RNAs (circRNAs) containing a protein coding region are emerging as a valuable a molecular tool and an alternative to messenger RNA (mRNA) therapeutics. CircRNAs are single-stranded and characterized by a covalently closed structure. In contrast to linear RNA, circRNAs have elevated stability, a significantly longer half-life, and are resistant to degradation by exonucleases. Uses of exogenous circRNAs include (1) the overexpression of native circRNAs, (2) the engineering of in vitro produced circRNA as a substitute to existing linear mRNA delivery, and/or (3) as described herein as part of a production and purification method for linear and/or circular RNA.

Methods for efficiently purifying exogenous circRNA remain a significant obstacle that must be overcome before the protein coding potential of circRNA can be fully realized. This is partly due to the different types and combinations of undesired contaminants in a sample that need to be separated from a pure sample of circRNA. Such contaminants are typically components and by-products of any upstream processes, for example RNA manufacturing and circularization conditions. The sample typically contains the desired circRNA alongside various contaminants such as linear precursor RNA, nicked circular RNA, double stranded RNA, triphosphate-RNA, free nucleotides, endotoxins, and solvents.

There remains a need for more effective, reliable, and safer methods of purifying circRNA from large scale manufacturing processes for potential therapeutic applications which are also economical in terms of the number of steps, the complexity of the steps, and the resources used in the steps.

BRIEF SUMMARY OF THE DISCLOSURE

In one aspect, the disclosure provides a circular RNA comprising a protein coding region and at least one RNA aptamer.

In certain embodiments, an internal ribosome entry site (IRES) is positioned at the 5′ end of the protein coding region.

In certain embodiments, an IRES is positioned at the 3′ end of the protein coding region.

In certain embodiments, the IRES is derived from Coxsackievirus B3 (CVB3), Encephalomyocarditis virus (EMCV), Dicistroviruses, hepatitis C virus (HCV), poliovirus (PV), enterovirus 71 (EV71), human rhinovirus (HRV), foot-and-mouth disease virus (FMDV), or synthetic IRES.

In certain embodiments, the IRES comprises a polynucleotide sequence of SEQ ID NO: 75.

In certain embodiments, the protein coding region encodes at least one polypeptide or peptide.

In certain embodiments, the polypeptide is a biologically active polypeptide, a therapeutic polypeptide, or an antigenic polypeptide.

In certain embodiments, the circular RNA comprises at least one 5′ internal homology arm and at least one 3′ internal homology arm.

In certain embodiments, the 5′ internal homology arm is about 5 to about 50 nucleotides in length.

In certain embodiments, the 5′ internal homology arm comprises the nucleotide sequence of SEQ ID NO: 70.

In certain embodiments, the 3′ internal homology arm is about 5 to about 50 nucleotides in length.

In certain embodiments, the 3′ internal homology arm comprises the nucleotide sequence of SEQ ID NO: 71.

In certain embodiments, the circular RNA comprises at least one 3′ exon element.

In certain embodiments, the 3′ exon element comprises the nucleotide sequence of SEQ ID NO: 81.

In certain embodiments, the circular RNA comprises at least one 5′ exon element.

In certain embodiments, the 5′ exon element comprises the nucleotide sequence of SEQ ID NO: 83.

In certain embodiments, the circular RNA comprises at least one spacer sequence.

In certain embodiments, the spacer sequence is about 5 to about 75 nucleotides in length.

In certain embodiments, the spacer sequence comprises the nucleotide sequence of SEQ ID NO: 78 or 79.

In certain embodiments, the spacer sequence is positioned at one or both of a 5′ end and 3′ end of any one of the following elements: the protein coding region, the IRES, the 5′ internal homology arm, the 3′ internal homology arm, the 5′ exon element, and the 3′ exon element.

In certain embodiments, the circular RNA comprises the following elements, from 5′ to 3′: a) the 3′ exon element, b) the 5′ internal homology arm, c) the spacer sequence, d) the IRES, e) the protein coding region, f) the spacer sequence, g) the 3′ internal homology arm, and h) the 5′ exon element.

In certain embodiments, the circular RNA comprises the following elements, from 5′ to 3′: a) the 3′ exon element, b) the 5′ internal homology arm, c) the spacer sequence, d) the protein coding region, e) the IRES, f) the spacer sequence, g) the 3′ internal homology arm, and h) the 5′ exon element.

In certain embodiments, the at least one RNA aptamer is positioned at a 5′ end or a 3′ end of any one of elements a)-h).

In certain embodiments, the circular RNA contains at least one 5′ untranslated region (5′ UTR), at least one 3′ untranslated region (3′ UTR), and/or at least one polyadenylation (polyA) sequence.

In certain embodiments, the 5′ UTR, the 3′ UTR, and/or the polyA sequence are spacer sequences.

In certain embodiments, the RNA aptamer is embedded in an RNA scaffold.

In certain embodiments, the RNA scaffold comprises at least one secondary structure motif.

In certain embodiments, the secondary structure motif is a tetraloop, a pseudoknot, or a stem-loop.

In certain embodiments, the RNA scaffold comprises at least one tertiary structure.

In certain embodiments, the secondary structure motif and/or tertiary structure are nuclease resistant.

In certain embodiments, the RNA scaffold comprises a transfer RNA (tRNA).

In certain embodiments, the RNA aptamer is embedded in a tRNA hairpin loop of the tRNA.

In certain embodiments, the RNA aptamer is embedded in a tRNA anticodon loop of the tRNA.

In certain embodiments, the RNA aptamer is embedded in a tRNA D loop of the tRNA.

In certain embodiments, the RNA aptamer is S1m, Sm, or a derivative or fragment thereof.

In certain embodiments, the circular RNA comprises between one to four RNA aptamers.

In certain embodiments, the RNA aptamers are identical.

In certain embodiments, at least one of the RNA aptamers is distinct.

In certain embodiments, the RNA aptamer is synthetically derived.

In certain embodiments, the RNA aptamer is a split aptamer or an X-aptamer.

In certain embodiments, the RNA aptamer is naturally-derived.

In certain embodiments, the RNA aptamer is derived from a hairpin RNA, a tRNA, or a riboswitch.

In certain embodiments, the RNA aptamer binds to an affinity ligand.

In certain embodiments, the affinity ligand comprises protein A, protein G, streptavidin, glutathione, dextran, or a fluorescent molecule.

In certain embodiments, the affinity ligand comprises streptavidin.

In certain embodiments, the affinity ligand is immobilized on a chromatography resin.

In certain embodiments, the at least one RNA aptamer is positioned: a) before the 3′ exon element, b) between the 3′ exon element and the 5′ internal homology arm, c) between the 5′ internal homology arm and the 5′ spacer sequence, d) between the 5′ spacer sequence and the IRES, e) between the protein coding region and the 3′ spacer sequence, f) between the 3′ spacer sequence and the 3′ internal homology arm, g) between the 3′ internal homology arm and the 5′ exon element, h) after the 5′ exon element, i) between the 3′ exon and the IRES, and/or j) between the IRES and the 5′ exon element.

In certain embodiments, the at least one RNA aptamer is positioned: a) before the 3′ exon element, b) between the 3′ exon element and the 5′ internal homology arm, c) between the 5′ internal homology arm and the 5′ spacer sequence, d) between the 5′ spacer sequence and the protein coding region, e) between the IRES and the 3′ spacer sequence, f) between the 3′ spacer sequence and the 3′ internal homology arm, g) between the 3′ internal homology arm and the 5′ exon element, h) after the 5′ exon element, i) between the 3′ exon and the protein coding region, and/or j) between the protein coding region and the 5′ exon element.

In certain embodiments, the RNA aptamer comprises the nucleotide sequence of SEQ ID NO: 65 or 66.

In certain embodiments, the RNA aptamer comprises the nucleotide sequence of SEQ ID NO: 84. In certain embodiments, the RNA aptamer comprises the nucleotide sequence of SEQ ID NO: 85. In certain embodiments, the RNA aptamer comprises the nucleotide sequence of SEQ ID NO: 86. In certain embodiments, the RNA aptamer comprises the nucleotide sequence of SEQ ID NO: 87. In certain embodiments, the RNA aptamer comprises the nucleotide sequence of SEQ ID NO: 88. In certain embodiments, the RNA aptamer comprises the nucleotide sequence of SEQ ID NO: 89. In certain embodiments, the RNA aptamer comprises the nucleotide sequence of SEQ ID NO: 90. In certain embodiments, the RNA aptamer comprises the nucleotide sequence of SEQ ID NO: 91. In certain embodiments, the RNA aptamer comprises the nucleotide sequence of SEQ ID NO: 92. In certain embodiments, the RNA aptamer comprises the nucleotide sequence of SEQ ID NO: 93.

In certain embodiments, the RNA aptamer embedded tRNA comprises the nucleotide sequence of SEQ ID NO: 67.

In certain embodiments, the RNA aptamer is about 30-200 nucleotides in length.

In certain embodiments, the RNA aptamer is about 50-200 nucleotides in length.

In certain embodiments, the RNA aptamer is not a histone stem-loop.

In certain embodiments, the circular RNA comprises at least one chemical modification.

In certain embodiments, the chemical modification is pseudouridine, N1-methylpseudouridine, 2-thiouridine, 4′-thiouridine, 5-methylcytosine, 2-thio-I-methyl-1-deaza-pseudouridine, 2-thio-I-methyl-pseudouridine, 2-thio-5-aza-uridine, 2-thio-dihydropseudouridine, 2-thio-dihydrouridine, 2-thio-pseudouridine, 4-methoxy-2-thio-pseudouridine, 4-methoxy-pseudouridine, 4-thio-I-methyl-pseudouridine, 4-thio-pseudouridine, 5-aza-uridine, dihydropseudouridine, 5-methyluridine, 5-methyluridine, 5-methoxyuridine, 2′-O-methyl uridine, or N6-methyladenosine.

In certain embodiments, the chemical modification is pseudouridine, N1-methylpseudouridine, 5-methylcytosine, 5-methoxyuridine, N6-methyladenosine or a combination thereof.

In certain embodiments, the chemical modification is N1-methylpseudouridine.

In another aspect, the disclosure provides a linear precursor RNA comprising at least a self-splicing ribozyme and a protein coding region, wherein the linear precursor RNA comprises at least one RNA aptamer.

In certain embodiments, the self-splicing ribozyme comprises at least two catalytic subunits.

In certain embodiments, the self-splicing ribozyme catalytic subunits derive from either a group I intron or a group II intron RNA transcript or a fragment thereof.

In certain embodiments, the self-splicing ribozyme catalytic subunits derive from a permuted intron-exon (PIE) sequence from Cyanobacterium anabaena pre-tRNA-Leu gene, T4 phage Td gene, or Tetrahymena pre-rRNA.

In certain embodiments, the catalytic activity of the two subunits results in a circularized RNA.

In certain embodiments, the linear precursor RNA comprises the following elements, from 5′ to 3′: a) a 5′ external homology arm, b) a 3′ self-splicing PIE fragment, c) a 5′ internal homology arm, d) a 5′ spacer sequence, e) an internal ribosome entry site (IRES) f) a protein coding region, g) a 3′ spacer sequence, h) a 3′ internal homology arm, i) a 5′ self-splicing PIE fragment, and j) a 3′ external homology arm, wherein the RNA aptamer is present at one or both of the 5′ end or 3′ end of any one of elements a)-j).

In certain embodiments, the linear precursor RNA comprises the following elements, from 5′ to 3′: a) a 5′ external homology arm, b) a 3′ self-splicing PIE fragment, c) a 5′ internal homology arm, d) a 5′ spacer sequence, e) a protein coding region, f) an IRES, g) a 3′ spacer sequence, h) a 3′ internal homology arm, i) a 5′ self-splicing PIE fragment, and j) a 3′ external homology arm, wherein the RNA aptamer is present at one or both of the 5′ end or 3′ end of any one of elements a)-j).

In certain embodiments, the 5′ external homology arm and the 3′ external homology arm comprises the nucleotide sequence of SEQ ID NO: 69 or SEQ ID NO: 72.

In certain embodiments, the 5′ external homology arm and the 3′ external homology arm are each independently about 5 to about 50 nucleotides in length.

In certain embodiments, the 5′ self-splicing PIE fragment comprises the nucleotide sequence of SEQ ID NO: 74.

In certain embodiments, the 5′ internal homology arm comprises the nucleotide sequence of SEQ ID NO: 70.

In certain embodiments, the 5′ internal homology arm is about 5 to about 50 nucleotides in length.

In certain embodiments, the 5′ spacer and the 3′ spacer comprises the nucleotide sequence of SEQ ID NO: 78 or SEQ ID NO: 79.

In certain embodiments, the 5′ spacer and the 3′ spacer are each independently about 5 to 75 nucleotides in length

In certain embodiments, the 3′ self-splicing PIE fragment comprises the nucleotide sequence of SEQ ID NO: 73.

In certain embodiments, the IRES comprises the nucleotide sequence of SEQ ID NO: 75.

In certain embodiments, the linear precursor RNA comprises at least one 5′ untranslated region (5′ UTR), at least one 3′ untranslated region (3′ UTR), and/or a polyadenylation (polyA) sequence.

In certain embodiments, the protein coding region encodes at least one polypeptide.