CIRCULAR RNA VACCINES AND METHODS OF USE THEREOF

Abstract
The present application provides circular RNAs (circRNAs) encoding therapeutic polyeptides (e.g., an antigenic polypeptide, a functional protein, a receptor protein, or a targeting protein). In some embodiments, the present application provides circRNA vaccines against a coronavirus such as SARS-CoV-2. In some embodiments, the circRNA vaccine comprises a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a Spike (S) protein or a fragment thereof of a coronavirus. Also provided are methods of treating or preventing a disease or condition using the circRNAs or compositions thereof.
Description
SUBMISSION OF SEQUENCE LISTING ON ASCII TEXT FILE

The content of the following submission on ASCII text file is incorporated herein by reference in its entirety: a computer readable form (CRF) of the Sequence Listing (file name: 165392000242SEQLIST.TXT, date recorded: Aug. 18, 2021, size: 247,489 bytes).


FIELD

The present application relates to circular RNA (circRNA) encoding a therapeutic polypeptide, such as circRNA vaccines against a coronavirus, and methods of use thereof.


BACKGROUND

COVID-19 is a serious worldwide public health emergency caused by a coronavirus infection with the SARS-CoV-2 virus. Currently, no effective drugs or vaccines are available. Thus, there is an urgent need for the development of safe and effective vaccines for coronavirus infections, such as SARS-CoV-2. Vaccines typically fall into two broad categories: vaccines comprising a complete virus (live attenuated vaccines or inactivated vaccines), or vaccines comprising a part of a virus, which can be recombinant protein or DNA or RNA-based vaccines. Vaccines based on a complete virus are subject to several disadvantages, including the need to handle large amounts of an infectious virus during vaccine production for an inactivated vaccine, and the need for extensive safety testing of live attenuated vaccines. Vaccines based on recombinant protein are also limited by global production capacity of recombinant proteins, while DNA-based vaccines suffer from difficulties related to safe delivery of DNA and effectiveness to generate immune responses (Amanat, F. & Krammer, F. SARS-CoV-2 Vaccines: Status Report. (2020) Immunity 52, 583-589).


The development of RNA-based vaccines provides a potential pathway to an immunogenic vaccine without requiring handling of infectious virus during production. RNA molecules are considered to be significantly safer than DNA vaccines, as RNAs are more easily degraded. They are cleared quickly out of the organism and cannot integrate into the genome and influence the cell's gene expression in an uncontrollable manner. It is also less likely for RNA vaccines to cause severe side effects like the generation of autoimmune disease or anti-DNA antibodies (Bringmann A. et al., Journal of Biomedicine and Biotechnology (2010), vol. 2010, article ID623687). Transfection with RNA requires only insertion into the cell's cytoplasm, which is easier to achieve than into the nucleus.


BRIEF SUMMARY

The present application provides circRNAs encoding polypeptides, such as therapeutic polypeptides, and methods of treatment using the circRNAs. In some embodiments, the present application provides novel vaccines against a coronavirus (e.g., SARS-CoV-2) based on circular RNAs (circRNA). Optionally the SARS-CoV-2 infection is caused by a SARS-CoV-2 variant (e.g., B.1.351 or B.1.617.2 variant). Also provided are methods of producing the circRNA vaccines and methods of treating or preventing a coronavirus infection using the circRNA vaccines.


One aspect of the present application provides a circular RNA (circRNA) comprising a nucleic acid sequence encoding a therapeutic polypeptide, wherein therapeutic polypeptide is selected from the group consisting of an antigenic polypeptide, a functional protein, a receptor protein, and a targeting protein. In some embodiments, the circRNA further comprises a Kozak sequence operably linked to the nucleic acid sequence encoding therapeutic polypeptide.


In some embodiments according to any one of the circRNAs described above, the circRNA further comprises an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding therapeutic polypeptide.


In some embodiments according to any one of the circRNAs described above, the circRNA further comprises an internal ribosomal entry site (IRES) sequence operably linked to the nucleic acid sequence encoding therapeutic polypeptide. In some embodiments, the IRES sequence is a CVB3 virus, EV71 virus, EMCV virus, PV virus, or CSFV virus IRES sequence. In some embodiments, the circRNA comprises a nucleic acid sequence comprising from the 5′ end to the 3′ end: the IRES sequence, the Kozak sequence, and the nucleic acid sequence encoding therapeutic polypeptide. In some embodiments according to any one of the circRNAs comprising an IRES sequence described above, the circRNA further comprises a polyAC or polyA sequence disposed at the 5′ end of the IRES.


In some embodiments according to any one of the circRNAs described above, the circRNA further comprises an m6A modification motif sequence operably linked to the nucleic acid sequence encoding therapeutic polypeptide. In some embodiments, the circRNA comprises a nucleic acid sequence comprising from the 5′ to the 3′ end: the m6A modification motif sequence, the Kozak sequence, and the nucleic acid sequence encoding therapeutic polypeptide.


In some embodiments according to any one of the circRNAs described above, the nucleic acid further encodes a signal peptide (SP) fused to the N-terminus of therapeutic polypeptide. In some embodiments, the SP is an SP of a human tissue plasminogen activator (tPA) or an SP of a human IgE immunoglobulin (e.g., the sequence shown in SEQ ID NO: 16). In some embodiments, the SP is an SP of a human IgE immunoglobulin, (e.g., the sequence shown in SEQ ID NO: 17).


In some embodiments according to any one of the circRNAs described above, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the 3′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 21, and the 5′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 22.


In some embodiments according to any one of the circRNAs described above, the circRNA is circularized in vitro.


In some embodiments according to any one of the circRNAs described above, therapeutic polypeptide is for treating or preventing an infection. In some embodiments, the infection is an infection by a virus, such as a coronavirus. In some embodiments, the coronavirus is selected from the group consisting of SARS-CoV, MERS-COV, and SARS-CoV-2. In some embodiments, the coronavirus is SARS-CoV-2.


One aspect of the present application provides a circRNA comprising a nucleic acid sequence encoding a therapeutic polypeptide, wherein therapeutic polypeptide is an antigenic polypeptide. In some embodiments, the circRNA is according to any one of the circRNAs described above.


One aspect of the present application provides a circRNA comprising a nucleic acid sequence encoding a therapeutic polypeptide, wherein therapeutic polypeptide is a receptor protein. In some embodiments, the circRNA is according to any one of the circRNAs described above. In some embodiments, therapeutic polypeptide is a soluble receptor comprising an extracellular domain of a naturally occurring receptor. In some embodiments, the receptor is an ACE2 receptor. In some embodiments, the receptor is a high-affinity mutant ACE2 receptor.


In some embodiments, there is provided a composition comprising a plurality of circRNAs according to any one of the circRNAs encoding a receptor protein described above, wherein the receptor proteins corresponding to the plurality of circRNAs are different with respect to each other. In some embodiments, the plurality of circRNAs target a plurality of strains of a coronavirus, e.g., SARS-CoV-2.


One aspect of the present application provides a circRNA comprising a nucleic acid sequence encoding a therapeutic polypeptide, wherein therapeutic polypeptide is a targeting protein. In some embodiments, the circRNA is according to any one of the circRNAs described above. In some embodiments, the targeting protein is an antibody. In some embodiments, the antibody is a neutralizing antibody, e.g., a neutralizing antibody targeting a coronavirus such as SARS-CoV-2. In some embodiments, the targeting protein is a therapeutic antibody.


In some embodiments, there is provided a composition comprising a plurality of circRNAs according to any one of the circRNAs encoding a targeting proteins described above, wherein the targeting proteins corresponding to the plurality of circRNAs are different with respect to each other. In some embodiments, the targeting proteins are neutralizing antibodies. In some embodiments, the plurality of circRNAs target a plurality of strains of a coronavirus, e.g., SARS-CoV-2.


One aspect of the present application provides a circRNA comprising a nucleic acid sequence encoding a therapeutic polypeptide, wherein therapeutic polypeptide is a functional protein. In some embodiments, the circRNA is according to any one of the circRNAs described above. In some embodiments, the functional protein is a tumor suppressor, such as p53 or PTEN. In some embodiments, the functional protein is an enzyme, such as OTC, FAH, or IDUA. In some embodiments, the functional protein is selected from the group consisting of DMD, COL3A1, BMPR2, AHI1, FANCC, MYBPC3, and IL2RG. In some embodiments, therapeutic polypeptide comprises a sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100%) identity to an amino acid sequence selected from the group consisting of SEQ ID NOs: 18-25.


One aspect of the present application provides a circular RNA (circRNA) comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a Spike (S) protein or a fragment thereof of a coronavirus. In some embodiments, the circRNA is according to any one of the circRNAs described above. In some embodiments, the coronavirus is SARS-CoV, MERS-CoV, or SARS-CoV-2. In some embodiments, the coronavirus is SARS-CoV-2. In some embodiments, the S protein or fragment thereof comprises a D614G mutation.


In some embodiments according to any one of the circRNAs encoding an antigenic polypeptide described above, the antigenic polypeptide comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100%) sequence identity to an amino acid sequence selected from the group consisting of SEQ ID NOs: 8-10 and 62-63. In some embodiments, the circRNA comprises a nucleic acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100%) sequence identity to a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 11-15 and 64.


In some embodiments according to any one of the circRNAs encoding an antigenic polypeptide described above, the antigenic polypeptide comprises a receptor-binding domain (RBD) of the S protein. In some embodiments, the RBD comprises amino acid residues 319 to 542 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the RBD comprises an amino acid sequence having at least about 80% identity (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100% identity) to the amino acid sequence of SEQ ID NO: 2.


In some embodiments according to any one of the circRNAs encoding an antigenic polypeptide described above, the antigenic polypeptide comprises a receptor-binding domain (RBD) of the S protein, wherein the RBD comprises an amino acid sequence having at least about 80% identity (e.g., at least about 85%, 90%, 95%, 98%, 99% or more, or 100% identity) to the amino acid sequence of SEQ ID NO: 63.


In some embodiments according to any one of the circRNAs comprising an RBD described above, the antigenic polypeptide further comprises a multimerization domain. In some embodiments, the multimerization domain is a C-terminal Foldon (Fd) domain of a T4 fibritin protein or a GCN4-based isoleucine zipper domain. In some embodiments, the multimerization domain comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100%) to the amino acid sequence of SEQ ID NO: 3 or SEQ ID NO: 4. In some embodiments, the RBD is fused to the multimerization domain via a peptide linker. In some embodiments, the peptide linker comprises the amino acid sequence of SEQ ID NO: 5.


In some embodiments according to any one of the circRNAs encoding an antigenic polypeptide described above, the antigenic polypeptide comprises an S2 region of the S protein. In some embodiments, the S2 region comprises amino acid residues 686 to 1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the S2 region comprises one or more mutations that stabilize a pre-fusion conformation of the S protein. In some embodiments, the one or more mutations comprise K986P and V987P. In some embodiments, the S2 region comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100%) to the amino acid sequence of SEQ ID NO: 6 or SEQ ID NO: 7.


In some embodiments according to any one of the circRNAs encoding an antigenic polypeptide described above, the antigenic polypeptide comprises amino acid residues 2-1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the antigenic polypeptide comprises one or more mutations that inhibit cleavage of the S protein. In some embodiments, the one or more mutations that inhibit cleavage of the S protein comprise deletion of amino acid residues 681-684, wherein the numbering is based on SEQ ID NO: 1.


In some embodiments, there is provided a composition comprising a plurality of circRNAs according to any one of the circRNAs encoding an antigenic polypeptides described above, wherein the antigenic polypeptides corresponding to the plurality of circRNAs are different with respect to each other. In some embodiments, the plurality of circRNAs target a plurality of strains of a coronavirus, e.g., SARS-CoV-2.


In some embodiments, there is provided a circRNA vaccine comprising a circRNA or a plurality of circRNAs according to any one of the circRNAs encoding an antigenic polypeptide described above.


In some embodiments, there is provided a pharmaceutical composition comprising the circRNA according to any one of the circRNAs described above, and a pharmaceutically acceptable carrier.


In some embodiments according to any one of the circRNA vaccines or pharmaceutical compositions described above, the circRNA vaccine or the pharmaceutical composition further comprises a transfection agent. In some embodiments, the transfection reagent is polethylenimine (PEI) or a lipid nanoparticle (LNP). In some embodiments, the LNP comprises MC3-lipid, DSPC, cholesterol, and PEG2000-DMG. In some embodiments, the circRNA or the pharmaceutical composition is not formulated with a transfection agent.


Other aspects of the present application provide methods of treating or preventing a coronavirus infection in an individual, comprising administering to the individual an effective amount of any one of the circRNA vaccines described above. In some embodiments, the infection is SARS-CoV-2 infection. In some embodiments, the circRNA is subject to rolling circle translation by a ribosome in the individual.


Another aspect of the present application provides a method of treating or preventing a disease or condition in an individual, comprising administering to the individual an effective amount of any one of the circRNAs described above or any one of the pharmaceutical compositions described above. In some embodiments, wherein the circRNA encodes an antigenic polypeptide, a receptor or a targeting protein (e.g., antibody), the disease or condition is an infection, such as a viral infection. In some embodiments, the disease or condition is a disease or condition associated with insufficient levels and/or activity of a naturally-occurring protein corresponding to the therapeutic polypeptide. In some embodiments, the disease or condition is a hereditary genetic disease associated with one or more mutations in the protein corresponding to the therapeutic polypeptide. In some embodiments, therapeutic polypeptide is TP53 or PTEN, and the disease or condition is cancer. In some embodiments, therapeutic polypeptide is OTC, and the disease is ornithine transcarbamylase deficiency. In some embodiments, therapeutic polypeptide is FAH, and the disease is tyrosinemia. In some embodiments, therapeutic polypeptide is DMD, and the disease is Duchenne and Becker muscular dystrophy, X-linked dilated cardiomyopathy, or familial dilated cardiomyopathy. In some embodiments, therapeutic polypeptide is IDUA, and the disease or condition is Mucopolysaccharidosis type I (MPS I). In some embodiments, therapeutic polypeptide is COL3A1, and the disease or condition is Ehlers-Danlos syndrome. In some embodiments, therapeutic polypeptide is AHI1, and the disease or condition is Joubert syndrome. In some embodiments, therapeutic polypeptide is BMPR2, and the disease or condition is pulmonary arterial hypertension, or pulmonary veno-occlusive disease. In some embodiments, therapeutic polypeptide is FANCC, and the disease or condition is Fanconi anemia. In some embodiments, therapeutic polypeptide is MYBPC3, and the disease or condition is primary familial hypertrophic cardiomyopathy. In some embodiments, therapeutic polypeptide is IL2RG, and the disease or condition is X-linked severe combined immunodeficiency. In some embodiments, the circRNA is subject to rolling circle translation by a ribosome in the individual.


Other aspects of the present application provide a linear RNA capable of forming the circRNA of any one of the circRNAs provided herein.


In some embodiments, the linear RNA can be circularized by autocatalysis of a Group I intron comprising a 5′ catalytic Group I intron fragment and a 3′ catalytic Group I intron fragment. In some embodiments, the linear RNA comprises a 3′ catalytic Group I intron fragment flanking the 5′ end of a 3′ exon sequence recognizable by a Group I intron, and a 5′ catalytic Group I intron fragment flanking the 3′ end of a 5′ exon sequence recognizable by a Group I intron. In some embodiments, the 3′ catalytic Group I intron fragment comprises the sequence of SEQ ID NO: 28, and the 5′ catalytic Group I intron fragment comprises the sequence of SEQ ID NO: 29. In some embodiments, the linear RNA further comprises a 5′ homology sequence flanking the 5′ end of the 3′ catalytic Group I intron fragment, and a 3′ homology sequence flanking the 3′ end of the 5′ catalytic Group I intron fragment. In some embodiments, the 5′ homology sequence comprises the nucleic acid sequence of SEQ ID NO: 23, and the 3′ homology sequence comprises the nucleic acid sequence of SEQ ID NO: 24.


In some embodiments, the linear RNA can be circularized by a ligase (e.g., an RNA ligase). In some embodiments, the ligase is selected from the group consisting of a T4 DNA ligase (T4 Dnl), a T4 RNA ligase 1 (T4 Rnl1) and a T4 RNA ligase 2 (T4 Rnl2). In some embodiments, the linear RNA comprises a 5′ ligation sequence at the 5′ end of the nucleic acid sequence encoding the circRNA, and a 3′ ligation sequence at the 3′ end of the nucleic acid sequence encoding the circRNA, wherein the 5′ ligation sequence and the 3′ ligation sequence can be ligated to each other via the ligase.


One aspect of the present application provides a nucleic acid construct comprising a nucleic acid sequence encoding any one of the linear RNAs described above. In some embodiments, the nucleic acid construct comprises a T7 promoter operably linked to the nucleic acid sequence encoding the linear RNA.


One aspect of the present application provides a method of producing a circRNA, comprising: (a) subjecting any one of the linear RNAs described above, wherein the linear RNA comprises a 3′ catalytic Group I intron fragment flanking the 5′ end of a 3′ exon sequence recognizable by a Group I intron, and a 5′ catalytic Group I intron fragment flanking the 3′ end of a 5′ exon sequence recognizable by a Group I intron, to a condition that activates autocatalysis of the 5′ catalytic Group I intron fragment and the 3′ catalytic Group I intron fragment to provide a circularized RNA product; and (b) isolating the circularized RNA product, thereby providing the circRNA vaccine.


One aspect of the present application provides a method of producing a circRNA, comprising: (a) contacting any one of the linear RNAs described above, wherein the linear RNA comprises a 5′ ligation sequence at the 5′ end of the nucleic acid sequence encoding the circRNA, and a 3′ ligation sequence at the 3′ end of the nucleic acid sequence encoding the circRNA, with a single-stranded adaptor nucleic acid comprising from the 5′ end to the 3′ end: a first sequence complementary to the 3′ ligation sequence and a second sequence complementary to the 5′ ligation sequence, and wherein the 5′ ligation sequence and the 3′ ligation sequence hybridize to the single-stranded adaptor nucleic acid to provide a duplex nucleic acid intermediate comprising a single strand break between the 3′ end of the 5′ ligation sequence and the 5′ end of the 3′ ligation sequence; (b) contacting the intermediate with an RNA ligase under a condition that allows ligation of the 5′ ligation sequence to the 3′ ligation sequence to provide a circularized RNA product; and (c) isolating the circularized RNA product, thereby providing the circRNA vaccine.


One aspect of the present application provides a method of producing a circRNA, comprising: (a) contacting any one of the linear RNAs described above, wherein the linear RNA comprises a 5′ ligation sequence at the 5′ end of the nucleic acid sequence encoding the circRNA, and a 3′ ligation sequence at the 3′ end of the nucleic acid sequence encoding the circRNA, with an RNA ligase under a condition that allows ligation of the 5′ ligation sequence to the 3′ ligation sequence to provide a circularized RNA product; and (b) isolating the circularized RNA product, thereby providing the circular RNA.


In some embodiments according to any one of the methods of producing a circRNA vaccine described above, the method further comprises obtaining the linear RNA by in vitro transcription of a nucleic acid construct comprising a nucleic acid sequence encoding the linear RNA.


In some embodiments according to any one of the methods of producing a circRNA vaccine described above, the method further comprises purifying the circularized RNA product.


Also provided are compositions, kits and articles of manufacture for use in any one the methods described above.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1A shows an exemplary method of generating a circRNA vaccine in vitro based on a Group I catalytic intron. A typical Group I catalytic intron comprises, from the 5′ end to the 3′ end: a 5′ exon comprising a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment (Exon 1), 5′ catalytic Group I intron fragment, 3′ catalytic Group I intron fragment, and a 3′ exon comprising a 3′ exon sequence recognizable by the 3′ catalytic Group I intron fragment (Exon 2). A linear RNA construct with an insert sequence can be made to allow auto-catalysis of the Group I intron fragments in order to join the two ends of the insert sequence and obtain a circular RNA after self-splicing by the Group I intron. The linear construct comprises, from 5′ to 3′, 3′ catalytic Group I intron fragment, a 3′ exon (Exon 2), an insert sequence, a 5′ exon (Exon 1), and 5′ Group I intron. The insert sequence may comprise the nucleic acid sequence encoding the antigenic polypeptide.



FIG. 1B shows a schematic of an exemplary nucleotide sequence having an IRES, and an exemplary method of circularizing purified linear RNAs by ribozyme autocatalysis of the Group I catalytic Intron.



FIG. 1C shows a schematic of an exemplary nucleotide sequence having an m6A modification motif sequence before the start codon, and an exemplary method of circularizing purified linearized RNAs by ribozyme autocatalysis of the Group I catalytic intron.



FIG. 2A shows a schematic of an exemplary nucleotide sequence having an IRES, and an exemplary method of circularizing linear RNAs by enzyme catalysis using T4 RNA ligases by supplying ssDNA adaptors.



FIG. 2B shows a schematic of an exemplary nucleotide sequence wherein the IRES sequence is replaced with an m6A modification motif and the TAA stop codon is replaced with a 2A peptide coding sequence (in non-limiting examples, a T2A, P2A or other 2A peptide-coding sequence). Also shown is an exemplary method of circularizing linear RNAs by enzyme catalysis using T4 RNA ligases by supplying ssDNA adaptors.



FIG. 2C shows a schematic illustration of the ribosome rolling circle translation of the circRNA vaccine. The translation factors can be recruited and translation initiated by either the IRES site or the m6A modification motifs.



FIG. 3A shows the results of agarose gel electrophoresis of an exemplary purified circRNARBD and precursor RNA (LinRNARBD, wherein the 3′ Intron sequence was mutated to random sequence), demonstrating that the circRNARBD ran faster than LinRNARBD, indicating circularization of the RNA.



FIG. 3B shows the results of an endonuclease RNase R digestion assay of an exemplary circRNA (circRNARBD) or LinRNA (LinRNARBD). Following incubation with RNAse R for the indicated time periods, the reaction products were resolved in agarose gel electrophoresis, indicating that the circRNA, which lacks a 5′ or 3′ end, was more resistant to RNase R compared to the LinRNA.



FIG. 3C shows the agarose gel electrophoresis result of the PCR products of linear RNARBD and circRNARBD, using the primers shown in FIG. 3E.



FIG. 3D shows the results of a quantitative ELISA assay to measure the concentration of RBD antigens in the supernatant. The data were shown as the mean±S.E.M. (n=3).



FIG. 3E shows a schematic diagram of circRNARBD circularization by the Group I ribozyme autocatalysis. SP, signal peptide sequence of human tPA protein. T4, the trimerization domain from bacteriophage T4 fibritin protein. RBD, the receptor binding domain of SARS-CoV-2 Spike protein. The arrows indicate the the design of primers for PCR analysis shown in FIG. 3C.



FIGS. 4A-4B show Western Blot analysis demonstrating expression and secretion of an exemplary protein from eukaryotic cells after transfection with an exemplary circRNA. Human HEK293T cells (FIG. 4A) and mouse NIH3T3 (FIG. 4B) cells were transfected with circRNARBD, or circRNAEGFP or the precursor RNA named LinRNARBD as controls. After 48 hours, the culture supernatant of transfected cells was collected for Western Blot analysis. Using the SARS-CoV-2 Spike RBD antibody (ABclonal, A20135) for detection, the Western Blot results showed that the circRNARBD could express and secret the SARS-CoV-2 RBD antigen to the cellular supernatant efficiently.



FIG. 4C shows results demonstrating the stability of an exemplary circRNA after extended incubation at room temperature. Purified circRNARBD was kept at room temperature about 25° C. for 3, 7 or 14 days, and then transfected into human HEK293T cells. The Western Blot results showed that the SARS-CoV-2 RBD antigen could be expressed by the circRNARBD and secreted to the cellular supernatant efficiently, even when the circRNARBD was kept at room temperature for 14 days.



FIG. 4D shows the results of ELISA analysis measuring the expression level of RBD antigens in the supernatant of HEK293T cells transfected with circRNARBD-LNP formulations for different shelf time (1, 3, 7, 14, 24 and 31 days) at 4° C. or room temperature (˜25° C.). The data were shown as the mean±S.E.M. (n=3 or 4).



FIGS. 5A-5B show results of a pseudovirus competition experiment demonstrating that the secreted SARS-CoV-2 RBD antigen produced by circRNA effectively interfered with infection of cells by a SARS-CoV-2 pseudovirus. Collected supernatant from HEK293T cells transfected with circRNARBD or controls was incubated with a lentivirus-based SARS-CoV-2 pseudovirus expressing an EGFP fluorescence marker at 37° C. for 2 hours. The resulting supernatant was then added into the culture medium of ACE2-overexpressing cells named HEK293-ACE2. After 48 hours, the cells were collected for FACS analysis for the EGFP marker, indicating infection of the cells by the pseudovirus. The results are shown as a bar graph in FIG. 5A, and the FACS plots are shown in FIG. 5B.



FIGS. 6A-6E show results demonstrating that immunization of mice with the circRNARBD or the circRNASpike results in the production of RBD-specific neutralizing antibodies. The circRNARBD or the circRNASpike was used to immunize BALB/c mice, respectively. The first immunization was conducted via intramuscular injection at day 0, and a second dose was adopted to boost the immune response at day 14 (FIG. 6A). At day 28, the serum of the immunized mice was collected for the following detection (FIG. 6A). Firstly, the RBD specific IgG titer was measured with ELISA, and the ELISA result showed that the IgG titer of circRNARBD (10 μg) group was about 32000, and the IgG titer of circRNARBD (50 μg) group was about 64000, while the Placebo group had almost no RBD-specific IgG signal (FIG. 6B). Meanwhile, an in vitro surrogate neutralizing assay was used to measure the neutralization activity of immunized mouse serum, and the result showed that circRNARBD (10 μg) group had about 70% neutralization activity, and the circRNARBD (50 μg) group had over 95% neutralization activity (FIG. 6C). Finally, the lentivirus-based SARS-CoV-2 pseudovirus coated with SARS-CoV-2 spike protein was used to determine the neutralizing activity at the cell level. The serum of immunized mice was incubated with SARS-CoV-2 pseudovirus, and then incubation system was added into the culture of ACE2-over-expression HEK293T cells. 48 hours later, the reporter-luciferase activity of pseudovirus was measured. And the luciferase assay results showed that both the circRNARBD and circRNASpike could induce SARS-CoV-2 spike specific neutralizing antibody to block the infection of pseudovirus (FIG. 6D and FIG. 6E, respectively).



FIGS. 7A-7B show results demonstrating increased spleen weight following immunization of mice with circRNARBD (10 μg) or circRNARBD (50 μg) compared to placebo. Four weeks post the second dose of circRNA vaccine or placebo the mice were sacrificed and the spleens of immunized mice were isolated (FIG. 7A). Then the weight of each mouse were measured and the weight of spleen from circRNARBD (10 μg) or circRNARBD (50 μg) was significantly higher than the placebo group (FIG. 7B).



FIG. 8A shows an exemplary method of generating a circRNA and a schematic of an exemplary circRNA construct for expression of a neutralizing antibody, such as a SARS-CoV-2 neutralizing antibody. Although the construct shown comprises an IRES sequence, it will be appreciated that any of the exemplary circRNA constructs described herein can be used for expression of a secreted neutralizing antibody (e.g., a variation of the construct comprising an m6A site and/or a 2A peptide instead of a stop codon, as shown in FIG. 1C).



FIG. 8B shows the pseudovirus neutralization activity of a secreted nAb produced by exemplary circRNA-nAb constructs (circRNAnAb-1, comprising a nucleotide sequence encoding nAb-1 (amino acid sequence shown in SEQ ID NO: 27), circRNAnAb-2, comprising a nucleotide sequence encoding nAb-2 (amino acid sequence shown in SEQ ID NO: 28), and, and circRNAnAB-5, comprising a nucleotide sequence encoding nAb-5 (amino acid sequence shown in SEQ ID NO: 31). A circRNA expressing Luciferase (circRNALuc) and a linear RNA encoding nAb-5 (LinRNAnAB-5) were used as negative controls and a commercial SARS-CoV-2 neutralizing antibody (ABclonal, A19215) was used as the positive control.



FIG. 8C shows the results of lentivirial-based pseudovirus neutralization assay with the supernatant from cells transfected with circRNA encoding neutralizing nanobodies nAB1, nAB1-Tri, nAB2, nAB2-Tri, nAB3 and nAB3-Tri or ACE2 decoys. The luciferase value was normalized to the circRNAEGFP control. The data were shown as the mean±S.E.M. (n=2).



FIG. 8D shows the results of a neutralization assay of VSV-based D614G, B.1.1.7 or B.1.351 pseudovirus by the supernatant from cells transfected with neutralizing nanobodies nAB1-Tri, nAB3-Tri or ACE2 decoys expressed through the circRNA platform. The data were shown as the mean±S.E.M. (n=3).



FIG. 9A shows an exemplary method of generating a circRNA and a schematic of an exemplary circRNA construct for expression of a therapeutic polypeptide, such as IDUA. A mouse α-1-iduronidase (IDUA) coding sequence was inserted into the circRNA backbone. Although the construct shown comprises an IRES sequence, and a nucleotide sequence coding for IDUA, it will be appreciated that any of the exemplary circRNA constructs described herein can be used for expression of any of the therapeutic polypeptides described herein (e.g., a variation of the construct comprising an m6A site and/or a 2A peptide instead of a stop codon, as shown in FIG. 1C).



FIGS. 9B-9C show the results of an α-1-iduronidase assay demonstrating that the circRNA-IDUA, but not a linear RNA control (LinRNA-IDUA), could recover the catalytic activity of α-1-iduronidase efficiently in primary MEF cells from Hurler Syndrome mouse models (FIG. 9B) and as well as human HEK293T/IDUA−/− cells (FIG. 9C).



FIG. 10 shows results demonstrating in vivo restoration of the catalytic activity of α-1-iduronidase in Hurler Syndrome mouse models by injection of encapsulated circRNA-IDUA. The purified circRNA-IDUA (30 μg) was encapsulated and delivered into Hurler Syndrome mouse via tail-vein injection with the 30 μg each mouse dosage. After 4 or 24 hours, the Hurler Syndrome mice were sacrificed to isolate the liver tissues and assay α-1-iduronidase activity. The results demonstrated that circRNA-IDUA could efficiently restore the catalytic activity of α-1-iduronidase in Hurler Syndrome mouse models, reaching to nearly 20% activity of the wildtype mouse, and that the catalytic activity increased from 4 hours to 24 hours, indicating that the circRNA-IDUA could be utilized in the therapy of genetic diseases.



FIGS. 11A-11H provide results demonstrating humoral immune responses in mice immunized with SARS-CoV-2 circRNARBD vaccines. FIG. 11A shows a schematic representation of an LNP-circRNA complex. FIG. 11B shows a representative of concentration-size graph of LNP-circRNARBD measured by dynamic light scattering method. FIG. 11C shows a schematic diagram of the LNP-circRNARBD vaccination process in BALB/c mice and serum collection schedule for specific antibodies analysis. FIG. 11D shows results of measuring the SARS-CoV-2 specific IgG antibody titer with ELISA. The data were shown as the mean±S.E.M. (n=4 or 5). Each symbol represented an individual mouse. FIG. 11E shows a sigmoidal curve diagram of the inhibition rate by sera of immunized mice with surrogate virus neutralization assay. Sera from circRNARBD (10 μg) and circRNARBD (50 μg) immunized mice were collected at 2 weeks post the second dose. The data were shown as the mean±S.E.M. (n=4). FIG. 11F shows a sigmoldal curve diagram of the inhibition rate by sera of immunized mice with surrogate virus neutralization assay. Sera from circRNARBD (10 μg) and circRNARBD (50 μg) immunized mice were collected at 5 weeks post boost. The data were shown as the mean±S.E.M. (n=5). FIG. 11G shows the NT50 of circRNARBD, calculated using lentivirus-based SARS-CoV-2 pseudovirus. The data were shown as the mean±S.E.M. (n 336=5). Each symbol represented an individual mouse. FIG. 11H shows the NT50 of circRNARBD, determined using infectious SARS-CoV-2 authentic virus. Sera from circRNARBD (50 μg) immunized mice were collected at 5 weeks post the second dose. The data were shown as the mean±S.E.M. (n=4 or 5). Each symbol represented an individual mouse.



FIGS. 12A-12D provide results demonstrating SARS-CoV-2 specific T cell immune responses in mice immunized with SARS-CoV-2 circRNARBD vaccines. FIG. 12A shows FACS analysis results showing the percentages of cytokine positive cells evaluated among single and viable CD44+CD62LCD4+ T cells. FIG. 12B shows an intracellular staining assay for cytokines (IFN-γ, TNF-α, and IL-2) production among SARS-CoV-2 specific CD4+ effector memory T cells (CD44+CD62L) in splenocytes. Results were pooled from two independent experiments. Data were presented as the mean±S.E.M. (n=3 or 4). Each symbol represented an individual mouse.



FIG. 12C shows FACS analysis results showing the percentages of cytokine positive cells evaluated among single and viable CD44+CD62LCD8+ T. FIG. 12D shows an intracellular staining assay for cytokines (IFN-γ, TNF-α, and IL-2) production among SARS-CoV-2 specific CD8+ effector memory T cells (CD44+CD62L) in splenocytes. Results were pooled from two independent experiments. Data were presented as the mean±S.E.M. (n=3 or 4). Each symbol represented an individual mouse.



FIGS. 13A-13G provide results demonstrating the susceptibility of SARS-CoV-2 D614G, B.1.1.7 or B.1.351 variants to neutralizing antibodies elicited by the circRNARBD or circRNARBD-501Y.V2 vaccines in mice. FIG. 13A shows the schematic diagram of circRNARBD-501Y.V2 circularization by the Group I ribozyme autocatalysis. SP, signal peptide sequence of human tPA protein. T4, the trimerization domain from bacteriophage T4 fibritin protein. RBD-501Y.V2, the RBD antigen harboring the K417N-E484K-N501Y mutations in SARS-CoV-2 501Y.V2 variant. FIG. 13B shows the SARS-CoV-2 specific IgG antibody titer with ELISA. The data were shown as the mean±S.E.M. Each symbol represented an individual mouse. FIG. 13C shows the Sigmodal curve diagram of the inhibition rate by sera of immunized mice with surrogate virus neutralization assay. Sera from circRNARBD-501Y.V2 (50 μg) immunized mice were collected at 1 week or 2 weeks post boost. The data were shown as the mean±S.E.M. FIG. 13D shows the neutralizating results of VSV-based D614G, B.1.1.7 or B.1.351 pseudovirus with the serum of mice immunized with circRNARBD vaccines. The serum samples were collected at 5 weeks post boost. The data were shown as the mean±S.E.M. (n=5). FIG. 13E shows the neutralizating results of VSV-based D614G, B.1.1.7 or B.1.351 pseudovirus with the serum of mice immunized with circRNARBD-501Y.V2 vaccines. The serum samples were collected at 1 week post boost. The data were shown as the mean±S.E.M. (n=5). FIG. 13F-13G shows the NT50 determined using authentic SARS-CoV-2 B.1.351/501Y.V2 strain (FIG. 13F) or D614G strain (FIG. 13G). Sera from circRNARBD-501Y.V2 (50 μg) immunized mice were collected at 2 weeks post the second dose. The data were shown as the mean±S.E.M. Each symbol represented an individual mouse.



FIGS. 14A-14E provide results demonstrating protection of circRNARBD-501Y.V2 vaccines against SARS-CoV-2 (B.1.351 strain) challenge in mice. FIG. 14A shows a schematic of the dosing regimen and serum collection. At seven weeks post the second immunization with circRNARBD-501Y.V2 vaccine, the BALB/c mice were challenged with 5×104 PFU of authentic SARS-CoV-2 B.1.351/501Y.V2 strain via the intranasal (i.n.) route, and the lung tissues were collected at 3 days after challenge for detecting the viral loads. FIG. 14B shows measurement of the SARS-CoV-2 specific IgG antibody titers with ELISA. The data were shown as the mean±S.E.M. (n=5). Each symbol represented an individual mouse. FIG. 14C shows a sigmoidal curve diagram of the inhibition rate by sera of immunized mice with surrogate virus neutralization assay. In FIG. 14B and FIG. 14C, the sera from circRNARBD-501Y.V2 (50 μg) immunized mice were collected at 3 days before challenge with authentic SARS-CoV-2 B.1.351/501Y.V2 strain. FIG. 14D shows the weight change of immunized mice after virus challenge. FIG. 14E show viral loads in the lung tissues of challenged mice. The data were shown as the mean±S.E.M. (n≥5). Each symbol represented an individual mouse. The statistical test was performed by unpaired two-sided Student's t-test.





DETAILED DESCRIPTION

The present application provides circRNAs encoding a therapeutic polypeptide, such as an antigenic polypeptide, a functional protein, a receptor protein, or a targeting protein (e.g., antibody). In some embodiments, the present application provides a novel vaccine against a coronavirus such as the SARS-CoV-2 virus based on circular RNAs (circRNA). In some embodiments, the circRNA vaccine encodes an antigenic polypeptide comprising a Spike protein or fragment thereof of the coronavirus such as SARS-CoV-2. Unlike other types of coronavirus vaccines, the circRNA vaccines described herein do not require the handling of large amounts of infectious particles during production. Furthermore, the circRNA vaccines described herein may provide enhanced stability and efficacy compared to linear RNA vaccines. For example, given their circular nature, circRNAs are particularly stable compared to many linear RNAs because they are resistant to exonucleolytic decay by the cellular exosome ribonuclease complex. In some embodiments, the circRNA in the circRNA vaccines disclosed herein can be subject to rolling circle translation by a ribosome in an individual, to whom the vaccine has been administered, giving rise to high amounts of antigenic polypeptides. The production of this circRNA vaccine could be performed using various methods, such as chemical ligation, enzyme catalysis, or ribozyme autocatalysis. The circRNA vaccines described herein provide a platform for rapid development of vaccines against emerging coronavirus strains. Moreover, circular RNAs could be quickly generated in large quantities in vitro, and they do not require any nucleotide modification, strikingly different from canonical mRNA vaccines. Our data demonstrated that an exemplary circRNA and encapsulated circRNA-LNP complex were highly thermostable at 4° C. or room temperature for 7 to 14 days. Owing to their specific properties, circRNAs hold potentials in biomedical applications.


I. DEFINITIONS

Terms are used herein as generally used in the art, unless otherwise defined as follows.


The terms “polynucleotide,” “nucleic acid,” “nucleotide sequence,” and “nucleic acid sequence” are used interchangeably. They refer to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof.


The term “vaccine” is understood as being directed to an immunoactive pharmaceutical preparation. In certain embodiments, the vaccine induces adaptive immunity when administered to a host. The vaccine preparation may further contain a pharmaceutical carrier, which may be designed for the particular mode by which the vaccine is intended to be administered.


The terms “Group I intron” and “Group I catalytic intron” are used interchangeably to refer to a self-splicing ribozyme that can catalyze its own excision from an RNA precursor. Group I introns comprise two fragments, the 5′ catalytic Group I intron fragment and the 3′ catalytic Group I intron fragment, which retain their folding and catalytic function (i.e., self-splicing activity). In its native environment, the 5′ catalytic Group I intron fragment is flanked at its 5′ end by a 5′ exon, which comprises a 5′ exon sequence that is recognized by the 5′ catalytic Group I intron fragment; and the 3′ catalytic Group I intron fragment is flanked at its 3′ end by a 3′ exon, which comprises a 3′ exon sequence that is recognized by the 3′ catalytic Group I intron fragment. The terms “5′ exon sequence” and “3′ exon sequence” used herein are labeled according to the order of the exons with respect to the Group I intron in its natural environment, e.g., as shown in FIG. 1A.


The term “therapeutic polypeptide” refers to a polypeptide having a therapeutic effect. A therapeutic polypeptide may be a naturally-occurring protein or an engineered functional variant thereof, including functional fragments, derivatives having one or more mutations (e.g., insertion, deletion, substitution, etc.) to the amino acid sequence of the naturally-occurring protein, and fusion proteins comprising a naturally-occurring protein or fragment thereof. A therapeutic polypeptide may also be an engineered protein that does not have a naturally-occurring counterpart. Therapeutic polypeptide may have a single polypeptide chain or multiple polypeptide chains.


The term “antigenic polypeptide” refers to a polypeptide that can be used to trigger the immune system of a mammal to develop antibodies specific to the polypeptide or a portion thereof. Antigenic polypeptides described herein include naturally-occurring proteins, protein domains, and short peptide fragments derived from a naturally-occurring protein. An antigenic polypeptide may contain one or more known epitopes of a naturally-occurring protein. The antigenic polypeptide may comprise a carrier protein or multimerization protein that improves immunogenicity.


The term “functional protein” refers to a naturally-occurring protein, functional variants thereof, or an engineered derivative thereof that is functional in treating a genetic disease or condition. The disease or condition may be caused in whole or in part by a change, such as a mutation, in the wildtype, naturally-occurring protein corresponding to the functional protein.


The term “targeting protein” refers to a polypeptide that specifically binds to a target molecule. Targeting proteins described herein include both antibody-based and non-antibody based binding proteins or target-binding portions thereof.


The term “antibody” is used in its broadest sense and encompasses various antibody structures, including but not limited to monoclonal antibodies, polyclonal antibodies, multispecific antibodies (e.g., bispecific antibodies), full-length antibodies and antigen-binding fragments thereof, so long as they exhibit the desired antigen-binding activity. The term “antigen-binding fragment” as used herein refers to an antibody fragment including, for example, a diabody, a Fab, a Fab′, a F(ab′)2, an Fv fragment, a disulfide stabilized Fv fragment (dsFv), a (dsFv)2, a bispecific dsFv (dsFv-dsFv′), a disulfide stabilized diabody (ds diabody), a single-chain Fv (scFv), an scFv dimer (bivalent diabody), a multispecific antibody formed from a portion of an antibody comprising one or more CDRs, a camelized single domain antibody, a nanobody, a domain antibody, a bivalent domain antibody, or any other antibody fragment that binds to an antigen but does not comprise a complete antibody structure.


As use herein, the terms “specifically binds,” “specifically recognizing,” and “is specific for” refer to measurable and reproducible interactions, such as binding between a target and a targeting moiety. For example, a targeting moiety that specifically recognizes a target (which can be an epitope) is a targeting moiety (e.g., antibody) that binds this target with greater affinity, avidity, more readily, and/or with greater duration than its bindings to other molecules. In some embodiments, the extent of binding of a targeting moiety to an unrelated molecule is less than about 10% of the binding of the targeting moiety to the target as measured, e.g., by a radioimmunoassay (RIA). In some embodiments, a targeting moiety that specifically binds a target has a dissociation constant (KD) of ≤10−5 M, ≤10−6 M, ≤10−7 M, ≤10−8 M, ≤10−9 M, ≤10−10 M, ≤10−11 M, or ≤10−12 M. In some embodiments, specific binding can include, but does not require exclusive binding. Binding specificity of the targeting moiety can be determined experimentally by methods known in the art. Such methods comprise, but are not limited to Western blots, ELISA, RIA, ECL, IRMA, EIA, BIACORE™ and peptide scans.


The term “functional variant” of a reference protein refers to a variant polypeptide derived from the reference protein or a portion thereof, and the variant has substantially the same activity (e.g., binding to a target or enzymatic activity) as the reference protein. “Substantially the same activity” means an activity level that is at least about any one of 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or more as the activity of the reference protein.


The term “introducing” or “introduction” used herein means delivering one or more polynucleotides, such as circRNAs or one or more constructs including vectors as described herein, one or more transcripts thereof, to a host cell. The methods of the present application can employ many delivery systems, including but not limited to, viral, liposome, electroporation, microinjection and conjugation, to achieve the introduction of the circRNA or construct as described herein into a host cell. Conventional viral and non-viral based gene transfer methods can be used to introduce nucleic acids into mammalian cells or target tissues. Such methods can be used to administer nucleic acids encoding the circRNA of the present application to cells in culture, or in a host organism. Non-viral vector delivery systems include DNA plasmids, RNA (e.g. a transcript of a construct described herein), naked nucleic acid, and nucleic acid complexed with a delivery vehicle, such as a liposome. Viral vector delivery systems include DNA and RNA viruses, which have either episomal or integrated genomes for delivery to the host cell.


As used herein, “operably linked,” when referring to a first nucleic acid sequence that is operably linked with a second nucleic acid sequence, means a situation when the first nucleic acid sequence is placed in a functional relationship with the second nucleic acid sequence. For instance, a promoter is operably linked to a coding sequence if the promoter effects the transcription of the coding sequence. Likewise, the coding sequence of a signal peptide is operably linked to the coding sequence of a polypeptide if the signal peptide effects the extracellular secretion of that polypeptide. Generally, operably linked nucleic acid sequences are contiguous and, where necessary to join two protein coding regions, the open reading frames are aligned.


As used herein, “complementarity” refers to the ability of a nucleic acid to form hydrogen bond(s) with another nucleic acid by traditional Watson-Crick base-pairing. A percent complementarity indicates the percentage of residues in a nucleic acid molecule which can form hydrogen bonds (i.e., Watson-Crick base pairing) with a second nucleic acid (e.g., about 5, 6, 7, 8, 9, 10 out of 10, being about 50%, 60%, 70%, 80%, 90%, and 100% complementary respectively). “Perfectly complementary” means that all the contiguous residues of a nucleic acid sequence form hydrogen bonds with the same number of contiguous residues in a second nucleic acid sequence. “Substantially complementary” as used herein refers to a degree of complementarity that is at least about any one of 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%, 99%, or 100% over a region of about 40, 50, 60, 70, 80, 100, 150, 200, 250 or more nucleotides, or refers to two nucleic acids that hybridize under stringent conditions.


As used herein, “treatment” or “treating” is an approach for obtaining beneficial or desired results including clinical results. For purposes of this application, beneficial or desired clinical results include, but are not limited to, one or more of the following: decreasing one more symptoms resulting from the disease, diminishing the extent of the disease, stabilizing the disease (e.g., preventing or delaying the worsening of the disease), preventing or delaying the spread of the disease, preventing or delaying the occurrence or recurrence of the disease, delay or slowing the progression of the disease, ameliorating the disease state, providing a remission (whether partial or total) of the disease, decreasing the dose of one or more other medications required to treat the disease, delaying the progression of the disease, increasing the quality of life, and/or prolonging survival. Also encompassed by “treatment” is a reduction of pathological consequence of the disease. The methods of the present application contemplate any one or more of these aspects of treatment.


The terms “individual,” “subject” and “patient” are used interchangeably herein to describe a mammal, including humans. In some embodiments, the individual is human. In some embodiments, the individual is a rodent, such as a mouse. In some embodiments, the individual suffers from a genetic disease or condition. In some embodiments, the individual suffers from a coronavirus infection. In some embodiments, the individual is at risk of contracting a coronavirus infection. In some embodiments, the individual is in need of treatment.


As is understood in the art, an “effective amount” refers to an amount of a composition sufficient to produce a desired therapeutic outcome (e.g., stimulating the production of antibodies and improving immunity against one or more coronaviruses, reducing the severity or duration of, stabilizing the severity of, or eliminating one or more symptoms of a coronavirus infection). For therapeutic use, beneficial or desired results include, e.g., decreasing one or more symptoms resulting from the disease (biochemical, histologic and/or behavioral), including its complications and intermediate pathological phenotypes presented during development of the disease, increasing the quality of life of those suffering from the disease, decreasing the dose of other medications required to treat the disease, enhancing effect of another medication, delaying the progression of the disease, and/or prolonging survival of patients. In some embodiments, an effective amount of the therapeutic agent may extend survival (including overall survival and progression free survival); result in an objective response (including a complete response or a partial response); relieve to some extent one or more signs or symptoms of the disease or condition; and/or improve the quality of life of the subject. In some embodiments, an effective amount is a prophylactically effective amount, which is an amount of a composition sufficient to prevent or reduce the severity of one or more future symptoms of a coronavirus infection when administered to an individual who is susceptible and/or who may develop the coronavirus infection. For prophylactic use, beneficial or desired results include, e.g., results such as eliminating or reducing the risk, lessening the severity of future disease, or delaying the onset of the disease (e.g., delaying biochemical, histologic and/or behavioral symptoms of the disease, its complications, and intermediate pathological phenotypes presenting during future development of the disease).


As used herein the term “wild type” is a term of the art understood by skilled persons and means the typical form of an organism, strain, gene or characteristic as it occurs in nature as distinguished from mutant or variant forms


The present disclosure provides several types of compositions that are polynucleotide or polypeptide based, including variants and derivatives. These include, for example, substitutional, insertional, deletion and covalent variants and derivatives. The term “derivative” is synonymous with the term “variant” and generally refers to a molecule that has been modified and/or changed in any way relative to a reference molecule or a starting molecule.


As such, polynucleotides encoding peptides or polypeptides containing substitutions, insertions and/or additions, deletions and covalent modifications with respect to reference sequences, in particular, the polypeptide sequences disclosed herein, are included within the scope of this disclosure. For example, sequence tags or amino acids, such as one or more lysines, can be added to peptide sequences (e.g., at the N-terminal or C-terminal ends). Sequence tags can be used for peptide detection, purification or localization. Lysines can be used to increase peptide solubility or to allow for biotinylation. Alternatively, amino acid residues located at the carboxy and amino terminal regions of the amino acid sequence of a peptide or protein may optionally be deleted providing for truncated sequences. Certain amino acids (e.g., C-terminal residues or N-terminal residues) alternatively may be deleted depending on the use of the sequence, as for example, expression of the sequence as part of a larger sequence that is soluble, or linked to a solid support.


The term “identity” refers to the overall relatedness between polymeric molecules, for example, between polynucleotide molecules (e.g. DNA molecules and/or RNA molecules) and/or between polypeptide molecules. Calculation of the percent identity of two polynucleic acid sequences, for example, can be performed by aligning the two sequences for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second nucleic acid sequences for optimal alignment and non-identical sequences can be disregarded for comparison purposes). In certain embodiments, the length of a sequence aligned for comparison purposes is at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or 100% of the length of the reference sequence. The nucleotides at corresponding nucleotide positions are then compared. When a position in the first sequence is occupied by the same nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which needs to be introduced for optimal alignment of the two sequences. The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. For example, the percent identity between two nucleic acid sequences can be determined using methods such as those described in Computational Molecular Biology, Lesk, A. M., ed., Oxford University Press, New York, 1988; Biocomputing: Informatics and Genome Projects, Smith, D. W., ed., Academic Press, New York, 1993; Sequence Analysis in Molecular Biology, von Heinje, G., Academic Press, 1987; Computer Analysis of Sequence Data, Part I, Griffin, A. M., and Griffin, H. G., eds., Humana Press, New Jersey, 1994; and Sequence Analysis Primer, Gribskov, M. and Devereux, J., eds., M Stockton Press, New York, 1991; each of which is incorporated herein by reference. For example, the percent identity between two nucleic acid sequences can be determined using the algorithm of Meyers and Miller (CABIOS, 1989, 4:11-17), which has been incorporated into the ALIGN program (version 2.0) using a PAM 120 weight residue table, a gap length penalty of 12 and a gap penalty of 4. The percent identity between two nucleic acid sequences can, alternatively, be determined using the GAP program in the GCG software package using an NWSgapdna.CMP matrix. Methods commonly employed to determine percent identity between sequences include, but are not limited to those disclosed in Carillo, H., and Lipman, D., SIAM J Applied Math., 48:1073 (1988); incorporated herein by reference. Techniques for determining identity are codified in publicly available computer programs. Exemplary computer software to determine homology between two sequences include, but are not limited to, GCG program package, Devereux, J., et al., Nucleic Acids Research, 12(1), 387 (1984)), BLASTP, BLASTN, and FASTA Altschul, S. F. et al., J. Molec. Biol., 215, 403 (1990)).


“Percent (%) amino acid sequence identity” with respect to the polypeptide sequences identified herein is defined as the percentage of amino acid residues in a candidate sequence that are identical with the amino acid residues in the polypeptide being compared, after aligning the sequences considering any conservative substitutions as part of the sequence identity. Alignment for purposes of determining percent amino acid sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN, Megalign (DNASTAR), or MUSCLE software. Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full-length of the sequences being compared. For purposes herein, however, % amino acid sequence identity values are generated using the sequence comparison computer program MUSCLE (Edgar, R. C., Nucleic Acids Research 32(5):1792-1797, 2004; Edgar, R. C., BMC Bioinformatics 5(1):113, 2004, each of which are incorporated herein by reference in their entirety for all purposes).


The terms “non-naturally occurring” or “engineered” are used interchangeably and indicate the involvement of the hand of man. The terms, when referring to nucleic acid molecules or polypeptides mean that the nucleic acid molecule or the polypeptide is at least substantially free from at least one other component with which they are naturally associated in nature and as found in nature.


As used herein, “expression” refers to the process by which a polynucleotide is transcribed from a DNA template (such as into and mRNA or other RNA transcript) and/or the process by which a transcribed mRNA is subsequently translated into peptides, polypeptides, or proteins. Transcripts and encoded polypeptides may be collectively referred to as “gene product.” If the polynucleotide is derived from genomic DNA, expression may include splicing of the mRNA in a eukaryotic cell.


The terms “polypeptide” or “peptide” are used herein to encompass all kinds of naturally occurring and synthetic proteins, including protein fragments of all lengths, fusion proteins and modified proteins, including without limitation, glycoproteins, as well as all other types of modified proteins (e.g., proteins resulting from phosphorylation, acetylation, myristoylation, palmitoylation, glycosylation, oxidation, formylation, amidation, polyglutamylation, ADP-ribosylation, pegylation, biotinylation, etc.).


The term “simultaneous administration,” as used herein, means that a first therapy and second therapy in a combination therapy are administered with a time separation of no more than about 15 minutes, such as no more than about any of 10, 5, or 1 minutes. When the first and second therapies are administered simultaneously, the first and second therapies may be contained in the same composition (e.g., a composition comprising both a first and second therapy) or in separate compositions (e.g., a first therapy in one composition and a second therapy is contained in another composition).


As used herein, the term “sequential administration” means that the first therapy and second therapy in a combination therapy are administered with a time separation of more than about 15 minutes, such as more than about any of 20, 30, 40, 50, 60, or more minutes. Either the first therapy or the second therapy may be administered first. The first and second therapies are contained in separate compositions, which may be contained in the same or different packages or kits.


As used herein, the term “concurrent administration” means that the administration of the first therapy and that of a second therapy in a combination therapy overlap with each other.


The term “pharmaceutical composition” refers to a preparation which is in such form as to permit the biological activity of an active ingredient contained therein to be effective, and which contains no additional components which are unacceptably toxic to a subject to which the formulation would be administered.


A “pharmaceutically acceptable carrier” refers to one or more ingredients in a pharmaceutical formulation, other than an active ingredient, which is nontoxic to a subject. A pharmaceutically acceptable carrier includes, but is not limited to, a buffer, excipient, stabilizer, cryoprotectant, tonicity agent, preservative, and combinations thereof. Pharmaceutically acceptable carriers or excipients have preferably met the required standards of toxicological and manufacturing testing and/or are included on the Inactive Ingredient Guide prepared by the U.S. Food and Drug administration or other state/federal government or listed in the U.S. Pharmacopeia or other generally recognized pharmacopeia for use in mammals, and more particularly in humans.


The term “package insert” is used to refer to instructions customarily included in commercial packages of therapeutic products, that contain information about the indications, usage, dosage, administration, combination therapy, contraindications and/or warnings concerning the use of such therapeutic products.


An “article of manufacture” is any manufacture (e.g., a package or container) or kit comprising at least one reagent, e.g., a medicament for treatment of a disease or condition (e.g., coronavirus infection), or a probe for specifically detecting a biomarker described herein. In certain embodiments, the manufacture or kit is promoted, distributed, or sold as a unit for performing the methods described herein.


It is understood that embodiments of the invention described herein include “consisting” and/or “consisting essentially of” embodiments.


Reference to “about” a value or parameter herein includes (and describes) variations that are directed to that value or parameter per se. For example, description referring to “about X” includes description of “X”.


As used herein, reference to “not” a value or parameter generally means and describes “other than” a value or parameter. For example, the method is not used to treat disease of type X means the method is used to treat disease of types other than X.


The term “about X-Y” used herein has the same meaning as “about X to about Y.”


As used herein and in the appended claims, the singular forms “a,” “an,” or “the” include plural referents unless the context clearly dictates otherwise.


The term “and/or” as used herein a phrase such as “A and/or B” is intended to include both A and B; A or B; A (alone); and B (alone). Likewise, the term “and/or” as used herein a phrase such as “A, B, and/or C” is intended to encompass each of the following embodiments: A, B, and C; A, B, or C; A or C; A or B; B or C; A and C; A and B; B and C; A (alone); B (alone); and C (alone).


II. THERAPEUTIC CIRCULAR RNA

The present application provides circular RNAs (circRNAs) encoding polypeptides, such as therapeutic polypeptides, including any one of therapeutic polypeptides described in Section A. “Therapeutic polypeptides” below.


In some embodiments, there is provided a circRNA comprising a nucleic acid sequence encoding a therapeutic polypeptide, wherein therapeutic polypeptide is selected from the group consisting of an antigenic polypeptide, a functional protein, a receptor protein, and a targeting protein.


In some embodiments, the circRNA is stable for at least 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or 20 days when stored at 4° C. or at room temperature. In some embodiments, the circRNA is stable for at least 7 days when stored at 4° C. or at room temperature. In some embodiments, the circRNA is stable for at least 14 days when stored at 4° C. or at room temperature. In some embodiments, the circRNA is stable for at least 30 days when stored at 4° C. In some embodiments, the circRNA is less than 40% degraded after storage at room temperature for 14 days.


In some embodiments, the present application provides a circRNA comprising: (a) a nucleic acid sequence encoding a therapeutic polypeptide, wherein therapeutic polypeptide is selected from the group consisting of an antigenic polypeptide, a functional protein, a receptor protein, and a targeting protein, and (b) an internal ribosomal entry site (IRES) sequence, wherein the IRES sequence is operably linked to the nucleic acid sequence encoding therapeutic polypeptide. In some embodiments, the nucleic acid sequence further encodes a SP (e.g., human tPA or IgE SP) fused to the N-terminus of therapeutic polypeptide. In some embodiments, the circRNA further comprises a Kozak sequence operably linked to the nucleic acid sequence encoding therapeutic polypeptide. In some embodiments, the circRNA comprises a nucleic acid sequence comprising from the 5′ end to the 3′ end: the IRES sequence, the Kozak sequence, and the nucleic acid sequence encoding therapeutic polypeptide. In some embodiments, the circRNA further comprises a polyA or polyAC sequence disposed at the 5′ end of an IRES sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding therapeutic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding therapeutic polypeptide. In some embodiments, the circRNA further comprises a 5′ ligation sequence at the 5′ end of the circRNA, and a 3′ ligation sequence at the 3′ end of the circRNA, wherein the 5′ ligation sequence and the 3′ ligation sequence are ligated to each other via a ligase (e.g., T4 RNA ligase).


In some embodiments, the present application provides a circRNA comprising: (a) a nucleic acid sequence encoding a therapeutic polypeptide, wherein therapeutic polypeptide is selected from the group consisting of an antigenic polypeptide, a functional protein, a receptor protein, and a targeting protein; (b) an IRES sequence, wherein the IRES sequence is operably linked to the nucleic acid sequence encoding therapeutic polypeptide; and (c) an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding therapeutic polypeptide. In some embodiments, the nucleic acid sequence further encodes a SP (e.g., human tPA or IgE SP) fused to the N-terminus of therapeutic polypeptide. In some embodiments, the circRNA further comprises a Kozak sequence operably linked to the nucleic acid sequence encoding therapeutic polypeptide. In some embodiments, the circRNA comprises a nucleic acid sequence comprising from the 5′ end to the 3′ end: the IRES sequence, the Kozak sequence, the nucleic acid sequence encoding therapeutic polypeptide, and the in-frame 2A peptide coding sequence. In some embodiments, the circRNA further comprises a polyA or polyAC sequence disposed at the 5′ end of an IRES sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding therapeutic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding therapeutic polypeptide. In some embodiments, the circRNA further comprises a 5′ ligation sequence at the 5′ end of the circRNA, and a 3′ ligation sequence at the 3′ end of the circRNA, wherein the 5′ ligation sequence and the 3′ ligation sequence are ligated to each other via a ligase (e.g., T4 RNA ligase).


In some embodiments, the present application provides a circRNA comprising: (a) a nucleic acid sequence encoding a therapeutic polypeptide, wherein therapeutic polypeptide is selected from the group consisting of an antigenic polypeptide, a functional protein, a receptor protein, and a targeting protein; and (b) an m6A modification motif sequence operably linked to the nucleic acid sequence encoding therapeutic polypeptide. In some embodiments, the nucleic acid sequence further encodes a SP (e.g., human tPA or IgE SP) fused to the N-terminus of therapeutic polypeptide. In some embodiments, the circRNA further comprises a Kozak sequence operably linked to the nucleic acid sequence encoding therapeutic polypeptide. In some embodiments, the circRNA comprises a nucleic acid sequence comprising from the 5′ end to the 3′ end: the m6A modification motif sequence, the Kozak sequence, and the nucleic acid sequence encoding therapeutic polypeptide. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding therapeutic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding therapeutic polypeptide. In some embodiments, the circRNA further comprises a 5′ ligation sequence at the 5′ end of the circRNA, and a 3′ ligation sequence at the 3′ end of the circRNA, wherein the 5′ ligation sequence and the 3′ ligation sequence are ligated to each other via a ligase (e.g., T4 RNA ligase).


In some embodiments, the present application provides a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide. In some embodiments, the antigenic polypeptide is a protein or fragment thereof of an infectious agent. In some embodiments, the infectious agent is a virus. In some embodiments, the virus is a coronavirus. In some embodiments, the coronavirus is selected from the group consisting of SARS-CoV, MERS-COV, and SARS-CoV-2. In some embodiments, the coronavirus is SARS-CoV-2. The circRNA may comprise any one of the circRNA expression and/or circularization elements described in Section B, “Additional circRNA expression and circularization elements” below.


In some embodiments, the present application provides a circRNA comprising a nucleic acid sequence encoding a receptor protein. In some embodiments, the receptor protein is a soluble receptor comprising an extracellular domain of a naturally occurring receptor. In some embodiments, the receptor protein is a receptor of an infectious agent (e.g., a virus such as a coronavirus). In some embodiments, the receptor is an ACE2 receptor, such as a soluble ACE2 receptor. In some embodiments, the receptor is a high-affinity mutant ACE2 receptor. The circRNA may comprise any one of the circRNA expression and/or circularization elements described in Section B, “Additional circRNA expression and circularization elements” below.


In some embodiments, the present application provides a circRNA comprising a nucleic acid sequence encoding a targeting protein. In some embodiments, the targeting protein is an antibody. In some embodiments, the antibody is a neutralizing antibody, e.g., a neutralizing antibody targeting a coronavirus such as SARS-CoV-2. In some embodiments, the targeting protein is a therapeutic antibody. The circRNA may comprise any one of the circRNA expression and/or circularization elements described in Section B, “Additional circRNA expression and circularization elements” below.


In some embodiments, the present application provides circular RNA vaccines for treatment or prevention of coronavirus.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a Spike (S) protein or a fragment thereof of a coronavirus (e.g., SARS-CoV, MERS-COV, or SARS-CoV-2).


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising an S protein or a fragment thereof of SARS-CoV-2.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising: (a) an S protein or a fragment thereof of a coronavirus (e.g., SARS-CoV, MERS-COV, or SARS-CoV-2); and (b) a multimerization domain. In some embodiments, the multimerization domain is a C-terminal Foldon (Fd) domain of a T4 fibritin protein that mediates trimerization of the T4 fibritin protein. In some embodiments, the multimerization domain is a GCN-4 based isoleucine zipper domain. In some embodiments, the multimerization domain comprises an amino acid sequence set forth in SEQ ID NOs: 3-4. In some embodiments, the multimerization domain is fused to the RBD domain of the S protein via a peptide linker, e.g., a peptide linker comprising the amino acid sequence of SEQ ID NO: 5.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a receptor-binding domain (RBD) of an S protein of a coronavirus (e.g., SARS-CoV2). In some embodiments, the RBD comprises amino acid residues 319 to 542 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the RBD comprises the amino acid sequence of SEQ ID NO: 2. In some embodiments, the RBD comprises the amino acid sequence of SEQ ID NO: 63.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising: (a) a RBD of an S protein fragment of a coronavirus (e.g., SARS-CoV, MERS-COV, or SARS-CoV-2) and (b) a multimerization domain. In some embodiments, the RBD comprises amino acid residues 319 to 542 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the RBD comprises the amino acid sequence of SEQ ID NO: 2. In some embodiments, the RBD comprises the amino acid sequence of SEQ ID NO: 63. In some embodiments, the multimerization domain is a C-terminal Foldon (Fd) domain of a T4 fibritin protein that mediates trimerization of the T4 fibritin protein. In some embodiments, the multimerization domain is a GCN-4 based isoleucine zipper domain. In some embodiments, the multimerization domain comprises an amino acid sequence set forth in SEQ ID NOs: 3-4. In some embodiments, the multimerization domain is fused to the RBD domain of the S protein via a peptide linker, e.g., a peptide linker comprising the amino acid sequence of SEQ ID NO: 5.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising an S2 region of an S protein of a coronavirus (e.g., SARS-CoV2). In some embodiments, the S2 region comprises amino acid residues 686 to 1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the S2 region comprises one or more mutations (e.g., K986P and V987P) that stabilize a pre-fusion conformation of the S protein. In some embodiments, the S2 region comprises the amino acid sequence of SEQ ID NO: 6 or 7.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising amino acid residues 2-1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the S2 region of the S protein comprises one or more mutations (e.g., K986P and V987P) that stabilize a pre-fusion conformation of the S protein. In some embodiments, the antigenic polypeptide comprises one or more mutations (e.g., deletion of amino acid residues 681-684) that inhibit cleavage of the S protein. In some embodiments, the antigenic polypeptide comprises an amino acid sequence selected from the group consisting of SEQ ID NOs: 8-10 and 62-63.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising: (a) a nucleic acid sequence encoding an antigenic polypeptide comprising an S protein or a fragment thereof of a coronavirus (e.g., SARS-CoV-2), and (b) an internal ribosomal entry site (IBES) sequence, wherein the IRES sequence is operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a SP (e.g., human tPA or IgE SP) fused to the N-terminus of the S protein or fragment thereof. In some embodiments, the circRNA further comprises a Kozak sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA comprises a nucleic acid sequence comprising from the 5′ end to the 3′ end: the IRES sequence, the Kozak sequence, the SP, and the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises a polyA or polyAC sequence disposed at the 5′ end of an IRES sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises a 5′ ligation sequence at the 5′ end of the circRNA, and a 3′ ligation sequence at the 3′ end of the circRNA, wherein the 5′ ligation sequence and the 3′ ligation sequence are ligated to each other via a ligase (e.g., T4 RNA ligase). In some embodiments, the antigenic polypeptide comprises a RBD of the S protein. In some embodiments, the antigenic polypeptide further comprises a multimerization domain (e.g., C-terminal Fd domain, or a GCN-4 based isoleucine zipper domain). In some embodiments, the antigenic polypeptide comprises an S2 region of the S protein. In some embodiments, the antigenic polypeptide comprises amino acid residues 2-1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the S2 region of the S protein comprises one or more mutations (e.g., K986P and V987P) that stabilize a pre-fusion conformation of the S protein. In some embodiments, the antigenic polypeptide comprises one or more mutations (e.g., deletion of amino acid residues 681-684) that inhibit cleavage of the S protein. In some embodiments, the circRNA comprises a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 11-15.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising: (a) a nucleic acid sequence encoding an antigenic polypeptide comprising an S protein or a fragment thereof of a coronavirus (e.g., SARS-CoV-2); (b) an IRES sequence, wherein the IRES sequence is operably linked to the nucleic acid sequence encoding the antigenic polypeptide; and (c) an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a SP (e.g., human tPA or IgE SP) fused to the N-terminus of the S protein or fragment thereof. In some embodiments, the circRNA further comprises a Kozak sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA comprises a nucleic acid sequence comprising from the 5′ end to the 3′ end: the IRES sequence, the Kozak sequence, the SP, the nucleic acid sequence encoding the antigenic polypeptide, and the in-frame 2A peptide coding sequence. In some embodiments, the circRNA further comprises a polyA or polyAC sequence disposed at the 5′ end of an IRES sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises a 5′ ligation sequence at the 5′ end of the circRNA, and a 3′ ligation sequence at the 3′ end of the circRNA, wherein the 5′ ligation sequence and the 3′ ligation sequence are ligated to each other via a ligase (e.g., T4 RNA ligase). In some embodiments, the antigenic polypeptide comprises a RBD of the S protein. In some embodiments, the antigenic polypeptide further comprises a multimerization domain (e.g., C-terminal Fd domain, or a GCN-4 based isoleucine zipper domain). In some embodiments, the antigenic polypeptide comprises an S2 region of the S protein. In some embodiments, the antigenic polypeptide comprises amino acid residues 2-1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the S2 region of the S protein comprises one or more mutations (e.g., K986P and V987P) that stabilize a pre-fusion conformation of the S protein. In some embodiments, the antigenic polypeptide comprises one or more mutations (e.g., deletion of amino acid residues 681-684) that inhibit cleavage of the S protein. In some embodiments, the circRNA comprises a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 11-15.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising: (a) a nucleic acid sequence encoding an antigenic polypeptide comprising an S protein or a fragment thereof of a coronavirus (e.g., SARS-CoV-2), and (b) an m6A modification motif sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a SP (e.g., human tPA or IgE SP) fused to the N-terminus of the S protein or fragment thereof. In some embodiments, the circRNA further comprises a Kozak sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA comprises a nucleic acid sequence comprising from the 5′ end to the 3′ end: the m6A modification motif sequence, the Kozak sequence, the SP, and the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises a 5′ ligation sequence at the 5′ end of the circRNA, and a 3′ ligation sequence at the 3′ end of the circRNA, wherein the 5′ ligation sequence and the 3′ ligation sequence are ligated to each other via a ligase (e.g., T4 RNA ligase). In some embodiments, the antigenic polypeptide comprises a RBD of the S protein. In some embodiments, the antigenic polypeptide further comprises a multimerization domain (e.g., C-terminal Fd domain, or a GCN-4 based isoleucine zipper domain). In some embodiments, the antigenic polypeptide comprises an S2 region of the S protein. In some embodiments, the antigenic polypeptide comprises amino acid residues 2-1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the S2 region of the S protein comprises one or more mutations (e.g., K986P and V987P) that stabilize a pre-fusion conformation of the S protein. In some embodiments, the antigenic polypeptide comprises one or more mutations (e.g., deletion of amino acid residues 681-684) that inhibit cleavage of the S protein. In some embodiments, the circRNA comprises a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 11-15.


The present application further provides a cocktail composition comprising a plurality of circRNAs each comprising a nucleic acid sequence encoding an antigenic polypeptide, a receptor protein of an infectious agent, or a targeting protein (e.g., an antibody such as a neutralizing antibody). In some embodiments, the plurality of circRNA encode antigenic polypeptides that are different with respect to each other, such as different mutants of an antigenic polypeptide (e.g., S protein or fragment thereof). In some embodiments, the plurality of circRNA encode receptor proteins that are different with respect to each other, such as different mutants of a receptor protein (e.g., ACE2). In some embodiments, the plurality of circRNA encode targeting proteins that are different with respect to each other, such as different antibodies (e.g., neutralizing antibodies).


A. Therapeutic Polypeptides

In some aspects, provided herein is a circRNA comprising a therapeutic polypeptide. In some embodiments, the therapeutic polypeptide is an antigenic polypeptide, a functional protein, a receptor protein or a targeting protein (e.g., an antibody).


In some embodiments, the nucleic acid sequence may be codon-optimized. A codon optimized sequence may be one in which codons in a polynucleotide encoding a polypeptide have been substituted in order to increase the expression, stability and/or activity of the polypeptide. Factors that influence codon optimization include, but are not limited to one or more of: (i) variation of codon biases between two or more organisms or genes or synthetically constructed bias tables, (ii) variation in the degree of codon bias within an organism, gene, or set of genes, (iii) systematic variation of codons including context, (iv) variation of codons according to their decoding tRNAs, (v) variation of codons according to GC %, either overall or in one position of the triplet, (vi) variation in degree of similarity to a reference sequence for example a naturally occurring sequence, (vii) variation in the codon frequency cutoff, (viii) structural properties of mRNAs transcribed from the DNA sequence, (ix) prior knowledge about the function of the DNA sequences upon which design of the codon substitution set is to be based, and/or (x) systematic variation of codon sets for each amino acid. In some embodiments, a codon optimized polynucleotide may minimize ribozyme collisions and/or limit structural interference between the expression sequence and the IRES.


i. Antigenic Polypeptides


The circRNA vaccines described herein comprise circular RNAs (circRNA) encoding an antigenic polypeptide. In some embodiments, the antigenic polypeptide comprises a Spike (S) protein or a fragment thereof of a coronavirus, such as any one of the S proteins or fragments thereof as described in the “Spike protein or fragment thereof” subsection below. In some embodiments, the antigenic polypeptide comprises a multimerization domain, such as a native multimerization domain of the S protein, or an exogenous multimerization domain. Suitable multimerization domains are described in the “Multimerization domain” subsection below. The S protein or fragment thereof may be fused to the multimerization domain via a peptide linker, such as any one of the peptide linkers described in the “peptide linker” subsection below.


An antigenic polypeptide comprises at least one epitope recognizable by a T cell receptor (TCR). In some embodiments, the antigenic polypeptide is a full-length protein or a fragment thereof, or an antigenic fusion protein that can trigger an immune response in a subject. In some embodiments, the antigenic polypeptide is a short peptide of no more than 100 amino acids long. The antigenic polypeptide can be a naturally derived peptide fragment from a protein antigen containing one or more epitopes, or an artificially designed peptide with one or more natural epitope sequences, wherein a peptide linker may optionally be placed in between adjacent epitope sequences. In some embodiments, the antigenic polypeptide comprises a single epitope of an antigenic protein. In some embodiments, the antigenic polypeptide comprises about any one of 1, 2, 3, 4, 5, 10 or more epitopes from a single antigenic protein. In some embodiments, the antigenic polypeptide comprises epitopes from a plurality (e.g., 2, 3, 4, 5, 10 or more) of different antigenic proteins. In some embodiments, the antigenic polypeptide comprises a Major Histocompatibility Complex (MHC) class I-restricted epitope. In some embodiments, the antigenic polypeptide comprises a MHC class II-restricted epitope. In some embodiments, the antigenic polypeptide comprises both MHC class I-restricted and MHC class II-restricted epitopes.


In some embodiments, the antigenic polypeptide is an antigenic protein or fragment thereof or a variant thereof from a pathogenic agent, such as a bacterium or a virus. In some embodiments, the antigenic polypeptide is an antigenic protein or fragment of a coronavirus, such as SARS-CoV2, including variants thereof.


In some embodiments, the antigenic polypeptide is an antigenic protein or fragment thereof or a variant thereof of a self antigen, such as an antigen involved in a disease or condition. In some embodiments, the antigenic polypeptide is a tumor antigen peptide. Tumor antigen peptide sequences are known in the art and can be found at public databases, such as the Cancer Antigenic Peptide Database (van der Bruggen P et al. (2013) “Peptide database: T cell-defined tumor antigens.” Cancer Immunity. URL: caped.icp.ucl.ac.be). The coding RNA sequence in the linear RNA or circRNA described herein may encode any of the known tumor antigen peptides or combinations thereof. In some embodiments, the antigenic polypeptide comprises an epitope of a tumor associated antigen (TAA). In some embodiments, the antigenic polypeptide comprises an epitope of a tumor specific antigen. In some embodiments, the antigenic polypeptide comprises an epitope of a neoantigen, i.e., newly acquired and expressed antigens present in tumor cells of an individual.


In some embodiments, the amino acid sequences of one or more epitope peptides are predicted based on the sequence of the antigen protein (including neoantigens) using a bioinformatics tool for T cell epitope prediction. Exemplary bioinformatics tools for T cell epitope prediction are known in the art, for example, see Yang X. and Yu X. (2009) “An introduction to epitope prediction methods and software” Rev. Med. Virol. 19(2): 77-96. In some embodiments, the sequence of the antigen protein is known in the art or available in public databases. In some embodiments, the sequence of the antigen protein (including neoantigens) is determined by sequencing a sample (such as a tumor sample) of the individual being treated.


In some embodiments, the antigenic polypeptide comprises a Spike (S) protein or a fragment thereof of a coronavirus, such as a SARS-CoV, MERS-COV, or SARS-CoV-2 virus. In some embodiments, the antigenic polypeptide is a full-length S protein. In some embodiments, the antigenic polypeptide is a fragment of a naturally occurring S protein. In some embodiments, the antigenic polypeptide comprises a Spike (S) protein or a fragment thereof of SARS-CoV-2.


In some embodiments, the antigenic polypeptide comprises a variant of an S protein or fragment thereof of a coronavirus. In some embodiments, the antigenic polypeptide comprises a naturally occurring variant of an S protein or fragment thereof of a coronavirus (e.g., SARS-CoV-2). Variants of the SARS-CoV-2 genome have been described. See, for example, Forster et al. (2020). Phylogenetic network analysis of SARS-CoV-2 genomes. PNAS 117 (17) 9241-9243, which is incorporated herein by reference in its entirety. In some embodiments, the antigenic polypeptide comprises a variant of an S protein or fragment thereof that confers a fitness advantage to a coronavirus, such as enhanced infectivity. In some embodiments, the antigenic polypeptide comprises an S protein or fragment thereof of SARS-CoV-2 having a D614G mutation. In some embodiments, the antigenic polypeptide is capable of eliciting an immune response in an individual against different strains and variants of a coronavirus, such as SARS-CoV-2 variants. In some embodiments, the antigenic polypeptide is capable of eliciting an immune response in an individual against a specific strain or variant of a coronavirus.


In some embodiments, the antigenic polypeptide comprises a receptor-binding domain (RBD) of an S protein of a coronavirus (e.g., SARS-CoV2). In some embodiments, the RBD comprises amino acid residues 319 to 542 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1 In some embodiments, the RBD comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98% or more, or 100%) sequence identity to the amino acid sequence of SEQ ID NO: 2. In some embodiments, the RBD comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98% or more, or 100%) sequence identity to the amino acid sequence of SEQ ID NO: 63.


In some embodiments, the antigenic polypeptide comprises an S2 region of an S protein of a coronavirus (e.g., SARS-CoV2). In some embodiments, the S2 region comprises amino acid residues 686 to 1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the S2 region comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98% or more, or 100%) sequence identity to the amino acid sequence of SEQ ID NO: 6. In some embodiments, the S2 region comprises one or more mutations that stabilize a pre-fusion conformation of the S protein. In some embodiments, the S2 region comprises K986P and V987P mutations. In some embodiments, the S2 region comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98% or more, or 100%) sequence identity to the amino acid sequence of SEQ ID NO: 7.


In some embodiments, the antigenic polypeptide comprises both an RBD and an S2 region of an S protein of a coronavirus (e.g., SARS-CoV2). In some embodiments, the antigenic polypeptide comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98% or more, or 100%) sequence identity to the amino acid sequence of SEQ ID NO: 1.


In some embodiments, the antigenic polypeptide comprises a Spike (S) protein fragment of a coronavirus (e.g., SARS-CoV, MERS-COV, or SARS-CoV-2) and a multimerization domain, which can be operably linked to the S protein fragment. In some embodiments, the multimerization domain is a C-terminal Foldon (Fd) domain of a T4 fibritin protein that mediates trimerization of the T4 fibritin protein. In some embodiments, the multimerization domain is a GCN-4 based isoleucine zipper domain. In some embodiments, the multimerization domain comprises the amino acid sequence as set forth in SEQ ID NO: 3 or 4. In some embodiments, the multimerization domain is fused to the S protein fragment via a peptide linker. In some embodiments, the antigenic polypeptide comprises a RBD domain of an S protein fused to a multimerization domain via a peptide linker. In some embodiments, the peptide linker comprises the amino acid sequence of SEQ ID NO: 5.


In some embodiments, the antigenic polypeptide comprises a Spike (S) protein or a fragment thereof of SARS-CoV-2 fused to a multimerization domain. In some embodiments, the antigenic polypeptide comprises an S protein fragment fused to a C-terminal Foldon (Fd) domain (e.g., SEQ ID NO: 3) of a T4 fibritin protein that mediates trimerization of the T4 fibritin protein (e.g., SEQ ID NO: 4). In some embodiments, the antigenic polypeptide comprises an S protein fragment fused to a GCN-4 based isoleucine zipper domain. In some embodiments, the antigenic polypeptide comprises a receptor-binding domain (RBD) of an S protein of SARS-CoV-2 fused to a multimerization domain via a peptide linker. In some embodiments, the peptide linker comprises the amino acid sequence of SEQ ID NO: 5.


The antigenic polypeptide may comprise a signal peptide (SP). In some embodiments, the SP is fused to the N-terminus of the S protein or fragment thereof. In non-limiting examples, the signal peptide is the signal sequence and propeptide from human tissue plasminogen activator (tPA), the signal sequence from human IgE Immunoglobulin, or the signal peptide sequence of MHC I. In some embodiments, the signal peptide can facilitate secretion of the antigenic polypeptide encoded by the circRNA vaccine.


In some embodiments, the circRNA comprises an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA does not comprise a stop codon at the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the in-frame 2A peptide coding sequence replaces the stop codon. In some embodiments, the circRNA contains no stop codon and the number of nucleotides composing the RNA is a multiple of three. In some embodiments, the circRNA having no stop codon and the number of nucleotides composing the RNA being a multiple of three allows for rolling circle translation of the circRNA. In some embodiments, the 2A peptide coding sequence allows for rolling circle translation of the circRNA. In some embodiments, the 2A peptide allows cleavage of a polypeptide generated by rolling circle translation into monomeric polypeptide sequences. In non-limiting examples, the 2A peptide coding sequence encodes a P2A or T2A peptide, such as the sequence set forth in SEQ ID NO: 44 or 45.


Also provided is a circRNA comprising a nucleic acid sequence encoding any one of the antigenic polypeptides described herein. The nucleic acid sequences encoding the antigenic polypeptides may be codon-optimized. In some embodiments, the circRNA comprises a nucleic acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98% or more, or 100%) sequence identity to a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 11-15 and SEQ ID NOs: 48-49.


Spike Protein or Fragment Thereof.

The circRNA vaccines described herein comprise a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a Spike (S) protein or a fragment thereof of a coronavirus (e.g., SARS-CoV-2, MERS-CoV, or SARS-CoV). Sequences of S proteins of coronaviruses are known in the art, including, for example, NCBI RefSeq ID: YP_009047204.1 (MERS-CoV), GenBank Accession number: AAT74874 (SARS-CoV), or NCBI RefSeq ID: YP_009724390 (SARS-CoV-2, provided as SEQ ID NO: 1 of the present application).


In some embodiments, the S protein or fragment thereof comprises amino acid residues 2-1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the S protein or fragment thereof comprises a deletion of amino acid residues 681-684. In some embodiments, the S protein or fragment thereof comprises at least one point mutation in the S2 region, for example, a K986P, V987P, F817P, A892P, A899P, or A942P mutation or combinations thereof. In some embodiments, the S protein of fragment thereof comprises at least one mutation selected from A222V, E406W, K417N, K417T, N439K, L452R, L452Q, L455N, L478K, E484K, Q493F, F490S, N501Y, A570D, D614G, P681H, A701V, T716I, S982A, or combinations thereof. In some embodiments, the S protein or fragment thereof comprises a N501Y point mutation. In some embodiments, the S protein or fragment thereof comprises K417N, E484K, and/or N501Y point mutations. In some embodiments, the S protein or fragment thereof comprises an E484K point mutation. In some embodiments, the S protein or fragment thereof comprises K417T, E484K, and N501Y point mutations. In some embodiments, the S Protein or fragment thereof of SARS-CoV-2 comprises K986P and V987P point mutations, either alone or in combination with a deletion of amino acid residues 681-684. In some embodiments, the S protein or fragment thereof comprises an amino acid sequence set forth in any one of SEQ ID NO: 1-2, SEQ ID NO: 6-10, or SEQ ID NO: 63. In some embodiments, the S protein or fragment thereof comprises an amino acid sequence set forth SEQ ID NO: 2. In some embodiments, the S protein or fragment thereof comprises an amino acid sequence set forth SEQ ID NO: 63.


In some embodiments, the S protein or fragment thereof is an Alpha (B.1.1.7), Beta (B.1.351, B.1.351.2, B.1.351.3), Delta (B.1.617.2, AY.1, AY.2, AY.3), or Gamma (P.1, P.1.1, P.1.2) S protein or fragment thereof. In some embodiments, the S protein or fragment thereof comprises two, three, four, five, or more mutations selected from the group consisting of T19R, V70F, T95I, G142D, E156-, F157-, R158G, A222V, W258L, K417N, L452R, T478K, D614G, P681R, and D950N, wherein the amino acid numbering is based on SEQ ID NO. 1. In some embodiments, the S protein or fragment thereof comprises an RBD comprising the S protein or fragment thereof comprises one, two, or three of the or more mutations selected from the group K417N, L452R, and T478K, wherein the amino acid numbering is based on SEQ ID NO: 1. In some embodiments, the S protein or fragment thereof comprises two, three, four, five, or more mutations selected from the group consisting of residue 69 deletion, residue 70 deletion, residue 144 deletion, E484K, S494P, N501Y, A570D, D614G, P681H, T716I, S982A, D1118H, and K1191N, wherein the amino acid numbering is based on SEQ ID NO: 1. In some embodiments, the S protein or fragment thereof comprises an RBD domain comprising one, two, or three of the mutations selected from the group consisting of E484K, S494P, and N501Y, wherein the amino acid numbering is based on SEQ ID NO: 1. In some embodiments, the S protein or fragment thereof comprises one, two, three, four, five, or more mutations selected from the group consisting of D80A, D215G, 241del, 242del, 243del, K417N, E484K, N501Y, D614G, and A701V, wherein the amino acid numbering is based on SEQ ID NO: 1. In some embodiments, the S protein or fragment thereof comprises an RBD comprising one, two, or three of the mutations selected from the group consisting of K417N, E484K, and N501Y, wherein the amino acid numbering is based on SEQ ID NO: 1. In some embodiments, the S protein or fragment thereof comprises one, two, three, four, five, or more mutations selected from the group consisting of L18F, T20N, P26S, D138Y, R190S, K417T, E484K, N501Y, D614G, H655Y, and T10271, wherein the amino acid numbering is based on SEQ ID NO: 1. In some embodiments, the S protein or fragment thereof comprises an RBD domain comprising one, two, or three of the mutations selected from the group consisting of K417T, E484K, and N501Y, wherein the amino acid numbering is based on SEQ ID NO. 1. In some embodiments, the S protein or fragment thereof comprises an RBD domain comprising one, two, or three of the mutations selected from the group consisting of E484K, N501Y, and L452R mutations, wherein the amino acid numbering is based on SEQ ID. NO: 1.


In some embodiments, the S protein or fragment thereof comprises an N-terminal domain (NTD) of an S protein of a coronavirus (e.g., SARS-CoV-2, MERS-CoV, or SARS-CoV).


In some embodiments, the S protein or fragment thereof comprises an amino acid sequence having about 80%, at least 85%, at least about 90%, at least about 95%, at least about 98% or more sequence identity to a wild-type S protein or a fragment thereof of a coronavirus, or with any one of the sequences set forth in SEQ ID NOs: 1-2, SEQ ID NOs: 6-10, and SEQ ID NOs: 62-63.


RBD Domain

In some embodiments, the S protein or fragment thereof comprises a receptor-binding domain (RBD) of the S protein. In some embodiments, the RBD comprises amino acid residues 319 to 542 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1 In some embodiments, the RBD comprises the amino acid sequence of SEQ ID NO: 2. In some embodiments, the RBD comprises a sequence having about 80%, at least 85%, at least about 90%, at least about 95%, at least about 98% or more sequence identity with the amino acid sequence of SEQ ID NO: 2. In some embodiments, the RBD comprises a sequence having about 80%, at least 85%, at least about 90%, at least about 95%, at least about 98% or more sequence identity with the amino acid sequence of SEQ ID NO: 63. In some embodiments, the RBD is linked to a multimerization domain. In some embodiments, the RBD is fused to a multimerization domain by a flexible peptide linker.


S2 Region

In some embodiments, the S protein or fragment thereof comprises an S2 region of the S protein. In some embodiments, the S2 region comprises amino acid residues 686 to 1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the S2 region comprises the amino acid sequence of SEQ ID NO: 6. In some embodiments, the S2 region comprises one or more mutations that stabilize a pre-fusion conformation of the S protein. In some embodiments, the S2 region comprises K986P and V987P mutations, for example, as in the sequence set forth in SEQ ID NO: 7. In some embodiments, the S2 region comprises a single point mutation, for example, a K986P, V987P, F817P, A892P, A899P or A942P mutation. In some embodiments, the S2 region comprises a combination of point mutations including K986P, V987P, F817P, A892P, A899P or A942P. In some embodiments, the S2 region comprises the wild type sequence of an S protein of a coronavirus, such as the sequence of SEQ ID NO: 6, or a sequence having about 80%, at least 85%, at least about 90%, at least about 95%, at least about 98% or more sequence identity with the amino acid sequence of SEQ ID NO: 6.


Multimerization Domain

In some embodiments, the antigenic polypeptide further comprises a multimerization domain, such as a dimerization domain, a trimerization domain, or a domain that mediates formation of higher order multimers. In some embodiments, the multimerization domain is a trimerization domain. In non-limiting examples, the multimerization domain comprises a C-terminal Foldon (Fd) domain of a T4 fibritin protein, wherein the C-terminal Foldon domain is the domain that mediates trimerization of the T4 fibritin protein, such as the amino acid sequence set forth in SEQ ID NO: 3. In another example, the multimerization domain comprises a GCN4-based isoleucine zipper (IZ) domain based on the trimerization domain of the GCN4 transcriptional activator from Saccharomyces cerevisiae, such as the amino acid sequence set forth in SEQ ID NO: 4. In some embodiments, the multimerization domain has about 80%, at least 85%, at least about 90%, at least about 95%, at least about 98% or more sequence identity with the amino acid sequence of SEQ ID NO: 3 or SEQ ID NO: 4. In some embodiments, the GCN4 IZ domain or T4 fibritin Fd domain can be modified to reduce their immunogenicity according to known techniques in the art. For example, the GCN4 IZ domain can be modified with N-linked glycosylation sites to reduce its immunogenicity (Sliepen et al. Immunosilencing a Highly Immunogenic Protein Trimerization Domain. The Journal of Biol. Chem. Vol. 290, No. 12, pp. 7436-7442). In some embodiments, the multimerization domain is fused to the N-terminus of the S protein or fragment thereof. In some embodiments, the multimerization domain is fused to the C-terminus of the S protein or fragment thereof.


ii. Targeting Proteins


In some embodiments, the therapeutic polypeptide is a targeting protein. In some embodiments, the targeting protein is an antibody or an antigen-binding fragment thereof.


In some embodiments, the therapeutic polypeptide is an antibody. In some embodiments, the therapeutic polypeptide is a neutralizing antibody, i.e., an antibody that blocks an interaction between a protein and its binding partner. In some embodiments, the antibody inhibits activity of a protein, e.g., by blocking binding of the protein to a binding partner. In some embodiments, the targeting protein is a therapeutic antibody. In some embodiments, the antibody is a checkpoint inhibitor, e.g., an antibody inhibitor of CTLA-4, PD-1, or PD-L1. In some embodiments, the antibody can be an antibody against a viral protein or a receptor that binds to a viral protein.


The antibody can be an antigen-binding fragment of an antibody, e.g., a portion or fragment of an intact or complete antibody having fewer amino acid residues than the intact or complete antibody, which is capable of binding to an antigen or competing with the intact antibody (i.e., the intact antibody from which the antigen-binding fragment is derived) for binding to an antigen. Antigen-binding fragments can be prepared by recombinant DNA techniques, or by enzymatic or chemical cleavage of intact antibodies. Antigen binding fragments include, but are not limited to, Fab′, F (ab′)2Fv, single chain Fv (scFv), single chain Fab, diabody (diabody), single domain antibody (sdAb, nanobody), camel Ig, Ig NAR, F (ab)′3Fragment, bis-scFv, (scFv)2Minibodies, diabodies, triabodies, tetradiabodies, disulfide stabilized Fv proteins (“dsFv”). In some embodiments, the neutralizing antibody can be a genetically engineered antibody, such as a chimeric antibody (e.g., humanized murine antibodies), heteroconjugate antibody (e.g., bispecific antibodies), or antigen-binding fragments thereof.


In some embodiments, the antibody is a neutralizing antibody that binds to a viral protein. In some embodiments, the antibody is a neutralizing antibody that binds to a receptor for a viral protein. In some embodiments, the antibody binds to a receptor that is required for viral entry into a cell (e.g., an ACE2 receptor). In some embodiments, the antibody is a neutralizing antibody (nAb) that binds to the S protein of a coronavirus and prevents or reduces its ability to infect cells. In some embodiments, the coronavirus is SARS-CoV-2. In some embodiments the nAb is a monoclonal antibody (mAb), a functional antigen-binding fragment (Fab), a single-chain variable region fragment (scFv), or a single-domain antibody (a VHH or nanobody).


In some embodiments the nAb binds to the RBD of a S protein of a coronavirus. In some embodiments, the nAb binds to the NTD of a S protein of a coronavirus. In some embodiments, the nAB binds to the S2 region of a S protein of a coronavirus. In some embodiments, the nAb binds the S1/S2 proteolytic cleavage site of a S protein of a coronavirus. In some embodiments, the coronavirus is SARS-CoV-2. In some embodiments, binding of the nAb to the S protein interferes with interaction of the RBD of the S protein with an ACE2 receptor. In some embodiments the nAb binds to the ACE2 binding site of the RBD. In some embodiments, binding of the nAb to the S protein interferes with S2-mediated membrane fusion. In some embodiments, binding of the nAb to the S protein interferes with viral entry into the host cell.


In some embodiments, the nAb binds to a S protein comprising one or more mutations. In some embodiments, the nAb binds to a S protein or fragment thereof that comprises at least one point mutation in the S2 region, for example, a K986P, V987P, F817P, A892P, A899P or A942P mutation or combinations thereof. In some embodiments, the nAb binds to a S protein or fragment thereof that comprises at least one point mutation selected from A222V, E406W, K417N, K417T, N439K, L452R, L452Q, L455N, L478K, E484K, Q493F, F490S, N501Y, A570D, D614G, P681H, A701V, T716I, S982A, or combinations thereof. In some embodiments, the nAb binds to a S protein or fragment thereof that comprises a N501Y point mutation. In some embodiments, the nAb binds to a S protein or fragment thereof that comprises K417N, E484K, and N501Y point mutations. In some embodiments, the nAb binds to a S protein or fragment thereof that comprises an E484K point mutation. In some embodiments, the nAb binds to a S protein or fragment thereof that comprises K417T, E484K, and N501Y point mutations. In some embodiments, the nAb binds to a S Protein or fragment thereof of SARS-CoV-2 that comprises K986P and V987P point mutations, either alone or in combination with a deletion of amino acid residues 681-684. In some embodiments, binding of the nAb to an S protein having any of the combinations of mutations described above (e.g., K417N, K417T, E484K, and/or N501Y) interferes with interaction of the RBD of the S protein with an ACE2 receptor. In some embodiments, binding of the nAb to an S protein any of the combinations of mutations described above (e.g., K417N, K417T, E484K, and/or N501Y) interferes with S2-mediated membrane fusion. In some embodiments, binding of the nAb to an S protein any of the combinations of mutations described above (e.g., K417N, K417T, E484K, and/or N501Y) interferes with viral entry into the host cell.


Exemplary nAbs for binding and neutralization of the S protein of SARS-CoV-2 have been described, for example, in Barnes, C. O. et al. SARS-CoV-2 neutralizing antibody structures inform therapeutic strategies. Nature 588, 682-687 (2020), and Chinese Patent Application No. CN111690058A, the contents of which are herein incorporated by reference in their entirety.


In some embodiments, the nAb comprises a sequence selected from SEQ ID NOs: 26-33. In some embodiments, the nAb comprises a sequence having at least 80% (e.g., at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, or at least 99%) amino acid sequence identity to a sequence selected from SEQ ID NOs: 26-33.


In some embodiments, the antibody is an antibody against the S protein of SARS-CoV-2. In some embodiments, the antibody comprises a sequence having at least 80% (e.g., at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, or at least 99%, or 100%) amino acid sequence identity to SEQ ID NO: 26. In some embodiments, the antibody comprises a sequence having at least 80% (e.g., at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, or at least 99%, or 100%) amino acid sequence identity to SEQ ID NO: 27. In some embodiments, the antibody comprises a sequence having at least 80% (e.g., at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, or at least 99%, or 100%) amino acid sequence identity to SEQ ID NO: 30.


In some embodiments, the targeting protein is not an antibody. Examples of non-antibody-based targeting proteins include, but are not limited to, a lipocalin, an anticalin (artificial antibody mimetic proteins that are derived from human lipocalins), “T-body”, a peptide (e.g., a BICYCLE™ peptide), an affibody (antibody mimetics composed of alpha helices, e.g. an three-helix bundle), a peptibody (peptide-Fc fusion), a DARPin (designd ankyrin repeat proteins, engineered antibody mimetic proteins consisting repeat motifs), an affimer, an avimer, a knottin (a protein structural motif containing 3 disulfide bridges), a monobody, an affinity clamp, an ectodomain, a receptor ectodomain, a receptor, a cytokine, a ligand, an immunocytokine, and a centryin. See, for example, Vazquez-Lombardi, Rodrigo, et al. Drug discovery today 20.10 (2015): 1271-1283.


iii. Soluble Receptors


In some embodiments, the therapeutic polypeptide is a soluble receptor. Soluble receptors (sometimes referred to as soluble receptor decoys or “traps”) can comprise all or a portion of the extracellular domain of a receptor protein. In some embodiments, a nucleotide sequence encoding all or a portion of the extracellular domain of a receptor protein is operably linked to a signal peptide for secretion from cells.


In some embodiments, the soluble receptor comprises an extracellular domain of a naturally occurring receptor. In some embodiments, the soluble receptor variant comprises an engineered variant of an extracellular domain of a naturally occurring receptor, such as a variant comprising one or more mutations in the extracellular domain. In some embodiments, the soluble receptor comprises one or more mutations that increase the affinity of the soluble receptor for its ligand compared to the affinity of the naturally occurring receptor for its ligand.


In some embodiments, the soluble receptor is a fusion protein comprising one or more additional protein domains operably linked to the extracellular domain of the receptor or a variant thereof. In some embodiments, the soluble receptor comprises an Fc domain of an immunoglobulin (Ig), e.g., a human immunoglobulin. In some embodiments, the soluble receptor comprises an Fc domain of a human IgG1.


In some embodiments, the soluble receptor comprises the extracellular domain of a signaling receptor, and the soluble receptor can reduce or inhibit activity of the signaling pathway by blocking binding between the endogenous receptor and its ligand.


In some embodiments, the soluble receptor is a receptor that binds to a viral protein and/or that mediates viral entry. In some embodiments, soluble receptor is a soluble ACE2 receptor. In some embodiments, the therapeutic polypeptide is a soluble ACE2 receptor variant capable of binding to a S protein of a coronavirus. In some embodiments, the soluble ACE2 can have a great advantage over antibodies due to resistance to escape mutations. The virus with escape mutation from sACE2 should have limited binding affinity to cell surface native ACE2 receptors, leading to a diminished or eliminated virulence.


In some embodiments, the ACE2 receptor fragment is engineered to have higher affinity to a S protein of a coronavirus. In some embodiments, the soluble ACE2 receptor variant is capable of binding to a S protein of a coronavirus and blocking or reducing binding of the S protein to an endogenous ACE2 receptor. In some embodiments, the soluble ACE2 receptor variant binds to the receptor binding domain (RBD) of the S protein. In some embodiments, the ACE2 receptor variant is enzymatically active. In other embodiments, the ACE2 receptor variant is enzymatically inactive.


In some embodiments, the soluble ACE2 receptor variant comprises the soluble extracellular domain of wild-type (WT) human recombinant ACE2 (APN01). APN01 has been found to be safe in healthy volunteers and in a small cohort of patients with acute respiratory distress syndrome by virtue of ACE2's intrinsic angiotensin-converting activity, which is not required for viral entry. APN01 is currently in phase 2 clinical trials in Europe for treatment of SARS-CoV-2 (NCT04335136). In some embodiments, the soluble ACE2 receptor variant comprises one or more mutations in the extracellular domain of human ACE2. In some embodiments, the soluble ACE2 receptor variant is engineered via affinity maturation to have increased binding affinity to the RBD of the S protein. For example, a nucleotide sequence encoding a wild-type extracellular domain of ACE2 may be subjected to one or more cycles of random mutation and cell sorting to identify ACE2 variants having a higher affinity for the RBD of the S protein wild-type ACE2.


In some embodiments, the soluble ACE2 receptor variant comprises a sequence having at least 80% (e.g., at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, or at least 99%, or 100%) amino acid sequence identity to SEQ ID NO: 34 or 35.


In some embodiments, the soluble ACE2 receptor variant is a fusion protein, e.g., a fusion of the extracellular ACE2 receptor domain to the Fc region of the human IgG1.


In some embodiments, the KD of the soluble ACE2 receptor variant for the RBD of the S protein is about 15-20 nM. In some embodiments, the KD of the soluble ACE2 receptor variant for the RBD of the S protein is less than 15 nM, less than 10 nM, less than 5 nM, less than 1 nM, less than 500 pM, less than 250 pM, less than 200 pM, or less than 150 pM.


Soluble ACE2 receptor variants have been described, for example in Haschke M et al., Clin Pharmacokinet. 2013 September; 52(9):783-92; Glasgow A et al., Proceedings of the National Academy of Sciences November 2020, 117 (45) 28046-28055; and Higuchi Y. et al., bioRxiv 2020. 09.16.299891, the contents of which are herein incorporated by reference in their entirety.


iv. Functional Proteins


In some embodiments, the therapeutic polypeptide can be any polypeptide that is capable of being expressed by target cells (e.g., human or mouse cells) for the production (and in certain instances, the excretion) of a functional enzyme or protein as disclosed, for example, in International Application No. PCT/US2010/058457 and WO2020237227, the contents of which are herein incorporated by reference in their entirety. In some embodiments, the therapeutic polypeptide can be engineered for secretion by operably linking a signal peptide to the amino terminus of the therapeutic polypeptide. For example, in some embodiments, upon the expression of one or more therapeutic polynucleotides by target cells, the production of a functional enzyme or protein in which a subject is deficient (e.g., a urea cycle enzyme or an enzyme associated with a lysosomal storage disorder) may be observed.


In some embodiments, the therapeutic polypeptide comprises a protein such as IDUA, OTC, FAH, miniDMD, DMD, p53, PTEN, COL3A1, BMPR2, AHI1, FANCC, MYBPC3, ILRG2, or ARG1, wherein deficiency of functional protein is associated with a disease or disorder.


In some embodiments, the therapeutic polypeptide comprises a protein (e.g., a lysosomal enzyme) wherein deficiency of the protein is associated with a lysosomal storage disorder.


In some embodiments, the therapeutic polypeptide comprises a protein (e.g., an enzyme), wherein deficiency of the protein is associated with a metabolic disorder. In some embodiments, the therapeutic polypeptide comprises a urea cycle enzyme (e.g., ARG1).


In some embodiments, the therapeutic polypeptide comprises a protein (e.g., p53 or PTEN), wherein deficiency of the protein is associated with a cancer. In some embodiments, the therapeutic polypeptide comprises a tumor suppressor.


In some embodiments, the therapeutic polypeptide comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100%) to the amino acid sequence of a wildtype mouse IDUA protein (e.g., SEQ ID NO: 18).


In some embodiments, the therapeutic polypeptide comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100%) to the amino acid sequence of a wildtype human IDUA protein (e.g., SEQ ID NO: 19).


In some embodiments, the therapeutic polypeptide comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100%) to the amino acid sequence of a wildtype mouse OTC protein (e.g., SEQ ID NO: 20).


In some embodiments, the therapeutic polypeptide comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100%) to the amino acid sequence of a wildtype mouse FAH protein (e.g., SEQ ID NO: 21).


In some embodiments, the therapeutic polypeptide comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100%) to the amino acid sequence of a human miniDMD protein (e.g. SEQ ID NO: 22).


In some embodiments, the therapeutic polypeptide comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100%) to the amino acid sequence of a wildtype human DMD protein (e.g., SEQ ID NO: 23).


In some embodiments, the therapeutic polypeptide comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100%) to the amino acid sequence of a wildtype human p53 protein (e.g., SEQ ID NO: 24).


In some embodiments, the therapeutic polypeptide comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100%) to the amino acid sequence of a wildtype human PTEN protein (e.g., SEQ ID NO: 25).


In some embodiments, the therapeutic polypeptide comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100%) to the amino acid sequence of a wildtype human COL3A1 protein (e.g., SEQ ID NO: 56).


In some embodiments, the therapeutic polypeptide comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100%) to the amino acid sequence of a wildtype human BMPR2 protein (e.g., SEQ ID NO: 57).


In some embodiments, the therapeutic polypeptide comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100%) to the amino acid sequence of a wildtype human AHI1 protein (e.g., SEQ ID NO: 58).


In some embodiments, the therapeutic polypeptide comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100%) to the amino acid sequence of a wildtype human FANCC protein (e.g., SEQ ID NO: 59).


In some embodiments, the therapeutic polypeptide comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100%) to the amino acid sequence of a wildtype human MYBPC3 protein (e.g., SEQ ID NO: 60).


In some embodiments, the therapeutic polypeptide comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100%) to the amino acid sequence of a wildtype human ILRG2 protein (e.g., SEQ ID NO: 61).


In some embodiments, the therapeutic polypeptide comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100%) to the amino acid sequence of a wildtype human OTC protein (e.g., SEQ ID NO: 55).


In some embodiments, the therapeutic polypeptide comprises an amino acid sequence having at least about 80% (e.g., at least about 85%, 90%, 95%, 98%, or more, or 100%) to the amino acid sequence of a wildtype human FAH protein (e.g., SEQ ID NO: 54).


v. Peptide Linker


In some embodiments, the various domains in the therapeutic polypeptide (e.g., the various domains of a Spike protein or fragment thereof) may be fused to each other or comprises domains (e.g., an antigenic polypeptide domain and a carrier protein or a multimerization domain) that are fused to each other via a peptide linker. In some embodiments, the antigenic polypeptide a domain of an S protein of a coronavirus fused to a multimerization domain via a peptide linker. Flexible peptide linkers such as glycine linkers, glycine-serine linkers, and linkers containing other amino acids are known in the art (for example, suitable peptide linkers are described by Chen et al. in Fusion Protein Linkers: Property, Design and Functionality. Adv. Drug Deli Rev. 2013 Oct. 15; 65(10): 1357-1369). Peptide linkers can also be designed by computation methods. The peptide linker can be of any length from 1 to 10, 10 to 20, 20 to 30, 30 to 40, 40 to 50, or greater than 50 amino acids. In some embodiments, the peptide linker comprises the amino acid sequence of SEQ ID NO: 5.


B. Additional circRNA Expression and Circularization Elements


The circRNAs of the circRNA vaccines described herein comprise one or more additional expression elements that facilitate expression and/or circularization of the circRNA.


In some embodiments, the circRNA comprises a Kozak sequence operably linked to a nucleic acid sequence encoding an antigenic polypeptide comprising a Spike (S) protein or a fragment thereof of a coronavirus (e.g., SARS-CoV-2). In some embodiments, the Kozak sequence functions as a protein translation initiation site.


In some embodiments, the circRNA comprises a nucleic acid sequence encoding an antigenic polypeptide comprising a Spike (S) protein or a fragment thereof of a coronavirus (e.g., SARS-CoV-2), which is operably linked to an internal ribosomal entry site (IRES). In non-limiting examples, the IRES sequence can be a CVB3 virus, EV71 virus, EMCV virus, PV virus, or a CSFV virus IRES sequence. See, for example, Searching for IRES. RNA. 2006 October; 12(10): 1755-1785, which is incorporated herein by reference in its entirety. In some embodiments, the IRES sequence is a cellular IRES sequence. In some embodiments, the IRES sequence is followed by a Kozak sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises a polyA or polyAC sequence disposed at the 5′ end of the IRES sequence.


In some embodiments, a polyA sequence or polyAC spacer is disposed at the 5′ end of an IRES. In some embodiments, the polyA or polyAC sequence is disposed between the 5′ end of the IRES and the exon-exon splice junction. The internal polyA sequence or polyAC spacer may range from 1 to 500 nucleotides in length (e.g., at least 20, 30, 40, 50, 60, 70, 80, 90, 120, 140, 160, 180, 200, 250, 300, 350, 400, 450, or 500 nucleotides). In some embodiments, the polyA sequence or polyAC sequence may range from 10-70, 20-60, or 30-60 nucleotides in length. In some embodiments, the circRNA comprises the polyAC sequence set forth in SEQ ID NO: 37 disposed at the 5′ end of the IRES sequence. In some embodiments, no polyA sequence or polyAC sequence is disposed at the 5′ end of the IRES sequence. Without being bound by any theory or hypothesis, an internal polyA sequence or a polyAC spacer added before IRES sequences can help to keep the functional second structure of IRES elements for efficient protein translation initiated by IRES. In some embodiments, the polyA sequence or polyAC spacer increases expression of the RNA construct.


In some embodiments, the circRNA comprises a nucleic acid sequence encoding an antigenic polypeptide comprising a Spike (S) protein or a fragment thereof of a coronavirus (e.g., SARS-CoV-2), which is operably linked to an m6A (N6-methyladenosine) modification motif sequence. The m6A modification sequence can comprise an m6A consensus sequence. M6A consensus sequences are known in the art (for example, consensus sequences identified by Ke et al., 2017 (m6A mRNA modifications are deposited in nascent pre-mRNA and are not required for splicing but do specify cytoplasmic turnover. Genes & Dev. 2017. 31: 990-1006) and available for download from GEO (GSE86336). In some embodiments, the m6A modification motif sequence comprises the sequence set forth in SEQ ID NO: 38. In some embodiments, the m6A modification motif sequence is followed by a Kozak sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment comprises the nucleic acid sequence of SEQ ID NO: 39. In some embodiments, the 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment comprises the nucleic acid sequence of SEQ ID NO: 40. In some embodiments, the 3′ catalytic Group I intron fragment comprises the nucleic acid sequence of SEQ ID NO: 46, and the 5′ catalytic Group I intron fragment sequence comprises the nucleic acid sequence of SEQ ID NO: 47.


In some embodiments, the Group I catalytic intron of the T4 phage Td gene is bisected in such a way to preserve structural elements critical for ribozyme folding. Exon fragment 2 is then ligated upstream of exon fragment 1, and a nucleic acid sequence comprising a sequence encoding the antigenic polypeptide comprising a Spike (S) protein or fragment thereof of a coronavirus is inserted between the exon-exon junction. In some embodiments, the sequence comprising an IRES or m6A sequence, a Kozak sequence, a signal peptide encoding sequence, an antigenic polypeptide comprising an S protein or fragment thereof of a coronavirus, and a stop codon or in-frame 2A peptide sequence is inserted between the exon-exon junction.


In some embodiments, the circRNA comprises a 5′ ligation sequence at the 5′ end of the circRNA, and a 3′ ligation sequence at the 3′ end of the circRNA, wherein the 5′ ligation sequence and the 3′ ligation sequence are ligated to each other via a ligase (e.g., T4 RNA ligase).


C. Exemplary Therapeutic circRNAs


i. Exemplary circRNAs for Expression of a Therapeutic Polypeptide


In some embodiments, the present application provides a circular RNA (circRNA) comprising a nucleic acid sequence encoding a therapeutic polypeptide (e.g., any of the therapeutic polypeptides described in Section A above) and further comprising an internal ribosomal entry site (IRES) sequence or an m6A modification motif sequence, wherein the IRES or m6A modification motif sequence is operably linked to the nucleic acid sequence encoding the therapeutic polypeptide. In some embodiments, the IRES sequence is a CVB3 virus, EV71 virus, EMCV virus, PV virus, or a CSFV virus IRES sequence. In some embodiments, the nucleic acid sequence further encodes a signal peptide (SP) fused to the N-terminus of the therapeutic polypeptide (e.g., an antigenic polypeptide, soluble receptor, or antibody). In non-limiting examples, the signal peptide is human tissue plasminogen activator (tPA) or IgE signal peptide. In some embodiments, the circRNA further comprises a polyA or polyAC sequence disposed at the 5′ end of an IRES sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the therapeutic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the therapeutic polypeptide. In some embodiments, the 3′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 39. In some embodiments, the 5′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 40. In some embodiments, the 3′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 39, and the 5′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 40.


In some embodiments, the present application provides a circular RNA (circRNA) comprising a nucleic acid sequence, the nucleic acid sequence comprising from the 5′ end to the 3′ end: an m6A modification motif sequence, a Kozak sequence, a nucleic acid sequence encoding a therapeutic polypeptide. In some embodiments, the circRNA further comprises an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding the therapeutic polypeptide. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the therapeutic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the therapeutic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) comprising a nucleic acid sequence, the nucleic acid sequence comprising from the 5′ end to the 3′ end: an internal ribosomal entry site (IRES) sequence, a Kozak sequence, and a nucleic acid sequence encoding a therapeutic polypeptide. In some embodiments, the IRES sequence is a CVB3 virus, EV71 virus, EMCV virus, PV virus, or a CSFV virus IRES sequence. In some embodiments, the circRNA further comprises a polyA or polyAC sequence disposed at the 5′ end of the IRES sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the therapeutic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the therapeutic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) comprising a nucleic acid sequence encoding a therapeutic polypeptide, further comprising an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding the therapeutic polypeptide (e.g., instead of a stop codon). In some embodiments, the circRNA further comprises an internal ribosomal entry site (IRES) sequence or an m6A modification motif sequence operably linked to the nucleic acid sequence encoding the therapeutic polypeptide. In some embodiments, the nucleic acid sequence further encodes a SP (e.g., human tPA or IgE SP) fused to the N-terminus of the therapeutic polypeptide for secretion of the therapeutic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) comprising a nucleic acid sequence, the nucleic acid sequence comprising from the 5′ end to the 3′ end: an internal ribosomal entry site (IRES) sequence or an m6A modification motif sequence, a Kozak sequence, a nucleic acid sequence encoding a therapeutic polypeptide, and an in-frame 2A peptide coding sequence. In some embodiments, the IRES sequence is a CVB3 virus, EV71 virus, EMCV virus, PV virus, or a CSFV virus IRES sequence. In some embodiments, the circRNA further comprises a polyA or polyAC sequence disposed at the 5′ end of the IRES sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the therapeutic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the therapeutic polypeptide.


ii. Exemplary circRNA Vaccines


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a Spike (S) protein or a fragment thereof of a coronavirus (e.g., SARS-CoV2), and further comprising an internal ribosomal entry site (IRES) sequence or an m6A modification motif sequence, wherein the IRES or m6A modification motif sequence is operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a signal peptide (SP) fused to the N-terminus of the S protein or fragment thereof. In non-limiting examples, the signal peptide is human tissue plasminogen activator (tPA) or IgE signal peptide. In some embodiments, the circRNA further comprises a polyA or polyAC sequence disposed at the 5′ end of an IRES sequence.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a Spike (S) protein or a fragment thereof of a coronavirus (e.g., SARS-CoV-2), and an internal ribosomal entry site (IRES) sequence or an m6A modification motif sequence, wherein the IRES or m6A modification motif sequence is operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a signal peptide (SP) fused to the N-terminus of the S protein or fragment thereof. In non-limiting examples, the signal peptide is human tissue plasminogen activator (tPA) or IgE signal peptide. In some embodiments, the circRNA further comprises a polyA or polyAC sequence disposed at the 5′ end of an IRES sequence.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a Spike (S) protein or a fragment thereof of a coronavirus (e.g., SARS-CoV-2), and further comprising an internal ribosomal entry site (IRES) or m6A modification motif sequence, and a Kozak sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a signal peptide (SP) fused to the N-terminus of the S protein or fragment thereof. In some embodiments, the signal peptide is, for example, human tissue plasminogen activator (tPA) or IgE signal peptide. In some embodiments, the circRNA further comprises a polyA or polyAC sequence disposed at the 5′ end of the IRES sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the 3′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 39. In some embodiments, the 5′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 40. In some embodiments, the 3′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 39, and the 5′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 40.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence, the nucleic acid sequence comprising from the 5′ end to the 3′ end: an internal ribosomal entry site (IRES) sequence, a Kozak sequence, a nucleic acid sequence encoding a signal peptide (SP), and a nucleic acid sequence encoding an antigenic polypeptide comprising a Spike (S) protein or a fragment thereof of a coronavirus (e.g., SARS-CoV-2). In some embodiments, the IRES sequence is a CVB3 virus, EV71 virus, EMCV virus, PV virus, or a CSFV virus IRES sequence. In some embodiments, the circRNA further comprises a polyA or polyAC sequence disposed at the 5′ end of the IRES sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence, the nucleic acid sequence comprising from the 5′ end to the 3′ end: an m6A modification motif sequence, a Kozak sequence, a nucleic acid sequence encoding a signal peptide (SP), and a nucleic acid sequence encoding an antigenic polypeptide comprising a Spike (S) protein or a fragment thereof of a coronavirus (e.g., SARS-CoV-2). In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a receptor-binding domain (RBD) of a Spike (S) protein of a coronavirus (e.g., SARS-CoV-2) and a multimerization domain (e.g., C-terminal domain of a T4 fibritin protein that mediates trimerization of the T4 fibritin protein). In some embodiments, the circRNA further comprises an internal ribosomal entry site (IRES) sequence or an m6A modification sequence, wherein the IRES or m6A modification motif sequence is operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a signal peptide (SP) fused to the N-terminus of the S protein or fragment thereof (e.g., human tPA or IgE signal peptide). In some embodiments, the circRNA further comprises a polyA or polyAC sequence disposed at the 5′ end of the IRES sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a receptor-binding domain (RBD) of a Spike (S) protein of a coronavirus (e.g., derived from the wild type or B.1.351/501Y.V2 variant of SARS-CoV-2) and a multimerization domain (e.g., C-terminal domain of a T4 fibritin protein that mediates trimerization of the T4 fibritin protein). In some embodiments, the circRNA further comprises an internal ribosomal entry site (IRES) sequence or an m6A modification sequence, wherein the IRES or m6A modification motif sequence is operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a signal peptide (SP) fused to the N-terminus of the S protein or fragment thereof (e.g., human tPA or IgE signal peptide). In some embodiments, the circRNA further comprises a polyA or polyAC sequence disposed at the 5′ end of the IRES sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments the RBD domain has at least 90% (e.g., at least 95%, 96%, 97%, 98%, 99%) sequence identity to the amino acid sequence of SEQ ID NO: 2. In some embodiments, the RBD domain comprises the amino acid sequence of SEQ ID NO: 2. In some embodiments the RBD domain has at least 90% (e.g., at least 95%, 96%, 97%, 98%, 99%) sequence identity to the amino acid sequence of SEQ ID NO: 63. In some embodiments, the RBD domain comprises the amino acid sequence of SEQ ID NO: 63.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence, the nucleic acid sequence comprising from the 5′ end to the 3′ end: an internal ribosomal entry site (IRES) sequence, a Kozak sequence, a nucleic acid sequence encoding a signal peptide (SP), and a nucleic acid sequence encoding an antigenic polypeptide comprising a receptor-binding domain (RBD) of a Spike (S) protein of a coronavirus (e.g., derived from the wild type SARS-CoV-2 or a variant such as the B.1.351 or B.1.617.2 variant of SARS-CoV-2). In some embodiments, the IRES sequence is a CVB3 virus, EV71 virus, EMCV virus, PV virus, or a CSFV virus IRES sequence. In some embodiments, the circRNA further comprises a polyAC sequence disposed at the 5′ end of the IRES sequence. In some embodiments, the circRNA further comprises the polyAC sequence set forth in SEQ ID NO: 37 disposed at the 5′ end of the IRES sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the 3′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 39. In some embodiments, the 5′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 40. In some embodiments, the 3′ exon comprises the nucleic acid sequence of SEQ ID NO: 39, and the 5′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 40.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence, the nucleic acid sequence comprising from the 5′ end to the 3′ end: an m6A modification motif sequence, a Kozak sequence, a nucleic acid sequence encoding a signal peptide (SP), and a nucleic acid sequence encoding an antigenic polypeptide comprising a receptor-binding domain (RBD) of a Spike (S) protein of a coronavirus (e.g., derived from the wild type SARS-CoV-2 or a variant such as the B.1.351 or B.1.617.2 variant of SARS-CoV-2). In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the 3′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 39. In some embodiments, the 5′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 40. In some embodiments, the 3′ exon comprises the nucleic acid sequence of SEQ ID NO: 39, and the 5′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 40.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence, the nucleic acid sequence comprising from the 5′ end to the 3′ end: an internal ribosomal entry site (IRES) sequence, a Kozak sequence, a nucleic acid sequence encoding a signal peptide (SP), and a nucleic acid sequence encoding an antigenic polypeptide comprising a receptor-binding domain (RBD) of a Spike (S) protein of a coronavirus (e.g., derived from the wild type SARS-CoV-2 or a variant such as the B.1.351 or B.1.617.2 variant of SARS-CoV-2) and a multimerization domain (e.g., C-terminal domain of a T4 fibritin protein that mediates trimerization of the T4 fibritin protein). In some embodiments, the IRES sequence is a CVB3 virus, EV71 virus, EMCV virus, PV virus, or a CSFV virus IRES sequence. In some embodiments, the circRNA further comprises a polyA or polyAC sequence disposed at the 5′ end of the IRES sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments the RBD domain has at least 90% (e.g., at least 95%, 96%, 97%, 98%, 99%) sequence identity to the amino acid sequence of SEQ ID NO: 2. In some embodiments, the RBD domain comprises the amino acid sequence of SEQ ID NO: 2. In some embodiments the RBD domain has at least 90% (e.g., at least 95%, 96%, 97%, 98%, 99%) sequence identity to the amino acid sequence of SEQ ID NO: 63. In some embodiments, the RBD domain comprises the amino acid sequence of SEQ ID NO: 63.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a receptor-binding domain (RBD) and an S2 region of a Spike (S) protein of a coronavirus (e.g., SARS-CoV-2) and a multimerization domain (e.g., C-terminal domain of a T4 fibritin protein that mediates trimerization of the T4 fibritin protein). In some embodiments, the circRNA further comprises an internal ribosomal entry site (IRES) or m6A modification motif sequence, wherein the IRES or m6A modification motif sequence is operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a signal peptide (SP) fused to the N-terminus of the S protein or fragment thereof (e.g., human tPA or IgE signal peptide). In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) comprising from the 5′ end to the 3′ end: an internal ribosomal entry site (IRES) sequence, a Kozak sequence, a nucleic acid sequence encoding a signal peptide (SP), and a nucleic acid sequence encoding an antigenic polypeptide comprising a receptor-binding domain (RBD) and an S2 region of a Spike (S) protein of a coronavirus (e.g., derived from the wild type SARS-CoV-2 or a variant such as the B.1.351 or B.1.617.2 variant of SARS-CoV-2) and a multimerization domain (e.g., C-terminal domain of a T4 fibritin protein that mediates trimerization of the T4 fibritin protein). In some embodiments, the IRES sequence is a CVB3 virus, EV71 virus, EMCV virus, PV virus, or a CSFV virus IRES sequence. In some embodiments, the circRNA further comprises a polyA or polyAC sequence disposed at the 5′ end of the IRES sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence, the nucleic acid sequence comprising from the 5′ end to the 3′ end: a m6A modification motif sequence, a Kozak sequence, a nucleic acid sequence encoding a signal peptide (SP), and a nucleic acid sequence encoding an antigenic polypeptide comprising a receptor-binding domain (RBD) and an S2 region of a Spike (S) protein of a coronavirus (e.g., derived from the wild type SARS-CoV-2 or a variant such as the B.1.351 or B.1.617.2 variant of SARS-CoV-2) and a multimerization domain (e.g., C-terminal domain of a T4 fibritin protein that mediates trimerization of the T4 fibritin protein). In some embodiments, the IRES sequence is a CVB3 virus, EV71 virus, EMCV virus, PV virus, or a CSFV virus IRES sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide, the antigenic polypeptide comprising amino acid residues 2-1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the circRNA further comprises an internal ribosomal entry site (IRES) or m6A modification motif sequence, wherein the IRES or m6A modification motif sequence is operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a signal peptide (SP) fused to the N-terminus of the S protein or fragment thereof. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide, the antigenic polypeptide comprising amino acid residues 2-1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the circRNA further comprises an internal ribosomal entry site (IRES) or m6A modification motif sequence, and a Kozak sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a signal peptide (SP) fused to the N-terminus of the S protein or fragment thereof. In some embodiments, the signal peptide is, for example, human tissue plasminogen activator (tPA) or IgE signal peptide. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence, the nucleic acid sequence comprising from the 5′ end to the 3′ end: an internal ribosomal entry site (IRES) sequence, a Kozak sequence, a nucleic acid sequence encoding a signal peptide (SP), and a nucleic acid sequence encoding an antigenic polypeptide, the antigenic polypeptide comprising amino acid residues 2-1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the IRES sequence is a CVB3 virus, EV71 virus, EMCV virus, PV virus, or a CSFV virus IRES sequence. In some embodiments, the circRNA further comprises a polyA or polyAC sequence disposed at the 5′ end of the IRES sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence, the nucleic acid sequence comprising from the 5′ end to the 3′ end: a m6A modification motif sequence, a Kozak sequence, a nucleic acid sequence encoding a signal peptide (SP), and a nucleic acid sequence encoding an antigenic polypeptide, the antigenic polypeptide comprising amino acid residues 2-1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the IRES sequence is a CVB3 virus, EV71 virus, EMCV virus, PV virus, or a CSFV virus IRES sequence. In some embodiments, the circRNA further comprises a polyA or polyAC sequence disposed at the 5′ end of the IRES sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a Spike (S) protein or a fragment thereof of SARS-CoV-2 wherein the antigenic polypeptide comprises an S2 region of the S protein. In some embodiments, the S2 region comprises amino acid residues 686 to 1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the S2 region comprises the amino acid sequence of SEQ ID NO: 6. In some embodiments, the S2 region comprises one or more mutations that stabilize a pre-fusion conformation of the S protein. In some embodiments, the S2 region comprises K986P and V987P mutations. In some embodiments, the S2 region comprises the amino acid sequence of SEQ ID NO: 7. In some embodiments, the circRNA further comprises an internal ribosomal entry site (IRES) or m6A modification motif sequence, wherein the IRES or m6A modification motif sequence is operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a signal peptide (SP) fused to the N-terminus of the S protein or fragment thereof (e.g., human tPA or IgE signal peptide). In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence set forth in any one of SEQ ID NOs: 11-15 or SEQ ID NOs: 48-49.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a Spike (S) protein or a fragment thereof of a coronavirus (e.g., SARS-CoV-2), further comprising an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises an internal ribosomal entry site (IRES) sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a SP (e.g., human tPA or IgE SP) fused to the N-terminus of the S protein or fragment thereof.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence, the nucleic acid sequence comprising from the 5′ end to the 3′ end: an internal ribosomal entry site (IRES) sequence, a Kozak sequence, a nucleic acid sequence encoding a signal peptide (SP), a nucleic acid sequence encoding an antigenic polypeptide comprising an S protein or fragment thereof of a coronavirus, and an in-frame 2A peptide coding sequence. In some embodiments, the IRES sequence is a CVB3 virus, EV71 virus, EMCV virus, PV virus, or a CSFV virus IRES sequence. In some embodiments, the circRNA further comprises a polyA or polyAC sequence disposed at the 5′ end of the IRES sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) comprising from the 5′ end to the 3′ end: an m6A modification motif sequence, a Kozak sequence, a nucleic acid sequence encoding a signal peptide (SP), a nucleic acid sequence encoding an antigenic polypeptide comprising an S protein or fragment thereof of a coronavirus, and an in-frame 2A peptide coding sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a Spike (S) protein or a fragment thereof of a coronavirus (e.g., derived from the wild type SARS-CoV-2 or a variant such as the B.1.351 or B.1.617.2 variant of SARS-CoV-2), further comprising an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises an m6A modification motif sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further comprises a Kozak sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a Spike (S) protein or a fragment thereof of a coronavirus (e.g., derived from the wild type SARS-CoV-2 or a variant such as the B.1.351 or B.1.617.2 variant of SARS-CoV-2), further comprising an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises an internal ribosomal entry site (IRES) sequence (e.g., a CVB3 virus, EV71 virus, EMCV virus, PV virus, or CSFV virus IRES sequence) operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a SP (e.g., human tPA or IgE SP) fused to the N-terminus of the S protein or fragment thereof, and a Kozak sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the 3′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 39. In some embodiments, the 5′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 40. In some embodiments, the 3′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 39, and the 5′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 40.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a receptor-binding domain (RBD) of a Spike (S) protein of a coronavirus (e.g., derived from the wild type SARS-CoV-2 or a variant such as the B.1.351 or B.1.617.2 variant of SARS-CoV-2), further comprising an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises an internal ribosomal entry site (IRES) or m6A modification motif sequence, wherein the IRES or m6A modification motif sequence is operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a SP (e.g., human tPA or IgE SP) fused to the N-terminus of the S protein or fragment thereof. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a receptor-binding domain (RBD) of a Spike (S) protein of a coronavirus (e.g., derived from the wild type SARS-CoV-2 or a variant such as the B.1.351 or B.1.617.2 variant of SARS-CoV-2), further comprising an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the antigenic polypeptide further comprises a multimerization domain (e.g., C-terminal domain of a T4 fibritin protein that mediates trimerization of the T4 fibritin protein). In some embodiments, the circRNA further comprises an internal ribosomal entry site (IRES) or m6A modification motif sequence, wherein the IRES or m6A modification motif sequence is operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further comprises a Kozak sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a receptor-binding domain (RBD) and an S2 region of a Spike (S) protein of a coronavirus (e.g., derived from the wild type SARS-CoV-2 or a variant such as the B.1.351 or B.1.617.2 variant of SARS-CoV-2), further comprising an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the antigenic polypeptide further comprises a multimerization domain (e.g., a C-terminal Foldon domain of a T4 fibritin protein or a GCN4-based isoleucine zipper domain). In some embodiments, the circRNA further comprises an internal ribosomal entry site (IRES) sequence (e.g., a CVB3 virus, EV71 virus, EMCV virus, PV virus, or CSFV virus IRES sequence) or an m6A modification motif sequence, wherein the IRES or m6A modification motif sequence is operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further comprises a Kozak sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a receptor-binding domain (RBD) and an S2 region of a Spike (S) protein of a coronavirus (e.g., derived from the wild type SARS-CoV-2 or a variant such as the B.1.351 or B.1.617.2 variant of SARS-CoV-2), further comprising an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the antigenic polypeptide further comprises a multimerization domain (e.g., C-terminal domain of a T4 fibritin protein that mediates trimerization of the T4 fibritin protein). In some embodiments, the circRNA further comprises an internal ribosomal entry site (IRES) sequence (e.g., a CVB3 virus, EV71 virus, EMCV virus, PV virus, or CSFV virus IRES sequence) operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a SP (e.g., human tPA or IgE SP) fused to the N-terminus of the S protein or fragment thereof, and a Kozak sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments the RBD domain has at least 90% (e.g., at least 95%, 96%, 97%, 98%, 99%) sequence identity to the amino acid sequence of SEQ ID NO: 2. In some embodiments, the RBD domain comprises the amino acid sequence of SEQ ID NO: 2. In some embodiments the RBD domain has at least 90% (e.g., at least 95%, 96%, 97%, 98%, 99%) sequence identity to the amino acid sequence of SEQ ID NO: 63. In some embodiments, the RBD domain comprises the amino acid sequence of SEQ ID NO: 63.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding amino acid residues 2-1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1, further comprising an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the antigenic polypeptide further comprises a multimerization domain. In some embodiments, the circRNA further comprises an internal ribosomal entry site (IRES) sequence (e.g., a CVB3 virus, EV71 virus, EMCV virus, PV virus, or CSFV virus IRES sequence) operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further comprises a Kozak sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding amino acid residues 2-1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1, further comprising an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the antigenic polypeptide further comprises a multimerization domain (e.g., C-terminal domain of a T4 fibritin protein that mediates trimerization of the T4 fibritin protein). In some embodiments, the circRNA further comprises an internal ribosomal entry site (IRES) sequence (e.g., a CVB3 virus, EV71 virus, EMCV virus, PV virus, or CSFV virus IRES sequence) operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a SP (e.g., human tPA or IgE SP) fused to the N-terminus of the S protein or fragment thereof. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding amino acid residues 2-1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1, further comprising an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the antigenic polypeptide further comprises a multimerization domain. In some embodiments, the circRNA further comprises an internal ribosomal entry site (IRES) sequence (e.g., a CVB3 virus, EV71 virus, EMCV virus, PV virus, or CSFV virus IRES sequence) operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a SP (e.g., human tPA or IgE SP) fused to the N-terminus of the S protein or fragment thereof, and a Kozak sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a receptor-binding domain (RBD) and an S2 region of a Spike (S) protein of a coronavirus (e.g., SARS-CoV-2), further comprising an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the S2 region comprises amino acid residues 686 to 1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the S2 region comprises the amino acid sequence of SEQ ID NO: 6. In some embodiments, the S2 region comprises one or more mutations that stabilize a pre-fusion conformation of the S protein. In some embodiments, the S2 region comprises the amino acid sequence of SEQ ID NO: 7. In some embodiments, the antigenic polypeptide further comprises a multimerization domain (e.g., C-terminal domain of a T4 fibritin protein that mediates trimerization of the T4 fibritin protein). In some embodiments, the circRNA further comprises an internal ribosomal entry site (IRES) sequence (e.g., a CVB3 virus, EV71 virus, EMCV virus, PV virus, or CSFV virus IRES sequence) operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further comprises a Kozak sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a receptor-binding domain (RBD) and an S2 region of a Spike (S) protein of a coronavirus (e.g., SARS-CoV-2), further comprising an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the S2 region comprises amino acid residues 686 to 1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the S2 region comprises the amino acid sequence of SEQ ID NO: 6. In some embodiments, the S2 region comprises one or more mutations that stabilize a pre-fusion conformation of the S protein. In some embodiments, the S2 region comprises the amino acid sequence of SEQ ID NO: 7. In some embodiments, the antigenic polypeptide further comprises a multimerization domain (e.g., C-terminal domain of a T4 fibritin protein that mediates trimerization of the T4 fibritin protein). In some embodiments, the circRNA further comprises an internal ribosomal entry site (IRES) sequence (e.g., a CVB3 virus, EV71 virus, EMCV virus, PV virus, or CSFV virus IRES sequence) operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a SP (e.g., human tPA or IgE SP) fused to the N-terminus of the S protein or fragment thereof. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the present application provides a circular RNA (circRNA) vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a receptor-binding domain (RBD) and an S2 region of a Spike (S) protein of a coronavirus (e.g., SARS-CoV-2), further comprising an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the S2 region comprises amino acid residues 686 to 1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the S2 region comprises the amino acid sequence of SEQ ID NO: 6. In some embodiments, the S2 region comprises one or more mutations that stabilize a pre-fusion conformation of the S protein. In some embodiments, the S2 region comprises the amino acid sequence of SEQ ID NO: 7. In some embodiments, the antigenic polypeptide further comprises a multimerization domain (e.g., C-terminal domain of a T4 fibritin protein that mediates trimerization of the T4 fibritin protein). In some embodiments, the circRNA further comprises an internal ribosomal entry site (IRES) sequence (e.g., a CVB3 virus, EV71 virus, EMCV virus, PV virus, or CSFV virus IRES sequence) operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a SP (e.g., human tPA or IgE SP) fused to the N-terminus of the S protein or fragment thereof, and a Kozak sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the circRNA vaccine provided comprises a nucleic acid sequence comprising from the 5′ end to the 3′ end: the IRES sequence, the Kozak sequence, the SP, the nucleic acid sequence encoding the antigenic polypeptide, and the in-frame 2A peptide coding sequence. In some embodiments, the circRNA vaccine further comprises a polyA or polyAC sequence disposed at the 5′ end of the IRES sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, the circRNA vaccine provided comprises a nucleic acid sequence comprising from the 5′ end to the 3′ end: the polyA or polyAC sequence, the IRES sequence, the Kozak sequence, the SP, the nucleic acid sequence encoding the antigenic polypeptide, and the in-frame 2A peptide coding sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide.


In some embodiments, there is provided a circular RNA (circRNA) comprising a nucleic acid sequence comprising from the 5′ end to the 3′ end: the m6A modification motif sequence, the Kozak sequence, the SP, the nucleic acid sequence encoding antigenic polypeptide, and the in-frame 2A peptide coding sequence. In some embodiments, the antigenic polypeptide comprises a receptor-binding domain (RBD) of the S protein. In some embodiments, the antigenic polypeptide further comprises a multimerization domain (e.g., C-terminal domain of a T4 fibritin protein that mediates trimerization of the T4 fibritin protein).


In some embodiments, there is provided a circular RNA (circRNA) comprising a nucleic acid sequence comprising from the 5′ end to the 3′ end: the m6A modification motif sequence, the Kozak sequence, the SP, and the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the antigenic polypeptide comprises a receptor-binding domain (RBD) and an S2 region of the S protein. In some embodiments, the antigenic polypeptide further comprises a multimerization domain (e.g., C-terminal domain of a T4 fibritin protein that mediates trimerization of the T4 fibritin protein).


In some embodiments, there is provided a circular RNA (circRNA) comprising a nucleic acid sequence comprising from the 5′ end to the 3′ end: the m6A modification motif sequence, the Kozak sequence, the SP, and a sequence encoding the antigenic polypeptide. In some embodiments, the antigenic polypeptide comprises amino acid residues 2-1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the antigenic polypeptide further comprises a multimerization domain (e.g., C-terminal domain of a T4 fibritin protein that mediates trimerization of the T4 fibritin protein). In some embodiments, the circRNA comprises an in-frame 2A peptide coding sequence following the antigenic polypeptide.


III. METHODS OF TREATMENT

The circRNAs and compositions derived herein may be used to treat or prevent a disease or condition in an individual, including, but not limited to genetic diseases (e.g., hereditary genetic diseases, metabolic diseases and cancer), and infections (e.g., viral infections such as coronavirus infections). In some embodiments, the circRNA is subject to rolling circle translation by a ribosome in the individual.


In some embodiments, there is provided a method of treating or preventing a disease or condition in an individual, comprising administering to the individual an effective amount of a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide. In some embodiments, the antigenic polypeptide is a protein or a fragment thereof of an infectious agent, such as a virus, e.g., a coronavirus. In some embodiments, the infectious agent is SARS-CoV-2. In some embodiments, the antigenic polypeptide is an S protein or fragment thereof. In some embodiments, the disease or condition is a coronavirus infection. In some embodiments, the method comprises administering an effective amount of a cocktail composition comprising a plurality of circRNA encoding different antigenic polypeptides.


In some embodiments, there is provided a method of treating or preventing a disease or condition in an individual, comprising administering to the individual an effective amount of a circRNA comprising a nucleic acid sequence encoding a functional protein. In some embodiments, the functional protein is an enzyme, a receptor, a ligand, a signaling molecule, or a transcription factor. In some embodiments, the disease or condition is a metabolic disease. In some embodiments, the disease or condition is a lysosomal storage disorder. In some embodiments, the disease or condition is a cancer.


In some embodiments, there is provided a method of treating or preventing a disease or condition in an individual, comprising administering to the individual an effective amount of a circRNA comprising a nucleic acid sequence encoding a receptor protein. In some embodiments, the receptor protein is a receptor of an infectious agent, such as a virus, e.g., a coronavirus. In some embodiments, the receptor protein is a soluble receptor, such as a soluble ACE2 receptor.


In some embodiments, there is provided a method of treating or preventing a disease or condition in an individual, comprising administering to the individual an effective amount of a circRNA comprising a nucleic acid sequence encoding a targeting protein, such as an antibody. In some embodiments, the targeting protein is a neutralizing antibody. In some embodiments, the targeting protein is a therapeutic antibody. In some embodiments, the targeting protein specifically binds an infectious agent, such as a virus, e.g., a coronavirus.


In some embodiments, the present application provides circRNAs for use in treating or preventing a disease or condition in an individual.


In some embodiments, the present application provides circRNA vaccines for use in treating or preventing a coronavirus (e.g., SARS-CoV, MERS-COV, or SARS-CoV-2) infection in an individual.


In some embodiments, the present application provides use of a circRNA comprising a nucleic acid sequence encoding a therapeutic polypeptide for the manufacture of a medicament for treating or preventing a disease or condition in an individual.


In some embodiments, the present application provides use of a circRNA vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising a Spike (S) protein or a fragment thereof of a coronavirus (e.g., SARS-CoV-2) for the manufacture of a vaccine for treating or preventing a coronavirus infection in an individual.


A. Treating a Genetic Disease or Condition

The circRNAs described herein may be used for treating a genetic disease or condition that is associated with a mutation or deficiency in a naturally-occurring protein corresponding to the therapeutic polypeptide encoded by the circRNA. In some embodiments, the disease or condition is a disease or condition associated with insufficient levels and/or activity of a naturally-occurring protein corresponding to the therapeutic polypeptide. In some embodiments, the disease or condition is a hereditary genetic disease associated with one or more mutations in naturally-occurring protein corresponding to the therapeutic polypeptide. In some embodiments, the therapeutic polypeptide is a wildtype protein, or a functional variant thereof (e.g., a functional fragment, fusion protein, or mutant).


In some aspects, the present application provides methods and compositions for treatment of a disease or condition associated with a deficiency of a functional protein, such as an enzyme (e.g., IDUA) using a circRNA expressing a therapeutic polypeptide. In some embodiments, the therapeutic polypeptide comprises a nucleotide sequence encoding the protein or a derivative thereof. In some embodiments, the circRNA is capable of expressing a functional protein or functional derivative of a protein that is capable of restoring function of the protein associated with the disease or condition. In some embodiments, the circRNA is capable of restoring 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 100% of the activity of the protein as compared to the endogenous wild-type protein in a cell or organism (e.g., a mouse or a human), e.g., up to 8, 12, 16, 24, 30, 36, or 40 hours after administration of the circRNA.


In some embodiments, the therapeutic polypeptide can be any polypeptide that is capable of being expressed by target cells (e.g., human or mouse cells) for the production (and in certain instances, the excretion) of a functional enzyme or protein as disclosed, for example, in International Application No. PCT/US2010/058457. In some embodiments, the therapeutic polypeptide can be engineered for secretion by operably linking a signal peptide to the amino terminus of the therapeutic polypeptide. For example, in some embodiments, upon the expression of one or more therapeutic polynucleotides by target cells, the production of a functional enzyme or protein in which a subject is deficient (e.g., a urea cycle enzyme or an enzyme associated with a lysosomal storage disorder) may be observed.


Examples of disease-associated mutations that may be treated by the methods of the present application include, but are not limited to, TP53W53X (e.g., 158G>A) associated with cancer, IDUAW402X (e.g., TGG>TAG mutation in exon 9) associated with Mucopolysaccharidosis type I (MPS I), COL3A1W1278X (e.g., 3833G>A mutation) associated with Ehlers-Danlos syndrome, BMPR2W298X (e.g., 893G>A) associated with primary pulmonary hypertension, AHI1W725X (e.g., 2174G>A) associated with Joubert syndrome, FANCCW506X (e.g., 1517G>A) associated with Fanconi anemia, MYBPC3W1098X (e.g., 3293G>A) associated with primary familial hypertrophic cardiomyopathy, and IL2RGW237X (e.g., 710G>A) associated with X-linked severe combined immunodeficiency. In some embodiments, the disease or condition is a cancer. In some embodiments, the disease or condition is a monogenetic disease. In some embodiments, the disease or condition is a polygenetic disease.


In some embodiments, the disease or condition is a liver disease or condition. In some embodiments, the disease or condition is a disease or condition of the respiratory tract of the individual, such as a lung disease or condition.


In some embodiments, there is provided a method of treating a cancer in an individual, comprising administering to the individual an effective amount of a circRNA comprising a nucleic acid sequence encoding a tumor suppressor. In some embodiments, the tumor suppressor is TP53 (including a functional variant thereof). In some embodiments, the tumor suppressor is PTEN (including a functional variant thereof). In some embodiments, the tumor suppressor comprises an amino acid sequence having at least about 80% (e.g., at least about any one of 85%, 90%, 93%, 95%, 97%, 98%, or 99%, or 100%) to the amino acid sequence of SEQ ID NO: 24 or 25.


In some embodiments, there is provided a method of treating a lysosomal storage disorder in an individual, comprising administering to the individual an effective amount of a circRNA comprising a nucleic acid sequence encoding a lysosomal enzyme.


In some embodiments, there is provided a method of treating a liver disease or condition in an individual, comprising administering to the individual an effective amount of a circRNA comprising a nucleic acid sequence encoding a liver protein (e.g., an enzyme).


In some embodiments, there is provided a method of treating Mucopolysaccharidosis type I (MPS I) in an individual, comprising administering to the individual an effective amount of a circRNA comprising a nucleic acid sequence encoding IDUA (including a functional variant thereof). In some embodiments, the IDUA comprises an amino acid sequence having at least about 80% (e.g., at least about any one of 85%, 90%, 93%, 95%, 97%, 98%, or 99%, or 100%) to the amino acid sequence of SEQ ID NO: 18. In some embodiments, the IDUA comprises an amino acid sequence having at least about 80% (e.g., at least about any one of 85%, 90%, 93%, 95%, 97%, 98%, or 99%, or 100%) to the amino acid sequence of SEQ ID NO: 19.


In some embodiments, there is provided a method of treating ornithine transcarbamylase deficiency in an individual, comprising administering to the individual an effective amount of a circRNA comprising a nucleic acid sequence encoding OTC (including a functional variant thereof). In some embodiments, the OTC comprises an amino acid sequence having at least about 80% (e.g., at least about any one of 85%, 90%, 93%, 95%, 97%, 98%, or 99%, or 100%) to the amino acid sequence of SEQ ID NO: 20. In some embodiments, the OTC comprises an amino acid sequence having at least about 80% (e.g., at least about any one of 85%, 90%, 93%, 95%, 97%, 98%, or 99%, or 100%) to the amino acid sequence of SEQ ID NO: 56.


In some embodiments, there is provided a method of treating tyrosinemia in an individual, comprising administering to the individual an effective amount of a circRNA comprising a nucleic acid sequence encoding FAH (including a functional variant thereof). In some embodiments, the FAH comprises an amino acid sequence having at least about 80% (e.g., at least about any one of 85%, 90%, 93%, 95%, 97%, 98%, or 99%, or 100%) to the amino acid sequence of SEQ ID NO: 54. In some embodiments, the FAH comprises an amino acid sequence having at least about 80% (e.g., at least about any one of 85%, 90%, 93%, 95%, 97%, 98%, or 99%, or 100%) to the amino acid sequence of SEQ ID NO: 21.


In some embodiments, there is provided a method of treating Duchenne and Becker muscular dystrophy, X-linked dilated cardiomyopathy, or familial dilated cardiomyopathy in an individual, comprising administering to the individual an effective amount of a circRNA comprising a nucleic acid sequence encoding DMD (including a functional variant thereof, e.g., miniDMD). In some embodiments, the DMD comprises an amino acid sequence having at least about 80% (e.g., at least about any one of 85%, 90%, 93%, 95%, 97%, 98%, or 99%, or 100%) to the amino acid sequence of SEQ ID NO: 23. In some embodiments, the DMD comprises an amino acid sequence having at least about 80% (e.g., at least about any one of 85%, 90%, 93%, 95%, 97%, 98%, or 99%, or 100%) to the amino acid sequence of SEQ ID NO: 22.


In some embodiments, there is provided a method of treating Ehlers-Danlos syndrome in an individual, comprising administering to the individual an effective amount of a circRNA comprising a nucleic acid sequence encoding COL3A1 (including a functional variant thereof). In some embodiments, the COL3A1 comprises an amino acid sequence having at least about 80% (e.g., at least about any one of 85%, 90%, 93%, 95%, 97%, 98%, or 99%, or 100%) to the amino acid sequence of SEQ ID NO: 56.


In some embodiments, there is provided a method of treating Joubert syndrome in an individual, comprising administering to the individual an effective amount of a circRNA comprising a nucleic acid sequence encoding AHI1 (including a functional variant thereof). In some embodiments, the AHI1 comprises an amino acid sequence having at least about 80% (e.g., at least about any one of 85%, 90%, 93%, 95%, 97%, 98%, or 99%, or 100%) to the amino acid sequence of SEQ ID NO: 58.


In some embodiments, there is provided a method of treating pulmonary arterial hypertension, or pulmonary veno-occlusive disease in an individual, comprising administering to the individual an effective amount of a circRNA comprising a nucleic acid sequence encoding FANCC (including a functional variant thereof). In some embodiments, the FANCC comprises an amino acid sequence having at least about 80% (e.g., at least about any one of 85%, 90%, 93%, 95%, 97%, 98%, or 99%, or 100%) to the amino acid sequence of SEQ ID NO: 59.


In some embodiments, there is provided a method of treating primary familial hypertrophic cardiomyopathy in an individual, comprising administering to the individual an effective amount of a circRNA comprising a nucleic acid sequence encoding MYBPC3 (including a functional variant thereof). In some embodiments, the MYBPC3 comprises an amino acid sequence having at least about 80% (e.g., at least about any one of 85%, 90%, 93%, 95%, 97%, 98%, or 99%, or 100%) to the amino acid sequence of SEQ ID NO: 60.


In some embodiments, there is provided a method of treating X-linked severe combined immunodeficiency in an individual, comprising administering to the individual an effective amount of a circRNA comprising a nucleic acid sequence encoding IL2RG (including a functional variant thereof). In some embodiments, the IL2RG comprises an amino acid sequence having at least about 80% (e.g., at least about any one of 85%, 90%, 93%, 95%, 97%, 98%, or 99%, or 100%) to the amino acid sequence of SEQ ID NO: 61.


In some embodiments, the circRNA has a functional half-life of at least or at least about 20 hours, 24 hours, 30 hours, or 36 hours. In some embodiments, the circRNA has a duration of therapeutic effect in a human cell of at least or at least about 20 hours, 24 hours, 30 hours, or 36 hours. In some embodiments, the circRNA has a duration of therapeutic effect in a human cell greater than or equal to that of an equivalent linear RNA comprising the same expression sequence. In some embodiments, the circRNA has a functional half-life in a human cell greater than or equal to that of an equivalent linear RNA comprising the same expression sequence.


In some embodiments, the therapeutic polypeptide comprises IDUA, and the disease or condition is Hurler Syndrome. In some embodiments, administration of the circRNA restores at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 100% of α-1-iduronidase in a human or an animal model with a mutation in IDUA compared to wild-type. In some embodiments, the catalytic activity of IDUA increases from 4 to 24 hours (e.g., from 4 to 8 hours, from 8 to 12 hours, from 12 to 16 hours, from 16 to 20 hours, and/or from 16 to 24 hours) following administration of the circRNA encoding IDUA. In some embodiments, the circRNA encoding IDUA has a functional half-life of at least or at least about 20 hours, 24 hours, 30 hours, or 36 hours.


B. Treating or Preventing a Coronavirus Infection

The present application provides methods of treating or preventing a coronavirus (e.g., SARS-CoV-2 infection) infection in an individual, comprising administering to the individual an effective amount of the circRNAs of any one of the embodiments described herein, wherein the circRNA encodes an antigenic polypeptide or a receptor protein (e.g., soluble receptor) of the coronavirus, or a neutralizing antibody specifically binding the coronavirus. In some embodiments, the coronavirus is SARS-CoV, MERS-COV, or SARS-CoV-2. In some embodiments, the coronavirus is SARS-CoV-2. In some embodiments, the present application provides methods of preventing or decreasing the risk of a coronavirus (e.g., SARS-CoV-2 infection) infection in an individual, comprising administering to the individual an effective amount of the circRNA of any one of the embodiments described above, wherein the circRNA encodes an antigenic polypeptide or a receptor protein (e.g., soluble receptor) of the coronavirus, or a neutralizing antibody specifically binding the coronavirus. In some embodiments, the method comprises administering a cocktail composition comprising a plurality of circRNAs encoding different antigenic polypeptides, receptor proteins, or neutralizing antibodies. In some embodiments, the circRNA is subject to rolling circle translation by a ribosome in the individual. In some embodiments, the circRNA is administered as naked circRNA, or as a pharmaceutical composition comprising a transfection agent.


In some embodiments, there is provided a method of treating or preventing a coronavirus (e.g., SARS-CoV-2 infection) infection in an individual, comprising administering to the individual an effective amount of a circRNA comprising a nucleic acid sequence encoding a receptor protein of the coronavirus. In some embodiments, the coronavirus is SARS-CoV-2. In some embodiments, the receptor protein is a soluble receptor, such as a soluble ACE2 receptor. In some embodiments, the method comprises administering an effective amount of a cocktail composition comprising a plurality of circRNA encoding different receptor proteins.


In some embodiments, there is provided a method of treating or preventing a coronavirus (e.g., SARS-CoV-2 infection) infection in an individual, comprising administering to the individual an effective amount of a circRNA comprising a nucleic acid sequence encoding a neutralizing antibody that specifically binds the coronavirus. In some embodiments, the coronavirus is SARS-CoV-2. In some embodiments, the method comprises administering an effective amount of a cocktail composition comprising a plurality of circRNA encoding different neutralizing antibodies.


In some embodiments, the present application provides methods of treating or preventing a coronavirus infection in an individual, comprising administering to the individual an effective amount of the circRNA vaccine of any one of the embodiments described herein. In some embodiments, the coronavirus is SARS-CoV, MERS-COV, or SARS-CoV-2. In some embodiments, the coronavirus is SARS-CoV-2. In some embodiments, the present application provides methods of preventing or decreasing the risk of a coronavirus (e.g., SARS-CoV-2 infection) infection in an individual, comprising administering to the individual an effective amount of the circRNA vaccine of any one of the embodiments described above. In some embodiments, the circRNA is subject to rolling circle translation by a ribosome in the individual. In some embodiments, the circRNA vaccine is administered as naked circRNA, or as a pharmaceutical composition comprising a transfection agent.


In some embodiments, the present application provides methods of treating or preventing a coronavirus infection in an individual, comprising administering to the individual an effective amount of the circRNA vaccine of any one of the embodiments described herein. In some embodiments, the coronavirus is a wild-type strain of SARS-CoV-2 or a variant strain of SARS-CoV-2. In some embodiments, the coronavirus is SARS-CoV-2. In some embodiments, the SARS-CoV-2 is an Alpha (B.1.1.7), Beta (B.1.351, B.1.351.2, B.1.351.3), Delta (B.1.617.2, AY.1, AY.2, AY.3), or Gamma (P.1, P.1.1, P.1.2) variant of SARS-CoV-2. In some embodiments, the variant can be any variant described on cdc.gov/coronavirus/2019-ncov/variants/. In some embodiments, the present application provides methods of preventing or decreasing the risk of a coronavirus (e.g., SARS-CoV-2 infection, such as an infection with any of the variant SARS-CoV-2 strains described herein) infection in an individual, comprising administering to the individual an effective amount of the circRNA vaccine of any one of the embodiments described above. In some embodiments, the circRNA vaccine encodes an S protein or fragment thereof comprising one, two, or three of the mutations selected from the group K417N, L452R, and T478K, wherein the amino acid numbering is based on SEQ ID NO: 1. In some embodiments, the circRNA vaccine encodes an S protein or fragment thereof comprising one, two, three, four, five, or more mutations selected from the group consisting of residue 69 deletion, residue 70 deletion, residue 144 deletion, E484K, S494P, N501Y, A570D, D614G, P681H, T716I, S982A, D1118H, and K1191N, wherein the amino acid numbering is based on SEQ ID NO: 1. In some embodiments, the circRNA vaccine encodes an S protein or fragment thereof comprising one, two, or three of the mutations selected from the group consisting of E484K, S494P, and N501Y, wherein the amino acid numbering is based on SEQ ID NO: 1. In some embodiments, the circRNA vaccine encodes an S protein or fragment thereof comprising one, two, three, four, five, or more mutations selected from the group consisting of D80A, D215G, 241del, 242del, 243del, K417N, E484K, N501Y, D614G, and A701V, wherein the amino acid numbering is based on SEQ ID NO: 1. In some embodiments, the circRNA vaccine encodes an S protein or fragment thereof comprising one, two, or three of the mutations selected from the group consisting of K417N, E484K, and N501Y, wherein the amino acid numbering is based on SEQ ID NO: 1. In some embodiments, the circRNA vaccine encodes an S protein or fragment thereof comprising one, two, three, four, five, or more mutations selected from the group consisting of L18F, T20N, P26S, D138Y, R190S, K417T, E484K, N501Y, D614G, H655Y, and T10271, wherein the amino acid numbering is based on SEQ ID NO: 1. In some embodiments, the circRNA vaccine encodes an S protein or fragment thereof comprising one, two, or three of the mutations selected from the group consisting of K417T, E484K, and N501Y.


In some embodiments, the present application provides methods of treating or preventing a multiple strains of a coronavirus (e.g., multiple strains of SARS-CoV-2) infection in an individual, comprising administering to the individual an effective amount of the circRNA vaccine of any one of the embodiments described herein. In some embodiments, the present application provides methods of treating or preventing a multiple strains of a coronavirus (e.g., multiple strains of SARS-CoV-2) infection in an individual, comprising administering to the individual an effective amount of multiple different circRNA vaccines of any one of the embodiments described herein. In some embodiments, the method comprises administering to the individual a composition comprising a plurality (e.g., two or more) circRNAs, wherein a first circRNA encodes an S protein or fragment thereof of a first strain of a coronavirus, and a second circRNA encodes an S protein or fragment thereof of a second strain of a coronavirus. In some embodiments, at least one of the circRNAs of the plurality encodes an S protein or fragment thereof comprising the mutations found in the D614G, B.1.1.7/501Y.V1 variant of SARS-CoV-2 or the B.1.351/501Y.V2 variant of SARS-CoV-2.


In some embodiments, the present application provides methods of treating or preventing a coronavirus infection in an individual, comprising administering to the individual an effective amount of a circRNA vaccine comprising a circRNA comprising a nucleic acid sequence encoding an antigenic polypeptide comprising an S protein or a fragment thereof of the coronavirus (e.g., SARS-CoV-2). In some embodiments, the antigenic polypeptide comprises a RBD of the S protein. In some embodiments, the antigenic polypeptide further comprises a multimerization domain (e.g., C-terminal Fd domain, or a GCN-4 based isoleucine zipper domain). In some embodiments, the antigenic polypeptide comprises an S2 region of the S protein. In some embodiments, the antigenic polypeptide comprises amino acid residues 2-1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the S2 region of the S protein comprises one or more mutations (e.g., K986P and V987P) that stabilize a pre-fusion conformation of the S protein. In some embodiments, the antigenic polypeptide comprises one or more mutations (e.g., deletion of amino acid residues 681-684) that inhibit cleavage of the S protein. In some embodiments, the antigenic polypeptide comprises an S protein or fragment thereof of SARS-CoV-2 having a D614G mutation. In some embodiments, the circRNA comprises a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 11-15. In some embodiments, the circRNA is subject to rolling circle translation by a ribosome in the individual.


In some embodiments, the present application provides methods of treating or preventing a coronavirus infection in an individual, comprising administering to the individual an effective amount of a circRNA vaccine comprising a circRNA comprising: (a) a nucleic acid sequence encoding an antigenic polypeptide comprising an S protein or a fragment thereof of a coronavirus (e.g., SARS-CoV-2), and (b) an IRES sequence, wherein the IRES sequence is operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a SP (e.g., human tPA or IgE SP) fused to the N-terminus of the S protein or fragment thereof. In some embodiments, the circRNA further comprises a Kozak sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA comprises a nucleic acid sequence comprising from the 5′ end to the 3′ end: the IRES sequence, the Kozak sequence, the SP, and the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises a polyA or polyAC sequence disposed at the 5′ end of an IRES sequence. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises a 5′ ligation sequence at the 5′ end of the circRNA, and a 3′ ligation sequence at the 3′ end of the circRNA, wherein the 5′ ligation sequence and the 3′ ligation sequence are ligated to each other via a ligase (e.g., T4 RNA ligase). In some embodiments, the antigenic polypeptide comprises a RBD of the S protein. In some embodiments, the antigenic polypeptide further comprises a multimerization domain (e.g., C-terminal Fd domain, or a GCN-4 based isoleucine zipper domain). In some embodiments, the antigenic polypeptide comprises an S2 region of the S protein. In some embodiments, the antigenic polypeptide comprises amino acid residues 2-1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the S2 region of the S protein comprises one or more mutations (e.g., K986P and V987P) that stabilize a pre-fusion conformation of the S protein. In some embodiments, the antigenic polypeptide comprises one or more mutations (e.g., deletion of amino acid residues 681-684) that inhibit cleavage of the S protein. In some embodiments, the antigenic polypeptide comprises an S protein or fragment thereof of SARS-CoV-2 having a D614G mutation. In some embodiments, the circRNA comprises a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 11-15. In some embodiments, the circRNA is subject to rolling circle translation by a ribosome in the individual. In some embodiments, the circRNA vaccine is administered via intramuscular (i.m) injection. In some embodiments, one or more doses of the circRNA vaccine are administered. In some embodiments, the interval between doses is about 2 weeks (e.g., 12, 13, 14, 15, or 16 days). In some embodiments, the method comprises administering a first dose of the circRNA vaccine and administering a second dose of the circRNA vaccine after 2 weeks or about 2 weeks.


In some embodiments, the present application provides methods of treating or preventing a coronavirus infection in an individual, comprising administering to the individual an effective amount of a circRNA vaccine comprising a circRNA comprising: (a) a nucleic acid sequence encoding an antigenic polypeptide comprising an S protein or a fragment thereof of a coronavirus (e.g., SARS-CoV-2), and (b) an m6A modification motif sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the nucleic acid sequence further encodes a SP (e.g., human tPA or IgE SP) fused to the N-terminus of the S protein or fragment thereof. In some embodiments, the circRNA further comprises a Kozak sequence operably linked to the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA comprises a nucleic acid sequence comprising from the 5′ end to the 3′ end: the m6A modification motif sequence, the Kozak sequence, the SP, and the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the antigenic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the antigenic polypeptide. In some embodiments, the circRNA further comprises a 5′ ligation sequence at the 5′ end of the circRNA, and a 3′ ligation sequence at the 3′ end of the circRNA, wherein the 5′ ligation sequence and the 3′ ligation sequence are ligated to each other via a ligase (e.g., T4 RNA ligase). In some embodiments, the antigenic polypeptide comprises a RBD of the S protein. In some embodiments, the antigenic polypeptide further comprises a multimerization domain (e.g., C-terminal Fd domain, or a GCN-4 based isoleucine zipper domain). In some embodiments, the antigenic polypeptide comprises an S2 region of the S protein. In some embodiments, the antigenic polypeptide comprises amino acid residues 2-1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1. In some embodiments, the S2 region of the S protein comprises one or more mutations (e.g., K986P and V987P) that stabilize a pre-fusion conformation of the S protein. In some embodiments, the antigenic polypeptide comprises one or more mutations (e.g., deletion of amino acid residues 681-684) that inhibit cleavage of the S protein. In some embodiments, the antigenic polypeptide comprises an S protein or fragment thereof of SARS-CoV-2 having a D614G mutation. In some embodiments, the circRNA comprises a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 11-15. In some embodiments, the circRNA is subject to rolling circle translation by a ribosome in the individual. In some embodiments, the circRNA vaccine is administered via intramuscular (i.m) injection. In some embodiments, one or more doses of the circRNA vaccine are administered. In some embodiments, the interval between doses is about 2 weeks (e.g., 12, 13, 14, 15, or 16 days). In some embodiments, the method comprises administering a first dose of the circRNA vaccine and administering a second dose of the circRNA vaccine after 2 weeks or about 2 weeks.


C. Formulation and Administration

In some embodiments, the circRNA composition for administration (e.g., circRNA vaccine or pharmaceutical composition) further comprises a transfection agent. In non-limiting examples, the transfection agent is polyethylenimine (PEI) or a lipid nanoparticle (LNP). Suitable lipid nanoparticles for administration of the circRNA have been described, for example, in Ickenstein, L. M. & Garidel, P. Lipid-based nanoparticle formulations for small molecules and RNA drugs. 890 Expert Opin Drug Deliv 16, 1205-1226, doi:10.1080/17425247.2019.1669558 (2019), U.S. Patent App. Pub. No. 20200121809, U.S. Patent App. Pub. No. 20200163878, U.S. Patent App. Pub. No. 20190022247, and International Patent App. Pub. No. WO2021/030701, the contents of which are herein incorporated by reference in their entirety. In some embodiments, the LNP are formed from a lipid mixture of MC3-lipid:DSPC:cholesterol:PEG2000-DMG. In some embodiments, the MC3-lipid:DSPC:cholesterol:PEG2000-DMG are mixed in molar ratios of 50:10:38.5:1.5.


Other examples of lipidosomes that can be used to administer the circRNA composition for administration (e.g., circRNA vaccine or pharmaceutical composition) include protamines, cationic nanoemulsions, modified dendrimer nanoparticles, protamine liposomes, cationic polymers, cationic polymer liposomes, polysaccharide particles, cationic lipid nanoparticles, cationic lipid-cholesterol nanoparticles, cationic lipid-cholesterol PEG nanoparticle, cationic lipid transfection reagents sold under the trademark LIPOFECTAMINE, nonliposomal transfection reagents sold under the trademark FUGENE, or any combination thereof can be used as the transfection agent.


In some embodiments, the liposome formulation may be influenced by, but not limited to, the selection of the cationic lipid component, the degree of cationic lipid saturation, the nature of the PEGylation, ratio of all components and biophysical parameters such as size. In some embodiments, the liposome formulation comprises a cationic lipid, a cholesterol and a PEGylated lipid. For example, a liposome formulation may comprise a cationic lipid, dipalmitoylphosphatidylcholine, cholesterol, and PEG-c-DMA. See, for example, Semple et al. Nature Biotech. 2010 28:172-176, herein incorporated by reference in its entirety. In some embodiments, liposome formulations may comprise from about 35 to about 45% cationic lipid, from about 40% to about 50% cationic lipid, from about 50% to about 60% cationic lipid and/or from about 55% to about 65% cationic lipid. In some embodiments, the ratio of lipid to RNA in liposomes may be from about 5:1 to about 20:1, from about 10:1 to about 25:1, from about 15:1 to about 30:1 and/or at least 30:1. Suitable liposome formulations have been described, for example, in WO2020237227, the contents of which are herein incorporated by reference in their entirety.


In some embodiments, the circRNA is not formulated with a transfection reagent. In some embodiments, the circRNA is delivered as naked RNA. In some embodiments, the circRNA is delivered by gene gun or by electroporation.


The circRNA composition for administration (e.g., circRNA vaccine or pharmaceutical composition) can be administered to a subject by systemic injection into the vasculature, systemic injection into the lymph nodes, subcutaneous injection or depots, or by local injection.


In some embodiments, a circRNA vaccine herein (e.g., encoding an S protein or fragment thereof of a coronavirus) is administered via intramuscular (i.m) injection. In some embodiments, one or more doses of the circRNA vaccine are administered. In some embodiments, two or more doses of the circRNA vaccine are administered. In some embodiments, the interval between doses is about 2 weeks (e.g., 12, 13, 14, 15, or 16 days). In some embodiments, the method comprises administering a first dose of the circRNA vaccine and administering a second dose of the circRNA vaccine after 2 weeks or about 2 weeks.


In some embodiments, the circRNA may be formulated in a lipid nanoparticle such as those described in International Publication No. WO2012170930, herein incorporated by reference in its entirety.


In some embodiments, the synthetic nanocarriers may be formulated for controlled and/or sustained release of the circRNA described herein. As a non-limiting example, the synthetic nanocarriers for sustained release may be formulated by methods known in the art, described herein and/or as described in International Pub No. WO2010138192 and US Pub No. 20100303850, each of which is herein incorporated by reference in their entirety.


In some embodiments, the circRNA may be formulated for controlled and/or sustained release wherein the formulation comprises at least one polymer that is a crystalline side chain (CYSC) polymer. CYSC polymers are described in U.S. Pat. No. 8,399,007, herein incorporated by reference in its entirety.


In some embodiments, the synthetic nanocarrier may be formulated for use as a vaccine. In some embodiments, the synthetic nanocarrier may encapsulate at least one circRNA, which encode at least one antigen. As a nonlimiting example, the synthetic nanocarrier may include at least one antigen and an excipient for a vaccine dosage form (see International Pub No. WO201 1150264 and US Pub No. US201 10293723, each of which is herein incorporated by reference in their entirety). As another non-limiting example, a vaccine dosage form may include at least two synthetic nanocarriers with the same or different antigens and an excipient (see International Pub No. WO201 1150249 and US Pub No. US201 10293701, each of which is herein incorporated by reference in their entirety). The vaccine dosage form may be selected by methods described herein, known in the art and/or described in International Pub No. WO201 1150258 and US Pub No. US20120027806, each of which is herein incorporated by reference in their entirety).


In some embodiments, the synthetic nanocarrier may comprise at least one circRNA, which encodes at least one adjuvant. As non-limiting example, the adjuvant may comprise dimethyldioctadecylammonium-bromide, dimethyldioctadecylammoniumchloride, dimethyldioctadecylammonium-phosphate or dimethyldioctadecylammoniumacetate (DDA) and an apolar fraction or part of said apolar fraction of a total lipid extract of a mycobacterium (See e.g., U.S. Pat. No. 8,241,610; herein incorporated by reference in its entirety). In another embodiment, the synthetic nanocarrier may comprise at least one circRNA and an adjuvant. As a non-limiting example, the synthetic nanocarrier comprising and adjuvant may be formulated by the methods described in International Pub No. WO2011150240 and US Pub No. US20110293700, each of which is herein incorporated by reference in its entirety.


In some embodiments, the circRNA functions as an adjuvant. As an example, RNA-sensing in the cytoplasm can trigger innate immunity, and innate immune signaling is known to contribute to adaptive immunity by diverse routes. Thus, the circRNA comprising the antigenic polypeptide or a second circRNA (e.g., a circRNA that does not encode a polypeptide) can be used as an adjuvant for boosting the adaptive immune response to the antigenic polypeptide.


In some embodiments, the circRNA composition for administration (e.g., circRNA vaccine or pharmaceutical composition) may be administered intranasally. For example, circRNA vaccines may be administered intranasally similar to the administration of live vaccines. In some embodiments, the circRNA may be administered intramuscularly or intradermally similarly to the administration of inactivated vaccines known in the art.


In some embodiments, the circRNA vaccine comprises an adjuvant, which may enable the vaccine to elicit a higher immune response. As a non-limiting example, the adjuvant could be a sub-micron oil-in-water emulsion, which can elicit a higher immune response in human pediatric populations (see e.g., the adjuvant-containing vaccines described in US Patent Publication No. US20120027813 and U.S. Pat. No. 8,506,966, the contents of each of which are herein incorporated by reference in its entirety).


In some embodiments, the circRNA compositions of the present application may be administrated with other prophylactic or therapeutic compounds. As a non-limiting example, the prophylactic or therapeutic compound may be an adjuvant or a booster. As used herein, when referring to a prophylactic composition, such as a vaccine, the term “booster” refers to an extra administration of the prophylactic composition. A booster (or booster vaccine) may be given after an earlier administration of the prophylactic composition. The time of administration between the initial administration of the prophylactic composition and the booster may be, but is not limited to, 1 minute, 2 minutes, 3 minutes, 4 minutes, 5 minutes, 6 minutes, 7 minutes, 8 minutes, 9 minutes, 10 minutes, 15 minutes, 20 minutes 35 minutes, 40 minutes, 45 minutes, 50 minutes, 55 minutes, 1 hour, 2 hours, 3 hours, 4 hours, 5 hours, 6 hours, 7 hours, 8 hours, 9 hours, 10 hours, 11 hours, 12 hours, 13 hours, 14 hours, 15 hours, 16 hours, 17 hours, 18 hours, 19 hours, 20 hours, 21 hours, 22 hours, 23 hours, 1 day, 36 hours, 2 days, 3 days, 4 days, 5 days, 6 days, 1 week, 10 days, 2 weeks, 3 weeks, 1 month, 2 months, 3 months, 4 months, 5 months, 6 months, 7 months, 8 months, 9 months, 10 months, 11 months, 1 year, 18 months, 2 years, 3 years, 4 years, 5 years, 6 years, 7 years, 8 years, 9 years, 10 years, 11 years, 12 years, 13 years, 14 years, 15 years, 16 years, 17 years, 18 years, 19 years, 20 years, 25 years, 30 years, 35 years, 40 years, 45 years, 50 years, 55 years, 60 years, 65 years, 70 years, 75 years, 80 years, 85 years, 90 years, 95 years or more than 99 years.


IV. METHOD OF PREPARATION

The present application further provides nucleic acid constructs (e.g., linear RNA and vectors, etc.) for preparation of the circRNAs described herein, and methods for preparing the circRNAs, for example, by chemical ligation, enzymatic ligation, or ribozyme autocatalysis of linear RNAs. In some embodiments, the circRNA is prepared by circularizing a linear RNA in vitro.


Linear RNA and Nucleic Acid Constructs Encoding Thereof

In some embodiments, the present application provides a linear RNA capable of forming the circRNA of any one of the embodiments described above. In some embodiments, the linear RNA can circularized by chemical circularization methods using cyanogen bromide or a similar condensing agent. In some embodiments, the linear RNA can be circularized by autocatalysis of a Group I intron comprising a 5′ catalytic Group I intron fragment and a 3′ catalytic Group I intron fragment. In some embodiments, the linear RNA can be circularized by a ligase. In some embodiments, the linear RNA can be circularized by a T4 RNA ligase. In some embodiments, the linear RNA can be circularized by a DNA ligase. Suitable ligases include, but are not limited to a T4 DNA ligase (T4 Dnl), a T4 RNA ligase 1 (T4 Rnl1) and a T4 RNA ligase 2 (T4 Rnl2).


In some embodiments, the present application provides a linear RNA capable of forming the circRNA of any one of the embodiments described above, wherein the linear RNA can be circularized by autocatalysis of a Group I intron. In some embodiments, the Group I intron comprises a 5′ catalytic Group I intron fragment and a 3′ catalytic Group I intron fragment. In some embodiments, the linear RNA comprises a 3′ catalytic Group I intron fragment (such as the sequence set forth in SEQ ID NO: 46) flanking the 5′ end of a 3′ exon sequence recognizable by the 3′ catalytic Group I intron fragment (such as the sequence set forth in SEQ ID NO: 39), and the 5′ catalytic Group I intron fragment (such as the sequence set forth in SEQ ID NO: 47) flanking the 3′ end of a 5′ exon sequence recognizable by the 5′ catalytic Group I intron fragment (such as the sequence set forth in SEQ ID NO: 40).


In some embodiments, the linear RNA comprises, from 5′ to 3′ end, a 3′ Intron-IRES-Kozak-SP-Spike-5′ Intron sequence. In some embodiments, the Spike sequence comprises one of the sequences set forth in SEQ ID NOs: 11-15 and SEQ ID NOs: 48-49.


In some embodiments, the linear RNA comprises, from 5′ to 3′ end, a 3′ Intron-IRES-Kozak-SP-RBD-5′ Intron sequence. In some embodiments, the RBD sequence comprises comprises amino acid residues 319 to 542 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1


In some embodiments, the linear RNA comprises, from 5′ to 3′ end, a 3′ Intron-IRES-Kozak-SP-nAb-5′ Intron sequence. In some embodiments, the nAb sequence encodes the amino acid sequence of SEQ ID NO: one of SEQ ID NOS: 26-35. In some embodiments, the nAb sequence encodes the amino acid sequence of SEQ ID NO: 26. In some embodiments, the nAb sequence encodes the amino acid sequence of SEQ ID NO: 27. In some embodiments, the nAb sequence encodes the amino acid sequence of SEQ ID NO: 30.


In some embodiments, the linear RNA comprises, from 5′ to 3′ end, a 3′ Intron-IRES-Kozak-IDUA-5′ Intron sequence. In some embodiments, the IDUA sequence encodes the amino acid sequence of SEQ ID NO: 18. In some embodiments, the IDUA sequence encodes the amino acid sequence of SEQ ID NO: 19.


In some embodiments, the linear RNA further comprises a 5′ homology sequence flanking the 5′ end of the 3′ catalytic Group I intron fragment, and a 3′ homology sequence flanking the 3′ end of the 5′ catalytic Group I intron fragment. In some embodiments, the linear RNA comprises, from 5′ to 3′ end, a 5′ homology arm-3′ catalytic Group I Intron fragment-3′ exon sequence-IRES-Kozak-SP-antigenic polypeptide (e.g., Spike protein or fragment thereof)-5′ exon sequence-5′ catalytic Group I Intron fragment-3′ homology arm sequence. In some embodiments, the homology sequence can be between 1 and 100, between 5 and 80, between 5 and 60, between 10 and 50, or between 12 and 50 nucleotides in length. In some embodiments, the homology sequence is about 20-30 nucleotides in length. In some embodiments, the 5′ homology sequence comprises the nucleic acid sequence of SEQ ID NO: 41, and the 3′ homology sequence comprises the nucleic acid sequence of SEQ ID NO: 42. In some embodiments, the homology arms increase the efficiency of RNA circularization by about 0 to 20%, more than 20%, more than 30%, more than 40%, or more than 50%.


In some embodiments, there is provided a nucleic acid construct comprising a nucleic acid sequence encoding the linear RNA. In some embodiments, a T7 promoter is operably linked to the nucleic acid sequence encoding the linear RNA. In some embodiments, the T7 promoter comprises the sequence set forth in SEQ ID NO: 43. In some embodiments, the T7 promoter is capable of driving in vitro transcription.


Plasmids

In some embodiments, the present application provides plasmids comprising the nucleotide sequences described herein. In some embodiments, the plasmids are obtained by cloning the sequence encoding the linearized RNAs into a plasmid vector. Plasmids can be generated by techniques known in the art, such as Gibson cloning or cloning using restriction enzymes. In some embodiments, the plasmid vector includes an antibiotic expression cassette allowing antibiotic selection of bacteria expressing the plasmid. In some embodiments, the plasmids provided can be purified from bacteria and used for production of the linear circRNA constructs. Any plasmid vector suitable for in vitro transcription of the linear RNA may be used.


In some embodiments, the plasmids are linearized prior to in vitro transcription of the linear RNA. In some embodiments, the recombinant plasmids are linearized by restriction enzyme digestion. In some embodiments, the recombinant plasmids are linearized by PCR amplification. In some embodiments, the method further comprises performing in vitro transcription with the linearized plasmid template. In some embodiments, the in vitro transcription is driven by a T7 promoter.


Linear RNA Circularized by Chemical Ligation

In some embodiments, there is provided a method of preparing a circRNA described herein, comprising: (a) chemically ligating the 5′ end and the 3′ end of a linear RNA comprising a nucleic acid sequence encoding the circRNA; and (b) isolating the circularized RNA product, thereby providing the circRNA.


In some embodiments, the step of circularizing the linear RNA comprises chemical circularization methods using cyanogen bromide or a similar condensing agent.


In some embodiments, the linear RNA can be circularized by chemical methods. In some chemical methods, the 5′-end and the 3′-end of the nucleic acid (e.g., a linear circular polyribonucleotide) includes chemically reactive groups that, when close together, may form a new covalent linkage between the 5′-end and the 3′-end of the molecule. The 5′-end may contain an NHS ester reactive group and the 3′-end may contain a 3′-amino terminated nucleotide such that in an organic solvent the 3′-amino-terminated nucleotide on the 3′-end of a linear RNA molecule will undergo a nucleophilic attack on the 5′-NHS-ester moiety forming a new 5′-/3′-amide bond.


In some embodiments, the circularization efficiency of the circularization methods provided herein is at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 60, at least about 70%, at least about 80%, at least about 90%, at least about 95%, or 100%. In some embodiments, the circularization efficiency of the circularization methods provided herein is at least about 40%.


Linear RNA Circularized by Ribozyme Autocatalysis

In some embodiments, the circRNA can be obtained by circularizing a linear RNA by ribozyme autocatalysis. In some embodiments, the linear RNA is circularized in vitro. In some embodiments, circularization by ribozyme autocatalysis comprises (a) subjecting the linear RNA to a condition that activates autocatalysis of the Group I intron (or 5′ and 3′ catalytic Group I intron fragments thereof) to provide a circularized RNA product; and (b) isolating the circularized RNA product, thereby providing the circRNA.


In some embodiments, the method comprises a step of obtaining the linear RNA by first cloning the sequence encoding the linearized RNAs into a plasmid vector, and then linearizing the recombinant plasmids. In some embodiments, the recombinant plasmids are linearized by restriction enzyme digestion. In some embodiments, the recombinant plasmids are linearized by PCR amplification. In some embodiments, the method further comprises performing in vitro transcription with the linearized plasmid template. In some embodiments, the in vitro transcription is driven by a T7 promoter. In some embodiments, the method further comprises purifying the linear RNA transcripts. In some embodiments, the linear RNAs are purified by gel purification.


In some embodiments, the present application provides a method of cyclizing a linear RNA (e.g., purified linear RNA) by ribozyme autocatalysis of the Group I intron. During splicing, the 3′ hydroxyl group of a guanosine nucleotide engages in a transesterification reaction at the 5′ splice site. The 5′ intron half is excised, and the freed hydroxyl group at the end of the intermediate engages in a second transesterification at the 3′ splice site, resulting in circularization of the intervening region and excision of the 3′ intron. In some embodiments, the condition that activates autocatalysis of the Group I intron or 5′ and 3′ catalytic Group I intron fragments is the addition of GTPs and Mg2+. In some embodiments, there is provided a step of cyclizing the linear RNAs by adding GTPs and Mg2+ at 55° C. for 15 min. In some embodiments, the method further comprises treating with RNase R to digest the linear RNA transcripts. In some embodiments, the method further comprises isolating the circular RNA (circRNA). In some embodiments, the step of isolating the circRNA comprises gel-purifying the circRNA. In some embodiments, the purified circRNA can be stored at −80° C.


In some embodiments, the circularization has an efficiency of at least 2%, at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 32%, at least 34%, at least 36%, at least 38%, at least 40%, at least 42%, at least 44%, at least 46%, at least 48%, or at least 50%. In some embodiments, the circularization has an efficiency of about 40% to about 50% or more than 50%.


Linear RNA Circularized by Ligation

In some embodiments, the circRNA can be obtained by circularizing a linear RNA using a ligase such as a RNA ligase. In some embodiments, the linear RNA is circularized in vitro. In some embodiments, the linear RNA can be circularized by a T4 RNA ligase. In some embodiments, the linear RNA comprises a 5′ ligation sequence at the 5′ end of the nucleic acid sequence encoding the circRNA, and a 3′ ligation sequence at the 3′ end of the nucleic acid sequence encoding the circRNA, wherein the 5′ ligation sequence and the 3′ ligation sequence can be ligated to each other via the RNA ligase. In non-limiting examples, the linear RNA can be circularized by a ligase such as a T4 DNA ligase (T4 Dnl), T4 RNA ligase 1 (T4 Rnl1), and T4 RNA ligase 2 (T4 Rnl2). The linear RNA may be circularized with or without the presence of a single stranded nucleic acid adaptor, e.g., a splint DNA.


In some embodiments, the present application provides a method of producing any one of the circRNAs described above, comprising: (a) contacting any one of the linear RNAs comprising a 5′ ligation sequence at the 5′ end of the nucleic acid sequence encoding the circRNA, and a 3′ ligation sequence at the 3′ end of the nucleic acid sequence encoding the circRNA described above with a single-stranded adaptor nucleic acid comprising from the 5′ end to the 3′ end: a first sequence complementary to the 3′ ligation sequence and a second sequence complementary to the 5′ ligation sequence, and wherein the 5′ ligation sequence and the 3′ ligation sequence hybridize to the single-stranded adaptor nucleic acid to provide a duplex nucleic acid intermediate comprising a single strand break between the 3′ end of the 5′ ligation sequence and the 5′ end of the 3′ ligation sequence; (b) contacting the intermediate with an RNA ligase under a condition that allows ligation of the 5′ ligation sequence to the 3′ ligation sequence to provide a circularized RNA product; and (c) isolating the circularized RNA product, thereby providing the circRNA.


In some embodiments, the method described herein comprises circularizing a linear RNA in vitro, comprising: (a) contacting any one of the linear RNAs comprising a 5′ ligation sequence at the 5′ end of the nucleic acid sequence encoding the circular RNA, and a 3′ ligation sequence at the 3′ end of the nucleic acid sequence encoding the circular RNA described above with an RNA ligase under a condition that allows ligation of the 5′ ligation sequence to the 3′ ligation sequence to provide a circularized RNA product; and (b) isolating the circularized RNA product, thereby providing the circular RNA.


In some embodiments, the method further comprises treating with RNase R to digest the linear RNA transcripts. In some embodiments, the method further comprises isolating the circular RNA (circRNA). In some embodiments, the step of isolating the circRNA comprises gel-purifying the circRNA. In some embodiments, the purified circRNA can be stored at −80° C.


In some embodiments, a DNA or RNA ligase may be used to enzymatically link a 5′-phosphorylated nucleic acid molecule (e.g., a linear RNA) to the 3′-hydroxyl group of a nucleic acid (e.g., a linear nucleic acid) forming a new phosphorodiester linkage. In an example reaction, a linear circular RNA is incubated at 37° C. for 1 hour with 1-10 units of T4 RNA ligase (New England Biolabs, Ipswich, Mass.) according to the manufacturer's protocol. The ligation reaction may occur in the presence of a linear nucleic acid capable of base—pairing with both the 5′- and 3′-region in juxtaposition to assist the enzymatic ligation reaction. In some embodiments, the ligation is splint ligation. For example, a splint ligase, like SPLINTR® ligase, can be used for splint ligation. For splint ligation, a single stranded polynucleotide (splint), like a single stranded RNA, can be designed to hybridize with both termini of a linear polyribonucleotide, so that the two termini can be juxtaposed upon hybridization with the single-stranded splint. Splint ligase can thus catalyze the ligation of the juxtaposed two termini of the linear polyribonucleotide, generating a circular polyribonucleotide.


In some embodiments, a DNA or RNA ligase may be used in the synthesis of the circular RNA. As a non-limiting example, the ligase may be a circ ligase or circular ligase.


Purification of circRNA


In some embodiments, the method provided herein of producing a circRNA further comprises a step of purifying the circularized RNA product. In non-limiting examples, the circRNA is purified by gel-purification or by high-performance liquid chromatography (HPLC). In some embodiments, agarose gel electrophoresis allows for simple and effective separation of circular splicing products from linear precursor molecules, nicked circles, splicing intermediates, and excised introns. In some embodiments, the method comprises purifying the circular RNA by chromatography, such as HPLC. In some embodiments, the purified circular RNA can be stored at −80° C.


V. PHARMACEUTICAL COMPOSITIONS, KITS AND ARTICLES OF MANUFACTURE

Further provided by the present application are pharmaceutical compositions comprising any one of circRNAs described herein, and a pharmaceutically acceptable carrier. Pharmaceutical compositions can be prepared by mixing the therapeutic agents described herein having the desired degree of purity with optional pharmaceutically acceptable carriers, excipients or stabilizers (Remington's Pharmaceutical Sciences 16th edition, Osol, A. Ed. (1980)), in the form of lyophilized formulations or aqueous solutions. Acceptable carriers, excipients, or stabilizers are nontoxic to recipients at the dosages and concentrations employed, and include buffers, antioxidants including ascorbic acid, methionine, Vitamin E, sodium metabisulfite; preservatives, isotonicifiers (e.g. sodium chloride), stabilizers, metal complexes (e.g. Zn-protein complexes); chelating agents such as EDTA and/or non-ionic surfactants.


In some embodiments, the pharmaceutical composition is contained in a single-use vial, such as a single-use sealed vial. In some embodiments, the pharmaceutical composition is contained in a multi-use vial. In some embodiments, the pharmaceutical composition is contained in bulk in a container. In some embodiments, the pharmaceutical composition is cryopreserved.


The present application further provides kits and articles of manufacture for use in any embodiment of the treatment methods described herein. The kits and articles of manufacture may comprise any one of the formulations and pharmaceutical compositions described herein.


In some embodiments, there is provided a kit comprising any one of the circRNAs described herein and instructions for treating or preventing a disease or condition (e.g., coronavirus infection).


In some embodiments, there is provided a kit comprising any one of the circRNA described herein and instructions for treating or preventing a coronavirus infection.


In some embodiments, there is provided a kit comprising any one of the plasmids or linear RNAs described herein, and instructions for preparing any one of the circRNAs. In some embodiments, there is provided a kit comprising any one of the plasmids, linear RNAs, or circRNAs described herein, and instructions for administering the circRNA.


The kits of the invention are in suitable packaging. Suitable packaging includes, but is not limited to, vials, bottles, jars, flexible packaging (e.g., sealed Mylar or plastic bags), and the like. Kits may optionally provide additional components such as buffers and interpretative information. The present application thus also provides articles of manufacture, which include vials (such as sealed vials), bottles, jars, flexible packaging, and the like.


The instructions relating to the use of the compositions generally include information as to dosage, dosing schedule, and route of administration for the intended treatment. The containers may be unit doses, bulk packages (e.g., multi-dose packages) or sub-unit doses. For example, kits may be provided that contain sufficient dosages of the circRNA as disclosed herein to provide effective treatment of an individual or of many individuals. Additionally, kits may be provided that contain sufficient dosages of the circRNA to allow for multiple administrations to an individual (e.g., initial vaccine administration and subsequent booster administration, in the case of a circRNA vaccine). Kits may also include multiple unit doses of the pharmaceutical compositions and instructions for use and packaged in quantities sufficient for storage and use in pharmacies, for example, hospital pharmacies and compounding pharmacies.


In some embodiments, the kit comprises a delivery system. The delivery system may be a unit dose delivery system. The volume of solution or suspension delivered per dose can be anywhere from about 5 to about 2000 microliters, from about 10 to about 1000 microliters, or from about 50 to about 500 microliters. Delivery systems for these various dosage forms can be syringes, dropper bottles, plastic squeeze units, atomizers, nebulizers or pharmaceutical aerosols in either unit dose or multiple dose packages. In some embodiments, there is provided a delivery system of any one of the circRNAs described herein, comprising the circRNA and a device for delivering the circRNA.


All of the features disclosed in this specification may be combined in any combination. Each feature disclosed in this specification may be replaced by an alternative feature serving the same, equivalent, or similar purpose. Thus, unless expressly stated otherwise, each feature disclosed is only an example of a generic series of equivalent or similar features.


VI. EXEMPLARY EMBODIMENTS

In some aspects, the present application provides the following exemplary embodiments.


Embodiment 1. A circular RNA (circRNA) comprising a nucleic acid sequence encoding a therapeutic polypeptide, wherein the therapeutic polypeptide is selected from the group consisting of an antigenic polypeptide, a functional protein, a receptor protein, and a targeting protein.


Embodiment 2. The circRNA of embodiment 1, further comprising a Kozak sequence operably linked to the nucleic acid sequence encoding the therapeutic polypeptide.


Embodiment 3. The circRNA of embodiment 1 or 2, further comprising an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding the therapeutic polypeptide.


Embodiment 4. The circRNA of any one of embodiments 1-3, further comprising an internal ribosomal entry site (IRES) sequence operably linked to the nucleic acid sequence encoding the therapeutic polypeptide.


Embodiment 5. The circRNA of embodiment 4, wherein the IRES sequence is an IRES sequence selected from the group consisting of CVB3 virus, EV71 virus, EMCV virus, PV virus, and CSFV virus IRES sequences.


Embodiment 6. The circRNA of embodiment 4 or 5, comprising a nucleic acid sequence comprising from the 5′ end to the 3′ end: the IRES sequence, the Kozak sequence, and the nucleic acid sequence encoding the therapeutic polypeptide.


Embodiment 7. The circRNA of any one of embodiments 4-6, further comprising a polyAC or polyA sequence disposed at the 5′ end of the IRES sequence.


Embodiment 8. The circRNA of any one of embodiments 1-3, further comprising an m6A modification motif sequence operably linked to the nucleic acid sequence encoding the therapeutic polypeptide.


Embodiment 9. The circRNA of embodiment 8, comprising a nucleic acid sequence comprising from the 5′ end to the 3′ end: the m6A modification motif sequence, the Kozak sequence, and the nucleic acid sequence encoding the therapeutic polypeptide.


Embodiment 10. The circRNA of any one of embodiments 1-9, wherein the nucleic acid sequence further encodes a signal peptide (SP) fused to the N-terminus of the therapeutic polypeptide.


Embodiment 11. The circRNA of embodiment 10, wherein the SP is an SP of a human tissue plasminogen activator (tPA) or a human IgE immunoglobulin.


Embodiment 12. The circRNA of any one of embodiments 1-11, further comprising a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the therapeutic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the therapeutic polypeptide.


Embodiment 13. The circRNA of embodiment 12, wherein the 3′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 40, and the 5′ exon sequence comprises the nucleic acid sequence of SEQ ID NO: 41.


Embodiment 14. The circRNA of any one of embodiments 1-13, wherein the therapeutic protein is for treating or preventing an infection.


Embodiment 15. The circRNA of embodiment 14, wherein the infection is an infection by a virus.


Embodiment 16. The circRNA of embodiment 15, wherein the virus is a coronavirus.


Embodiment 17. The circRNA of embodiment 16, wherein the coronavirus is selected from the group consisting of SARS-CoV, MERS-COV, and SARS-CoV-2.


Embodiment 18. The circRNA of embodiment 17, wherein the coronavirus is SARS-CoV-2.


Embodiment 19. The circRNA of any one of embodiments 1-18, wherein the therapeutic polypeptide is an antigenic polypeptide.


Embodiment 20. The circRNA of embodiment 19, wherein the antigenic polypeptide comprises a Spike (S) protein or a fragment thereof of a coronavirus.


Embodiment 21. The circRNA of embodiment 20, wherein the antigenic polypeptide comprises a receptor-binding domain (RBD) of the S protein.


Embodiment 22. The circRNA of embodiment 21, wherein the RBD comprises amino acid residues 319 to 542 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1.


Embodiment 23. The circRNA of embodiment 22, wherein the RBD comprises the amino acid sequence of SEQ ID NO: 2.


Embodiment 24. The circRNA of any one of embodiments 21-23, wherein the antigenic polypeptide further comprises a multimerization domain.


Embodiment 25. The circRNA of embodiment 24, wherein the multimerization domain is a C-terminal Foldon (Fd) domain of a T4 fibritin protein that mediates trimerization of the T4 fibritin protein, or a GCN-4 based isoleucine zipper domain.


Embodiment 26. The circRNA of embodiment 24, wherein the multimerization domain comprises the amino acid sequence of SEQ ID NO: 3 or 4.


Embodiment 27. The circRNA of any one of embodiments 24-26, wherein the RBD is fused to the multimerization domain via a peptide linker.


Embodiment 28. The circRNA of embodiment 27, wherein the peptide linker comprises the amino acid sequence of SEQ ID NO: 5.


Embodiment 29. The circRNA of any one of embodiments 20-28, wherein the antigenic polypeptide comprises an S2 region of the S protein.


Embodiment 30. The circRNA of embodiment 29, wherein the S2 region comprises amino acid residues 686 to 1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1.


Embodiment 31. The circRNA of embodiment 29, wherein the S2 region comprises one or more mutations that stabilize a pre-fusion conformation of the S protein.


Embodiment 32. The circRNA of embodiment 31, wherein the one or more mutations comprise K986P and V987P.


Embodiment 33. The circRNA of any one of embodiments 29-32, wherein the S2 region comprises the amino acid sequence of SEQ ID NO: 6 or 7.


Embodiment 34. The circRNA of any one of embodiments 20-23 and 29-33, wherein the antigenic polypeptide comprises amino acid residues 2-1273 of a full-length S protein of SARS-CoV-2, wherein the numbering is based on SEQ ID NO: 1.


Embodiment 35. The circRNA of embodiment 34, wherein the antigenic polypeptide comprises one or more mutations that inhibit cleavage of the S protein.


Embodiment 36. The circRNA of embodiment 35, wherein the one or more mutations comprise deletion of amino acid residues 681-684.


Embodiment 37. The circRNA of embodiment 20, wherein the antigenic polypeptide comprises an amino acid sequence selected from the group consisting of SEQ ID NOs: 8-10 and 62-63.


Embodiment 38. The circRNA of embodiment 20, wherein the circRNA comprises a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 11-15 and 64.


Embodiment 39. The circRNA of any one of embodiments 1-18, wherein the therapeutic protein is a receptor protein.


Embodiment 40. The circRNA of embodiment 39, wherein the therapeutic protein is a soluble receptor comprising an extracellular domain of a naturally occurring receptor.


Embodiment 41. The circRNA of embodiment 39 or 40, wherein the receptor is an ACE2 receptor.


Embodiment 42. The circRNA of embodiment 41, wherein the receptor is a high-affinity mutant ACE2 receptor.


Embodiment 43. The circRNA of any one of embodiments 1-18, wherein the therapeutic protein is a targeting protein.


Embodiment 44. The circRNA of embodiment 43, wherein the targeting protein is an antibody.


Embodiment 45. The circRNA of embodiment 44, wherein the antibody is a neutralizing antibody.


Embodiment 46. The circRNA of embodiment 44, wherein the targeting protein is a therapeutic antibody.


Embodiment 47. The circRNA of any one of embodiments 1-13, wherein the therapeutic protein is a functional protein.


Embodiment 48. The circRNA of embodiment 47, wherein the functional protein is a tumor suppressor.


Embodiment 49. The circRNA of embodiment 48, wherein the tumor suppressor is selected from the group consisting of p53 and PTEN.


Embodiment 50. The circRNA of embodiment 47, wherein the functional protein is an enzyme.


Embodiment 51. The circRNA of embodiment 50, wherein the enzyme is selected from the group consisting of OTC, FAH and IDUA.


Embodiment 52. The circRNA of embodiment 47, wherein the functional protein is selected from the group consisting of DMD, COL3A1, BMPR2, AHI1, FANCC, MYBPC3, and IL2RG.


Embodiment 53. A composition comprising a plurality of circRNAs of any one of embodiments 20-38, wherein the antigenic polypeptides corresponding to the plurality of circRNAs are different with respect to each other.


Embodiment 54. A composition comprising a plurality of circRNAs of any one of embodiments 39-42, wherein the receptor proteins corresponding to the plurality of circRNAs are different with respect to each other.


Embodiment 55. A composition comprising a plurality of circRNAs of any one of embodiments 43-46, wherein the targeting proteins corresponding to the plurality of circRNAs are different with respect to each other.


Embodiment 56. The composition of any one of embodiments 53-55, wherein the plurality of circRNAs target a plurality of strains of a coronavirus.


Embodiment 57. A circRNA vaccine comprising the circRNA of any one of embodiments 20-38 or the composition of embodiment 53 or 56.


Embodiment 58. A pharmaceutical composition comprising the circRNA of any one of embodiments 1-52 and a pharmaceutically acceptable carrier.


Embodiment 59. The circRNA vaccine of embodiment 57 or the pharmaceutical composition of embodiment 58, further comprising a transfection agent.


Embodiment 60. The circRNA vaccine or the pharmaceutical composition of embodiment 59, wherein the transfection agent is polyethylenimine (PEI) or a lipid nanoparticle (LNP), optionally wherein the LNP comprises MC3-lipid, DSPC, cholesterol, and PEG2000-DMG.


Embodiment 61. The circRNA vaccine of embodiment 57 or the pharmaceutical composition of embodiment 58, wherein the circRNA is not formulated with a transfection agent.


Embodiment 62. A method of treating or preventing an infection in an individual, comprising administering to the individual an effective amount of the circRNA of any one of embodiments 20-46, the composition of any one of embodiments 53-56, or the circRNA vaccine of any one of embodiments 57 and 59-61.


Embodiment 63. The method of embodiment 62, wherein the infection is a coronavirus infection.


Embodiment 64. The method of embodiment 63, wherein the infection is SARS-CoV-2 infection, optionally the SARS-CoV-2 infection is caused by a SARS-CoV-2 variant (e.g., B.1.351 variant).


Embodiment 65. A method of treating or preventing a disease or condition in an individual, comprising administering to the individual an effective amount of the circRNA of any one of embodiments 1-52, or the pharmaceutical composition of any one of embodiments 58-61.


Embodiment 66. The method of embodiment 65, wherein the disease or condition is a disease or condition associated with insufficient levels and/or activity of a protein corresponding to the therapeutic protein.


Embodiment 67. The method of embodiment 66, wherein the disease or condition is a hereditary genetic disease associated with one or more mutations in the protein corresponding to the therapeutic protein.


Embodiment 68. The method of any of embodiments 65-67, wherein:

    • (i) the therapeutic polypeptide is TP53 or PTEN, and the disease or condition is cancer;
    • (ii) the therapeutic polypeptide is OTC, and the disease is ornithine transcarbamylase deficiency;
    • (iii) the therapeutic polypeptide is FAH, and the disease is tyrosinemia;
    • (iv) the therapeutic polypeptide is DMD, and the disease is Duchenne and Becker muscular dystrophy, X-linked dilated cardiomyopathy, or familial dilated cardiomyopathy;
    • (v) the therapeutic polypeptide is IDUA, and the disease or condition is Mucopolysaccharidosis type I (MPS I);
    • (vi) the therapeutic polypeptide is COL3A1, and the disease or condition is Ehlers-Danlos syndrome;
    • (vii) the therapeutic polypeptide is AHI1, and the disease or condition is Joubert syndrome;
    • (viii) the therapeutic polypeptide is BMPR2, and the disease or condition is pulmonary arterial hypertension, or pulmonary veno-occlusive disease;
    • (ix) the therapeutic polypeptide is FANCC, and the disease or condition is Fanconi anemia;
    • (x) the therapeutic polypeptide is MYBPC3, and the disease or condition is primary familial hypertrophic cardiomyopathy; or
    • (xi) the therapeutic polypeptide is IL2RG, and the disease or condition is X-linked severe combined immunodeficiency.


Embodiment 69. The method of any one of embodiments 62-68, wherein the circRNA is subject to rolling circle translation by a ribosome in the individual.


Embodiment 70. The method of embodiment 66, wherein the disease or condition is a hereditary genetic disease associated with one or more mutations in the protein corresponding to the therapeutic protein.


Embodiment 71. A linear RNA capable of forming the circRNA of any one of embodiments 1-52.


Embodiment 72. The linear RNA of embodiment 71, wherein the linear RNA comprises a Group I intron comprising a 5′ catalytic Group I intron fragment and a 3′ catalytic Group I intron fragment, wherein the linear RNA can be circularized by autocatalysis of the Group I intron.


Embodiment 73. The linear RNA of embodiment 72, comprising the 3′ catalytic Group I intron fragment flanking the 5′ end of a 3′ exon sequence recognizable by the 3′ catalytic Group I intron fragment, and the 5′ catalytic Group I intron fragment flanking the 3′ end of a 5′ exon sequence recognizable by the 5′ catalytic Group I intron fragment.


Embodiment 74. The linear RNA of embodiment 73, comprising a 5′ homology sequence flanking the 5′ end of the 3′ catalytic Group I intron fragment, and a 3′ homology sequence flanking the 3′ end of the 5′ catalytic Group I intron fragment.


Embodiment 75. The linear RNA of embodiment 74, wherein the 5′ homology sequence comprises the nucleic acid sequence of SEQ ID NO: 41, and the 3′ homology sequence comprises the nucleic acid sequence of SEQ ID NO: 42.


Embodiment 76. The linear RNA of embodiment 71, wherein the linear RNA can be circularized by a ligase.


Embodiment 77. The linear RNA of embodiment 76, wherein the ligase is selected from the group consisting of a T4 DNA ligase (T4 Dnl), a T4 RNA ligase 1 (T4 Rnl1) and a T4 RNA ligase 2 (T4 Rnl2).


Embodiment 78. The linear RNA of embodiment 76 or 77, comprising a 5′ ligation sequence at the 5′ end of the nucleic acid sequence encoding the circRNA, and a 3′ ligation sequence at the 3′ end of the nucleic acid sequence encoding the circRNA, wherein the 5′ ligation sequence and the 3′ ligation sequence can be ligated to each other via the RNA ligase.


Embodiment 79. A nucleic acid construct comprising a nucleic acid sequence encoding the linear RNA of any one of embodiments 70-78.


Embodiment 80. The nucleic acid construct of claim 79, comprising a T7 promoter operably linked to the nucleic acid sequence encoding the linear RNA.


Embodiment 81. A method of producing a circRNA, comprising:

    • (a) subjecting the linear RNA of any one of embodiments 71-75 to a condition that activates autocatalysis of the 5′ catalytic Group I intron fragment and the 3′ catalytic Group I intron fragment to provide a circularized RNA product; and
    • (b) isolating the circularized RNA product, thereby providing the circRNA.


Embodiment 82. A method of producing a circRNA, comprising:

    • (a) contacting the linear RNA of any one of embodiments 76-78 with a single-stranded adaptor nucleic acid comprising from the 5′ end to the 3′ end: a first sequence complementary to the 3′ ligation sequence and a second sequence complementary to the 5′ ligation sequence, and wherein the 5′ ligation sequence and the 3′ ligation sequence hybridize to the single-stranded adaptor nucleic acid to provide a duplex nucleic acid intermediate comprising a single strand break between the 3′ end of the 5′ ligation sequence and the 5′ end of the 3′ ligation sequence;
    • (b) contacting the intermediate with an RNA ligase under a condition that allows ligation of the 5′ ligation sequence to the 3′ ligation sequence to provide a circularized RNA product; and
    • (c) isolating the circularized RNA product, thereby providing the circRNA.


Embodiment 83. A method of producing a circRNA, comprising:

    • (a) contacting the linear RNA of any one of embodiments 76-78 with an RNA ligase under a condition that allows ligation of the 5′ ligation sequence to the 3′ ligation sequence to provide a circularized RNA product; and
    • (b) isolating the circularized RNA product, thereby providing the circular RNA.


Embodiment 84. The method of any one of embodiments 80-82, further comprising obtaining the linear RNA by in vitro transcription of a nucleic acid construct comprising a nucleic acid sequence encoding the linear RNA.


Embodiment 85. The method of any one of embodiments 80-84, further comprising purifying the circularized RNA product.


EXAMPLES

The invention will be more fully understood by reference to the following examples. They should not, however, be construed as limiting the scope of the invention. It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended embodiments.


Example 1. In Vitro circRNA Production by Ligation

This example demonstrates in vitro production of a circular RNA (circRNA) by ligation.


A linear RNA is designed that can be circularized to produce a circRNA comprising, from 5′ to 3′, an IRES-Kozak-SP-Spike sequence, as shown in FIG. 2A. The linear RNA is designed with, from 5′ to 3′, an IRES sequence (SEQ ID NO: 53), a Kozak sequence (SEQ ID NO: 36), a signal peptide coding sequence (SEQ ID NO: 16 or SEQ ID NO: 17), and a Spike protein coding sequence having K986P/V987P and 4681-684 modifications (SEQ ID NO: 15) followed by a TAA stop codon.


Linear RNAs that can be circularized to produce the circular RNA (circRNAs) disclosed herein may be made using standard laboratory methods and materials. The cDNA sequence encoding the linear RNA may be synthesized by de novo DNA synthesis. The synthetic nucleic acid can be ordered from a synthetic nucleotide service such as GBLOCKS® (Integrated DNA Technologies). The nucleic acid sequence encoding the linear RNA sequence can be cloned into a plasmid vector containing a T7 promoter, the multiple cloning site flanked by restriction sites such as Xbal restriction sites. The resulting plasmid may be transformed into chemically competent E. coli.


For the present example, NEB DH5-alpha Competent E. coli cells are used. Transformations are performed according to NEB instructions using 100 ng of plasmid. The protocol is as follows:

    • 1. Thaw a tube of NEB 5-alpha Competent E. coli cells on ice for 10 minutes.
    • 2. Add 1-5 μL containing 1 pg-100 ng of plasmid DNA to the cell mixture. Carefully flick the tube 4-5 times to mix cells and DNA. Do not vortex.
    • 3. Place the mixture on ice for 30 minutes. Do not mix.
    • 4. Heat shock at 42° C. for exactly 30 seconds. Do not mix.
    • 5. Place on ice for 5 minutes. Do not mix.
    • 6. Pipette 950 μL of room temperature SOC into the mixture.
    • 7. Place at 37° C. for 60 minutes. Shake vigorously (250 rpm) or rotate.
    • 8. Warm selection plates to 37° C.
    • 9. Mix the cells thoroughly by flicking the tube and inverting.


Spread 50-100 μL of each dilution onto a selection plate and incubate overnight at 37° C. Alternatively, incubate at 30° C. for 24-36 hours or 25° C. for 48 hours.


A single colony is then used to inoculate 5 ml of LB growth media using the appropriate antibiotic and then allowed to grow (250 RPM, 37° C.) for 5 hours. This is then used to inoculate a 200 ml culture medium and allowed to grow overnight under the same conditions. To isolate the plasmid (up to 850 mg), a maxi prep is performed using the Invitrogen PURELINK™ HiPure Maxiprep Kit (Carlsbad, CA), following the manufacturer's instructions.


In order to generate a linearized plasmid DNA template for In Vitro Transcription (IVT), the plasmid (an Example of which is shown in FIG. 2) is first linearized using a restriction enzyme such as Xbal. A typical restriction digest with Xbal will comprise the following: Plasmid 1.0 mg 10× Buffer 1.0 mL; Xbal 1.5 mL; dH20 up to 10 mL; incubated at 37° C. for 1 hr. If performing at lab scale (<5), the reaction is cleaned up using Invitrogen's PURELINK™ PCR Micro Kit (Carlsbad, CA) per manufacturer's instructions. Larger scale purifications may need to be done with a product that has a larger load capacity such as Invitrogen's standard PURELINK™ PCR Kit (Carlsbad, CA). Following the cleanup, the linearized vector is quantified using the NanoDrop and analyzed to confirm linearization using agarose gel electrophoresis.


Unmodified linear RNA is synthesized by in vitro transcription using T7 RNA polymerase from the linearized plasmid. Transcribed RNA is purified with an RNA purification system (QIAGEN), treated with alkaline phosphatase (ThermoFisher Scientific, EF0652) following the manufacturer's instructions, and purified again with the RNA purification system.


Splint ligation circular RNA is generated by treatment of the transcribed linear RNA and a DNA splint using T4 DNA ligase (New England Bio, Inc., M0202M), and the circular RNA is isolated following enrichment with RNase R treatment. RNA quality is assessed by agarose gel or through automated electrophoresis (Agilent).


Example 2. In Vitro circRNA Production by Group I Ribozyme Autocatalysis

This example demonstrates in vitro production of a circular RNA (circRNA) by Group I ribozyme autocatalysis.


A linear RNA is designed that can be circularized to produce a circRNA comprising, from 5′ to 3′, a 5′ Homology arm-3′ catalytic Group I intron fragment-3′ exon sequence recognizable by the 3′ catalytic Group I intron fragment (i.e., Exon 2)-m6A modification motif-Kozak-SP-Spike-2A peptide-5′ exon sequence recognizable by the 5′ catalytic Group I intron fragment (i.e., Exon 1)-5′ catalytic Group I intron fragment-3′ Homology arm, as shown in FIG. 1C. The linear RNA is designed with, from 5′ to 3′, a 5′ homology arm (SEQ ID NO: 41), a 3′ catalytic Group I intron sequence (SEQ ID NO: 46), a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment (SEQ ID NO: 39), a m6A modification motif sequence (SEQ ID NO: 38), a Kozak sequence (SEQ ID NO: 37), a signal peptide coding sequence (SEQ ID NO: 16 or SEQ ID NO: 17), a Spike protein coding sequence having K986P/V987P and 4681-684 modifications (SEQ ID NO: 15), a 2A peptide coding sequence (SEQ ID NO: 44 or SEQ ID NO: 45), a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment (SEQ ID NO: 40), a 5′ catalytic Group I intron fragment (SEQ ID NO: 47), and a 3′ homology arm (SEQ ID NO: 43).


Linear RNAs that can be circularized to produce the circular RNA (circRNAs) disclosed herein may be made by the same methods described in Example 1 above.


Circularized RNA is generated by ribozyme autocatalysis of the Group I intron. During splicing, the 3′ hydroxyl group of a guanosine nucleotide engages in a transesterification reaction at the 5′ splice site. The 5′ intron half is excised, and the freed hydroxyl group at the end of the intermediate engages in a second transesterification at the 3′ splice site, resulting in circularization of the intervening region and excision of the 3′ intron.


Unmodified linear mRNA or circRNA precursors are synthesized by in-vitro transcription from a linearized plasmid DNA template using a T7 High Yield RNA Synthesis Kit (New England Biolabs). After in vitro transcription, reactions are treated with DNase I (New England Biolabs) for 20 min. After DNase treatment, unmodified linear mRNA is column purified using a MEGAclear Transcription Clean-up kit (Ambion). RNA is then heated to 70° C. for 5 min and immediately placed on ice for 3 min, after which the RNA is capped using mRNA cap-2′-O-methyltransferase (NEB) and Vaccinia capping enzyme (NEB) according to the manufacturer's instructions. Polyadenosine tails are added to capped linear transcripts using E. coli PolyA Polymerase (NEB) according to manufacturer's instructions, and fully processed mRNA is column purified. For circRNA, after DNase treatment additional GTP is added to a final concentration of 2 mM, and then reactions are heated at 55° C. for 15 min. RNA is then column purified. In some cases, purified RNA is recircularized: RNA is heated to 70° C. for 5 min and then immediately placed on ice for 3 min, after which GTP is added to a final concentration of 2 mM along with a buffer including magnesium (50 mM Tris-HCl, 10 mM MgCl2, 1 mM DTT, pH 7.5; New England Biolabs). RNA is then heated to 55° C. for 8 min, and then column purified. To enrich for circRNA, 20 μg of RNA is diluted in water (86 μL final volume) and then heated at 65° C. for 3 min and cooled on ice for 3 min. 20 U RNase R and 10 μL of 10× RNase R buffer (Epicenter) is added, and the reaction is incubated at 37° C. for 15 min; an additional 10 U RNase R is added halfway through the reaction. RNase R-digested RNA is column purified. RNA is separated on precast 2% E-gel EX agarose gels (invitrogen) on the E-gel iBase (Invitrogen) using the E-gel EX 1-2% program; ssRNA Ladder (NEB) is used as a standard.


For gel extractions, bands corresponding to the circRNA are excised from the gel and then extracted using a Zymoclean Gel RNA Extraction Kit (Zymogen). For high-performance liquid chromatography, 30 μg of RNA is heated at 65° C. for 3 min and then placed on ice for 3 min. RNA was run through a 4.6×300 mm size-exclusion column with particle size of 5 μm and pore size of 200 Å (Sepax Technologies; part number: 215980P-4630) on an Agilent 1100 Series HPLC (Agilent). RNA is run in RNase-free TE ss (10 mM Tris, 1 mM EDTA, pH:6) at a flow rate of 0.3 mL/minute. RN-A is detected by UV absorbance at 260 nm. Resulting RNA fractions are precipitated with 5 M ammonium acetate, resuspended in water, and then in some cases treated with RNase R as described above.


The resulting circRNA is shown in FIG. 1C.


Example 3. Gel-Electrophoresis and RNase R Resistance of circRNA

This example demonstrates that the purity and endonuclease resistance of the purified circRNA.


First, circRNA a circRNA construct was designed comprising a nucleotide sequence encoding an RBD of a SARS-CoV-2 Spike protein, using the circRNA backbone as described in Examples 1 and 2 above.


Briefly, linear RNAs were designed that can be circularized to produce a circRNA, the linear RNAs comprising, from 5′ to 3′, a 5′ Homology arm-3′ catalytic Group I intron fragment-3′ exon sequence recognizable by the 3′ catalytic Group I intron fragment (i.e., Exon 2)-IRES-Kozak-SP-RBD-TAA stop codon-5′ exon sequence recognizable by the 5′ catalytic Group I intron fragment (i.e., Exon 1)-5′ catalytic Group I intron fragment-3′ Homology arm. The linear RNA is designed with, from 5′ to 3′, a 5′ homology arm (SEQ ID NO: 41), a 3′ catalytic Group I intron sequence (SEQ ID NO: 46), a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment (SEQ ID NO: 39), an IRES sequence (SEQ ID NO: 53), a Kozak sequence (SEQ ID NO: 37), a signal peptide coding sequence (SEQ ID NO: 16 or SEQ ID NO: 17), a Spike protein RBD sequence encoding the amino acid sequence shown in SEQ ID NO: 2 or a Spike protein sequence encoding the amino acid sequence set forth in SEQ ID NO: 63, a stop codon, a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment (SEQ ID NO: 40), a 5′ catalytic Group I intron fragment (SEQ ID NO: 47), and a 3′ homology arm (SEQ ID NO: 43). The circularized RNA produced from this linear RNA were termed circRNARBD and circRNASpike, respectively. As a control, the 3′ Intron sequence was mutated to a random sequence to prevent circularization of the RNA, and the resulting construct was termed LinRNARBD.


A circRNA was generated and purified as described in Example 2. The purified circRNARBD and precursor linear RNA (LinRNARBD, wherein the 3′ Intron sequence was mutated to random sequence) were resolved in agarose gel electrophoresis. The gel electrophoresis results showed that the circRNARBD ran faster than LinRNA-RBD (FIG. 3A), indicating that the RNA was circularized. The circRNARBD is a circRNA that encodes the RBD domain of the Spike protein of SARS-CoV-2. The RBD domain is the amino acid residues 319 to 542 of a full-length Spike protein of SARS-CoV-2, as shown in SEQ ID NO: 2. The circularization of circRNARBD was 101 verified (FIG. 1b) by reverse transcription and RT-PCR analysis using specific primers (FIG. 3C) using the primers shown in FIG. 3E.


Next, the endonuclease resistance of the purified circRNA construct was tested. Because the circRNA has no 5′ or 3′ end, the circRNA is resistant to endonuclease. The endonuclease RNase R was used to digest the circRNARBD or LinRNARBD for different times, and the reaction products were resolved in agarose gel electrophoresis. The gel electrophoresis results showed than the circRNARBD was more resistant to RNase R compared to the LinRNARBD (FIG. 3B).


Example 4. Expression of SARS-CoV-2 RBD Antigen Via circRNA Transfection in Human HEK293T Cells and Mouse NIH3T3 Cells

This example demonstrates the ability of the circRNA to express a protein (e.g., a SARS-CoV-2 RBD of a S protein) in eukaryotic cells. Additionally, this example demonstrates the surprising stability of the circRNA for two weeks at room temperature. After two weeks incubation of the circRNA at room temperature, the protein could still be expressed and secreted in cells transfected with the circRNA.


After purification of the circRNA (RNase R treatment and HPLC), the circRNARBD was transfected into human HEK293T cells and mouse NIH3T3 cells with the Lipofectamine MessengerMAX Transfection Reagent (Thermo LMRNA003). The circRNA-EGFP and precursor linear RNA named LinRNA-RBD were used as controls. Quantitative ELISA assay showed that the RBD protein reached ˜143 ng/mL in the supernatant, 50-fold more than the linear RNARBD group (FIG. 3D).


After 48 hours, the culture supernatant of transfected cells was collected for Western Blot analysis. Using the SARS-CoV-2 Spike RBD antibody (ABclonal, A20135) for detection, the Western Blot results showed that the circRNARBD could express and secret the SARS-CoV-2 RBD antigen to the cellular supernatant efficiently. The Western Blot results are shown in FIG. 4A and FIG. 4B.


The circRNA was stable at room temperature about 25° C. The purified circRNARBD was kept at room temperature about 25° C. for 3, 7 or 14 days, and then were transfected into human HEK293T cells. The Western Blot results showed that the circRNARBD could express and secret the SARS-CoV-2 RBD antigen to the cellular supernatant efficiently, even when the circRNARBD had been kept at room temperature for 14 days. The results are shown in FIG. 4C.


Moreover, to test the thermostability of circRNA-LNP formulations, the encapsulated circRNARBD-LNP was stored at 4° C. or room temperature (˜25° C.) for 1, 3, 7, 14, 24, and 31 days before transfection. The sequentially collected circRNARBD-LNPs were transfected into cells and the abundance of RBD antigen production was quantified by ELISA. No reduction of RBD antigens could be detected from circRNARBD-LNPs after storage for 7 days at either 4° C. or room temperature (˜25° C.) (FIG. 4D). The expression levels dropped to −95% and ˜75% after 14-day storage of circRNARBD-LNP at 4° C. and room temperature, respectively (FIG. 4D). Deterioration effects did occur with extended shelf life and high temperature.


The stability of the circRNA at room temperature holds advantages for applications including vaccines and gene therapy, including for storage and transportation of the therapeutic circRNA (e.g., the circRNA vaccine).


Example 5. The SARS-CoV-2 RBD Antigen was Functional and could Block the Infection of SARS-CoV-2 Pseudovirus

This example demonstrates that a secreted SARS-CoV-2 RBD antigen expressed from an exemplary circRNA can directly interfere with infection of ACE2-expressing cells by a SARS-CoV-2 pseudovirus.


To evaluate whether the secreted SARS-CoV-2 RBD antigen produced by circRNA was functional, the cellular supernatant of HEK293T cells transfected with circRNARBD or control circRNA were incubated with the lentivirus-based SARS-CoV-2 pseudovirus encoding EGFP at 37° C. for 2 hours, and then the resulting SARS-CoV-2 pseudovirus/supernatant mixtures were added into the culture medium of ACE2-overexpressing cells named HEK293-ACE2. After 48 hours, the cells were collected for FACS analysis, as the SARS-CoV-2 pseudovirus expressed EGFP fluorescence marker. Cellular expression of EGFP indicated infection of cells by the SARS-CoV-2 pseudovirus. The commercial SARS-CoV-2 neutralizing antibody (ABclonal, A19215) was used as a positive control for neutralization of the SARS-CoV-2 S protein.


This pseudovirus competition experiment demonstrated that the secreted SARS-CoV-2 RBD antigen in supernatant produced by cells transfected with the circRNARBD could block the infection of SARS-CoV-2 pseudovirus efficiently, indicating that the circRNA-produced SARS-CoV-2 RBD antigen was functional at the cellular level. The secreted RBD antigen was able to interfere with binding between the RBD of the SARS-CoV-2 pseudovirus, thus blocking infection of the cells. The results are shown in FIG. 5A and FIG. 5B.


Example 6. The circRNA Vaccine could Induce SARS-CoV-2 Specific Immune Response and Generate High-Level Neutralizing Antibody

As demonstrated in Example 5 above, an RBD antigen expressed from an exemplary circRNA can directly interfere with binding of the SARS-CoV-2 S protein with the ACE2 receptor, thereby preventing or reducing infection of cells by a SARS-CoV-2 pseudovirus. This example demonstrates that an antigenic polypeptide (e.g., an RBD of a coronavirus S protein) expressed from a circRNA administered in vivo stimulates a specific immune response and generates a high level of neutralizing antibodies. Based on these results, the circRNAs described herein encoding antigenic polypeptides can serve as effective vaccines against viruses such as coronaviruses (e.g., SARS-CoV-2).


The purified circRNARBD (the circRNA backbone as shown in FIG. 1B comprising a nucleotide sequence encoding the amino acid sequence shown in SEQ ID NO: 2 as the “Spike” in FIG. 1B) and circRNASpike (the circRNA backbone as shown in FIG. 1B comprising a nucleotide sequence encoding the amino acid sequence shown in SEQ ID NO: 62 as the “Spike” in FIG. 1B) were used to immunize BALB/c mice, respectively. The first immunization was conducted via intramuscular injection at day 0, and a second dose was adopted to boost the immune response at day 14 (FIG. 6A). At day 28, the serum of the immunized mice was collected for the following detection (FIG. 6A).


Firstly, the RBD specific IgG titer was measured with ELISA, and the ELISA result showed that the IgG titer of circRNARBD (10 μg) group was about 32000, and the IgG titer of circRNARBD (50 μg) group was about 64000, while the Placebo group nearly had no RBD-specific IgG signal (FIG. 6B). Meanwhile, the in vitro surrogate neutralizing assay was used to measure the neutralization activity of immunized mouse serum, and the result showed that circRNARBD (10 μg) group had about 70% neutralization activity, and the circRNARBD (50 μg) group had over 95% neutralization activity (FIG. 6C). Finally, the lentivirus-based SARS-CoV-2 pseudovirus coated with SARS-CoV-2 spike protein was used to value the neutralizing activity at the cell level. The serum of immunized mice was incubated with SARS-CoV-2 pseudovirus, and then incubation system was added into the culture of ACE2-over-expression HEK293T cells. 48 hours later, the reporter-luciferase activity of pseudovirus was measured. The luciferase assay results showed that both the circRNARBD and circRNASpike could induce SARS-CoV-2 spike specific neutralizing antibody to block the infection of pseudovirus (FIG. 6D and FIG. 6E).


The above results proved that the circRNA vaccine could induce SARS-CoV-2 specific immune response and generate a high level of SARS-CoV-2 spike specific neutralizing antibody.


Example 7. Measuring the Weight of Mouse Spleen after Two-Dose Immunization

This example demonstrates the effect of immunization with an exemplary circRNA encoding an antigenic polypeptide (circRNARBD) on the weight of the spleen in mice following two-dose immunization.


The circRNA dosing scheme is shown in FIG. 6A. Four weeks after the second dose of circRNA vaccine or placebo, the mice were sacrificed and the spleens of immunized mice were isolated (FIG. 7A). Then the weight of each mouse were measured and the weight of spleen from circRNARBD (10 μg) or circRNARBD (50 μg) was significantly higher than the placebo group (FIG. 7B).


Example 8. Expression of SARS-CoV-2 Neutralizing Antibody Via circRNA

This example demonstrates expression of a secreted virus neutralizing antibody using exemplary circRNAs. Neutralizing antibodies expressed and secreted from cells transfected with the circRNAs described herein could effectively block infection by a SARS-CoV-2 pseudovirus.


The circRNA could also be used to express SARS-CoV-2 neutralizing antibody. Similar to the above RBD antigen, the SARS-CoV-2 neutralizing antibody-coding sequence was also circularized via the above circularization method (FIG. 8A).


Linear RNAs were designed that can be circularized to circRNAs, the linear RNA comprising, from 5′ to 3′, a 5′ Homology arm-3′ catalytic Group I intron fragment-3′ exon sequence recognizable by the 3′ catalytic Group I intron fragment (i.e., Exon 2)-IRES-Kozak-SP-RBD-TAA stop codon-5′ exon sequence recognizable by the 5′ catalytic Group I intron fragment (i.e., Exon 1)-5′ catalytic Group I intron fragment-3′ Homology arm. The linear RNA is designed with, from 5′ to 3′, a 5′ homology arm (SEQ ID NO: 41), a 3′ catalytic Group I intron sequence (SEQ ID NO: 46), a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment (SEQ ID NO: 39), an IRES sequence (SEQ ID NO: 53), a Kozak sequence (SEQ ID NO: 37), a signal peptide coding sequence (SEQ ID NO: 16 or SEQ ID NO: 17), a nucleotide sequence encoding a nAb (nAb-1 (the amino acid sequence shown in SEQ ID NO: 27), nAb-2 (the amino acid sequence shown in SEQ ID NO: 28), or nAb-5 (the amino acid sequence shown in SEQ ID NO: 30)), a stop codon, a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment (SEQ ID NO: 40), a 5′ catalytic Group I intron fragment (SEQ ID NO: 47), and a 3′ homology arm (SEQ ID NO: 43). The circularized RNAs produced from these linear RNAs were termed circRNAnAb-1, circRNAnAb-2, and circRNAnAB-5, respectively. As a control, the 3′ Intron sequence of the linear construct designed to generate circRNAnAB-5 was mutated to a random sequence to prevent circularization of the RNA, and the resulting construct was termed LinRNAnAB-5.


Circular RNAs were generated comprising a nucleotide sequence encoding nAb-1, nAb-2, nAb-3, nAb-4, nAb-5, nAb-6, nAb-7H, or nAb-7L. The amino acid sequences of the neutralizing antibodies are shown in SEQ ID NOs: 26-33, respectively. Alternatively, an antibody that binds to ACE2 and blocks binding of the S protein can be used, such as the amino acid sequences shown in SEQ ID NO: 34 or 35.


Exemplary circRNAs encoding nAbs (circRNAnAb-1, circRNAnAb-2, and circRNAnAB-5) were transfected into HEK293T cells, and 48 hours later the supernatant was collected and used to conduct the with SARS-CoV-2 pseudovirus neutralization assay. A circRNA encoding luciferase (circRNALuc) and a linear precursor RNA LinRNAnAB-5 were used as negative controls, and the commercial SARS-CoV-2 neutralizing antibody (ABclonal, A19215) was used as the positive control.


The pseudovirus neutralization assay results demonstrated that circRNAnAb-1, circRNAnAb-2, and circRNAnAB-5 could neutralize the infection of SARS-CoV-2 pseudovirus compared to negative controls (FIG. 8B). These results indicate that the circRNA can be utilized to expressing neutralizing antibodies for therapeutic purposes, such as to treat a coronavirus (e.g., SARS-CoV-2) infection. Pseudovirus neutralization assays showed that supernatants of HEK293T cells transfected with circRNAnAB or circRNAhACE2 decoys could effectively inhibit wild SARS-CoV-2 S-protein based pseudovirus infection (FIG. 8C).


Next, we tested neutralizing antibodies against the recently emerged SARS-CoV-2 variants, including B.1.1.7/501Y.V1 and B.1.351/501Y.V2, by pseudovirus assays. The supernatants of circRNAnAB1-Tri and circRNAnAB3-Tri transfected cells effectively blocked B1.1.7/501Y.V1 and D614G pseudovirus infection (FIG. 8D). However, both nanobodies showed markedly decreased neutralizing activity against B.1.351/501Y.V2 variant (FIG. 8D). The hACE2 decoys showed no inhibition activity against B1.1.7/501Y.V1 and B.1.351/501Y.V2 variants (FIG. 8D).


In this example, circRNA-encoded SARS-CoV-2 nanobodies showed strong neutralizing ability against the SARS-CoV-2 native, D614G and B.1.1.7/501Y.V1 strains in vitro, but they were completely escaped by B.1.351/501Y.V2 variant (FIG. 8D). Beyond viral receptors, this circRNA expression platform hold the potential to become therapeutic drugs, encoding therapeutic antibodies in vivo, e.g. anti-PD1/PD-L1 antibodies. Compared to the antibodies protein drugs, the circRNAs could target intracellular targets, such as TP5383 and KRAS84, because they encoded therapeutic antibodies in the cytoplasm, bypassing the cytomembrane barrier.


Example 9. A circRNA Encoding IDUA could Restore the Catalytic Activity of α-1-Iduronidase (IDUA) in Primary Cells from Hurler Syndrome Mouse Models

The above examples describe exemplary circRNA backbones, generation and purification of circRNAs, use of circRNAs to produce antigenic polypeptides that can effectively generate an immune response in vivo for use as vaccines, and circRNA expression of neutralizing antibodies to treat an infection (e.g., a SARS-CoV-2 infection). However, the circRNAs described herein can also be applied to therapy of other diseases that would benefit from expression of a therapeutic polypeptide, such as genetic diseases associated with a deficiency in a protein or functional protein. This example provides results demonstrating that the circRNA can be used to express a functional protein such as an enzyme (e.g., IDUA). Therefore, the circRNAs provided herein can be used for production of functional therapeutic polypeptides for gene therapy applications.


Instead of the SARS-CoV-2 RBD/Spike antigen, the functional wildtype disease-related proteins can also be expressed and function via the circRNAs and methods described herein. In an example, the mouse α-1-iduronidase (IDUA) coding sequence was inserted into the circRNA backbone to generate circRNAIDUA (FIG. 9A).


Briefly, a linear RNA was designed that can be circularized to produce a circRNA, the linear RNAs comprising, from 5′ to 3′, a 5′ Homology arm-3′ catalytic Group I intron fragment-3′ exon sequence recognizable by the 3′ catalytic Group I intron fragment (i.e., Exon 2)-IRES-Kozak-SP-RBD-TAA stop codon-5′ exon sequence recognizable by the 5′ catalytic Group I intron fragment (i.e., Exon 1)-5′ catalytic Group I intron fragment-3′ Homology arm. The linear RNA is designed with, from 5′ to 3′, a 5′ homology arm (SEQ ID NO: 41), a 3′ catalytic Group I intron sequence (SEQ ID NO: 46), a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment (SEQ ID NO: 39), an IRES sequence (SEQ ID NO: 53), a Kozak sequence (SEQ ID NO: 37), a signal peptide coding sequence (SEQ ID NO: 16 or SEQ ID NO: 17), a nucleotide sequence encoding IDUA (amino acid sequence set forth in SEQ ID NO: 18), a stop codon, a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment (SEQ ID NO: 40), a 5′ catalytic Group I intron fragment (SEQ ID NO: 47), and a 3′ homology arm (SEQ ID NO: 43). The circularized RNA produced from this linear RNA was termed circRNAIDUA. As a control, the 3′ Intron sequence was mutated to a random sequence to prevent circularization of the RNA, and the resulting construct was termed LinRNAIDUA.


The circRNAIDUA was circularized and purified according to the method described in Example 2, and then was transfected into primary MEF cells from Hurler Syndrome Mouse models or human HEK293T/IDUA−/− cells with a cationic-lipid transfection reagent, Lipofectamine™ MessengerMAX Transfection Reagent (Thermo LMRNA003).


After 48 hours, the catalytic activity of α-1-iduronidase was detected with the reported α-1-iduronidase assay (Qu et al., Nature Biotechnology, vol 37, September 2019, 1059-1069). The α-1-iduronidase assay showed the circRNAIDUA could recover the catalytic activity of α-1-iduronidase efficiently in the primary MEF cells from Hurler Syndrome mouse models and as well as the human HEK293T/IDUA−/− cells, indicating that the circRNAIDUA was functional in Hurler Syndrome Mouse-derived primary cells. The results are shown in FIG. 9B and FIG. 9C.


Example 10. In Vivo Restoration of the Catalytic Activity of α-1-Iduronidase in Hurler Syndrome Mouse Models

Example 8 above demonstrated an exemplary circRNA could express a functional enzyme protein in mouse or human cells deficient for the enzyme, thereby restoring the protein function in those cells. The present example demonstrates that an exemplary circRNA can be used to restore function of a protein in vivo. Moreover, the protein expressed from the circRNA can restore protein function for an extended period (e.g., at least 24 hours).


The purified circRNAIDUA (30 μg per dose) was encapsulated and delivered into Hurler Syndrome (IDUA deficient) mice via tail-vein injection. After 4 hours or 24 hours, the Hurler Syndrome mice were sacrificed to isolate the liver tissues, and α-1-iduronidase activity was assayed in the isolated liver tissues.


The α-1-iduronidase assay from the livers of mice injected with circRNAIDUA or control showed that circRNAIDUA could efficiently restore the catalytic activity of α-1-iduronidase in Hurler Syndrome mouse models, reaching to nearly 20% activity of the wildtype mouse (FIG. 10). Moreover, the catalytic activity increased from 4 hours to 24 hours, indicating that the effect of circRNAIDUA is long lasting and could be utilized for the therapy of genetic diseases.


Example 11. SARS-CoV-2 circRNA Vaccines Elicit Sustained Humoral Immune Responses with High-Level Neutralizing Antibodies

With its stability and immunogen-coding capability, we reasoned that circRNA could be developed into a new type of vaccine. We then attempted to assess the immunogenicity of circRNARBD encapsulated with lipid nanoparticle in BALB/c mice (FIG. 11A). The circRNARBD encapsulation efficiency was greater than 93%, with an average size of 100 nm in diameter (FIG. 11B). Animals were immunized with LNP-circRNARBD through intramuscular injection twice, using a dose of 10 μg or 50 μg per mouse at a two-week interval, while empty LNP was used as the placebo (FIG. 11C). The amount of RBD-specific IgG and pseudovirus neutralization activity were evaluated at two or five weeks post LNP-circRNARBD boost.


The circRNAs were encapsulated with lipid nanoparticle (LNP) through a previously described process (Ickenstein, L. M. & Garidel, P. Lipid-based nanoparticle formulations for small molecules and RNA drugs. 890 Expert Opin Drug Deliv 16, 1205-1226, doi:10.1080/17425247.2019.1669558 (2019), the contents of which are herein incorporated by reference). Briefly, the circRNAs were diluted in the 50 mM citrate buffer (pH 3.0) and the lipids were dissolved and mixed in ethanol at molar ratios of 50:10:38.5:1.5 (MC3-lipid:DSPC:cholesterol:PEG2000-DMG). The lipids mixture was then mixed with the circRNA solution at the volume ratio of 1:3 in the NANOASSEMBLR BENCHTOP (PRECISION, #NIT0046). Then the LNP-circRNA formulations were diluted 40-fold with the 1×PBS buffer (pH 7.2˜7.4) and concentrated by ultrafiltration with Amicon® Ultra Centrifugal Filter Unit (Millipore). The concentration and encapsulation rate of circRNAs were measured by the Quant-iT™ RiboGreen™ RNA Assay Kit (Invitrogen™ #R11490). The size of LNP-circRNA particles was measured using dynamic light scattering on a Malvern Zetasizer Nano-ZS 300 (Malvern). Samples were irradiated with red laser (l=632.8 nm) and scattered light were detected at a backscattering angle of 173. Results were analyzed to obtain an autocorrelation function using the software (Zetasizer V7.13).


High titers of RBD-specific IgG were elicited by circRNARBD in a dose-dependent manner, ˜3×104 and ˜1×106 for each dose and for both 2- and 5-weeks post boost, indicating that circRNARBD could induce long-lasting antibodies against SARS-CoV-2 RBD (FIG. 11D).


To test the antigen-specific binding capability of IgG from vaccinated animals, we performed a surrogate neutralization assay. In line with the amount of RBD-specific IgG (FIG. 11D), antibodies elicited by circRNARBD vaccines showed evident neutralizing capacity in dose-dependent manner, with an NT50 of ˜2×104 for the dose of 50 μg (FIG. 11E, FIG. 11F).


We further demonstrated that sera from circRNARBD-vaccinated mice could neutralize both SARS-CoV-2 pseudovirus (FIG. 11G) and authentic SARS-CoV-2 virus (FIG. 11H), with an NT50 of ˜5.6×103 (FIG. 11G) and an NT50 of ˜2.2×105 (FIG. 11H) in mice immunized with 50 μg of circRNARBD, respectively. The large amount of RBD-specific IgG, potent RBD antigen neutralization, and sustained SARS-CoV-2 neutralizing capacity suggest that circRNARBD vaccines did induce a long-lasting humoral immune response in mice.


Example 12. SARS-CoV-2 circRNA Vaccines Induce Strong T Cell Immune Responses in the Spleen

B cells (the source of antibodies), CD4+ T cells, and CD8+ T cells are three pillars of adaptive immunity, and they mediated effector functions that have been associated with the control of SARS-CoV-2 in both non-hospitalized and hospitalized cases of COVID-19.


To probe CD4+ and CD8+ T cell immune responses in circRNARBD vaccinated mice (5 weeks post-boost), splenocytes were stimulated with SARS-CoV-2 Spike RBD pooled peptides (Table E1 below), and cytokine-producing T cells were quantified by intracellular cytokine staining among effector memory T cells (Tem, CD44+CD62L). Stimulated with RBD peptide pools, CD4+ T cells of mice immunized with circRNARBD vaccines exhibited Th1-biased responses, producing interferon-γ (IFN-γ), tumor necrosis factor (TNF-α), and interleukin-2 (IL-2) (FIG. 12A, FIG. 12B), but not interleukin-4 (IL-4), indicating that circRNARBD vaccines mainly induced the Th1- but not the Th2-biased immune responses. In addition, multiple cytokine-producing CD8+ were detected in circRNARBD vaccinated mice (FIG. 12C, FIG. 12D). For unknown reasons, 10 μg of circRNARBD elicited stronger immune responses in both CD4+ and CD8+ effector memory T cells than 50 μg (FIGS. 12A-12D), while the latter induced higher potency of neutralizing antibodies in the B cell responses (FIG. 11G and FIG. 11H).









TABLE E1







Peptide sequences of RBD antigens













SEQ ID



Name
Sequence
NO















S-45
GIYQTSNFRVQPTESIVR
65







S-46
RVQPTESIVRFPNITNL
66







S-47
IVRFPNITNLCPFGEVF
67







S-48
TNLCPFGEVFNATRFASV
68







S-49
VFNATRFASVYAWNRKRI
69







S-50
SVYAWNRKRISNCVADY
70







S-51
KRISNCVADYSVLYNSA
71







S-52
ADYSVLYNSASFSTFKCY
72







S-53
SASFSTFKCYGVSPTKL
73







S-54
KCYGVSPTKLNDLCFTNV
74







S-55
KLNDLCFTNVYADSFVIR
75







S-56
NVYADSFVIRGDEVRQIA
76







S-57
IRGDEVRQIAPGQTGKIA
77







S-58
IAPGQTGKIADYNYKL
78







S-59
GKIADYNYKLPDDFTGCV
79







S-60
KLPDDFTGCVIAWNSNNL
80







S-61
CVIAWNSNNLDSKVGGNY
81







S-62
NLDSKVGGNYNYLYRLFR
82







S-63
NYNYLYRLFRKSNLKPF
83







S-64
LFRKSNLKPFERDISTEI
84







S-65
PFERDISTEIYQA
85







S-66
RDISTEIYQAGSTPCNGV
86







S-67
YQAGSTPCNGVEGFNCYF
87







S-68
NGVEGFNCYFPLQSYGF
88







S-69
CYFPLQSYGFQPTNGVGY
89







S-70
GFQPTNGVGYQPYRVVVL
90







S-71
GYQPYRVVVLSFELLHA
91







S-72
VVLSFELLHAPATVCGPK
92







S-73
HAPATVCGPKKSTNLVK
93







S-74
GPKKSTNLVKNKCVNFNF
94







S-75
VKNKCVNFNFNGLTGTGV
95










Collectively, these results demonstrated that SARS-CoV-2 circRNARBD vaccines could induce high level of humoral and cellular immune responses in mice. In this report, circRNARBD-501Y.V2 immunized mice produced high titers of neutralizing antibodies. Given that K417N-E484K-N501Y mutant in RBD reduces its interactions with certain neutralizing antibodies as shown in Example 8, we also demonstrated that neutralizing antibodies produced by mice immunized with circRNARBD or circRNARBD-501Y.V2 had preferential neutralizing abilities to their corresponding virus strains. Recent studies suggested that 501Y.V2 showed no higher infectivity but had immune escape capability, and varieties of vaccines were reported to be less effective against SARS-CoV-2 variants. There have also been reported vaccine breakthrough infections with SARS-CoV-2 variants. Thus, it is important to develop and implement vaccines against emerging variants, and the cricRNA vaccine is such a platform that could be rapidly tailored for specific variants. For example, a vaccine contains E484K, N501Y, and L452R mutations in the RBD can be developed quickly via the circRNA platform to deal with a potential outbreak caused by the SARS-CoV-2 variants (L452R being found in the recently reported B.1.617 276 variant emerging in India and in the B.1.429 variant that has emerged in the USA).


We highlight this generalizable strategy for designing immunogens. The coding sequence of circular RNA can be quickly adapted to deal with any emerging SARS-CoV-2 variants, such as the recently reported B.1.1.7/501Y.V1, B.1.351/501Y.V2, P.1/501Y.V3 and B.1.671 variants. Moreover, circular RNAs could be quickly generated in large quantities in vitro, and they do not require any nucleotide modification.


Example 13. SARS-CoV-2 circRNARBD-501Y.V2 Vaccines Elicited Antibodies Show Preferential Neutralizing Activity Against B.1.351 Variant

Next, we evaluated the efficacy of a circRNA vaccine encoding RBD/K417N-E484K-N501Y derived from the B.1.351/501Y.V2 variant, termed as circRNARBD-501Y.V2 (FIG. 13A). BALB/c mice were immunized with an i.m. injection of the circRNARBD-501Y.V2 vaccine, followed by a boost at a two-week interval. The immunized mice's sera were collected at 1 and 2 weeks post the boost. The ELISA showed that the RBD-501Y.V2-specific IgG titer reached 7×104 at 2 weeks post boost (FIG. 13B). The surrogate neutralization assay showed that sera of circRNARBD-501Y.V2 immunized mice effectively neutralized RBD antigens (FIG. 13C). We then went on to assess the neutralization activity of the sera from mice immunized with circRNARBD or circRNARBD-501Y.V2 vaccines against D614G, B.1.1.7/501Y.V1, or B.1.351/501Y.V2 variants. VSV-based pseudovirus neutralization assay revealed that antibodies elicited by circRNARBD vaccines, which encode the native RBD sequence, effectively neutralized all three viral strains, with the highest activity against the D614G strain (FIG. 13D). The circRNARBD-501Y.V2 immunized mouse serum could also neutralize all three pseudoviruses, with the highest neutralizing activity against its corresponding variant, 501Y.V2 (FIG. 13E).


We further tested the neutralizing capacity of the circRNARBD-501Y.V2 immunized mouse serum against authentic SARS-CoV-2 strains. In line with pseudovirus neutralization assay, the serum could effectively neutralize authentic SARS-CoV-2 B.1.351/501Y.V2 strain with an NT50 of 7.1×104 (FIG. 13F), and could neutralize authentic SARS-CoV-2 D164G strain less effectively with an NT50 of 9.8×103 (FIG. 13G). Collectively, circRNA vaccines-elicited antibodies showed the best neutralizing activity against their corresponding variant strains. It's worth noting that both vaccines could neutralize all three strains albeit with variable efficacies. Nevertheless, the updated vaccines for corresponding variant strains or multivalent vaccines might provide better protection for both native SARS-CoV-2 strain and its circulating variants.


Example 14. Long-Lasting Protection of circRNARBD-501Y.V2 Vaccines Against Authentic B.1.351 Strain in a Novel Mouse Model

To further evaluate the protective efficacy of SARS-CoV-2 circRNARBD-501Y.V2 vaccines in vivo, we employed the B.1.351/501Y.V2 strain for authentic virus challenge experiments due to its severe antibody escape capability. Consistent with a recent report, the B.1.351/501Y.V2 variant could infect BALB/c mice and replicate in their lungs, possibly due to the mutations in spike protein, especially in the RBD domain, such as K417N, E484K and N501Y. We then employed BALB/c mice for accessing the protective efficacy of SARS-CoV-2 circRNARBD-501Y.V2 vaccines. The BALB/c mice were received two-dose immunization of 50 μg circRNARBD-501Y.V2 vaccine or placebo via the i.m. route, at a two-week interval (FIG. 14A). To evaluate the long-term protection of circRNA vaccines, each immunized mouse was challenged with 5×104 PFU of authentic SARS-CoV-2 B.1.351/501Y.V2 strain via the intranasal (i.n.) route at 7 weeks post the boost dose, and the lung tissues were collected 3 days after challenge for detecting viral RNA (FIG. 14A). Three days before virus challenge, the sera of immunized mice were collected to detect the RBD-501Y.V2-specific IgG (FIG. 14A). Near two months after the immunization, the titer of RBD-501Y.V2-specific IgG is around 2 λ104 (FIG. 14B), and the serum showed significant neutralizing capacity against RBD-501Y.V2 antigens (FIG. 14C).


Moreover, we found that the mice in the placebo group underwent increased weight loss compared with vaccinated mice (FIG. 14D). Consistently, virus titers in the lungs of vaccinated mice were significantly decreased compared with those who received placebos (FIG. 14E). These results indicated that the circRNARBD-501Y.V2 vaccine could effectively protect the mice from infection by SARS-CoV-2 B.1.351/501Y.V2 variants.












EXEMPLARY SEQUENCES















SEQ ID NO: 1 Full-length S protein sequence of SARS-COV-2


MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFS


NVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIV


NNATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMD


LEGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLVRDLPQGFSALEPLVDLPIGINITRFQT


LLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSET


KCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISN


CVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIA


DYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGST


PCNGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKN


KCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVS


VITPGTNTSNQVAVLYQDVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEH


VNNSYECDIPIGAGICASYQTQTNSPRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPT


NFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDK


NTQEVFAQVKQIYKTPPIKDFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQY


GDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIP


FAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQN


AQALNTLVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAA


EIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKN


FTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVN


NTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLN


ESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCC


KFDEDDSEPVLKGVKLHYT





SEQ ID NO: 2 RBD amino acid residues 319-542 of S protein


RVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFK


CYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWN


SNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGF


QPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNF





SEQ ID NO: 3 C-terminal Foldon domain of a T4 fibritin domain


GSGYIPEAPRDGQAYVRKDGEWVLLSTFLGRS





SEQ ID NO: 4 GCN4-based leucine zipper domain


RMKQIEDKIEEILSKIYHIENEIARIKKLIGER





SEQ ID NO: 5 Exemplary peptide linker


GGGGSGGGGS





SEQ ID NO: 6 wildtype S2 region of SARS-COV-2 S protein


SVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDST


ECSNLLLQYGSFCTQLNRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKDFGGFNFSQILP


DPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDE


MIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQF


NSAIGKIQDSLSSTASALGKLQDVVNQNAQALNTLVKQLSSNFGAISSVLNDILSRLDKV


EAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKG


YHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFV


TQRNFYEPQUITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDV


DLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLI


AIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT





SEQ ID NO: 7 K986P/V987P S2 region sequence of SARS-COV-2 S protein


SVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDST


ECSNLLLQYGSFCTQLNRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKDFGGFNFSQILP


DPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDE


MIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQF


NSAIGKIQDSLSSTASALGKLQDVVNQNAQALNTLVKQLSSNFGAISSVLNDILSRLDPPE


AEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGY


HLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVT


QRNFYEPQUITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVD


LGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIA


IVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT





SEQ ID NO: 8 wildtype amino acid residues 2-1273 sequence of S


protein of SARS-COV-2


FVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSN


VTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVN


NATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDL


EGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLVRDLPQGFSALEPLVDLPIGINITRFQTL


LALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETK


CTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNC


VADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIAD


YNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTP


CNGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNK


CVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVI


TPGTNTSNQVAVLYQDVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHV


NNSYECDIPIGAGICASYQTQTNSPRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNF


TISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDKNT


QEVFAQVKQIYKTPPIKDFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGD


CLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFA


MQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQ


ALNTLVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEI


RASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNF


TTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQUITTDNTFVSGNCDVVIGIVNN


TVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNE


SLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCK


FDEDDSEPVLKGVKLHYT





SEQ ID NO: 9 amino acid residues 2-1273 sequence of S protein


of SARS-COV-2, A681-684


FVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSN


VTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVN


NATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDL


EGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLVRDLPQGFSALEPLVDLPIGINITRFQTL


LALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETK


CTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNC


VADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIAD


YNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTP


CNGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNK


CVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVI


TPGTNTSNQVAVLYQDVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHV


NNSYECDIPIGAGICASYQTQTNSRSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISV


TTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDKNTQEVF


AQVKQIYKTPPIKDFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDI


AARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMA


YRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALNTL


VKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASAN


LAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAI


CHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQUITTDNTFVSGNCDVVIGIVNNTVYDP


LQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQ


ELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDD


SEPVLKGVKLHYT





SEQ ID NO: 10 amino acid residues 2-1273 sequence of S protein


of SARS-COV-2, K986P V987P Δ681-684 sequence


FVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSN


VTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVN


NATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDL


EGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLVRDLPQGFSALEPLVDLPIGINITRFQTL


LALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETK


CTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNC


VADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIAD


YNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTP


CNGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNK


CVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVI


TPGTNTSNQVAVLYQDVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHV


NNSYECDIPIGAGICASYQTQTNSRSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISV


TTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDKNTQEVF


AQVKQIYKTPPIKDFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDI


AARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMA


YRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALNTL


VKQLSSNFGAISSVLNDILSRLDPPEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANL


AATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAIC


QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQE


LGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDS


EPVLKGVKLHYT





SEQ ID NO: 11 Nucleic acid sequence of the wildtype S2 region sequence


AGTGTGGCTTCTCAAAGCATTATAGCATACACTATGTCTCTTGGTGCCGAAAATTCC


GTGGCCTATTCTAACAATTCAATCGCCATCCCAACCAACTTCACAATTAGCGTGACT


ACCGAAATACTGCCTGTGAGCATGACGAAAACCAGCGTAGACTGCACTATGTATAT


CTGTGGAGACTCCACTGAGTGCTCCAACCTTCTCCTGCAGTACGGTAGCTTCTGTAC


CCAATTGAACCGCGCCCTTACAGGCATCGCTGTTGAGCAAGATAAGAATACCCAGG


AAGTTTTTGCCCAGGTTAAGCAGATATACAAAACACCGCCCATTAAGGACTTCGGAG


GCTTCAACTTCTCTCAGATACTGCCTGACCCCTCCAAGCCATCAAAACGCAGCTTCA


TTGAGGACCTCTTGTTCAACAAAGTGACTCTGGCTGATGCTGGCTTCATTAAGCAGT


ACGGAGATTGCCTGGGAGATATTGCTGCCAGGGACCTCATCTGCGCCCAGAAGTTTA


ATGGCCTGACAGTCTTGCCCCCACTTCTGACAGACGAGATGATTGCTCAGTACACAT


CTGCCCTCCTCGCTGGCACCATAACATCCGGATGGACATTTGGTGCTGGTGCTGCCC


TCCAGATTCCCTTCGCAATGCAGATGGCGTATCGCTTTAACGGCATCGGTGTCACAC


AAAACGTGTTGTATGAGAACCAAAAGCTCATCGCTAACCAGTTTAATTCTGCTATTG


GTAAGATTCAGGACAGCCTGTCATCAACCGCGTCTGCCCTTGGTAAGTTGCAGGACG


TGGTGAACCAGAATGCTCAGGCTTTGAATACTCTGGTGAAGCAACTCTCTTCAAATT


TCGGCGCTATCTCTTCTGTGTTGAACGACATCCTGAGTCGCCTTGATAAGGTGGAAG


CTGAAGTTCAAATTGATAGATTGATTACTGGCAGGCTCCAGTCTTTGCAGACCTACG


TTACACAGCAGCTGATTAGGGCGGCTGAAATTAGAGCTTCCGCCAATCTGGCTGCAA


CCAAGATGTCCGAATGCGTCCTGGGTCAGTCAAAGCGCGTTGACTTTTGTGGTAAAG


GCTACCACCTCATGTCATTTCCCCAGTCAGCACCTCACGGAGTAGTGTTCCTCCACG


TCACCTACGTTCCAGCACAGGAAAAGAATTTTACCACTGCGCCGGCAATCTGTCACG


ACGGTAAGGCACACTTCCCCCGCGAGGGCGTATTCGTGTCTAACGGAACTCATTGGT


TCGTCACACAGAGAAACTTCTATGAGCCTCAGATCATTACCACCGACAATACATTTG


TGTCCGGTAACTGCGACGTTGTGATTGGAATCGTCAACAACACTGTGTACGATCCAC


TTCAGCCAGAACTGGATAGCTTCAAGGAAGAATTGGACAAATATTTCAAAAATCAC


ACTTCACCCGATGTGGACCTGGGTGACATTAGTGGTATCAATGCGTCCGTGGTCAAT


ATTCAAAAAGAGATTGACAGGCTCAACGAAGTGGCCAAGAACCTGAACGAAAGTCT


TATCGATCTGCAAGAATTGGGAAAGTATGAGCAGTACATCAAGTGGCCGTGGTACA


TTTGGTTGGGTTTTATCGCCGGTCTGATCGCCATCGTTATGGTTACCATTATGCTTTG


CTGCATGACGAGCTGTTGCTCCTGTCTGAAGGGATGCTGCTCTTGCGGATCATGTTG


CAAGTTCGATGAAGACGATAGCGAACCAGTTCTGAAGGGCGTCAAGCTGCATTACA


CA





SEQ ID NO: 12 Nucleic acid sequence of the K986P/V987P S2 region sequence


AGTGTGGCTTCTCAAAGCATTATAGCATACACTATGTCTCTTGGTGCCGAAAATTCC


GTGGCCTATTCTAACAATTCAATCGCCATCCCAACCAACTTCACAATTAGCGTGACT


ACCGAAATACTGCCTGTGAGCATGACGAAAACCAGCGTAGACTGCACTATGTATAT


CTGTGGAGACTCCACTGAGTGCTCCAACCTTCTCCTGCAGTACGGTAGCTTCTGTAC


CCAATTGAACCGCGCCCTTACAGGCATCGCTGTTGAGCAAGATAAGAATACCCAGG


AAGTTTTTGCCCAGGTTAAGCAGATATACAAAACACCGCCCATTAAGGACTTCGGAG


GCTTCAACTTCTCTCAGATACTGCCTGACCCCTCCAAGCCATCAAAACGCAGCTTCA


TTGAGGACCTCTTGTTCAACAAAGTGACTCTGGCTGATGCTGGCTTCATTAAGCAGT


ACGGAGATTGCCTGGGAGATATTGCTGCCAGGGACCTCATCTGCGCCCAGAAGTTTA


ATGGCCTGACAGTCTTGCCCCCACTTCTGACAGACGAGATGATTGCTCAGTACACAT


CTGCCCTCCTCGCTGGCACCATAACATCCGGATGGACATTTGGTGCTGGTGCTGCCC


TCCAGATTCCCTTCGCAATGCAGATGGCGTATCGCTTTAACGGCATCGGTGTCACAC


AAAACGTGTTGTATGAGAACCAAAAGCTCATCGCTAACCAGTTTAATTCTGCTATTG


GTAAGATTCAGGACAGCCTGTCATCAACCGCGTCTGCCCTTGGTAAGTTGCAGGACG


TGGTGAACCAGAATGCTCAGGCTTTGAATACTCTGGTGAAGCAACTCTCTTCAAATT


TCGGCGCTATCTCTTCTGTGTTGAACGACATCCTGAGTCGCCTTGATcctccaGAAGCTG


AAGTTCAAATTGATAGATTGATTACTGGCAGGCTCCAGTCTTTGCAGACCTACGTTA


CACAGCAGCTGATTAGGGCGGCTGAAATTAGAGCTTCCGCCAATCTGGCTGCAACC


AAGATGTCCGAATGCGTCCTGGGTCAGTCAAAGCGCGTTGACTTTTGTGGTAAAGGC


TACCACCTCATGTCATTTCCCCAGTCAGCACCTCACGGAGTAGTGTTCCTCCACGTC


ACCTACGTTCCAGCACAGGAAAAGAATTTTACCACTGCGCCGGCAATCTGTCACGAC


GGTAAGGCACACTTCCCCCGCGAGGGCGTATTCGTGTCTAACGGAACTCATTGGTTC


GTCACACAGAGAAACTTCTATGAGCCTCAGATCATTACCACCGACAATACATTTGTG


TCCGGTAACTGCGACGTTGTGATTGGAATCGTCAACAACACTGTGTACGATCCACTT


CAGCCAGAACTGGATAGCTTCAAGGAAGAATTGGACAAATATTTCAAAAATCACAC


TTCACCCGATGTGGACCTGGGTGACATTAGTGGTATCAATGCGTCCGTGGTCAATAT


TCAAAAAGAGATTGACAGGCTCAACGAAGTGGCCAAGAACCTGAACGAAAGTCTTA


TCGATCTGCAAGAATTGGGAAAGTATGAGCAGTACATCAAGTGGCCGTGGTACATTT


GGTTGGGTTTTATCGCCGGTCTGATCGCCATCGTTATGGTTACCATTATGCTTTGCTG


CATGACGAGCTGTTGCTCCTGTCTGAAGGGATGCTGCTCTTGCGGATCATGTTGCAA


GTTCGATGAAGACGATAGCGAACCAGTTCTGAAGGGCGTCAAGCTGCATTACACA





SEQ ID NO: 13 Nucleic acid sequence of the wildtype 2-1273 


sequence of Spike


TTCGTTTTCCTTGTTCTGTTGCCTCTCGTTAGTAGCCAATGCGTCAACCTTACTACTA


GAACCCAGCTCCCTCCAGCATATACCAACTCTTTCACCAGGGGCGTATATTACCCGG


ACAAAGTGTTCCGCTCAAGTGTGCTGCATTCTACGCAGGACCTTTTCTTGCCCTTTTT


CAGTAATGTTACTTGGTTTCATGCTATCCATGTGTCTGGAACTAACGGAACCAAGCG


CTTTGACAACCCCGTCCTCCCTTTCAACGATGGCGTGTACTTCGCTTCCACGGAAAA


GTCAAACATAATTCGCGGCTGGATCTTTGGTACAACACTCGACTCAAAGACGCAGA


GCCTGCTGATCGTTAATAACGCTACAAATGTTGTGATAAAGGTGTGTGAATTTCAGT


TCTGCAATGATCCCTTCCTGGGTGTGTACTACCATAAGAATAACAAGAGCTGGATGG


AATCCGAATTTAGGGTTTACAGTTCCGCTAACAACTGCACATTCGAATACGTAAGCC


AGCCATTTCTTATGGATCTTGAGGGCAAGCAAGGAAACTTCAAGAACTTGAGGGAG


TTCGTGTTCAAAAATATCGACGGCTATTTTAAGATATATAGCAAGCACACTCCAATA


AACTTGGTGCGCGACCTGCCCCAGGGATTCTCTGCTCTGGAGCCCCTGGTGGATCTG


CCCATTGGAATAAACATAACTCGCTTTCAAACACTGCTCGCCCTGCATCGCAGTTAC


CTCACCCCTGGTGATAGTAGTTCAGGATGGACAGCAGGAGCCGCCGCATACTACGT


CGGCTACCTGCAGCCTAGGACCTTCTTGCTGAAGTACAACGAGAACGGTACAATAA


CTGACGCTGTGGACTGCGCTCTGGACCCTCTGTCCGAGACGAAGTGCACCCTGAAGA


GCTTTACTGTTGAAAAAGGCATTTACCAAACCAGCAACTTCCGCGTCCAGCCAACCG


AGAGCATCGTCAGATTTCCCAACATTACAAATCTGTGTCCCTTCGGCGAGGTGTTCA


ACGCCACACGCTTCGCTTCAGTGTACGCATGGAACCGCAAGCGCATATCTAACTGCG


TCGCGGATTATTCTGTCCTCTACAACTCCGCCTCTTTCTCCACCTTCAAGTGCTACGG


AGTGTCACCGACTAAGCTGAACGATCTCTGCTTTACCAACGTCTACGCGGACTCCTT


CGTGATAAGAGGTGATGAAGTGAGACAAATAGCCCCAGGTCAGACTGGTAAGATCG


CAGATTACAACTACAAATTGCCTGATGATTTCACTGGTTGCGTTATCGCGTGGAACT


CTAATAACCTCGATTCTAAGGTCGGTGGTAACTACAATTACCTGTACCGCTTGTTTA


GGAAGTCAAACCTGAAGCCTTTCGAGAGGGATATTTCAACCGAAATCTATCAAGCG


GGTTCAACACCGTGTAACGGTGTGGAAGGATTTAACTGCTACTTCCCCCTGCAGTCT


TACGGATTCCAGCCAACCAATGGCGTGGGTTACCAACCTTATCGCGTGGTGGTTCTG


AGTTTCGAACTGTTGCACGCTCCCGCCACGGTATGCGGTCCCAAGAAGAGCACTAAC


TTGGTGAAGAATAAGTGCGTGAATTTCAATTTCAATGGCCTCACTGGAACTGGAGTG


CTGACCGAATCCAATAAGAAGTTCTTGCCCTTCCAGCAGTTCGGAAGAGACATTGCT


GACACAACCGACGCGGTGCGCGATCCTCAGACTCTGGAGATATTGGACATTACACC


ATGTTCTTTCGGCGGTGTGTCTGTCATTACTCCGGGCACGAATACTAGCAACCAGGT


AGCCGTGCTGTACCAAGACGTGAATTGCACAGAGGTTCCCGTCGCAATTCACGCTGA


CCAGCTGACCCCCACGTGGAGGGTTTACAGCACTGGTAGTAACGTCTTCCAGACGA


GAGCCGGTTGCTTGATCGGAGCGGAACATGTGAATAACTCCTACGAGTGCGACATC


CCCATCGGAGCCGGTATATGCGCCTCTTATCAGACACAAACTAACTCACCCAGGAG


AGCCCGCAGTGTGGCTTCTCAAAGCATTATAGCATACACTATGTCTCTTGGTGCCGA


AAATTCCGTGGCCTATTCTAACAATTCAATCGCCATCCCAACCAACTTCACAATTAG


CGTGACTACCGAAATACTGCCTGTGAGCATGACGAAAACCAGCGTAGACTGCACTA


TGTATATCTGTGGAGACTCCACTGAGTGCTCCAACCTTCTCCTGCAGTACGGTAGCTT


CTGTACCCAATTGAACCGCGCCCTTACAGGCATCGCTGTTGAGCAAGATAAGAATAC


CCAGGAAGTTTTTGCCCAGGTTAAGCAGATATACAAAACACCGCCCATTAAGGACTT


CGGAGGCTTCAACTTCTCTCAGATACTGCCTGACCCCTCCAAGCCATCAAAACGCAG


CTTCATTGAGGACCTCTTGTTCAACAAAGTGACTCTGGCTGATGCTGGCTTCATTAA


GCAGTACGGAGATTGCCTGGGAGATATTGCTGCCAGGGACCTCATCTGCGCCCAGA


AGTTTAATGGCCTGACAGTCTTGCCCCCACTTCTGACAGACGAGATGATTGCTCAGT


ACACATCTGCCCTCCTCGCTGGCACCATAACATCCGGATGGACATTTGGTGCTGGTG


CTGCCCTCCAGATTCCCTTCGCAATGCAGATGGCGTATCGCTTTAACGGCATCGGTG


TCACACAAAACGTGTTGTATGAGAACCAAAAGCTCATCGCTAACCAGTTTAATTCTG


CTATTGGTAAGATTCAGGACAGCCTGTCATCAACCGCGTCTGCCCTTGGTAAGTTGC


AGGACGTGGTGAACCAGAATGCTCAGGCTTTGAATACTCTGGTGAAGCAACTCTCTT


CAAATTTCGGCGCTATCTCTTCTGTGTTGAACGACATCCTGAGTCGCCTTGATAAGGT


GGAAGCTGAAGTTCAAATTGATAGATTGATTACTGGCAGGCTCCAGTCTTTGCAGAC


CTACGTTACACAGCAGCTGATTAGGGCGGCTGAAATTAGAGCTTCCGCCAATCTGGC


TGCAACCAAGATGTCCGAATGCGTCCTGGGTCAGTCAAAGCGCGTTGACTTTTGTGG


TAAAGGCTACCACCTCATGTCATTTCCCCAGTCAGCACCTCACGGAGTAGTGTTCCT


CCACGTCACCTACGTTCCAGCACAGGAAAAGAATTTTACCACTGCGCCGGCAATCTG


TCACGACGGTAAGGCACACTTCCCCCGCGAGGGCGTATTCGTGTCTAACGGAACTCA


TTGGTTCGTCACACAGAGAAACTTCTATGAGCCTCAGATCATTACCACCGACAATAC


ATTTGTGTCCGGTAACTGCGACGTTGTGATTGGAATCGTCAACAACACTGTGTACGA


TCCACTTCAGCCAGAACTGGATAGCTTCAAGGAAGAATTGGACAAATATTTCAAAA


ATCACACTTCACCCGATGTGGACCTGGGTGACATTAGTGGTATCAATGCGTCCGTGG


TCAATATTCAAAAAGAGATTGACAGGCTCAACGAAGTGGCCAAGAACCTGAACGAA


AGTCTTATCGATCTGCAAGAATTGGGAAAGTATGAGCAGTACATCAAGTGGCCGTG


GTACATTTGGTTGGGTTTTATCGCCGGTCTGATCGCCATCGTTATGGTTACCATTATG


CTTTGCTGCATGACGAGCTGTTGCTCCTGTCTGAAGGGATGCTGCTCTTGCGGATCAT


GTTGCAAGTTCGATGAAGACGATAGCGAACCAGTTCTGAAGGGCGTCAAGCTGCAT


TACACA





SEQ ID NO: 14 Nucleic acid sequence of the Δ681-684 sequence of Spike


TTCGTTTTCCTTGTTCTGTTGCCTCTCGTTAGTAGCCAATGCGTCAACCTTACTACTA


GAACCCAGCTCCCTCCAGCATATACCAACTCTTTCACCAGGGGCGTATATTACCCGG


ACAAAGTGTTCCGCTCAAGTGTGCTGCATTCTACGCAGGACCTTTTCTTGCCCTTTTT


CAGTAATGTTACTTGGTTTCATGCTATCCATGTGTCTGGAACTAACGGAACCAAGCG


CTTTGACAACCCCGTCCTCCCTTTCAACGATGGCGTGTACTTCGCTTCCACGGAAAA


GTCAAACATAATTCGCGGCTGGATCTTTGGTACAACACTCGACTCAAAGACGCAGA


GCCTGCTGATCGTTAATAACGCTACAAATGTTGTGATAAAGGTGTGTGAATTTCAGT


TCTGCAATGATCCCTTCCTGGGTGTGTACTACCATAAGAATAACAAGAGCTGGATGG


AATCCGAATTTAGGGTTTACAGTTCCGCTAACAACTGCACATTCGAATACGTAAGCC


AGCCATTTCTTATGGATCTTGAGGGCAAGCAAGGAAACTTCAAGAACTTGAGGGAG


TTCGTGTTCAAAAATATCGACGGCTATTTTAAGATATATAGCAAGCACACTCCAATA


AACTTGGTGCGCGACCTGCCCCAGGGATTCTCTGCTCTGGAGCCCCTGGTGGATCTG


CCCATTGGAATAAACATAACTCGCTTTCAAACACTGCTCGCCCTGCATCGCAGTTAC


CTCACCCCTGGTGATAGTAGTTCAGGATGGACAGCAGGAGCCGCCGCATACTACGT


CGGCTACCTGCAGCCTAGGACCTTCTTGCTGAAGTACAACGAGAACGGTACAATAA


CTGACGCTGTGGACTGCGCTCTGGACCCTCTGTCCGAGACGAAGTGCACCCTGAAGA


GCTTTACTGTTGAAAAAGGCATTTACCAAACCAGCAACTTCCGCGTCCAGCCAACCG


AGAGCATCGTCAGATTTCCCAACATTACAAATCTGTGTCCCTTCGGCGAGGTGTTCA


ACGCCACACGCTTCGCTTCAGTGTACGCATGGAACCGCAAGCGCATATCTAACTGCG


TCGCGGATTATTCTGTCCTCTACAACTCCGCCTCTTTCTCCACCTTCAAGTGCTACGG


AGTGTCACCGACTAAGCTGAACGATCTCTGCTTTACCAACGTCTACGCGGACTCCTT


CGTGATAAGAGGTGATGAAGTGAGACAAATAGCCCCAGGTCAGACTGGTAAGATCG


CAGATTACAACTACAAATTGCCTGATGATTTCACTGGTTGCGTTATCGCGTGGAACT


CTAATAACCTCGATTCTAAGGTCGGTGGTAACTACAATTACCTGTACCGCTTGTTTA


GGAAGTCAAACCTGAAGCCTTTCGAGAGGGATATTTCAACCGAAATCTATCAAGCG


GGTTCAACACCGTGTAACGGTGTGGAAGGATTTAACTGCTACTTCCCCCTGCAGTCT


TACGGATTCCAGCCAACCAATGGCGTGGGTTACCAACCTTATCGCGTGGTGGTTCTG


AGTTTCGAACTGTTGCACGCTCCCGCCACGGTATGCGGTCCCAAGAAGAGCACTAAC


TTGGTGAAGAATAAGTGCGTGAATTTCAATTTCAATGGCCTCACTGGAACTGGAGTG


CTGACCGAATCCAATAAGAAGTTCTTGCCCTTCCAGCAGTTCGGAAGAGACATTGCT


GACACAACCGACGCGGTGCGCGATCCTCAGACTCTGGAGATATTGGACATTACACC


ATGTTCTTTCGGCGGTGTGTCTGTCATTACTCCGGGCACGAATACTAGCAACCAGGT


AGCCGTGCTGTACCAAGACGTGAATTGCACAGAGGTTCCCGTCGCAATTCACGCTGA


CCAGCTGACCCCCACGTGGAGGGTTTACAGCACTGGTAGTAACGTCTTCCAGACGA


GAGCCGGTTGCTTGATCGGAGCGGAACATGTGAATAACTCCTACGAGTGCGACATC


CCCATCGGAGCCGGTATATGCGCCTCTTATCAGACACAAACTAACTCACGCAGTGTG


GCTTCTCAAAGCATTATAGCATACACTATGTCTCTTGGTGCCGAAAATTCCGTGGCC


TATTCTAACAATTCAATCGCCATCCCAACCAACTTCACAATTAGCGTGACTACCGAA


ATACTGCCTGTGAGCATGACGAAAACCAGCGTAGACTGCACTATGTATATCTGTGGA


GACTCCACTGAGTGCTCCAACCTTCTCCTGCAGTACGGTAGCTTCTGTACCCAATTG


AACCGCGCCCTTACAGGCATCGCTGTTGAGCAAGATAAGAATACCCAGGAAGTTTTT


GCCCAGGTTAAGCAGATATACAAAACACCGCCCATTAAGGACTTCGGAGGCTTCAA


CTTCTCTCAGATACTGCCTGACCCCTCCAAGCCATCAAAACGCAGCTTCATTGAGGA


CCTCTTGTTCAACAAAGTGACTCTGGCTGATGCTGGCTTCATTAAGCAGTACGGAGA


TTGCCTGGGAGATATTGCTGCCAGGGACCTCATCTGCGCCCAGAAGTTTAATGGCCT


GACAGTCTTGCCCCCACTTCTGACAGACGAGATGATTGCTCAGTACACATCTGCCCT


CCTCGCTGGCACCATAACATCCGGATGGACATTTGGTGCTGGTGCTGCCCTCCAGAT


TCCCTTCGCAATGCAGATGGCGTATCGCTTTAACGGCATCGGTGTCACACAAAACGT


GTTGTATGAGAACCAAAAGCTCATCGCTAACCAGTTTAATTCTGCTATTGGTAAGAT


TCAGGACAGCCTGTCATCAACCGCGTCTGCCCTTGGTAAGTTGCAGGACGTGGTGAA


CCAGAATGCTCAGGCTTTGAATACTCTGGTGAAGCAACTCTCTTCAAATTTCGGCGC


TATCTCTTCTGTGTTGAACGACATCCTGAGTCGCCTTGATAAGGTGGAAGCTGAAGT


TCAAATTGATAGATTGATTACTGGCAGGCTCCAGTCTTTGCAGACCTACGTTACACA


GCAGCTGATTAGGGCGGCTGAAATTAGAGCTTCCGCCAATCTGGCTGCAACCAAGA


TGTCCGAATGCGTCCTGGGTCAGTCAAAGCGCGTTGACTTTTGTGGTAAAGGCTACC


ACCTCATGTCATTTCCCCAGTCAGCACCTCACGGAGTAGTGTTCCTCCACGTCACCT


ACGTTCCAGCACAGGAAAAGAATTTTACCACTGCGCCGGCAATCTGTCACGACGGT


AAGGCACACTTCCCCCGCGAGGGCGTATTCGTGTCTAACGGAACTCATTGGTTCGTC


ACACAGAGAAACTTCTATGAGCCTCAGATCATTACCACCGACAATACATTTGTGTCC


GGTAACTGCGACGTTGTGATTGGAATCGTCAACAACACTGTGTACGATCCACTTCAG


CCAGAACTGGATAGCTTCAAGGAAGAATTGGACAAATATTTCAAAAATCACACTTC


ACCCGATGTGGACCTGGGTGACATTAGTGGTATCAATGCGTCCGTGGTCAATATTCA


AAAAGAGATTGACAGGCTCAACGAAGTGGCCAAGAACCTGAACGAAAGTCTTATCG


ATCTGCAAGAATTGGGAAAGTATGAGCAGTACATCAAGTGGCCGTGGTACATTTGG


TTGGGTTTTATCGCCGGTCTGATCGCCATCGTTATGGTTACCATTATGCTTTGCTGCA


TGACGAGCTGTTGCTCCTGTCTGAAGGGATGCTGCTCTTGCGGATCATGTTGCAAGT


TCGATGAAGACGATAGCGAACCAGTTCTGAAGGGCGTCAAGCTGCATTACACA





SEQ ID NO: 15 Nucleic acid sequence of the K986P/V987P 


Δ681-684 sequence of Spike


TTCGTTTTCCTTGTTCTGTTGCCTCTCGTTAGTAGCCAATGCGTCAACCTTACTACTA


GAACCCAGCTCCCTCCAGCATATACCAACTCTTTCACCAGGGGCGTATATTACCCGG


ACAAAGTGTTCCGCTCAAGTGTGCTGCATTCTACGCAGGACCTTTTCTTGCCCTTTTT


CAGTAATGTTACTTGGTTTCATGCTATCCATGTGTCTGGAACTAACGGAACCAAGCG


CTTTGACAACCCCGTCCTCCCTTTCAACGATGGCGTGTACTTCGCTTCCACGGAAAA


GTCAAACATAATTCGCGGCTGGATCTTTGGTACAACACTCGACTCAAAGACGCAGA


GCCTGCTGATCGTTAATAACGCTACAAATGTTGTGATAAAGGTGTGTGAATTTCAGT


TCTGCAATGATCCCTTCCTGGGTGTGTACTACCATAAGAATAACAAGAGCTGGATGG


AATCCGAATTTAGGGTTTACAGTTCCGCTAACAACTGCACATTCGAATACGTAAGCC


AGCCATTTCTTATGGATCTTGAGGGCAAGCAAGGAAACTTCAAGAACTTGAGGGAG


TTCGTGTTCAAAAATATCGACGGCTATTTTAAGATATATAGCAAGCACACTCCAATA


AACTTGGTGCGCGACCTGCCCCAGGGATTCTCTGCTCTGGAGCCCCTGGTGGATCTG


CCCATTGGAATAAACATAACTCGCTTTCAAACACTGCTCGCCCTGCATCGCAGTTAC


CTCACCCCTGGTGATAGTAGTTCAGGATGGACAGCAGGAGCCGCCGCATACTACGT


CGGCTACCTGCAGCCTAGGACCTTCTTGCTGAAGTACAACGAGAACGGTACAATAA


CTGACGCTGTGGACTGCGCTCTGGACCCTCTGTCCGAGACGAAGTGCACCCTGAAGA


GCTTTACTGTTGAAAAAGGCATTTACCAAACCAGCAACTTCCGCGTCCAGCCAACCG


AGAGCATCGTCAGATTTCCCAACATTACAAATCTGTGTCCCTTCGGCGAGGTGTTCA


ACGCCACACGCTTCGCTTCAGTGTACGCATGGAACCGCAAGCGCATATCTAACTGCG


TCGCGGATTATTCTGTCCTCTACAACTCCGCCTCTTTCTCCACCTTCAAGTGCTACGG


AGTGTCACCGACTAAGCTGAACGATCTCTGCTTTACCAACGTCTACGCGGACTCCTT


CGTGATAAGAGGTGATGAAGTGAGACAAATAGCCCCAGGTCAGACTGGTAAGATCG


CAGATTACAACTACAAATTGCCTGATGATTTCACTGGTTGCGTTATCGCGTGGAACT


CTAATAACCTCGATTCTAAGGTCGGTGGTAACTACAATTACCTGTACCGCTTGTTTA


GGAAGTCAAACCTGAAGCCTTTCGAGAGGGATATTTCAACCGAAATCTATCAAGCG


GGTTCAACACCGTGTAACGGTGTGGAAGGATTTAACTGCTACTTCCCCCTGCAGTCT


TACGGATTCCAGCCAACCAATGGCGTGGGTTACCAACCTTATCGCGTGGTGGTTCTG


AGTTTCGAACTGTTGCACGCTCCCGCCACGGTATGCGGTCCCAAGAAGAGCACTAAC


TTGGTGAAGAATAAGTGCGTGAATTTCAATTTCAATGGCCTCACTGGAACTGGAGTG


CTGACCGAATCCAATAAGAAGTTCTTGCCCTTCCAGCAGTTCGGAAGAGACATTGCT


GACACAACCGACGCGGTGCGCGATCCTCAGACTCTGGAGATATTGGACATTACACC


ATGTTCTTTCGGCGGTGTGTCTGTCATTACTCCGGGCACGAATACTAGCAACCAGGT


AGCCGTGCTGTACCAAGACGTGAATTGCACAGAGGTTCCCGTCGCAATTCACGCTGA


CCAGCTGACCCCCACGTGGAGGGTTTACAGCACTGGTAGTAACGTCTTCCAGACGA


GAGCCGGTTGCTTGATCGGAGCGGAACATGTGAATAACTCCTACGAGTGCGACATC


CCCATCGGAGCCGGTATATGCGCCTCTTATCAGACACAAACTAACTCACGCAGTGTG


GCTTCTCAAAGCATTATAGCATACACTATGTCTCTTGGTGCCGAAAATTCCGTGGCC


TATTCTAACAATTCAATCGCCATCCCAACCAACTTCACAATTAGCGTGACTACCGAA


ATACTGCCTGTGAGCATGACGAAAACCAGCGTAGACTGCACTATGTATATCTGTGGA


GACTCCACTGAGTGCTCCAACCTTCTCCTGCAGTACGGTAGCTTCTGTACCCAATTG


AACCGCGCCCTTACAGGCATCGCTGTTGAGCAAGATAAGAATACCCAGGAAGTTTTT


GCCCAGGTTAAGCAGATATACAAAACACCGCCCATTAAGGACTTCGGAGGCTTCAA


CTTCTCTCAGATACTGCCTGACCCCTCCAAGCCATCAAAACGCAGCTTCATTGAGGA


CCTCTTGTTCAACAAAGTGACTCTGGCTGATGCTGGCTTCATTAAGCAGTACGGAGA


TTGCCTGGGAGATATTGCTGCCAGGGACCTCATCTGCGCCCAGAAGTTTAATGGCCT


GACAGTCTTGCCCCCACTTCTGACAGACGAGATGATTGCTCAGTACACATCTGCCCT


CCTCGCTGGCACCATAACATCCGGATGGACATTTGGTGCTGGTGCTGCCCTCCAGAT


TCCCTTCGCAATGCAGATGGCGTATCGCTTTAACGGCATCGGTGTCACACAAAACGT


GTTGTATGAGAACCAAAAGCTCATCGCTAACCAGTTTAATTCTGCTATTGGTAAGAT


TCAGGACAGCCTGTCATCAACCGCGTCTGCCCTTGGTAAGTTGCAGGACGTGGTGAA


CCAGAATGCTCAGGCTTTGAATACTCTGGTGAAGCAACTCTCTTCAAATTTCGGCGC


TATCTCTTCTGTGTTGAACGACATCCTGAGTCGCCTTGATcctccaGAAGCTGAAGTTCA


AATTGATAGATTGATTACTGGCAGGCTCCAGTCTTTGCAGACCTACGTTACACAGCA


GCTGATTAGGGCGGCTGAAATTAGAGCTTCCGCCAATCTGGCTGCAACCAAGATGTC


CGAATGCGTCCTGGGTCAGTCAAAGCGCGTTGACTTTTGTGGTAAAGGCTACCACCT


CATGTCATTTCCCCAGTCAGCACCTCACGGAGTAGTGTTCCTCCACGTCACCTACGTT


CCAGCACAGGAAAAGAATTTTACCACTGCGCCGGCAATCTGTCACGACGGTAAGGC


ACACTTCCCCCGCGAGGGCGTATTCGTGTCTAACGGAACTCATTGGTTCGTCACACA


GAGAAACTTCTATGAGCCTCAGATCATTACCACCGACAATACATTTGTGTCCGGTAA


CTGCGACGTTGTGATTGGAATCGTCAACAACACTGTGTACGATCCACTTCAGCCAGA


ACTGGATAGCTTCAAGGAAGAATTGGACAAATATTTCAAAAATCACACTTCACCCG


ATGTGGACCTGGGTGACATTAGTGGTATCAATGCGTCCGTGGTCAATATTCAAAAAG


AGATTGACAGGCTCAACGAAGTGGCCAAGAACCTGAACGAAAGTCTTATCGATCTG


CAAGAATTGGGAAAGTATGAGCAGTACATCAAGTGGCCGTGGTACATTTGGTTGGG


TTTTATCGCCGGTCTGATCGCCATCGTTATGGTTACCATTATGCTTTGCTGCATGACG


AGCTGTTGCTCCTGTCTGAAGGGATGCTGCTCTTGCGGATCATGTTGCAAGTTCGAT


GAAGACGATAGCGAACCAGTTCTGAAGGGCGTCAAGCTGCATTACACA





SEQ ID NO: 16 tissue plasminogen activator SP


DAMKRGLCCVLLLCGAVFVSPSQEIHARFRR





SEQ ID NO: 17 human IgE immunoglobulin SP


DWTWILFLVAAATRVHS





SEQ ID NO: 18 amino acid sequence of mouse Alpha-L-iduronidase


(IDUA) protein


MLTFFAAFLAAPLALAESPYLVRVDAARPLRPLLPFWRSTGFCPPLPHDQADQYDLSWD


QQLNLAYIGAVPHSGIEQVRIHWLLDLITARKSPGQGLMYNFTHLDAFLDLLMENQLLP


GFELMGSPSGYFTDFDDKQQVFEWKDLVSLLARRYIGRYGLTHVSKWNFETWNEPDH


HDFDNVSMTTQGFLNYYDACSEGLRIASPTLKLGGPGDSFHPLPRSPMCWSLLGHCANG


TNFFTGEVGVRLDYISLHKKGAGSSIAILEQEMAVVEQVQQLFPEFKDTPIYNDEADPLV


GWSLPQPWRADVTYAALVVKVIAQHQNLLFANSSSSMRYVLLSNDNAFLSYHPYPFSQ


RTLTARFQVNNTHPPHVQLLRKPVLTVMGLMALLDGEQLWAEVSKAGAVLDSNHTVG


VLASTHHPEGSAAAWSTTVLIYTSDDTHAHPNHSIPVTLRLRGVPPGLDLVYIVLYLDNQ


LSSPYSAWQHMGQPVFPSAEQFRRMRMVEDPVAEAPRPFPARGRLTLHRKLPVPSLLLV


HVCTRPLKPPGQVSRLRALPLTHGQLILVWSDERVGSKCLWTYEIQFSQKGEEYAPINRR


PSTFNLFVFSPDTAVVSGSYRVRALDYWARPGPFSDPVTYLDVPAS





SEQ ID NO: 19 amino acid sequence of human Alpha-L-iduronidase


(IDUA) protein


MRPLRPRAALLALLASLLAAPPVAPAEAPHLVHVDAARALWPLRRFWRSTGFCPPLPHS


QADQYVLSWDQQLNLAYVGAVPHRGIKQVRTHWLLELVTTRGSTGRGLSYNFTHLDG


YLDLLRENQLLPGFELMGSASGHFTDFEDKQQVFEWKDLVSSLARRYIGRYGLAHVSK


WNFETWNEPDHHDFDNVSMTMQGFLNYYDACSEGLRAASPALRLGGPGDSFHTPPRSP


LSWGLLRHCHDGTNFFTGEAGVRLDYISLHRKGARSSISILEQEKVVAQQIRQLFPKFAD


TPIYNDEADPLVGWSLPQPWRADVTYAAMVVKVIAQHQNLLLANTTSAFPYALLSNDN


AFLSYHPHPFAQRTLTARFQVNNTRPPHVQLLRKPVLTAMGLLALLDEEQLWAEVSQA


GTVLDSNHTVGVLASAHRPQGPADAWRAAVLIYASDDTRAHPNRSVAVTLRLRGVPPG


PGLVYVTRYLDNGLCSPDGEWRRLGRPVFPTAEQFRRMRAAEDPVAAAPRPLPAGGRL


TLRPALRLPSLLLVHVCARPEKPPGQVTRLRALPLTQGQLVLVWSDEHVGSKCLWTYEI


QFSQDGKAYTPVSRKPSTFNLFVFSPDTGAVSGSYRVRALDYWARPGPFSDPVPYLEVP


VPRGPPSPGNP





SEQ ID NO: 20 amino acid sequence of mouse Ornithine carbamoyltransferase


(OTC) protein


MLSNLRILLNNAALRKGHTSVVRHFWCGKPVQSQVQLKGRDLLTLKNFTGEEIQYMLW


LSADLKFRIKQKGEYLPLLQGKSLGMIFEKRSTRTRLSTETGFALLGGHPSFLTTQDIHLG


VNESLTDTARVLSSMTDAVLARVYKQSDLDTLAKEASIPIVNGLSDLYHPIQILADYLTL


QEHYGSLKGLTLSWIGDGNNILHSIMMSAAKFGMHLQAATPKGYEPDPNIVKLAEQYA


KENGTKLSMTNDPLEAARGGNVLITDTWISMGQEDEKKKRLQAFQGYQVTMKTAKVA


ASDWTFLHCLPRKPEEVDDEVFYSPRSLVFPEAENRKWTIMAVMVSLLTDYSPVLQKPK


F





SEQ ID NO: 21 amino acid sequence of mouse Fumarylacetoacetase (FAH) 


protein


MSFIPVAEDSDFPIQNLPYGVFSTQSNPKPRIGVAIGDQILDLSVIKHLFTGPALSKHQHVF


DETTLNNFMGLGQAAWKEARASLQNLLSASQARLRDDKELRQRAFTSQASATMHLPA


TIGDYTDFYSSRQHATNVGIMFRGKENALLPNWLHLPVGYHGRASSIVVSGTPIRRPMG


RDIQQWEYVPLGPFLGKSFGTTISPWVVPMDALMPFVVPNPKQDPKPLPYLCHSQPYTF


DINLSVSLKGEGMSQAATICRSNFKHMYWTMLQQLTHHSVNGCNLRPGDLLASGTISGS


DPESFGSMLELSWKGTKAIDVEQGQTRTFLLDGDEVIITGHCQGDGYRVGFGQCAGKVL


PALSPA





SEQ ID NO: 22 amino acid sequence of human miniDMD protein


MLWWEEVEDCYEREDVQKKTFTKWVNAQFSKFGKQHIENLFSDLQDGRRLLDLLEGL


TGQKLPKEKGSTRVHALNNVNKALRVLQNNNVDLVNIGSTDIVDGNHKLTLGLIWNIIL


HWQVKNVMKNIMAGLQQTNSEKILLSWVRQSTRNYPQVNVINFTTSWSDGLALNALIH


SHRPDLFDWNSVVCQQSATQRLEHAFNIARYQLGIEKLLDPEDVDTTYPDKKSILMYITS


LFQVLPQQVSIEAIQEVEMLPRPPKVTKEEHFQLHHQMHYSQQITVSLAQGYERTSSPKP


RFKSYAYTQAAYVTTSDPTRSPFPSQHLEAPEDKSFGSSLMESEVNLDRYQTALEEVLS


WLLSAEDTLQAQGEISNDVEVVKDQFHTHEGYMMDLTAHQGRVGNILQLGSKLIGTGK


LSEDEETEVQEQMNLLNSRWECLRVASMEKQSNLHRVLMDLQNQKLKELNDWLTKTE


ERTRKMEEEPLGPDLEDLKRQVQQHKVLQEDLEQEQVRVNSLTHMVVVVDESSGDHA


TAALEEQLKVLGDRWANICRWTEDRWVLLQDILLKWQRLTEEQCLFSAWLSEKEDAV


NKIHTTGFKDQNEMLSSLQKLAVLKADLEKKKQSMGKLYSLKQDLLSTLKNKSVTQKT


EAWLDNFARCWDNLVQKLEKSTAQETEIAVQAKQPDVEEILSKGQHLYKEKPATQPVK


RKLEDLSSEWKAVNRLLQELRAKQPDLAPGLTTIGASPTQTVTLVTQPVVTKETAISKLE


MPSSLMLEVPALADFNRAWTELTDWLSLLDQVIKSQRVMVGDLEDINEMIIKQKATMQ


DLEQRRPQLEELITAAQNLKNKTSNQEARTHITDRIERIQNQWDEVQEHLQNRRQQLNE


MLKDSTQWLEAKEEAEQVLGQARAKLESWKEGPYTVDAIQKKITETKQLAKDLRQWQ


TNVDVANDLALKLLRDYSADDTRKVHMITENINASWRSIHKRVSEREAALEETHRLLQ


QFPLDLEKFLAWLTEAETTANVLQDATRKERLLEDSKGVKELMKQWQDLQGEIEAHTD


VYHNLDENSQKILRSLEGSDDAVLLQRRLDNMNFKWSELRKKSLNIRSHLEASSDQWK


RLHLSLQELLVWLQLKDDELSRQAPIGGDFPAVQKQNDVHRAFKRELKTKEPVIMSTLE


TVRIFLTEQPLEGLEKLYQEPRELPPEERAQNVTRLLRKQAEEVNTEWEKLNLHSADWQ


RKIDETLERLRELQEATDELDLKLRQAEVIKGSWQPVGDLLIDSLQDHLEKVKALRGEIA


PLKENVSHVNDLARQLTTLGIQLSPYNLSTLEDLNTRWKLLQVAVEDRVRQLHEAHRD


FGPASQHFLSTSVQGPWERAISPNKVPYYINHETQTTCWDHPKMTELYQSLADLNNVRF


SAYRTAMKLRRLQKALCLDLLSLSAACDALDQHNLKQNDQPMDILQIINCLTTIYDRLE


QEHNNLVNVPLCVDMCLNWLLNVYDTGRTGRIRVLSFKTGIISLCKAHLEDKYRYLFK


QVASSTGFCDQRRLGLLLHDSIQIPRQLGEVASFGGSNIEPSVRSCFQFANNKPEIEAALF


LDWMRLEPQSMVWLPVLHRVAAAETAKHQAKCNICKECPIIGFRYRSLKHFNYDICQS


CFFSGRVAKGHKMHYPMVEYCTPTTSGEDVRDFAKVLKNKFRTKRYFAKHPRMGYLP


VQTVLEGDNMETPVTLINFWPVDSAPASSPQLSHDDTHSRIEHYASRLAEMENSNGSYL


NDSISPNESIDDEHLLIQHYCQSLNQDSPLSQPRSPAQILISLESEERGELERILADLEEENR


NLQAEYDRLKQQHEHKGLSPLPSPPEMMPTSPQSPRDAELIAEAKLLRQHKGRLEARMQ


ILEDHNKQLESQLHRLRQLLEQPQAEAKVNGTTVSSPSTSLQRSDSSQPMLLRVVGSQTS


DSMGEEDLLSPPQDTSTGLEEVMEQLNNSFPSSRGRNTPGKPMREDTM





SEQ ID NO: 23 amino acid sequence of human DMD protein


MLWWEEVEDCYEREDVQKKTFTKWVNAQFSKFGKQHIENLFSDLQDGRRLLDLLEGL


TGQKLPKEKGSTRVHALNNVNKALRVLQNNNVDLVNIGSTDIVDGNHKLTLGLIWNIIL


HWQVKNVMKNIMAGLQQTNSEKILLSWVRQSTRNYPQVNVINFTTSWSDGLALNALIH


SHRPDLFDWNSVVCQQSATQRLEHAFNIARYQLGIEKLLDPEDVDTTYPDKKSILMYITS


LFQVLPQQVSIEAIQEVEMLPRPPKVTKEEHFQLHHQMHYSQQITVSLAQGYERTSSPKP


RFKSYAYTQAAYVTTSDPTRSPFPSQHLEAPEDKSFGSSLMESEVNLDRYQTALEEVLS


WLLSAEDTLQAQGEISNDVEVVKDQFHTHEGYMMDLTAHQGRVGNILQLGSKLIGTGK


LSEDEETEVQEQMNLLNSRWECLRVASMEKQSNLHRVLMDLQNQKLKELNDWLTKTE


ERTRKMEEEPLGPDLEDLKRQVQQHKVLQEDLEQEQVRVNSLTHMVVVVDESSGDHA


TAALEEQLKVLGDRWANICRWTEDRWVLLQDILLKWQRLTEEQCLFSAWLSEKEDAV


NKIHTTGFKDQNEMLSSLQKLAVLKADLEKKKQSMGKLYSLKQDLLSTLKNKSVTQKT


EAWLDNFARCWDNLVQKLEKSTAQISQAVTTTQPSLTQTTVMETVTTVTTREQILVKH


AQEELPPPPPQKKRQITVDSEIRKRLDVDITELHSWITRSEAVLQSPEFAIFRKEGNFSDLK


EKVNAIEREKAEKFRKLQDASRSAQALVEQMVNEGVNADSIKQASEQLNSRWIEFCQLL


SERLNWLEYQNNIIAFYNQLQQLEQMTTTAENWLKIQPTTPSEPTAIKSQLKICKDEVNR


LSDLQPQIERLKIQSIALKEKGQGPMFLDADFVAFTNHFKQVFSDVQAREKELQTIFDTL


PPMRYQETMSAIRTWVQQSETKLSIPQLSVTDYEIMEQRLGELQALQSSLQEQQSGLYY


LSTTVKEMSKKAPSEISRKYQSEFEEIEGRWKKLSSQLVEHCQKLEEQMNKLRKIQNHIQ


TLKKWMAEVDVFLKEEWPALGDSEILKKQLKQCRLLVSDIQTIQPSLNSVNEGGQKIKN


EAEPEFASRLETELKELNTQWDHMCQQVYARKEALKGGLEKTVSLQKDLSEMHEWMT


QAEEEYLERDFEYKTPDELQKAVEEMKRAKEEAQQKEAKVKLLTESVNSVIAQAPPVA


QEALKKELETLTTNYQWLCTRLNGKCKTLEEVWACWHELLSYLEKANKWLNEVEFKL


KTTENIPGGAEEISEVLDSLENLMRHSEDNPNQIRILAQTLTDGGVMDELINEELETFNSR


WRELHEEAVRRQKLLEQSIQSAQETEKSLHLIQESLTFIDKQLAAYIADKVDAAQMPQE


AQKIQSDLTSHEISLEEMKKHNQGKEAAQRVLSQIDVAQKKLQDVSMKFRLFQKPANFE


QRLQESKMILDEVKMHLPALETKSVEQEVVQSQLNHCVNLYKSLSEVKSEVEMVIKTG


RQIVQKKQTENPKELDERVTALKLHYNELGAKVTERKQQLEKCLKLSRKMRKEMNVL


TEWLAATDMELTKRSAVEGMPSNLDSEVAWGKATQKEIEKQKVHLKSITEVGEALKTV


LGKKETLVEDKLSLLNSNWIAVTSRAEEWLNLLLEYQKHMETFDQNVDHITKWIIQADT


LLDESEKKKPQQKEDVLKRLKAELNDIRPKVDSTRDQAANLMANRGDHCRKLVEPQIS


ELNHRFAAISHRIKTGKASIPLKELEQFNSDIQKLLEPLEAEIQQGVNLKEEDFNKDMNED


NEGTVKELLQRGDNLQQRITDERKREEIKIKQQLLQTKHNALKDLRSQRRKKALEISHQ


WYQYKRQADDLLKCLDDIEKKLASLPEPRDERKIKEIDRELQKKKEELNAVRRQAEGLS


EDGAAMAVEPTQIQLSKRWREIESKFAQFRRLNFAQIHTVREETMMVMTEDMPLEISYV


PSTYLTEITHVSQALLEVEQLLNAPDLCAKDFEDLFKQEESLKNIKDSLQQSSGRIDIIHSK


KTAALQSATPVERVKLQEALSQLDFQWEKVNKMYKDRQGRFDRSVEKWRRFHYDIKIF


NQWLTEAEQFLRKTQIPENWEHAKYKWYLKELQDGIGQRQTVVRTLNATGEEIIQQSSK


TDASILQEKLGSLNLRWQEVCKQLSDRKKRLEEQKNILSEFQRDLNEFVLWLEEADNIA


SIPLEPGKEQQLKEKLEQVKLLVEELPLRQGILKQLNETGGPVLVSAPISPEEQDKLENKL


KQTNLQWIKVSRALPEKQGEIEAQIKDLGQLEKKLEDLEEQLNHLLLWLSPIRNQLEIYN


QPNQEGPFDVKETEIAVQAKQPDVEEILSKGQHLYKEKPATQPVKRKLEDLSSEWKAVN


RLLQELRAKQPDLAPGLTTIGASPTQTVTLVTQPVVTKETAISKLEMPSSLMLEVPALAD


FNRAWTELTDWLSLLDQVIKSQRVMVGDLEDINEMIIKQKATMQDLEQRRPQLEELITA


AQNLKNKTSNQEARTIITDRIERIQNQWDEVQEHLQNRRQQLNEMLKDSTQWLEAKEE


AEQVLGQARAKLESWKEGPYTVDAIQKKITETKQLAKDLRQWQTNVDVANDLALKLL


RDYSADDTRKVHMITENINASWRSIHKRVSEREAALEETHRLLQQFPLDLEKFLAWLTE


AETTANVLQDATRKERLLEDSKGVKELMKQWQDLQGEIEAHTDVYHNLDENSQKILRS


LEGSDDAVLLQRRLDNMNFKWSELRKKSLNIRSHLEASSDQWKRLHLSLQELLVWLQL


KDDELSRQAPIGGDFPAVQKQNDVHRAFKRELKTKEPVIMSTLETVRIFLTEQPLEGLEK


LYQEPRELPPEERAQNVTRLLRKQAEEVNTEWEKLNLHSADWQRKIDETLERLRELQEA


TDELDLKLRQAEVIKGSWQPVGDLLIDSLQDHLEKVKALRGEIAPLKENVSHVNDLARQ


LTTLGIQLSPYNLSTLEDLNTRWKLLQVAVEDRVRQLHEAHRDFGPASQHFLSTSVQGP


WERAISPNKVPYYINHETQTTCWDHPKMTELYQSLADLNNVRFSAYRTAMKLRRLQKA


LCLDLLSLSAACDALDQHNLKQNDQPMDILQIINCLTTIYDRLEQEHNNLVNVPLCVDM


CLNWLLNVYDTGRTGRIRVLSFKTGIISLCKAHLEDKYRYLFKQVASSTGFCDQRRLGL


LLHDSIQIPRQLGEVASFGGSNIEPSVRSCFQFANNKPEIEAALFLDWMRLEPQSMVWLP


VLHRVAAAETAKHQAKCNICKECPIIGFRYRSLKHFNYDICQSCFFSGRVAKGHKMHYP


MVEYCTPTTSGEDVRDFAKVLKNKFRTKRYFAKHPRMGYLPVQTVLEGDNMETPVTLI


NFWPVDSAPASSPQLSHDDTHSRIEHYASRLAEMENSNGSYLNDSISPNESIDDEHLLIQH


YCQSLNQDSPLSQPRSPAQILISLESEERGELERILADLEEENRNLQAEYDRLKQQHEHKG


LSPLPSPPEMMPTSPQSPRDAELIAEAKLLRQHKGRLEARMQILEDHNKQLESQLHRLRQ


LLEQPQAEAKVNGTTVSSPSTSLQRSDSSQPMLLRVVGSQTSDSMGEEDLLSPPQDTSTG


LEEVMEQLNNSFPSSRGRNTPGKPMREDTM





SEQ ID NO: 24 amino acid sequence of human p53 protein


MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGP


DEAPRMPEAAPPVAPAPAAPTPAAPAPAPSWPLSSSVPSQKTYQGSYGFRLGFLHSGTA


KSVTCTYSPALNKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIYKQSQHMTEVVRRCP


HHERCSDSDGLAPPQHLIRVEGNLRVEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNY


MCNSSCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENLRKKGEPH


HELPPGSTKRALPNNTSSSPQPKKKPLDGEYFTLQIRGRERFEMFRELNEALELKDAQAG


KEPGGSRAHSSHLKSKKGQSTSRHKKLMFKTEGPDSD





SEQ ID NO: 25 amino acid sequence of human PTEN protein


HKNHYKIYNLCAERHYDTAKFNCRVAQYPFEDHNPPQLELIKPFCEDLDQWLSEDDNH


VAAIHCKAGKGRTGVMICAYLLHRGKFLKAQEALDFYGEVRTRDKKGVTIPSQRRYVY


YYSYLLKNHLDYRPVALLFHKMMFETIPMFSGGTCNPQFVVCQLKVKIYSSNSGPTRRE


DKFMYFEFPQPLPVCGDIKVEFFHKQNKMLKKDKMFHFWVNTFFIPGPEETSEKVENGS


LCDQEIDSICSIERADNDKEYLVLTLTKNDLDKANKDKANRYFSPNFKVKLYFTKTVEEP


SNPEASSSTSVTPDVSDNEPDHYRYSDTTDSDPENEPFDEDQHTQITKV





SEQ ID NO: 26 amino acid sequence of SARS-COV-2 neutralizing antibody 


nAB-1


QVQLVESGGGLVQAGGSLRLSCAVSGAGAHRVGWFRRAPGKEREFVAAIGASGGMTN


YLDSVKGRFTISRDNAKNTIYLQMNSLKPQDTAVYYCAARDIETAEYIYWGQGTQVTVS


S





SEQ ID NO: 27 amino acid sequence of SARS-COV-2 neutralizing antibody 


nAB-2


QVQLVESGGGLVQAGGSLRLSCAVSGLGAHRVGWFRRAPGKEREFVAAIGANGGNTN


YLDSVKGRFTISRDNAKNTIYLQMNSLKPQDTAVYYCAARDIETAEYTYWGQGTQVTV


SS





SEQ ID NO: 28 amino acid sequence of SARS-COV-2 neutralizing antibody 


nAB-3


QVQLVESGGGLVQAGGSLRLSCAVSGAGAHRVGWFRRAPGKEREFVAAIGASGGMTN


YLDSVKGRFTISRDNAKNTIYLQMNSLKPQDTAVYYCAARDIETAEYIYWGQGTQVTVS


SKLGGGGSGGGGSGGGGSGGGGSGGGGSQVQLVESGGGLVQAGGSLRLSCAVSGAGA


HRVGWFRRAPGKEREFVAAIGASGGMTNYLDSVKGRFTISRDNAKNTIYLQMNSLKPQ


DTAVYYCAARDIETAEYIYWGQGTQVTVSSGGGGSGGGGSGGGGSGGGGSGGGGSQV


QLVESGGGLVQAGGSLRLSCAVSGAGAHRVGWFRRAPGKEREFVAAIGASGGMTNYL


DSVKGRFTISRDNAKNTIYLQMNSLKPQDTAVYYCAARDIETAEYIYWGQGTQVTVSS





SEQ ID NO: 29 amino acid sequence of SARS-COV-2 neutralizing antibody 


nAB-4


QVQLVESGGGLVQAGGSLRLSCAVSGLGAHRVGWFRRAPGKEREFVAAIGANGGNTN


YLDSVKGRFTISRDNAKNTIYLQMNSLKPQDTAVYYCAARDIETAEYTYWGQGTQVTV


SSKLGGGGSGGGGSGGGGSGGGGSGGGGSSQVQLVESGGGLVQAGGSLRLSCAVSGL


GAHRVGWFRRAPGKEREFVAAIGANGGNTNYLDSVKGRFTISRDNAKNTIYLQMNSLK


PQDTAVYYCAARDIETAEYTYWGQGTQVTVSSGGGGSGGGGSGGGGSGGGGSGGGGS


QVQLVESGGGLVQAGGSLRLSCAVSGLGAHRVGWFRRAPGKEREFVAAIGANGGNTN


YLDSVKGRFTISRDNAKNTIYLQMNSLKPQDTAVYYCAARDIETAEYTYWGQGTQVTV


SS





SEQ ID NO: 30 amino acid sequence of SARS-COV-2 neutralizing antibody 


nAB-5


QVQLVESGGGLVQAGGSLRLSCAASGYIFGRNAMGWYRQAPGKERELVAGITRRGSIT


YYADSVKGRFTISRDNAKNTVYLQMNSLKPEDTAVYYCAADPASPAYGDYWGQGTQV


TVSS





SEQ ID NO: 31 amino acid sequence of SARS-COV-2 neutralizing antibody 


nAB-6


QVQLVESGGGLVQAGGSLRLSCAASGYIFGRNAMGWYRQAPGKERELVAGITRRGSIT


YYADSVKGRFTISRDNAKNTVYLQMNSLKPEDTAVYYCAADPASPAYGDYWGQGTQV


TVSSGGGGSGGGGSGGGGSGGGGSQVQLVESGGGLVQAGGSLRLSCAASGYIFGRNA


MGWYRQAPGKERELVAGITRRGSITYYADSVKGRFTISRDNAKNTVYLQMNSLKPEDT


AVYYCAADPASPAYGDYWGQGTQVTVSSGGGGSGGGGSGGGGSGGGGSQVQLVESG


GGLVQAGGSLRLSCAASGYIFGRNAMGWYRQAPGKERELVAGITRRGSITYYADSVKG


RFTISRDNAKNTVYLQMNSLKPEDTAVYYCAADPASPAYGDYWGQGTQVTVSS





SEQ ID NO: 32 amino acid sequence of SARS-COV-2 neutralizing antibody 


nAB-7H


EVQLLESGGGVVQPGGSLRLSCAASGFAFTTYAMNWVRQAPGRGLEWVSAISDGGGSA


YYADSVKGRFTISRDNSKNTLYLQMNSLRAEDTAVYYCAKTRGRGLYDYVWGSKDY


WGQGTLVTVSSASTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFPEPVTVSWNSGALTS


GVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTKVDKKVEPKSCDKTH


TCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHEDPEVKFNWYVDGVE


VHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKG


QPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSDIAVEWESNGQPENNYKTTPPVLDS


DGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKSLSLSPGK





SEQ ID NO: 33 amino acid sequence of SARS-COV-2 neutralizing antibody 


nAB-7L


DIVMTQSPLSLPVTPGEPASISCRSSQSLLHSNGYNYLDWYLQKPGQSPQLLIYLGSNRAS


GVPDRFSGSGSGTDFTLKISRVEAEDVGVYYCMQALQTPGTFGQGTRLEIKRTVAAPSV


FIFPPSDEQLKSGTASVVCLLNNFYPREAKVQWKVDNALQSGNSQESVTEQDSKDSTYS


LSSTLTLSKADYEKHKVYACEVTHQGLSSPVTKSFNRGEC





SEQ ID NO: 34 amino acid sequence of ACE-binding-1


SAEIDLGKGDFREIRASEDAREAAEALAEAARAMKEALEIIREIAEKLRDSSRASEAAKRI


AKAIRKAADAIAEAAKIAARAAKDGDAARNAENAARKAKEFAEEQAKLADMYAELAK


NGDKSSVLEQLKTFADKAFHEMEDRFYQAALAVFEAAEAAAGGSGWGSG





SEQ ID NO: 35 amino acid sequence of ACE-binding-2


SAEIDLGKGDFREIRASEDAREAAEALAEAARAMKEALEIIREIAEKLRDSSRASEAAKRI


AKAIRKAADAIAEAAKIAARAAKDGDAARNAENAARKAKEFAEEQAKLADMYAELAK


NGDKSSVLEQLKTFADKAFHEMEDRFYQAALAVFEAAEAAAGGGGSGGSGSGGSGGG


SPGSAEIDLGKGDFREIRASEDAREAAEALAEAARAMKEALEIIREIAEKLRDSSRASEAA


KRIAKAIRKAADAIAEAAKIAARAAKDGDAARNAENAARKAKEFAEEQAKLADMYAE


LAKNGDKSSVLEQLKTFADKAFHEMEDRFYQAALAVFEAAEAAAGGSGWGS





SEQ ID NO: 36 Kozak nucleic acid sequence


GCCACCAUG





SEQ ID NO: 37 polyAC sequence


GAAAAACAAAAAACAAAAAAAACAAAAAAAAAACCAAAAAAACAAAACACA





SEQ ID NO: 38 m6A modification sequence


ACGAGTCCTGGACTGAAACGGACTTGT





SEQ ID NO: 39 3′ exon sequence recognizable by a 3′ catalytic


Group I intron fragment


AAAAUCCGUUGACCUUAAACGGUCGUGUGGGUUCAAGUCCCUCCACCCCCAC





SEQ ID NO: 40 5′ exon sequence recognizable by a 5′ catalytic


Group I intron fragment


GAGACGCUACGGACUU





SEQ ID NO: 41 Exemplary 5′ homology sequence


GGGAGACCCUCGACCGUCGAUUGUCCACUGGUC





SEQ ID NO: 42 Exemplary 3′ homology sequence


ACCAGUGGACAAUCGACGGAUAACAGCAUAUCUAG





SEQ ID NO: 43 T7 Promoter


UAAUACGACUCACUAUAGG





SEQ ID NO: 44 T2A peptide coding sequence


GAGGGCAGAGGAAGUCUUCUAACAUGCGGUGACGUGGAGGAGAAUCCCGGCCCU





SEQ ID NO: 45 P2A peptide coding sequence


GCUACUAACUUCAGCCUGCUGAAGCAGGCUGGAGACGUGGAGGAGAACCCUGGAC


CU





SEQ ID NO: 46 catalytic Group I intron fragment


AACAAUAGAUGACUUACAACUAAUCGGAAGGUGCAGAGACUCGACGGGAGCUAC


CCUAACGUCAAGACGAGGGUAAAGAGAGAGUCCAAUUCUCAAAGCCAAUAGGCA


GUAGCGAAAGCUGCAAGAGAAUG





SEQ ID NO: 47 5′ catalytic Group I intron fragment


AAAUAAUUGAGCCUUAAAGAAGAAAUUCUUUAAGUGGAUGCUCUCAAACUCAGG


GAAACCUAAAUCUAGUUAUAGACAAGGCAAUCCUGAGCCAAGCCGAAGUAGUAA


UUAGUAAG





SEQ ID NO: 48 Nucleic acid sequence of full-length S protein


sequence of SARS-COV-2


ATGTTCGTTTTCCTTGTTCTGTTGCCTCTCGTTAGTAGCCAATGCGTCAACCTTACTA


CTAGAACCCAGCTCCCTCCAGCATATACCAACTCTTTCACCAGGGGCGTATATTACC


CGGACAAAGTGTTCCGCTCAAGTGTGCTGCATTCTACGCAGGACCTTTTCTTGCCCTT


TTTCAGTAATGTTACTTGGTTTCATGCTATCCATGTGTCTGGAACTAACGGAACCAA


GCGCTTTGACAACCCCGTCCTCCCTTTCAACGATGGCGTGTACTTCGCTTCCACGGA


AAAGTCAAACATAATTCGCGGCTGGATCTTTGGTACAACACTCGACTCAAAGACGC


AGAGCCTGCTGATCGTTAATAACGCTACAAATGTTGTGATAAAGGTGTGTGAATTTC


AGTTCTGCAATGATCCCTTCCTGGGTGTGTACTACCATAAGAATAACAAGAGCTGGA


TGGAATCCGAATTTAGGGTTTACAGTTCCGCTAACAACTGCACATTCGAATACGTAA


GCCAGCCATTTCTTATGGATCTTGAGGGCAAGCAAGGAAACTTCAAGAACTTGAGG


GAGTTCGTGTTCAAAAATATCGACGGCTATTTTAAGATATATAGCAAGCACACTCCA


ATAAACTTGGTGCGCGACCTGCCCCAGGGATTCTCTGCTCTGGAGCCCCTGGTGGAT


CTGCCCATTGGAATAAACATAACTCGCTTTCAAACACTGCTCGCCCTGCATCGCAGT


TACCTCACCCCTGGTGATAGTAGTTCAGGATGGACAGCAGGAGCCGCCGCATACTA


CGTCGGCTACCTGCAGCCTAGGACCTTCTTGCTGAAGTACAACGAGAACGGTACAAT


AACTGACGCTGTGGACTGCGCTCTGGACCCTCTGTCCGAGACGAAGTGCACCCTGAA


GAGCTTTACTGTTGAAAAAGGCATTTACCAAACCAGCAACTTCCGCGTCCAGCCAAC


CGAGAGCATCGTCAGATTTCCCAACATTACAAATCTGTGTCCCTTCGGCGAGGTGTT


CAACGCCACACGCTTCGCTTCAGTGTACGCATGGAACCGCAAGCGCATATCTAACTG


CGTCGCGGATTATTCTGTCCTCTACAACTCCGCCTCTTTCTCCACCTTCAAGTGCTAC


GGAGTGTCACCGACTAAGCTGAACGATCTCTGCTTTACCAACGTCTACGCGGACTCC


TTCGTGATAAGAGGTGATGAAGTGAGACAAATAGCCCCAGGTCAGACTGGTAAGAT


CGCAGATTACAACTACAAATTGCCTGATGATTTCACTGGTTGCGTTATCGCGTGGAA


CTCTAATAACCTCGATTCTAAGGTCGGTGGTAACTACAATTACCTGTACCGCTTGTTT


AGGAAGTCAAACCTGAAGCCTTTCGAGAGGGATATTTCAACCGAAATCTATCAAGC


GGGTTCAACACCGTGTAACGGTGTGGAAGGATTTAACTGCTACTTCCCCCTGCAGTC


TTACGGATTCCAGCCAACCAATGGCGTGGGTTACCAACCTTATCGCGTGGTGGTTCT


GAGTTTCGAACTGTTGCACGCTCCCGCCACGGTATGCGGTCCCAAGAAGAGCACTA


ACTTGGTGAAGAATAAGTGCGTGAATTTCAATTTCAATGGCCTCACTGGAACTGGAG


TGCTGACCGAATCCAATAAGAAGTTCTTGCCCTTCCAGCAGTTCGGAAGAGACATTG


CTGACACAACCGACGCGGTGCGCGATCCTCAGACTCTGGAGATATTGGACATTACA


CCATGTTCTTTCGGCGGTGTGTCTGTCATTACTCCGGGCACGAATACTAGCAACCAG


GTAGCCGTGCTGTACCAAGACGTGAATTGCACAGAGGTTCCCGTCGCAATTCACGCT


GACCAGCTGACCCCCACGTGGAGGGTTTACAGCACTGGTAGTAACGTCTTCCAGAC


GAGAGCCGGTTGCTTGATCGGAGCGGAACATGTGAATAACTCCTACGAGTGCGACA


TCCCCATCGGAGCCGGTATATGCGCCTCTTATCAGACACAAACTAACTCACCCAGGA


GAGCCCGCAGTGTGGCTTCTCAAAGCATTATAGCATACACTATGTCTCTTGGTGCCG


AAAATTCCGTGGCCTATTCTAACAATTCAATCGCCATCCCAACCAACTTCACAATTA


GCGTGACTACCGAAATACTGCCTGTGAGCATGACGAAAACCAGCGTAGACTGCACT


ATGTATATCTGTGGAGACTCCACTGAGTGCTCCAACCTTCTCCTGCAGTACGGTAGC


TTCTGTACCCAATTGAACCGCGCCCTTACAGGCATCGCTGTTGAGCAAGATAAGAAT


ACCCAGGAAGTTTTTGCCCAGGTTAAGCAGATATACAAAACACCGCCCATTAAGGA


CTTCGGAGGCTTCAACTTCTCTCAGATACTGCCTGACCCCTCCAAGCCATCAAAACG


CAGCTTCATTGAGGACCTCTTGTTCAACAAAGTGACTCTGGCTGATGCTGGCTTCATT


AAGCAGTACGGAGATTGCCTGGGAGATATTGCTGCCAGGGACCTCATCTGCGCCCA


GAAGTTTAATGGCCTGACAGTCTTGCCCCCACTTCTGACAGACGAGATGATTGCTCA


GTACACATCTGCCCTCCTCGCTGGCACCATAACATCCGGATGGACATTTGGTGCTGG


TGCTGCCCTCCAGATTCCCTTCGCAATGCAGATGGCGTATCGCTTTAACGGCATCGG


TGTCACACAAAACGTGTTGTATGAGAACCAAAAGCTCATCGCTAACCAGTTTAATTC


TGCTATTGGTAAGATTCAGGACAGCCTGTCATCAACCGCGTCTGCCCTTGGTAAGTT


GCAGGACGTGGTGAACCAGAATGCTCAGGCTTTGAATACTCTGGTGAAGCAACTCTC


TTCAAATTTCGGCGCTATCTCTTCTGTGTTGAACGACATCCTGAGTCGCCTTGATAAG


GTGGAAGCTGAAGTTCAAATTGATAGATTGATTACTGGCAGGCTCCAGTCTTTGCAG


ACCTACGTTACACAGCAGCTGATTAGGGCGGCTGAAATTAGAGCTTCCGCCAATCTG


GCTGCAACCAAGATGTCCGAATGCGTCCTGGGTCAGTCAAAGCGCGTTGACTTTTGT


GGTAAAGGCTACCACCTCATGTCATTTCCCCAGTCAGCACCTCACGGAGTAGTGTTC


CTCCACGTCACCTACGTTCCAGCACAGGAAAAGAATTTTACCACTGCGCCGGCAATC


TGTCACGACGGTAAGGCACACTTCCCCCGCGAGGGCGTATTCGTGTCTAACGGAACT


CATTGGTTCGTCACACAGAGAAACTTCTATGAGCCTCAGATCATTACCACCGACAAT


ACATTTGTGTCCGGTAACTGCGACGTTGTGATTGGAATCGTCAACAACACTGTGTAC


GATCCACTTCAGCCAGAACTGGATAGCTTCAAGGAAGAATTGGACAAATATTTCAA


AAATCACACTTCACCCGATGTGGACCTGGGTGACATTAGTGGTATCAATGCGTCCGT


GGTCAATATTCAAAAAGAGATTGACAGGCTCAACGAAGTGGCCAAGAACCTGAACG


AAAGTCTTATCGATCTGCAAGAATTGGGAAAGTATGAGCAGTACATCAAGTGGCCG


TGGTACATTTGGTTGGGTTTTATCGCCGGTCTGATCGCCATCGTTATGGTTACCATTA


TGCTTTGCTGCATGACGAGCTGTTGCTCCTGTCTGAAGGGATGCTGCTCTTGCGGATC


ATGTTGCAAGTTCGATGAAGACGATAGCGAACCAGTTCTGAAGGGCGTCAAGCTGC


ATTACACA





SEQ ID NO: 49 Nucleic acid sequence of the RBD amino acid residues


319-542 of S protein


CGCGTCCAGCCAACCGAGAGCATCGTCAGATTTCCCAACATTACAAATCTGTGTCCC


TTCGGCGAGGTGTTCAACGCCACACGCTTCGCTTCAGTGTACGCATGGAACCGCAAG


CGCATATCTAACTGCGTCGCGGATTATTCTGTCCTCTACAACTCCGCCTCTTTCTCCA


CCTTCAAGTGCTACGGAGTGTCACCGACTAAGCTGAACGATCTCTGCTTTACCAACG


TCTACGCGGACTCCTTCGTGATAAGAGGTGATGAAGTGAGACAAATAGCCCCAGGT


CAGACTGGTAAGATCGCAGATTACAACTACAAATTGCCTGATGATTTCACTGGTTGC


GTTATCGCGTGGAACTCTAATAACCTCGATTCTAAGGTCGGTGGTAACTACAATTAC


CTGTACCGCTTGTTTAGGAAGTCAAACCTGAAGCCTTTCGAGAGGGATATTTCAACC


GAAATCTATCAAGCGGGTTCAACACCGTGTAACGGTGTGGAAGGATTTAACTGCTAC


TTCCCCCTGCAGTCTTACGGATTCCAGCCAACCAATGGCGTGGGTTACCAACCTTAT


CGCGTGGTGGTTCTGAGTTTCGAACTGTTGCACGCTCCCGCCACGGTATGCGGTCCC


AAGAAGAGCACTAACTTGGTGAAGAATAAGTGCGTGAATTTC





SEQ ID NO: 50 Nucleic acid sequence of the C-terminal Foldon domain


of a T4 fibritin protein


GGAAGCGGCTACATCCCAGAAGCCCCTAGAGACGGACAGGCTTACGTGCGAAAAGA


CGGCGAGTGGGTGCTGCTGAGCACATTCCTGGGAAGGAGC





SEQ ID NO: 51 Nucleic acid sequence of the GCN4-based isoleucine


zipper domain


CGAATGAAGCAGATTGAGGATAAAATTGAGGAGATTCTCAGCAAAATTTACCACAT


AGAAAATGAGATCGCTCGGATTAAAAAACTGATCGGAGAAAGA





SEQ ID NO: 52 Nucleic acid sequence of the GS peptide linker


GGCGGAGGAGGCAGCGGCGGAGGAGGCAGC





SEQ ID NO: 53 CVB3 virus IRES


TTAAAACAGCCTGTGGGTTGATCCCACCCACAGGCCCATTGGGCGCTAGCACTCTGG


TATCACGGTACCTTTGTGCGCCTGTTTTATACCCCCTCCCCCAACTGTAACTTAGAAG


TAACACACACCGATCAACAGTCAGCGTGGCACACCAGCCACGTTTTGATCAAGCAC


TTCTGTTACCCCGGACTGAGTATCAATAGACTGCTCACGCGGTTGAAGGAGAAAGCG


TTCGTTATCCGGCCAACTACTTCGAAAAACCTAGTAACACCGTGGAAGTTGCAGAGT


GTTTCGCTCAGCACTACCCCAGTGTAGATCAGGTCGATGAGTCACCGCATTCCCCAC


GGGCGACCGTGGCGGTGGCTGCGTTGGCGGCCTGCCCATGGGGAAACCCATGGGAC


GCTCTAATACAGACATGGTGCGAAGAGTCTATTGAGCTAGTTGGTAGTCCTCCGGCC


CCTGAATGCGGCTAATCCTAACTGCGGAGCACACACCCTCAAGCCAGAGGGCAGTG


TGTCGTAACGGGCAACTCTGCAGCGGAACCGACTACTTTGGGTGTCCGTGTTTCATT


TTATTCCTATACTGGCTGCTTATGGTGACAATTGAGAGATCGTTACCATATAGCTATT


GGATTGGCCATCCGGTGACTAATAGAGCTATTATATATCCCTTTGTTGGGTTTATACC


ACTTAGCTTGAAAGAGGTTAAAACATTACAATTCATTGTTAAGTTGAATACAGCAAA





SEQ ID NO: 54 amino acid sequence of human Fumarylacetoacetase


(FAH) protein


MSFIPVAEDSDFPIHNLPYGVFSTRGDPRPRIGVAIGDQILDLSIIKHLFTGPVLSKHQDVF


NQPTLNSFMGLGQAAWKEARVFLQNLLSVSQARLRDDTELRKCAFISQASATMHLPATI


GDYTDFYSSRQHATNVGIMFRDKENALMPNWLHLPVGYHGRASSVVVSGTPIRRPMGQ


DIQKWEYVPLGPFLGKSFGTTVSPWVVPMDALMPFAVPNPKQDPRPLPYLCHDEPYTFD


INLSVNLKGEGMSQAATICKSNFKYMYWTMLQQLTHHSVNGCNLRPGDLLASGTISGP


EPENFGSMLELSWKGTKPIDLGNGQTRKFLLDGDEVIITGYCQGDGYRIGFGQCAGKVL


PALLPS





SEQ ID NO: 55 amino acid sequence of human Ornithine carbamoyltransferase


(OTC) protein


MLFNLRILLNNAAFRNGHNFMVRNFRCGQPLQNKVQLKGRDLLTLKNFTGEEIKYMLW


LSADLKFRIKQKGEYLPLLQGKSLGMIFEKRSTRTRLSTETGLALLGGHPCFLTTQDIHLG


VNESLTDTARVLSSMADAVLARVYKQSDLDTLAKEASIPIINGLSDLYHPIQILADYLTL


QEHYSSLKGLTLSWIGDGNNILHSIMMSAAKFGMHLQAATPKGYEPDASVTKLAEQYA


KENGTKLLLTNDPLEAAHGGNVLITDTWISMGQEEEKKKRLQAFQGYQVTMKTAKVA


ASDWTFLHCLPRKPEEVDDEVFYSPRSLVFPEAENRKWTIMAVMVSLLTDYSPQLQKPK


F





SEQ ID NO: 56 amino acid sequence of human Ornithine COL3A1 protein


MMSFVQKGSWLLLALLHPTIILAQQEAVEGGCSHLGQSYADRDVWKPEPCQICVCDSG


SVLCDDIICDDQELDCPNPEIPFGECCAVCPQPPTAPTRPPNGQGPQGPKGDPGPPGIPGR


NGDPGIPGQPGSPGSPGPPGICESCPTGPQNYSPQYDSYDVKSGVAVGGLAGYPGPAGPP


GPPGPPGTSGHPGSPGSPGYQGPPGEPGQAGPSGPPGPPGAIGPSGPAGKDGESGRPGRP


GERGLPGPPGIKGPAGIPGFPGMKGHRGFDGRNGEKGETGAPGLKGENGLPGENGAPGP


MGPRGAPGERGRPGLPGAAGARGNDGARGSDGQPGPPGPPGTAGFPGSPGAKGEVGPA


GSPGSNGAPGQRGEPGPQGHAGAQGPPGPPGINGSPGGKGEMGPAGIPGAPGLMGARG


PPGPAGANGAPGLRGGAGEPGKNGAKGEPGPRGERGEAGIPGVPGAKGEDGKDGSPGE


PGANGLPGAAGERGAPGFRGPAGPNGIPGEKGPAGERGAPGPAGPRGAAGEPGRDGVP


GGPGMRGMPGSPGGPGSDGKPGPPGSQGESGRPGPPGPSGPRGQPGVMGFPGPKGNDG


APGKNGERGGPGGPGPQGPPGKNGETGPQGPPGPTGPGGDKGDTGPPGPQGLQGLPGT


GGPPGENGKPGEPGPKGDAGAPGAPGGKGDAGAPGERGPPGLAGAPGLRGGAGPPGPE


GGKGAAGPPGPPGAAGTPGLQGMPGERGGLGSPGPKGDKGEPGGPGADGVPGKDGPR


GPTGPIGPPGPAGQPGDKGEGGAPGLPGIAGPRGSPGERGETGPPGPAGFPGAPGQNGEP


GGKGERGAPGEKGEGGPPGVAGPPGKDGTSGHPGPIGPPGPRGNRGERGSEGSPGHPGQ


PGPPGPPGAPGPCCGGVGAAAIAGIGGEKAGGFAPYYGDEPMDFKINTDEIMTSLKSVN


GQIESLISPDGSRKNPARNCRDLKFCHPELKSGEYWVDPNQGCKLDAIKVFCNMETGET


CISANPLNVPRKHWWTDSSAEKKHVWFGESMDGGFQFSYGNPELPEDVLDVQLAFLRL


LSSRASQNITYHCKNSIAYMDQASGNVKKALKLMGSNEGEFKAEGNSKFTYTVLEDGC


TKHTGEWSKTVFEYRTRKAVRLPIVDIAPYDIGGPDQEFGVDVGPVCFL





SEQ ID NO: 57 amino acid sequence of human BMPR2 protein


MTSSLQRPWRVPWLPWTILLVSTAAASQNQERLCAFKDPYQQDLGIGESRISHENGTILC


TDLCNVNFTENFPPPDTTPLSPPHSFNRDETIIIALASVSVLAVLIVALCFGYRMLTGDRK


QGLHSMNMMEAAASEPSLDLDNLKLLELIGRGRYGAVYKGSLDERPVAVKVFSFANRQ


NFINEKNIYRVPLMEHDNIARFIVGDERVTADGRMEYLLVMEYYPNGSLCKYLSLHTSD


WVSSCRLAHSVTRGLAYLHTELPRGDHYKPAISHRDLNSRNVLVKNDGTCVISDFGLSM


RLTGNRLVRPGEEDNAAISEVGTIRYMAPEVLEGAVNLRDCESALKQVDMYALGLIYW


EIFMRCTDLFPGESVPEYQMAFQTEVGNHPTFEDMQVLVSREKQRPKFPEAWKENSLA


VRSLKETIEDCWDQDAEARLTAQCAEERMAELMMIWERNKSVSPTVNPMSTAMQNER


NLSHNRRVPKIGPYPDYSSSSYIEDSIHHTDSIVKNISSEHSMSSTPLTIGEKNRNSINYERQ


QAQARIPSPETSVTSLSTNTTTTNTTGLTPSTGMTTISEMPYPDETNLHTTNVAQSIGPTP


VCLQLTEEDLETNKLDPKEVDKNLKESSDENLMEHSLKQFSGPDPLSSTSSSLLYPLIKL


AVEATGQQDFTQTANGQACLIPDVLPTQIYPLPKQQNLPKRPTSLPLNTKNSTKEPRLKF


GSKHKSNLKQVETGVAKMNTINAAEPHVVTVTMNGVAGRNHSVNSHAATTQYANGT


VLSGQTTNIVTHRAQEMLQNQFIGEDTRLNINSSPDEHEPLLRREQQAGHDEGVLDRLV


DRRERPLEGGRTNSNNNNSNPCSEQDVLAQGVPSTAADPGPSKPRRAQRPNSLDLSATN


VLDGSSIQIGESTQDGKSGSGEKIKKRVKTPYSLKRWRPSTWVISTESLDCEVNNNGSNR


AVHSKSSTAVYLAEGGTATTMVSKDIGMNCL





SEQ ID NO: 58 amino acid sequence of human AHI1 protein


MPTAESEAKVKTKVRFEELLKTHSDLMREKKKLKKKLVRSEENISPDTIRSNLHYMKET


TSDDPDTIRSNLPHIKETTSDDVSAANTNNLKKSTRVTKNKLRNTQLATENPNGDASVEE


DKQGKPNKKVIKTVPQLTTQDLKPETPENKVDSTHQKTHTKPQPGVDHQKSEKANEGR


EETDLEEDEELMQAYQCHVTEEMAKEIKRKIRKKLKEQLTYFPSDTLFHDDKLSSEKRK


KKKEVPVFSKAETSTLTISGDTVEGEQKKESSVRSVSSDSHQDDEISSMEQSTEDSMQDD


TKPKPKKTKKKTKAVADNNEDVDGDGVHEITSRDSPVYPKCLLDDDLVLGVYIHRTDR


LKSDFMISHPMVKIHVVDEHTGQYVKKDDSGRPVSSYYEKENVDYILPIMTQPYDFKQL


KSRLPEWEEQIVFNENFPYLLRGSDESPKVILFFEILDFLSVDEIKNNSEVQNQECGFRKIA


WAFLKLLGANGNANINSKLRLQLYYPPTKPRSPLSVVEAFEWWSKCPRNHYPSTLYVV


RGLKVPDCIKPSYRSMMAPQEEKGKPVHCERHHESSSVDTEPGLEESKEVIKWKRLPGQ


NIIYDLSWSKDDHYILTSSSDGTARIWKNEINNTNTFRVLPHPSFVYTAKFHPAVRELVV


TGCYDSMIRIWKVEMREDSAILVRQFDVHKSFINSLCFDTEGHHMYSGDCTGVIVVWNT


YVKINDLEHSVHHWTINKEIKETEFKGIPISYLEIHPNGKRLLIHTKDSTLRIMDLRILVAR


KFVGAANYREKIHSTLTPCGTFLFAGSEDGIVYVWNPETGEQVAMYSDLPFKSPIRDISY


HPFENMVAFCAFGQNEPILLYIYDFHVAQQEAEMFKRYNGTFPLPGIHQSQDALCTCPK


LPHQGSFQIDEFVHTESSSTKMQLVKQRLETVTEVIRSCAAKVNKNLSFTSPPAVSSQQS


KLKQSNMLTAQEILHQFGFTQTGIISIERKPCNHQVDTAPTVVALYDYTANRSDELTIHR


GDIIRVFFKDNEDWWYGSIGKGQEGYFPANHVASETLYQELPPEIKERSPPLSPEEKTKIE


KSPAPQKQSINKNKSQDFRLGSESMTHSEMRKEQSHEDQGHIMDTRMRKNKQAGRKVT


LIE





SEQ ID NO: 59 amino acid sequence of human FANCC protein


MAQDSVDLSCDYQFWMQKLSVWDQASTLETQQDTCLHVAQFQEFLRKMYEALKEMD


SNTVIERFPTIGQLLAKACWNPFILA YDESQKILIWCLCCLINKEPQNSGQSKLNSWIQGV


LSHILSALRFDKEVALFTQGLGYAPIDYYPGLLKNMVLSLASELRENHLNGFNTQRRMA


PERVASLSRVCVPLITLTDVDPLVEALLICHGREPQEILQPEFFEAVNEAILLKKISLPMSA


VVCLWLRHLPSLEKAMLHLFEKLISSERNCLRRIECFIKDSSLPQAACHPAIFRVVDEMFR


CALLETDGALEIIATIQVFTQCFVEALEKASKQLRFALKTYFPYTSPSLAMVLLQDPQDIP


RGHWLQTLKHISELLREAVEDQTHGSCGGPFESWFLFIHFGGWAEMVAEQLLMSAAEP


PTALLWLLAFYYGPRDGRQQRAQTMVQVKAVLGHLLAMSRSSSLSAQDLQTVAGQGT


DTDLRAPAQQLIRHLLLNFLLWAPGGHTIAWDVITLMAHTAEITHEIIGFLDQTLYRWNR


LGIESPRSEKLARELLKELRTQV





SEQ ID NO: 60 amino acid sequence of human MYBPC3 protein


MPEPGKKPVSAFSKKPRSVEVAAGSPAVFEAETERAGVKVRWQRGGSDISASNKYGLA


TEGTRHTLAVREVGPADQGSYAVIAGSSKVKFDLKVIEAEEAEPMLAPAPAPAEATGAP


GEAPAPAAELGESAPSPKGSSSAALNGPTPGAPDDPIGLFVMRPQDGEVTVGGSITFSAR


VAGASLLKPPVVKWFKGKWVDLSSKVGQHLQLHDSYDRASKVYLFELHITDAQPAFTG


SYRCEVSTKDKFDCSNFNLTVHEAMGTGDLDLLSAFRRTSLAGGGRRISDSHEDTGILDF


SSLLKKRDSFRTPRDSKLEAPAEEDVWETLRQAPPSEYERIAFQYGVTDLRGMLKRLKG


MRRDEKKSTAFQKKLEPAYQVSKGHKIRLTVELADHDAEVKWLKDGQEIQMSGSKYIF


ESIGAKRTLTISQCSLADDAAYQCVVGGEKCSTELFVKEPPVLITRPLEDQLVMVGQRVE


FECEVSEEGAQVKWLKDGVELTREETFKYRFKKDGQRHHLIINEAMLEDAGHYALCTS


GGQALAELIVQEKKLEVYQSIADLMVGAKDQAVFKCEVSDENVRGVWLKNGKELVPD


SRIKVSHIGRVHKLTIDDVTPADEADYSFVPEGFACNLSAKLHFMEVKIDFVPRQEPPKIH


LDCPGRIPDTIVVVAGNKLRLDVPISGDPAPTVIWQKAITQGNKAPARPAPDAPEDTGDS


DEWVFDKKLLCETEGRVRVETTKDRSIFTVEGAEKEDEGVYTVTVKNPVGEDQVNLTV


KVIDVPDAPAAPKISNVGEDSCTVQWEPPAYDGGQPILGYILERKKKKSYRWMRLNFDL


IQELSHEARRMIEGVVYEMRVYAVNAIGMSRPSPASQPFMPIGPPSEPTHLAVEDVSDTT


VSLKWRPPERVGAGGLDGYSVEYCPEGCSEWVAALQGLTEHTSILVKDLPTGARLLSR


VRAHNMAGPGAPVTTTEPVTVQEILQRPRLQLPRHLRQTIQKKVGEPVNLLIPFQGKPRP


QVTWTKEGQPLAGEEVSIRNSPTDTILFIRAARRVHSGTYQVTVRIENMEDKATLVLQV


VDKPSPPQDLRVTDAWGLNVALEWKPPQDVGNTELWGYTVQKADKKTMEWFTVLEH


YRRTHCVVPELIIGNGYYFRVFSQNMVGFSDRAATTKEPVFIPRPGITYEPPNYKALDFSE


APSFTQPLVNRSVIAGYTAMLCCAVRGSPKPKISWFKNGLDLGEDARFRMFSKQGVLTL


EIRKPCPFDGGIYVCRATNLQGEARCECRLEVRVPQ





SEQ ID NO: 61 amino acid sequence of human IL2RG protein


MLKPSLPFTSLLFLQLPLLGVGLNTTILTPNGNEDTTADFFLTTMPTDSLSVSTLPLPEVQ


CFVFNVEYMNCTWNSSSEPQPTNLTLHYWYKNSDNDKVQKCSHYLFSEEITSGCQLQK


KEIHLYQTFVVQLQDPREPRRQATQMLKLQNLVIPWAPENLTLHKLSESQLELNWNNRF


LNHCLEHLVQYRTDWDHSWTEQSVDYRHKFSLPSVDGQKRYTFRVRSRFNPLCGSAQH


WSEWSHPIHWGSNTSKENPFLFALEAVVISVGSMGLIISLLCVYFWLERTMPRIPTLKNL


EDLVTEYHGNFSAWSGVSKGLAESLQPDYSERLCLVSEIPPKGGALGEGPGASPCNQHS


PYWAPPCYTLKPET





SEQ ID NO: 62 amino acid residues 2-1273 sequence of S protein of


SARS-COV-2, K986P V987P


FVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSN


VTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVN


NATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDL


EGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLVRDLPQGFSALEPLVDLPIGINITRFQTL


LALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETK


CTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNC


VADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIAD


YNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTP


CNGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNK


CVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVI


TPGTNTSNQVAVLYQDVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHV


NNSYECDIPIGAGICASYQTQTNSPRRARSVASQSIIA YTMSLGAENSVAYSNNSIAIPTNF


TISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDKNT


QEVFAQVKQIYKTPPIKDFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGD


CLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFA


MQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQ


ALNTLVKQLSSNFGAISSVLNDILSRLDPPEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIR


ASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTT


APAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQUITTDNTFVSGNCDVVIGIVNNTV


YDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLI


DLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFD


EDDSEPVLKGVKLHYT





SEQ ID NO: 63 amino acid sequence of SARS-COV-2 strain B.1.351 RBD


RVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFK


CYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGNIADYNYKLPDDFTGCVIAWN


SNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVKGFNCYFPLQSYGF


QPTYGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNF





SEQ ID NO: 64 Nucleic acid sequence encoding SARS-COV-2 strain B.1.351 RBD


CGCGTCCAGCCAACCGAGAGCATCGTCAGATTTCCCAACATTACAAATCTGTGTCCC


TTCGGCGAGGTGTTCAACGCCACACGCTTCGCTTCAGTGTACGCATGGAACCGCAAG


CGCATATCTAACTGCGTCGCGGATTATTCTGTCCTCTACAACTCCGCCTCTTTCTCCA


CCTTCAAGTGCTACGGAGTGTCACCGACTAAGCTGAACGATCTCTGCTTTACCAACG


TCTACGCGGACTCCTTCGTGATAAGAGGTGATGAAGTGAGACAAATAGCCCCAGGT


CAGACTGGTAACATCGCAGATTACAACTACAAATTGCCTGATGATTTCACTGGTTGC


GTTATCGCGTGGAACTCTAATAACCTCGATTCTAAGGTCGGTGGTAACTACAATTAC


CTGTACCGCTTGTTTAGGAAGTCAAACCTGAAGCCTTTCGAGAGGGATATTTCAACC


GAAATCTATCAAGCGGGTTCAACACCGTGTAACGGTGTGAAAGGATTTAACTGCTAC


TTCCCCCTGCAGTCTTACGGATTCCAGCCAACCTATGGCGTGGGTTACCAACCTTAT


CGCGTGGTGGTTCTGAGTTTCGAACTGTTGCACGCTCCCGCCACGGTATGCGGTCCC


AAGAAGAGCACTAACTTGGTGAAGAATAAGTGCGTGAATTTC








Claims
  • 1. A circular RNA (circRNA) comprising a nucleic acid sequence encoding a therapeutic polypeptide, wherein the therapeutic polypeptide is selected from the group consisting of an antigenic polypeptide, a functional protein, a receptor protein, and a targeting protein.
  • 2. The circRNA of claim 1, further comprising a Kozak sequence operably linked to the nucleic acid sequence encoding the therapeutic polypeptide.
  • 3. The circRNA of claim 1, further comprising an in-frame 2A peptide coding sequence operably linked to the 3′ end of the nucleic acid sequence encoding the therapeutic polypeptide.
  • 4. The circRNA of claim 1, further comprising an internal ribosomal entry site (IRES) sequence operably linked to the nucleic acid sequence encoding the therapeutic polypeptide.
  • 5. The circRNA of claim 4, comprising a nucleic acid sequence comprising from the 5′ end to the 3′ end: the IRES sequence, the Kozak sequence, and the nucleic acid sequence encoding the therapeutic polypeptide.
  • 6. The circRNA of claim 4, further comprising a polyAC or polyA sequence disposed at the 5′ end of the IRES sequence.
  • 7. The circRNA of claim 1, further comprising an m6A modification motif sequence operably linked to the nucleic acid sequence encoding the therapeutic polypeptide.
  • 8. The circRNA of claim 7, comprising a nucleic acid sequence comprising from the 5′ end to the 3′ end: the m6A modification motif sequence, the Kozak sequence, and the nucleic acid sequence encoding the therapeutic polypeptide.
  • 9. The circRNA of claim 1, further comprising a 3′ exon sequence recognizable by a 3′ catalytic Group I intron fragment flanking the 5′ end of the nucleic acid sequence encoding the therapeutic polypeptide, and a 5′ exon sequence recognizable by a 5′ catalytic Group I intron fragment flanking the 3′ end of the nucleic acid sequence encoding the therapeutic polypeptide.
  • 10. The circRNA of claim 1, wherein the therapeutic protein is for treating or preventing an infection.
  • 11. The circRNA of claim 10, wherein the infection is an infection by a virus.
  • 12. The circRNA of claim 11, wherein the virus is a coronavirus.
  • 13. The circRNA of claim 12, wherein the coronavirus is SARS-CoV-2.
  • 14. The circRNA of claim 1, wherein the therapeutic polypeptide is an antigenic polypeptide.
  • 15. The circRNA of claim 14, wherein the antigenic polypeptide comprises a Spike (S) protein or a fragment thereof of a coronavirus.
  • 16. The circRNA of claim 15, wherein the antigenic polypeptide comprises a receptor-binding domain (RBD) of the S protein.
  • 17. The circRNA of claim 14, wherein the antigenic polypeptide further comprises a multimerization domain.
  • 18. The circRNA of claim 14, wherein the antigenic polypeptide comprises an S2 region of the S protein.
  • 19. The circRNA of claim 15,
  • 20. The circRNA of claim 16, wherein the antigenic polypeptide comprises an amino acid sequence selected from the group consisting of SEQ ID NOs: 8-10 or SEQ ID NOs:62-63, and/or wherein the circRNA comprises a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 11-15 and 64.
  • 21. The circRNA of claim 1, wherein the therapeutic protein is a receptor protein.
  • 22. The circRNA of claim 21 wherein the receptor is an ACE2 receptor.
  • 23. The circRNA of claim 22, wherein the receptor is a high-affinity mutant ACE2 receptor.
  • 24. The circRNA of claim 1, wherein the therapeutic protein is a targeting protein.
  • 25. The circRNA of claim 24, wherein the targeting protein is an antibody.
  • 26. The circRNA of claim 25, wherein the antibody is a neutralizing antibody.
  • 27. The circRNA of claim 24, wherein the targeting protein is a therapeutic antibody.
  • 28. The circRNA of claim 1, wherein the therapeutic protein is a functional protein.
  • 29. The circRNA of claim 28, wherein the functional protein is a tumor suppressor.
  • 30. The circRNA of claim 28, wherein the functional protein is an enzyme.
  • 31. The circRNA of claim 30, wherein the functional protein is selected from the group consisting of DMD, COL3A1, BMPR2, AHI1, FANCC, MYBPC3, and IL2RG.
  • 32. A composition comprising a plurality of circRNAs of claim 1, wherein the therapeutic polypeptides corresponding to the plurality of circRNAs are different with respect to each other.
  • 33. The composition of claim 32, wherein the plurality of circRNAs target a plurality of strains of a coronavirus.
  • 34. A circRNA vaccine comprising the circRNA of claim 1.
  • 35. A pharmaceutical composition comprising the circRNA of claim 1 and a pharmaceutically acceptable carrier.
  • 36. The circRNA vaccine of claim 34, further comprising a transfection agent.
  • 37. The circRNA vaccine of claim 34, wherein the circRNA is not formulated with a transfection agent.
  • 38. A method of treating or preventing an infection in an individual, comprising administering to the individual an effective amount of the circRNA of claim 1.
  • 39. The method of claim 38, wherein the infection is a coronavirus infection.
  • 40. The method of claim 39, wherein the infection is SARS-CoV-2 infection.
  • 41. A method of treating or preventing a disease or condition in an individual, comprising administering to the individual an effective amount of the circRNA of claim 1.
  • 42. The method of claim 41, wherein the disease or condition is a disease or condition associated with insufficient levels and/or activity of a protein corresponding to the therapeutic protein, or wherein the disease or condition is a hereditary genetic disease associated with one or more mutations in the protein corresponding to the therapeutic protein.
  • 43. The method of claim 41, wherein: (i) the therapeutic polypeptide is TP53 or PTEN, and the disease or condition is cancer;(ii) the therapeutic polypeptide is OTC, and the disease is ornithine transcarbamylase deficiency;(iii) the therapeutic polypeptide is FAH, and the disease is tyrosinemia;(iv) the therapeutic polypeptide is DMD, and the disease is Duchenne and Becker muscular dystrophy, X-linked dilated cardiomyopathy, or familial dilated cardiomyopathy;(v) the therapeutic polypeptide is IDUA, and the disease or condition is Mucopolysaccharidosis type I (MPS I);(vi) the therapeutic polypeptide is COL3A1, and the disease or condition is Ehlers-Danlos syndrome;(vii) the therapeutic polypeptide is AHI1, and the disease or condition is Joubert syndrome;(viii) the therapeutic polypeptide is BMPR2, and the disease or condition is pulmonary arterial hypertension, or pulmonary veno-occlusive disease;(ix) the therapeutic polypeptide is FANCC, and the disease or condition is Fanconi anemia;(x) the therapeutic polypeptide is MYBPC3, and the disease or condition is primary familial hypertrophic cardiomyopathy; or(xi) the therapeutic polypeptide is IL2RG, and the disease or condition is X-linked severe combined immunodeficiency.
  • 44. The method of claim 38, wherein the circRNA is subject to rolling circle translation by a ribosome in the individual.
  • 45. A linear RNA capable of forming the circRNA of claim 1.
  • 46. A nucleic acid construct comprising a nucleic acid sequence encoding the linear RNA of claim 45.
Priority Claims (2)
Number Date Country Kind
PCT/CN2020/110486 Aug 2020 WO international
PCT/CN2021/074998 Feb 2021 WO international
CROSS REFERENCE TO RELATED APPLICATIONS

This application claims priority benefit of International Patent Application No. PCT/CN2021/074998 filed Feb. 3, 2021, and International Patent Application No. PCT/CN2020/110486 filed Aug. 21, 2020, the contents of which are incorporated herein by reference in its entirety.

PCT Information
Filing Document Filing Date Country Kind
PCT/CN2021/113865 8/20/2021 WO