In accordance with 37 C.F.R. 1.52(e)(5), the present specification makes reference to a Sequence Listing (submitted electronically as a .txt file named “U119670085WO00-SEQ-KSB”). The .txt file was generated on Feb. 15, 2022 and is 1,016,464 bytes in size. The Sequence Listing is herein incorporated by reference in its entirety.
Recombinant viruses (e.g., recombinant adeno-associated viruses (AAV) and recombinant lentiviruses, etc.) can be used to express therapeutic proteins (i.e., therapeutic cargoes) in patients as a form of genetic therapy. Such therapies seeking to deliver a protein cargo commonly package a recombinant virus genome comprising a coding region of interest along with a 5′ untranslated region, 3′ untranslated region, a promoter that will drive the gene of interest, and, sometimes, a constitutive intron to enhance nuclear export and RNA stability. However, most promoter elements are not able to deliver the therapeutic cargo consistently and reliably in conditions of interest (e.g., a specific tissue, a specific cellular environment, etc.).
New approaches relating to the use of recombinant viruses for delivering therapeutic cargo consistently and reliably in conditions of interest would be an advance in the art.
The present disclosure relates to the observation that alternatively-spliced exons may be used in the context of viral vectors (e.g., AAV viral vectors or lentivirus viral vectors) to effectively regulate the expression of a coding region of interest (e.g., a coding region of a transgene that encodes a therapeutic protein). In certain embodiments, the alternatively-spliced exons regulate a coding region of interest in a condition-sensitive manner. As used herein, “condition-sensitive manner” means that the alternatively-spliced exon regulates the expression of a coding region of interest in a manner that is controlled or influenced by one or more conditions, including, but not limited to, environmental conditions, intracellular conditions, extracellular conditions, type of cell (e.g., liver versus kidney cell), gene expression pattern, or disease state. Accordingly, the present disclosure relates to a new approach for regulating expression of a coding region of interest (e.g., a coding region of a transgene that encodes a therapeutic protein) from recombinant viral vectors, optionally in a condition-sensitive manner, by coupling the expression of a coding region of interest with an alternatively-spliced exon. The present disclosure describes a variety of exemplary configurations and methods of coupling the expression of a coding region of interest (or multiple portions of coding regions) with an alternatively-spliced exon, but any suitable arrangement or configuration is contemplated so long as the expression of the coding region of interest (e.g., a coding region of a transgene that encodes a therapeutic protein) is configured to come under regulatory control of an alternatively-spliced exon.
The present disclosure further relates to the following embodiments.
Aspects of the invention relate to a recombinant viral genome capable of delivering (e.g., expressing) a transgene or coding region thereof in a subject, wherein said recombinant viral genome comprises at least one alternatively-spliced exon and a coding region of the transgene. In various aspects, the alternatively-spliced exon undergoes differential splicing in a condition-sensitive manner to result in different spliced transcripts (e.g., mRNA isoforms), whereby the alternatively-spliced exon has been either retained (“spliced in”) or not retained (“spliced-out”) in the resulting spliced transcripts. For example, in a healthy cell environment, the alternatively-spliced exon may be spliced-out of the resulting transcript; however, in a cancer cell, the alternatively-spliced exon may be spliced-in the resulting transcript. And, depending upon the regulatory sequences present in the alternatively-spliced exon, and whether those regulatory sequences impart a positive or negative regulatory control on the expression of the coding region of interest, the alternatively-spliced exon regulates the expression of the coding region of interest by virtue of being either present (spliced-in) or not present (spliced-out) in the resulting mRNA transcript isoform.
In some embodiments, the alternatively-spliced exon may be provided in the form of a transgene comprising the alternatively-spliced exon, one or more introns (or portion(s) thereof), and one or more additional exons (e.g., constitutive exons). Such transgenes comprising an alternatively-spliced exon may be referred to herein as comprising an “alternatively-spliced exon cassettes.” The configuration of the alternatively-spliced exon cassettes and transgenes is not limited in any way, and examples of such configurations are provided in the Figures.
In some embodiments, the transgene comprises an alternatively-spliced exon, one or more introns (or portion(s) thereof) and one or more exons. In various embodiments, the one or more exons can be constitutive exons (i.e., those that are retained in all mRNA isoforms resulting from splicing). In certain embodiments, the transgene or the alternatively-spliced exon cassette comprises one intron (or portion thereof). In some embodiments, the intron (or portion thereof) is located 3′ or 5′ to an alternatively-spliced exon. In other embodiments, the transgene or the alternatively-spliced exon cassette comprises two introns (or portion(s) thereof) (e.g., whereby the one or more introns are flanking introns, i.e., introns that are immediately upstream or downstream of the alternatively-spliced exon).
In some embodiments, an alternative exon cassette comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in any one of SEQ ID NOs: 107-778. In some embodiments, an alternative exon cassette comprises a polynucleotide having a nucleic acid sequence as set forth in any one of SEQ ID NOs: 107-778.
In some embodiments, the alternatively-spliced exon comprises at least one modification, relative to a naturally occurring alternatively-spliced exon. In some embodiments, the alternatively-spliced exon comprises at its 3′ end a heterologous start codon or part of a heterologous start codon. In some embodiments, all native start codons located 5′ to the heterologous start codon are disrupted or deleted.
In some embodiments, the alternatively-spliced exon is located 5′ to the coding region of the transgene. In some embodiments, the alternatively-spliced exon cassette comprises two alternatively-spliced exons, each with flanking introns. In some embodiments, the two alternatively-spliced exons are adjacent. In some embodiments, the constitutive exon is located 5′ to the two alternatively-spliced exons.
In some embodiments, each alternatively-spliced exon comprises at its 3′ end a heterologous start codon or part of a heterologous start codon. In some embodiments, all native start codons located 5′ to the heterologous start codon of the 5′-most alternatively-spliced exon are disrupted or deleted.
In some embodiments, only one of the two alternatively-spliced exons is retained in the spliced transcript. In some embodiments, the 5′-most alternatively-spliced exon is retained in the spliced transcript. In some embodiments, the 3′-most alternatively-spliced exon is retained in the spliced transcript.
In some embodiments, the alternatively-spliced exon(s) and flanking intron(s) are located within the coding region of the transgene.
In some embodiments, the alternatively-spliced exon comprises a heterologous, in-frame stop codon. In some embodiments, the heterologous, in-frame stop codon is at least 50 nucleotides upstream of the next 5′ splice junction. In some embodiments, the heterologous stop codon elicits nonsense-mediated decay.
In various embodiments, the alternatively-spliced exon is spliced-in or retained in the presence of one or more conditions (i.e., in a condition-sensitive manner) to result in an mRNA isoform comprising the alternatively-spliced exon and a coding region of interest. In some embodiments, the one or more conditions comprise the conditions that define one cell type from another. In other embodiments, the one or more conditions comprise the intracellular conditions that define a healthy cell state from a diseased cell state. In some embodiments, the one or more conditions comprise the presence or absence of activated T cells and/or the presence or absence of a state of inflammation. In still other embodiments, the one or more conditions comprise one or more signs or symptoms of a disease state, and/or the presence or absence of one or more disease markers. In still other embodiments, the one or more conditions comprise the expression level and/or activity of the endogenous protein that corresponds to the protein encoded by the coding region of interest in the alternatively-spliced exon cassette of the recombinant virus genome. For example, in one embodiment, if the endogenous protein has a low level of expression and/or activity (e.g., due to a defective naturally occurring gene encoding the endogenous protein), the alternatively-spliced exon may be spliced-in, and the coding region of interest may be upregulated (e.g., if the alternatively-spliced exon comprises a positive regulatory sequence). In another embodiment, if the endogenous protein has a low level of expression and/or activity (e.g., due to a defective naturally occurring gene encoding the endogenous protein), the alternatively-spliced exon may be spliced-in, and the coding region of interest may be downregulated (e.g., if the alternatively-spliced exon comprises a negative regulatory sequence). In still other embodiments, if the endogenous protein has a low level of expression and/or activity (e.g., due to a defective naturally occurring gene encoding the endogenous protein), the alternatively-spliced exon may be spliced-out, and the coding region of interest may be upregulated (e.g., if the alternatively-spliced exon comprises a negative regulatory sequence that is removed by the splicing-out of the exon). In another embodiment, if the endogenous protein has a low level of expression and/or activity (e.g., due to a defective naturally occurring gene encoding the endogenous protein), the alternatively-spliced exon may be spliced-out, and the coding region of interest may be downregulated (e.g., if the alternatively-spliced exon comprises a positive regulatory sequence that is removed by the splicing-out of the exon).
In various embodiments, the one or more conditions (e.g., environmental, intracellular, disease state, cell type, expression pattern, etc.) may result in the splicing-in or splicing-out of the alternatively-spliced exon. For example, the one or more conditions may cause the alternatively-spliced exon to be spliced-in, and the coding region of interest may be upregulated (e.g., if the alternatively-spliced exon comprises a positive regulatory sequence). In another embodiment, the one or more conditions may cause the alternatively-spliced exon to be spliced-in, and the coding region of interest may be downregulated (e.g., if the alternatively-spliced exon comprises a negative regulatory sequence). In still other embodiments, the one or more conditions may cause the alternatively-spliced exon to be spliced-out, and the coding region of interest may be upregulated (e.g., if the alternatively-spliced exon comprises a negative regulatory sequence that is removed by the splicing-out of the exon). In another embodiment, the one or more conditions may cause the alternatively-spliced exon to be spliced-out, and the coding region of interest may be downregulated (e.g., if the alternatively-spliced exon comprises a positive regulatory sequence that is removed by the splicing-out of the exon).
In some embodiments, the alternatively-spliced exon comprises an alternatively-spliced exon from a gene selected from the group consisting of: ABCC1, AK125149, ASCC2, BAT2D1, BBX, BRD8, BRE, C17orf70, CAMKK2, CBFB, CCAR1, CCDC7CD6, CHTF8, COL4A3BP, COL6A3, CUGBP1, CUGBP2, CXorf45, DENND3, DGUOK, DKFZp762G094, DNAJC7, DNASE1, EIF4A2, EIF4G2, EIF4H, EXOC7, EZH2, FAM120A, FAM136A, FAM36A, FARSB, FBXO38, FGFR1OP2, FIP1L1, FOXRED1, FUBP3, GALT, GATA3, GOLGA2, HIF1A, HMMR, HRB, IKZF1, ILF3, IRAK4, IRF1, KCTD13, LEF1, LUC7L, LYRM1, MALT1 e7, MAP2K7, MAP3K7, MAP4K2, MBNL2, MFF, NAE1, NCSTN, NR4A3, NRF1, NUP98, PARP6, PCM1, PLAUR, PLSCR3, PPIL5, PPP5C, PTPRC-E4, PTPRC-E6, PTS, RABL5, RAPH1, SEC16A, SFRS3, SFRS7, SLMAP, SNRNP70, STAT6, TBC1D1, TIMM8B, TIR8, TRA2A, TROVE2, UGCGL1, VAP-B, VAV1, ZNF384, ZNF496, CAMK2B, PKP2, LGMN, NRAP, VPS39, KSR1, PDLIM3, BIN1, ARFGAP2, KIF13A, and/or PICALM. In some embodiments, the alternatively-spliced exon comprises an alternatively-spliced exon from or derived from an alternatively-spliced exon of a gene selected from the group consisting of CAMK2B, PKP2, LGMN, NRAP, VPS39, KSR1, PDLIM3, BIN1, ARFGAP2, KIF13A, and/or PICALM. In some embodiments, the alternatively-spliced exon is or is derived from an alternatively-spliced exon of CAMK2B. In some embodiments, the alternatively-spliced exon is or is derived from an alternatively-spliced exon of PKP2. In some embodiments, the alternatively-spliced exon is or is derived from an alternatively-spliced exon of LGMN. In some embodiments, the alternatively-spliced exon is or is derived from an alternatively-spliced exon of NRAP. In some embodiments, the alternatively-spliced exon is or is derived from an alternatively-spliced exon of VPS39. In some embodiments, the alternatively-spliced exon is or is derived from an alternatively-spliced exon of KSR1. In some embodiments, the alternatively-spliced exon is or is derived from an alternatively-spliced exon of PDLIM3. In some embodiments, the alternatively-spliced exon is or is derived from an alternatively-spliced exon of BIN1. In some embodiments, the alternatively-spliced exon is or is derived from an alternatively-spliced exon of ARFGAP2. In some embodiments, the alternatively-spliced exon is or is derived from an alternatively-spliced exon of KIF13A. In some embodiments, the alternatively-spliced exon is or is derived from an alternatively-spliced exon of PICALM.
In some embodiments, the alternatively-spliced exon is or is derived from exon 11 of BIN1. In some embodiments, the alternatively-spliced exon which is or is derived from exon 11 of BIN1 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 37. In some embodiments, the alternatively-spliced exon which is or is derived from exon 11 of BIN1 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 37. In some embodiments, the alternatively-spliced exon which is or is derived from exon 11 of BIN1 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 38. In some embodiments, the alternatively-spliced exon which is or is derived from exon 11 of BIN1 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 38.
In some embodiments, a component (e.g., an alternative exon; an intronic sequence) which is “derived from” a gene (e.g., BIN1, SMN1) may be derived from the gene in that the component is taken from its wild-type or natural context and put into a non-natural context (e.g., inserted into the nucleic acid sequence of a transgene), but may comprise the wild-type or natural nucleic acid sequence of said component. In some embodiments, a component (e.g., an alternative exon; an intronic sequence) which is “derived from” a gene (e.g., BIN1, SMN1) may be derived from the gene in that the component is taken from its wild-type or natural context and put into a non-natural context (e.g., inserted into the nucleic acid sequence of a transgene), and may also be derived from the gene in that the nucleic acid sequence of the component is modified, relative to the wild-type or natural nucleic acid sequence of said component. Modifications to the various components (e.g., introns, exons, etc.) are described elsewhere herein.
In some embodiments, the alternatively-spliced exon comprises an alternatively-spliced exon comprising a polynucleotide sequence as set forth in any one of SEQ ID NOs: 23-44.
In some embodiments, the flanking intron(s) (or portion(s) thereof) is a native flanking intron(s) (or portion(s) thereof) of the alternatively-spliced exon(s). In some embodiments, the flanking intron(s) (or portion(s) thereof) comprises at its 5′ end a 5′ splice donor site. In some embodiments, the flanking intron(s) (or portion(s) thereof) comprises at its 3′ end a 3′ splice donor site. In some embodiments, the flanking intron(s) (or portion(s) thereof) comprises no modifications, relative to a naturally occurring intron (or portion thereof). In some embodiments, the flanking intron(s) (or portion(s) thereof) comprises at least one modification, relative to a naturally occurring intron (or portion thereof). In some embodiments, the modification is a substitution or deletion of one or more nucleotides. In some embodiments, the flanking intron(s) (or portion(s) thereof) is a regulated intron (or portion thereof).
In some embodiments, the flanking intron(s) is or is derived from an intron of a gene selected from the group consisting of ABCC1, AK125149, ASCC2, BAT2D1, BBX, BRD8, BRE, C17orf70, CAMKK2, CBFB, CCAR1, CCDC7CD6, CHTF8, COL4A3BP, COL6A3, CUGBP1, CUGBP2, CXorf45, DENND3, DGUOK, DKFZp762G094, DNAJC7, DNASE1, EIF4A2, EIF4G2, EIF4H, EXOC7, EZH2, FAM120A, FAM136A, FAM36A, FARSB, FBXO38, FGFR1OP2, FIP1L1, FOXRED1, FUBP3, GALT, GATA3, GOLGA2, HIF1A, HMMR, HRB, IKZF1, ILF3, IRAK4, IRF1, KCTD13, LEF1, LUC7L, LYRM1, MALT1 e7, MAP2K7, MAP3K7, MAP4K2, MBNL2, MFF, NAE1, NCSTN, NR4A3, NRF1, NUP98, PARP6, PCM1, PLAUR, PLSCR3, PPIL5, PPP5C, PTPRC-E4, PTPRC-E6, PTS, RABL5, RAPH1, SEC16A, SFRS3, SFRS7, SLMAP, SMN1, SNRNP70, STAT6, TBC1D1, TIMM8B, TIR8, TRA2A, TROVE2, UGCGL1, VAP-B, VAV1, ZNF384, ZNF496, CAMK2B, PKP2, LGMN, NRAP, VPS39, KSR1, PDLIM3, BIN1, ARFGAP2, KIF13A, and/or PICALM.
In some embodiments, the flanking intron(s) is or is derived from an intron of SMN1. In some embodiments, the flanking intron(s) which is or is derived from an intron of SMN1 flanks a constitutive exon. In some embodiments, the flanking intron(s) is or is derived from intron 6 and/or intron 7 of SMN1. In some embodiments, the flanking intron which is derived from SMN1 intron 6 is a fragment of (e.g., is truncated relative to) the wild-type or naturally occurring sequence of SMN1 intron 6. In some embodiments, the flanking intron which is derived from SMN1 intron 6 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 103. In some embodiments, the flanking intron which is derived from SMN1 intron 6 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 103. In some embodiments, the flanking intron which is derived from SMN1 intron 7 is a fragment of (e.g., is truncated relative to) the wild-type or naturally occurring sequence of SMN1 intron 7. In some embodiments, the flanking intron which is derived from SMN1 intron 7 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 104. In some embodiments, the flanking intron which is derived from SMN1 intron 7 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 104.
In some embodiments, the flanking intron(s) is or is derived from an intron of BIN1. In some embodiments, the flanking intron(s) which is or is derived from an intron of BIN1 flanks an alternative exon. In some embodiments, the flanking intron(s) is or is derived from intron 10 and/or intron 11 of BIN1. In some embodiments, the flanking intron(s) which is or is derived from intron 10 of BIN1 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 15. In some embodiments, the flanking intron(s) which is or is derived from intron 10 of BIN1 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 15. In some embodiments, the flanking intron(s) which is or is derived from intron 11 of BIN1 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 16. In some embodiments, the flanking intron(s) which is or is derived from intron 11 of BIN1 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 16.
In some embodiments, the flanking intron(s) comprises an intron comprising a polynucleotide sequence as set forth in any one of SEQ ID NOs: 1-22, 103, and 104.
In some embodiments, the constitutive exon is an exon which is natively associated with the coding region of the transgene. In some embodiments, the constitutive exon is not a exon which is natively associated with the coding region of the transgene. In some embodiments, the constitutive exon is or is derived from the same gene as the alternatively-spliced exon(s). In some embodiments, the gene is the gene from which the coding region of the transgene is also derived. In some embodiments, the constitutive exon is not from or derived from the same gene as the alternatively-spliced exon(s).
In some embodiments, the coding region of the transgene is or is derived from a coding region of a gene selected from the group consisting of MBNL1, MBNL2, MBNL3, hnRNP A1, hnRNP A2B1, hnRNP C, hnRNP D, hnRNP DL, hnRNP F, hnRNP H, hnRNP K, hnRNP L, hnRNP M, hnRNP R, hnRNP U, FUS, TDP43, PABPN1, ATXN2, TAF15, EWSR1, MATR3, TIA1, FMRP, MTM1, MTMR2, LAMP2, KIF5A, a microdystrophin-encoding gene, C9ORF72, HTT, DNM2, BIN1, RYR1, NEB, ACTA, TPM3, TPM2, TNNT2, CFL2, KBTBD13, KLHL40, KLHL41, LMOD3, MYPN, SEPN1, TTN, SPEG, MYH7, TK2, POLG1, GAA, AGL, PYGM, SLC22A5, OCTN2, ETF, ETFH, PNPLA2, a cytochrome b oxidase-encoding gene, a cytochrome c oxidase-encoding gene, CLCN1, SCN4A, DMPK, CNBP, MYOT, LMNA, CAV3, DNAJB6, DES, TNPO3, HNRPDL, CAPN3, DYSF, an alpha-sarcoglycan-encoding gene, a beta-sarcoglycan-encoding gene, a gamma-sarcoglycan-encoding gene, a delta-sarcoglycan-encoding gene, TCAP, TRIM32, FKRP, FXN, POMT1, FKTN, POMT2, POMGnT1, DAG1, ANO5, PLEC1, TRAPPC11, GMPPB, ISPD, LIMS2, POPDC1, TOR1AIP1, POGLUT2, LAMA2, COL6A1, POMT1, POMT2, DUX4, EMD, PAX7, PMP22, MPZ, MFN2, SMCHD1, SMN, Lamin A/C (LAMN), GJB1, ABCC1, AK125149, ASCC2, BAT2D1, BBX, BRD8, BRE, C17orf70, CAMKK2, CBFB, CCAR1, CCDC7CD6, CHTF8, COL4A3BP, COL6A3, CUGBP1, CUGBP2, CXorf45, DENND3, DGUOK, DKFZp762G094, DNAJC7, DNASE1, EIF4A2, EIF4G2, EIF4H, EXOC7, EZH2, FAM120A, FAM136A, FAM36A, FARSB, FBXO38, FGFR1OP2, FIP1L1, FOXRED1, FUBP3, GALT, GATA3, GOLGA2, HIF1A, HMMR, HRB, IKZF1, ILF3, IRAK4, IRF1, KCTD13, LEF1, LUC7L, LYRM1, MALT1 e7, MAP2K7, MAP3K7, MAP4K2, MBNL2, MFF, NAE1, NCSTN, NR4A3, NRF1, NUP98, PARP6, PCM1, PLAUR, PLSCR3, PPIL5, PPP5C, PTPRC-E4, PTPRC-E6, PTS, RABL5, RAPH1, SEC16A, SFRS3, SFRS7, SLMAP, SNRNP70, STAT6, TBC1D1, TIMM8B, TIR8, TRA2A, TROVE2, UGCGL1, VAP-B, VAV1, ZNF384, ZNF496, CAMK2B, PKP2, LGMN, NRAP, VPS39, KSR1, PDLIM3, BIN1, ARFGAP2, KIF13A, and/or PICALM. In some embodiments, the coding region of the transgene is or is derived from MTM1, CAPN3, or FXN. In some embodiments, the coding region of the transgene is or is derived from FXN.
In some embodiments, the coding region of the transgene is or is derived from MTM1. In some embodiments, the coding region of the transgene which is or is derived from MTM1 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 1881. In some embodiments, the coding region of the transgene which is or is derived from MTM1 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 1881.
In some embodiments, the coding region of the transgene is or is derived from CAPN3. In some embodiments, the coding region of the transgene which is or is derived from CAPN3 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 1882. In some embodiments, the coding region of the transgene which is or is derived from CAPN3 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 1882.
In some embodiments, a recombinant viral genome of the present disclosure further comprises a promoter. In some embodiments, the promoter is a native promoter of the coding region of the transgene. In some embodiments, the promoter is not a native promoter of the coding region of the transgene. In some embodiments, the promoter is constitutive. In some embodiments, the promoter is inducible. In some embodiments, the promoter is a cell-specific promoter. In some embodiments, the promoter is a tissue-specific promoter. In some embodiments, the promoter is selected from the group consisting of an EF1 alpha promoter, beta actin promoter, CMV, muscle creatine kinase promoter, C5-12 muscle promoter, MHCK7, CBh, synapsin, MECP2, enolase, GFAP, Desmin, and CAG promoter.
In some embodiments, the promoter is an MHCK7 promoter. In some embodiments, an MHCK7 promoter comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 1880. In some embodiments, an MHCK7 promoter comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 1880.
In some embodiments, the promoter drives expression of the transgene (e.g., expression of the product encoded by the coding region of interest). In some embodiments, the promoter is a ubiquitous promoter. In some embodiments, a ubiquitous promoter is a promoter selected from the group consisting of: an EF1 alpha promoter, a beta actin promoter, CMV, CBh, and CAG promoter. In some embodiments, the promoter is a tissue-specific promoter, such as a muscle- or heart-biased promoter. In some embodiments, a tissue-specific promoter, such as a muscle- or heart-biased promoter, is a promoter selected from the group consisting of: a muscle creatine kinase promoter, a C5-12 muscle promoter, MHCK7, and Desmin. In some embodiments, the promoter is a neuronal-biased promoter. In some embodiments, a neuronal-biased promoter is a promoter selected from the group consisting of: synapsin and MECP2. In some embodiments, the promoter is an astrocyte-biased promoter. In some embodiments, an astrocyte-biased promoter is a GFAP promoter.
In some embodiments, the coding region of the transgene comprises at least one modification, relative to a coding region of a naturally occurring gene. In some embodiments, the modification is an addition, substitution or deletion of at least one nucleotide. In some embodiments, the coding region of the transgene comprises a deletion of a native start codon, or a portion thereof. In some embodiments, the coding region of the transgene comprises an addition of a non-native stop codon, or a portion thereof. In some embodiments, the transgene comprises one or more recombinant introns (e.g., a 3′ UTR intron). In some embodiments, the one or more recombinant introns (e.g., a 3′ UTR intron), when translated, elicits nonsense mediated decay (NMD).
In some embodiments, the naturally occurring gene is a gene selected from the group consisting of MBNL1, MBNL2, MBNL3, hnRNP A1, hnRNP A2B1, hnRNP C, hnRNP D, hnRNP DL, hnRNP F, hnRNP H, hnRNP K, hnRNP L, hnRNP M, hnRNP R, hnRNP U, FUS, TDP43, PABPN1, ATXN2, TAF15, EWSR1, MATR3, TIA1, FMRP, MTM1, MTMR2, LAMP2, KIF5A, a microdystrophin-encoding gene, C9ORF72, HTT, DNM2, BIN1, RYR1, NEB, ACTA, TPM3, TPM2, TNNT2, CFL2, KBTBD13, KLHL40, KLHL41, LMOD3, MYPN, SEPN1, TTN, SPEG, MYH7, TK2, POLG1, GAA, AGL, PYGM, SLC22A5, OCTN2, ETF, ETFH, PNPLA2, a cytochrome b oxidase-encoding gene, a cytochrome c oxidase-encoding gene, CLCN1, SCN4A, DMPK, CNBP, MYOT, LMNA, CAV3, DNAJB6, DES, TNPO3, HNRPDL, CAPN3, DYSF, an alpha-sarcoglycan-encoding gene, a beta-sarcoglycan-encoding gene, a gamma-sarcoglycan-encoding gene, a delta-sarcoglycan-encoding gene, TCAP, TRIM32, FKRP, FXN, POMT1, FKTN, POMT2, POMGnT1, DAG1, ANO5, PLEC1, TRAPPC11, GMPPB, ISPD, LIMS2, POPDC1, TOR1AIP1, POGLUT2, LAMA2, COL6A1, POMT1, POMT2, DUX4, EMD, PAX7, PMP22, MPZ, MFN2, SMCHD1, SMN, Lamin A/C (LAMN), and/or GJB1. In some embodiments, the naturally occurring gene is MTM1, CAPN3, or FXN. In some embodiments, the naturally occurring gene is MTM1. In some embodiments, the naturally occurring gene is CAPN3. In some embodiments, the naturally occurring gene is FXN.
In some embodiments, the coding region of the transgene comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in either SEQ ID NO: 1881 or SEQ ID NO: 1882. In some embodiments, the coding region of the transgene comprises a polynucleotide having a nucleic acid sequence as set forth in either SEQ ID NO: 1881 or SEQ ID NO: 1882.
In some embodiments, the recombinant viral genome is a recombinant genome from an adeno-associated virus (rAAV), lentivirus, retrovirus, or foamyvirus. In some embodiments, the recombinant viral genome is from an AAV. In some embodiments, the transgene is flanked by AAV inverted terminal repeat (ITR) sequences. In some embodiments, the ITR sequences comprise AAV1, AAV2, AAV5, AAV7, AAV8, or AAV9 ITR sequences. In some embodiments, the recombinant viral genome is from a lentivirus. In some embodiments, the alternatively-spliced exon cassette is located on the minus strand of the lentivirus genome.
In some embodiments, a recombinant viral genome of the present disclosure further comprises a 3′ untranslated region (UTR) that is endogenous or exogenous to the transgene. In some embodiments, the exogenous 3′ UTR is the 3′ UTR from bovine growth hormone, SV40, EBV, or Myc.
In some embodiments, the exogenous 3′ UTR is SV40. In some embodiments, the SV40 3′ UTR comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 1883. In some embodiments, the SV40 3′ UTR comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 1883.
In some embodiments, the exogenous 3′ UTR comprises a polyadenylation (pA) signal. In some embodiments, the pA signal is an SV40 pA signal.
Aspects of the invention contemplate a viral particle comprising a viral genome according to any embodiment of the present disclosure. In some embodiments, the viral particle is an rAAV particle. In some embodiments, the rAAV particle comprises an AAV serotype selected from the group consisting of 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10. In some embodiments, the rAAV particle comprises AAV serotype 9. In some embodiments, the rAAV particle comprises an AAV derivative or pseudotype selected from the group consisting of an AAV2-AAV3 hybrid, AAVrh.10, AAVhu.14, AAV3a/3b, AAVrh32.33, AAV-HSC15, AAV-HSC17, AAVhu.37, AAVrh.8, CHt-P6, AAV2.5, AAV6.2, AAV2i8, AAV-HSC15/17, AAVM41, AAV9.45, AAV6(Y445F/Y731F), AAV2.5T, AAV-HAE1/2, AAV clone 32/83, AAVShH10, AAV2 (Y→F), AAV8 (Y733F), AAV2.15, AAV2.4, AAVM41, and AAVr3.45.
In some embodiments, the viral particle further comprises at least one helper plasmid. In some embodiments, the helper plasmid comprises a rep gene and a cap gene. In some embodiments, the rep gene encodes Rep78, Rep68, Rep52, or Rep40. In some embodiments, the cap gene encodes a VP1, VP2, and/or VP3 region of the viral capsid protein. In some embodiments, the viral particle comprises two helper plasmids. In some embodiments, the first helper plasmid comprises a rep gene and a cap gene and the second helper plasmid comprises a E1a gene, a E1b gene, a E4 gene, a E2a gene, and a VA gene.
In some embodiments, the viral particle is a recombinant lentivirus particle. In some embodiments, the lentivirus is a human immunodeficiency virus (HIV1 or HIV2), a feline immunodeficiency virus (FIV), a bovine immunodeficiency virus (BIV), a caprine arthritis encephalitis virus, an equine infectious anemia virus, a jembrana disease virus, a puma lentivirus, aimian immunodeficiency virus, or a visna-maedi virus. In some embodiments, the viral particle further comprises a viral envelope.
Aspects of the invention relate to a method of treating a disease or condition in a subject comprising administering a recombinant viral genome or a viral particle according to any embodiment of the present disclosure to the subject. In some embodiments, the subject is a mammal. In some embodiments, the mammal is a human. In some embodiments, the recombinant viral genome or viral particle is administered to the subject at least one time. In some embodiments, the recombinant viral genome or viral particle is administered to the subject 2, 3, 4, 5, 6, 7, 8, 9, or 10 times. In some embodiments, the recombinant viral genome or viral particle is administered to the subject parenterally, subcutaneously, intraocularly, intravitreally, subretinally, intravenously (IV), intracerebro-ventricularly, intramuscularly, intrathecally (IT), intracisternally, intraperitoneally, enterally, via inhalation, topically, or by direct injection to one or more cells, tissues, or organs. In some embodiments, the recombinant viral genome or viral particle is administered to the subject by intravenous injection, intramuscular injection, intrathecal injection, or intravitreal injection.
In some embodiments, the disease or condition is a disease or condition selected from the group consisting of Dentatorubral-pallido-luysian atrophy (DRPLA), myotonic dystrophy type 1 (DM1), myotonic dystrophy type 2 (DM2), Fragile X syndrome of mental retardation (FMR1), Fragile X tremor ataxia syndrome (FXTAS), FRAXE mental retardation (FMR2), Friedreichs ataxia (FRDA), Huntington disease (HD), Huntington disease-like 2 (HDL2), Oculopharyngeal muscular dystrophy (OPMD), Myoclonic epilepsy type 1, Alzheimer's disease, ALS/FTD, spinocerebellar ataxia type 1 (SCA1), spinocerebellar ataxia type 2 (SCA2), spinocerebellar ataxia type 3 (SCA3), spinocerebellar ataxia type 6 (SCA6), spinocerebellar ataxia type 7 (SCAT), spinocerebellar ataxia type 8 (SCA8), spinocerebellar ataxia type 10 (SCA10), spinocerebellar ataxia type 12 (SCA12), spinocerebellar ataxia type 17 (SCA17), Syndromic/non-syndromic X-linked mental retardation, Emery-Dreifuss muscular dystrophy type 2, familial partial lipodystrophy, limb girdle muscular dystrophy type 1B, dilated cardiomyopathy, familial partial lipodystrophy, Charcot-Marie-Tooth disorder type 2B1, mandibuloacral dysplasia, childhood progeria syndrome (Hutchinson-Gilford syndrome), Werner syndrome, Dilated cardiomyopathy (DCM), Hypertrophic cardiomyopathy (HCM), Restrictive cardiomyopathy (RCM), Left Ventricular Non-compaction (LVNC), Arrhythmogenic Right Ventricular Dysplasia (ARVD), takotsubo cardiomyopathy, Duchenne muscular dystrophy, Becker muscular dystrophy, Limb-girdle muscular dystrophy, Facioscapulohumeral muscular dystrophy, Congenital muscular dystrophy, Oculopharyngeal muscular dystrophy, Distal muscular dystrophy, Emery-Dreifuss muscular dystrophy, dementia, Parkinson's disease (PD), a PD-related disorder, Prion disease, a motor neuron disease (MND), Progressive bulbar palsy (PBP), Progressive muscular atrophy (PMA), Primary lateral sclerosis (PLS), Spinal muscular atrophy (SMA), a bladder cancer, a breast cancer, a colorectal cancer, a kidney cancer, a lung cancer, a lymphoma, a melanoma, an oral cancer, an ovarian cancer, an oropharyngeal cancer, a pancreatic cancer, a prostate cancer, a thyroid cancer, a uterine cancer, Down syndrome, Prader-Willi Syndrome (PWS), Bloom Syndrome, Cockayne Syndrome Type I-216400, Cockayne Syndrome Type III, Cockayne Syndrome Type I, Hutchinson-Gilford Progeria Syndrome, Mandibuloacral Dysplasia with Type A Lipodystrophy, Progeria, Adult Onset Progeroid Syndrome, Neonatal Rothmund-Thomson Syndrome, Seip Syndrome, Werner Syndrome, Replication Focus-Forming Activity 1, myotubular myopathy, Danon Disease, and/or centronuclear myopathy.
Aspects of the invention relate to a method of regulating transgene expression (e.g., comprising a coding region of interest which encodes a protein of interest, such as a therapeutic protein) using a viral vector comprising a recombinant viral genome as described herein, wherein the transgene, or coding region of the transgene, are under the regulatory control of an alternatively-spliced exon. In some embodiments, the method comprises inserting into the recombinant viral genome at least one alternatively-spliced exon and at least one coding region of interest (e.g., which encodes a therapeutic protein), wherein the expression of the at least one coding region of interest is regulated by the alternative-spliced exon. In turn, how the regulation of the coding region of interest is imparted depends on (a) the presence or absence of positive or negative regulatory control sequences in the alternatively-spliced exon, and (b) whether the alternatively-splice exon is spliced-in (i.e., retained) or spliced-out (i.e., removed) from the final mRNA transcript isoform. The recombinant viral genome may be configured with one or more additional introns, exons, and/or regulatory sequences (e.g., promoters, enhancers, and the like that control transcription from the recombinant viral genome). In addition, the alternatively-splice exon may be comprised on a cassette (which may be referred to as an alternatively-spliced exon cassette), comprising the alternatively-spliced exon(s) and one or more introns, which may be inserted into the recombinant viral genome in a manner that couples it to the coding region of interest, such that the expression of the coding region of interest comes under regulatory control of the alternatively-spliced exon of the cassette.
In other embodiments, the transgene comprises an alternatively-spliced exon, optionally one or more introns (or portion(s) thereof), optionally one or more constitutive exons, and a coding region of interest.
Aspects of the invention relate to a method of regulating transgene (e.g., comprising a coding region of interest which encodes a protein of interest, such as a therapeutic protein) expression using a viral vector comprising a recombinant viral genome as described herein. In some embodiments, the method comprises: (a) inserting into the recombinant viral genome at least one transgene, wherein the transgene comprises a constitutive exon, at least one alternatively-spliced exon, at least one flanking intron (or portion thereof), and a coding region of a transgene; (b) introducing a heterologous start codon or part of a heterologous start codon at the 3′ end of the alternatively-spliced exon; (c) disrupting or deleting all native start codons located 5′ to the heterologous start codon; and (d) deleting or disrupting one or more native start codons, or a portion(s) thereof, from the coding region of the transgene. In some embodiments, the method comprises: (a) inserting into the recombinant viral genome at least one transgene, wherein the transgene comprises a constitutive exon, at least one alternatively-spliced exon, at least one flanking intron (or portion thereof), and a coding region of a transgene; (b) introducing a heterologous start codon or part of a heterologous start codon at the 3′ end of the alternatively-spliced exon; (c) disrupting or deleting all native start codons located 5′ to the heterologous start codon; and (d) adding a heterologous 3′ UTR, or a portion thereof, to the coding region of the transgene. In some embodiments, translation of the heterologous 3′ UTR elicits nonsense mediated decay. In some embodiments, (a) inserting into the recombinant viral genome at least one alternatively-spliced exon cassette, wherein the alternatively-spliced exon cassette comprises a constitutive exon, at least one alternatively-spliced exon, at least one flanking intron (or portion thereof), and a coding region of a transgene; (b) introducing a heterologous start codon or part of a heterologous start codon at the 3′ end of the alternatively-spliced exon; (c) disrupting or deleting all native start codons located 5′ to the heterologous start codon; (d) deleting or disrupting one or more native start codons, or a portion(s) thereof, from the coding region of the transgene; and (e) adding a heterologous 3′ UTR, or a portion thereof, to the coding region of the transgene. In some embodiments, translation of the heterologous 3′ UTR elicits nonsense mediated decay. In some embodiments, the constitutive exon, alternatively-spliced exon, and flanking intron (or portion thereof) are each located 5′ to the coding region of the transgene.
Aspects of the invention relate to a method of regulating transgene (e.g., comprising a coding region of interest which encodes a protein of interest, such as a therapeutic protein) expression using a viral vector comprising a recombinant viral genome as described herein. In some embodiments, the method comprises: (a) inserting into the recombinant viral genome at least one transgene, wherein the transgene comprises an alternatively-spliced exon and at least one flanking intron (or portion thereof) within the coding region of the transgene; and (b) introducing into the alternatively-spliced exon a heterologous, in-frame stop codon upstream of the next 5′ splice junction. In some embodiments, the heterologous, in-frame stop codon elicits nonsense-mediated decay. In certain embodiments, the in-frame stop codon is inserted at least 100 nucleotides, at least 95 nucleotides, at least 90 nucleotides, at least 85 nucleotides, at least 80 nucleotides, at least 75 nucleotides, at least 70 nucleotides, at least 65 nucleotides, at least 60 nucleotides, at least 55 nucleotides, at least 50 nucleotides, at least 45 nucleotides, at least 40 nucleotides, at least 35 nucleotides, at least 30 nucleotides, at least 25 nucleotides, at least 20 nucleotides, at least 15 nucleotides, at least 10 nucleotides, or at least 5 nucleotides, or between 1 to 5 nucleotides upstream of the next 5′ splice junction.
Aspects of the invention relate to a transgene comprising, in the 5′ to 3′ direction: (i) a nucleotide sequence comprising a first exonic sequence having a 5′ to 3′ orientation, wherein the first exonic sequence comprises a constitutive exon; (ii) a nucleotide sequence comprising a first intronic sequence having a 5′ to 3′ orientation, wherein the first intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; (iii) a nucleotide sequence comprising a second exonic sequence having a 5′ to 3′ orientation, wherein the second exonic sequence comprises an alternatively-spliced exon comprising at its 3′ end a heterologous ATG start codon; (iv) a nucleotide sequence comprising a second intronic sequence having a 5′ to 3′ orientation, wherein the second intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; and (v) a nucleotide sequence comprising a coding region of the transgene having a 5′ to 3′ orientation, wherein the coding region of the transgene comprises at its 5′ end a modification comprising the removal of a native ATG start codon. In some embodiments, all native ATG start codons located upstream of the heterologous ATG start codon are mutated or deleted.
Other aspects of the invention relate to a transgene comprising, in the 5′ to 3′ direction: (i) a nucleotide sequence comprising a first portion of a coding region of the transgene having a 5′ to 3′ orientation; (ii) a nucleotide sequence comprising a first intronic sequence having a 5′ to 3′ orientation, wherein the first intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; (iii) a nucleotide sequence comprising an exonic sequence having a 5′ to 3′ orientation, wherein the exonic sequence comprises an alternatively-spliced exon comprising at its 3′ end a heterologous stop codon; (iv) a nucleotide sequence comprising a second intronic sequence having a 5′ to 3′ orientation, wherein the second intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; and (v) a nucleotide sequence comprising a second portion of a coding region of the transgene having a 5′ to 3′ orientation.
Aspects of the invention relate to a transgene comprising, in the 5′ to 3′ direction: (i) a nucleotide sequence comprising a coding region of the transgene having a 5′ to 3′ orientation; (ii) a nucleotide sequence comprising a first intronic sequence having a 5′ to 3′ orientation, wherein the first intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; (iii) a nucleotide sequence comprising a first exonic sequence having a 5′ to 3′ orientation, wherein the first exonic sequence comprises an alternatively-spliced exon comprising a positive or negative cis-acting element; (iv) a nucleotide sequence comprising a second intronic sequence having a 5′ to 3′ orientation, wherein the second intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; and (v) a nucleotide sequence comprising a second exonic sequence having a 5′ to 3′ orientation, wherein the exonic sequence comprises a constitutive exon.
Aspects of the invention relate to a transgene comprising, in the 5′ to 3′ direction: (i) a nucleotide sequence comprising a first exonic sequence having a 5′ to 3′ orientation, wherein the first exonic sequence comprises a constitutive exon; (ii) a nucleotide sequence comprising a second exonic sequence having a 5′ to 3′ orientation, wherein the second exonic sequence comprises an alternatively-spliced exon comprising at its 3′ end a heterologous ATG start codon; and (iii) a nucleotide sequence comprising a coding region of the transgene having a 5′ to 3′ orientation, wherein the coding region of the transgene comprises at its 5′ end a modification comprising the removal of a native ATG start codon. In some embodiments, all native ATG start codons located upstream of the heterologous ATG start codon are mutated or deleted.
Aspects of the invention relate to a transgene comprising, in the 5′ to 3′ direction: (i) a nucleotide sequence comprising a first portion of a coding region of the transgene having a 5′ to 3′ orientation; (ii) a nucleotide sequence comprising a first exonic sequence having a 5′ to 3′ orientation, wherein the first exonic sequence comprises an alternatively-spliced exon comprising at its 3′ end a heterologous stop codon; (iii) a nucleotide sequence comprising a second portion of a coding region of the transgene having a 5′ to 3′ orientation; (iv) a nucleotide sequence comprising an intronic sequence having a 5′ to 3′ orientation, wherein the intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; and (v) a nucleotide sequence comprising a second exonic sequence having a 5′ to 3′ orientation, wherein the second exonic sequence comprises a constitutive exon.
Aspects of the invention relate to a transgene comprising, in the 5′ to 3′ direction: (i) a nucleotide sequence comprising a coding region of the transgene having a 5′ to 3′ orientation; (ii) a nucleotide sequence comprising a first exonic sequence having a 5′ to 3′ orientation, wherein the first exonic sequence comprises an alternatively-spliced exon comprising a positive or negative cis-acting element; and (iii) a nucleotide sequence comprising a second exonic sequence having a 5′ to 3′ orientation, wherein the second exonic sequence comprises a constitutive exon.
Aspects of the invention relate to a transgene comprising, in the 5′ to 3′ direction: (i) a nucleotide sequence comprising a first exonic sequence having a 5′ to 3′ orientation, wherein the first exonic sequence comprises a constitutive exon; (ii) a nucleotide sequence comprising a second exonic sequence having a 5′ to 3′ orientation, wherein the second exonic sequence comprises an alternatively-spliced exon comprising at its 3′ end a heterologous ATG start codon; (iii) a nucleotide sequence comprising an intronic sequence having a 5′ to 3′ orientation, wherein the intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; and (iv) a nucleotide sequence comprising a coding region of the transgene having a 5′ to 3′ orientation, wherein the coding region of the transgene comprises at its 5′ end a modification comprising the removal of a native ATG start codon. In some embodiments, all native ATG start codons located upstream of the heterologous ATG start codon are mutated or deleted.
Aspects of the invention relate to a transgene comprising, in the 5′ to 3′ direction: (i) a nucleotide sequence comprising a first portion of a coding region of the transgene having a 5′ to 3′ orientation; (ii) a nucleotide sequence comprising an exonic sequence having a 5′ to 3′ orientation, wherein the exonic sequence comprises an alternatively-spliced exon comprising at its 3′ end a heterologous stop codon; (iii) a nucleotide sequence comprising an intronic sequence having a 5′ to 3′ orientation, wherein the intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; and (iv) a nucleotide sequence comprising a second portion of a coding region of the transgene having a 5′ to 3′ orientation.
Aspects of the invention relate to a transgene comprising, in the 5′ to 3′ direction: (i) a nucleotide sequence comprising a coding region of the transgene having a 5′ to 3′ orientation; (ii) a nucleotide sequence comprising a first exonic sequence having a 5′ to 3′ orientation, wherein the exonic sequence comprises an alternatively-spliced exon comprising a positive or negative cis-acting element; (iii) a nucleotide sequence comprising an intronic sequence having a 5′ to 3′ orientation, wherein the intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; and (iv) a nucleotide sequence comprising a second exonic sequence having a 5′ to 3′ orientation, wherein the exonic sequence comprises a constitutive exon.
Aspects of the invention relate to a transgene comprising, in the 5′ to 3′ direction: (i) a nucleotide sequence comprising a first exonic sequence having a 5′ to 3′ orientation, wherein the first exonic sequence comprises a constitutive exon; (ii) a nucleotide sequence comprising an intronic sequence having a 5′ to 3′ orientation, wherein the intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; (iii) a nucleotide sequence comprising a second exonic sequence having a 5′ to 3′ orientation, wherein the second exonic sequence comprises an alternatively-spliced exon comprising at its 3′ end a heterologous ATG start codon; and (iv) a nucleotide sequence comprising a coding region of the transgene having a 5′ to 3′ orientation, wherein the coding region of the transgene comprises at its 5′ end a modification comprising the removal of a native ATG start codon. In some embodiments, all native ATG start codons located upstream of the heterologous ATG start codon are mutated or deleted.
Aspects of the invention relate to a transgene comprising, in the 5′ to 3′ direction: (i) a nucleotide sequence comprising a first portion of a coding region of the transgene having a 5′ to 3′ orientation; (ii) a nucleotide sequence comprising a first intronic sequence having a 5′ to 3′ orientation, wherein the first intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; (iii) a nucleotide sequence comprising a first exonic sequence having a 5′ to 3′ orientation, wherein the first exonic sequence comprises an alternatively-spliced exon comprising at its 3′ end a heterologous stop codon; (iv) a nucleotide sequence comprising a second portion of a coding region of the transgene having a 5′ to 3′ orientation; (v) a nucleotide sequence comprising a second intronic sequence having a 5′ to 3′ orientation, wherein the second intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; and (vi) a nucleotide sequence comprising a second exonic sequence having a 5′ to 3′ orientation, wherein the second exonic sequence comprises a constitutive exon.
Aspects of the invention relate to a transgene comprising, in the 5′ to 3′ direction: (i) a nucleotide sequence comprising a coding region of the transgene having a 5′ to 3′ orientation; (ii) a nucleotide sequence comprising an intronic sequence having a 5′ to 3′ orientation, wherein the intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; (iii) a nucleotide sequence comprising an exonic sequence having a 5′ to 3′ orientation, wherein the exonic sequence comprises an alternatively-spliced exon comprising a positive or negative cis-acting element; and (iv) a nucleotide sequence comprising a second exonic sequence having a 5′ to 3′ orientation, wherein the exonic sequence comprises a constitutive exon.
Aspects of the invention relate to a transgene comprising, in the 5′ to 3′ direction: (i) a nucleotide sequence comprising a first exonic sequence having a 5′ to 3′ orientation, wherein the first exonic sequence comprises a constitutive exon; (ii) a nucleotide sequence comprising a first intronic sequence having a 5′ to 3′ orientation, wherein the first intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; (iii) a nucleotide sequence comprising a second exonic sequence having a 5′ to 3′ orientation, wherein the second exonic sequence comprises an alternatively-spliced exon comprising at its 3′ end a heterologous ATG start codon; (iv) a nucleotide sequence comprising a second intronic sequence having a 5′ to 3′ orientation, wherein the second intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; (v) a nucleotide sequence comprising a third exonic sequence having a 5′ to 3′ orientation, wherein the third exonic sequence comprises an alternatively-spliced exon; (vi) a nucleotide sequence comprising a third intronic sequence having a 5′ to 3′ orientation, wherein the third intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; and (vii) a nucleotide sequence comprising a coding region of the transgene having a 5′ to 3′ orientation, wherein the coding region of the transgene comprises at its 5′ end a modification comprising the removal of a native ATG start codon. In some embodiments, all native ATG start codons located upstream of the heterologous ATG start codon are mutated or deleted.
Aspects of the invention relate to a transgene comprising, in the 5′ to 3′ direction: (i) a nucleotide sequence comprising a first portion of a coding region of the transgene having a 5′ to 3′ orientation; (ii) a nucleotide sequence comprising a first intronic sequence having a 5′ to 3′ orientation, wherein the first intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; (iii) a nucleotide sequence comprising a first exonic sequence having a 5′ to 3′ orientation, wherein the first exonic sequence comprises an alternatively-spliced exon comprising at its 3′ end a heterologous stop codon; (iv) a nucleotide sequence comprising a second intronic sequence having a 5′ to 3′ orientation, wherein the second intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; (v) a nucleotide sequence comprising a second exonic sequence having a 5′ to 3′ orientation, wherein the second exonic sequence comprises an alternatively-spliced exon; (vi) a nucleotide sequence comprising a third intronic sequence having a 5′ to 3′ orientation, wherein the third intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site (m); and (vii) a nucleotide sequence comprising a second portion of a coding region of the transgene having a 5′ to 3′ orientation.
Aspects of the invention relate to a transgene comprising, in the 5′ to 3′ direction: (i) a nucleotide sequence comprising a coding region of the transgene having a 5′ to 3′ orientation; (ii) a nucleotide sequence comprising a first intronic sequence having a 5′ to 3′ orientation, wherein the first intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; (iii) a nucleotide sequence comprising a first exonic sequence having a 5′ to 3′ orientation (e), wherein the first exonic sequence comprises a first alternatively-spliced exon comprising a positive or negative cis-acting element; (iv) a nucleotide sequence comprising a second intronic sequence having a 5′ to 3′ orientation, wherein the second intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; (v) a nucleotide sequence comprising a second exonic sequence having a 5′ to 3′ orientation, wherein the second exonic sequence comprises a second alternatively-spliced exon; (vi) a nucleotide sequence comprising a third intronic sequence having a 5′ to 3′ orientation, wherein the third intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; and (vii) a nucleotide sequence comprising a third exonic sequence having a 5′ to 3′ orientation, wherein the third exonic sequence comprises a constitutive exon.
Aspects of the invention relate to a transgene comprising, in the 5′ to 3′ direction: (i) a nucleotide sequence comprising a first exonic sequence having a 5′ to 3′ orientation, wherein the first exonic sequence comprises a constitutive exon; (ii) a nucleotide sequence comprising a first intronic sequence having a 5′ to 3′ orientation, wherein the first intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; (iii) a nucleotide sequence comprising a coding region of the transgene having a 5′ to 3′ orientation; (iv) a nucleotide sequence comprising a second intronic sequence having a 5′ to 3′ orientation, wherein the second intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; and (v) a nucleotide sequence comprising a second exonic sequence having a 5′ to 3′ orientation, wherein the second exonic sequence comprises an alternatively-spliced exon.
Aspects of the invention relate to a transgene comprising, in the 5′ to 3′ direction: (i) a nucleotide sequence comprising a first portion of a coding region of the transgene having a 5′ to 3′ orientation; (ii) a nucleotide sequence comprising a first intronic sequence having a 5′ to 3′ orientation, wherein the first intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; (iii) a nucleotide sequence comprising a first exonic sequence having a 5′ to 3′ orientation, wherein the first exonic sequence comprises an alternatively-spliced exon comprising at its 3′ end a heterologous stop codon; (iv) a nucleotide sequence comprising a second intronic sequence having a 5′ to 3′ orientation, wherein the second intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; (v) a nucleotide sequence comprising a second exonic sequence having a 5′ to 3′ orientation, wherein the second exonic sequence comprises a constitutive exon; (vi) a nucleotide sequence comprising a third intronic sequence having a 5′ to 3′ orientation, wherein the third intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; and (vii) a nucleotide sequence comprising a second portion of a coding region of the transgene having a 5′ to 3′ orientation.
Aspects of the invention relate to a transgene comprising, in the 5′ to 3′ direction: (i) a nucleotide sequence comprising a coding region of the transgene having a 5′ to 3′ orientation; (ii) a nucleotide sequence comprising a first intronic sequence having a 5′ to 3′ orientation, wherein the first intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; (iii) a nucleotide sequence comprising a first exonic sequence having a 5′ to 3′ orientation, wherein the first exonic sequence comprises an alternatively-spliced exon comprising a positive or negative cis-acting element; (iv) a nucleotide sequence comprising a second intronic sequence having a 5′ to 3′ orientation, wherein the second intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site; and (v) a nucleotide sequence comprising a second exonic sequence having a 5′ to 3′ orientation, wherein the second exonic sequence comprises a constitutive exon.
Aspects of the disclosure relate to a transgene comprising: (i) a constitutive exon and one or more intronic sequences, each from a first gene; (ii) an alternatively-spliced exon cassette, and (iii) a coding region of interest from a third gene. In some embodiments, the alternatively-spliced exon cassette comprises: (a) an alternatively-spliced exon, and (b) flanking intronic sequences. In some embodiments, each of (a) and (b) are from a second gene. In some embodiments, the alternatively-spliced exon comprises an ATG start codon at its 3′ end.
In some embodiments, the first and second gene are the same gene; the first and third gene are the same gene; or all of the first, second, and third genes are the same gene.
In some embodiments, the first gene is survival motor neuron 1 (SMN1).
In some embodiments, the constitutive exon comprises exon 6 of SMN1, or a portion thereof. In some embodiments, the constitutive exon comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 102. In some embodiments, the constitutive exon comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 102.
In some embodiments, the one or more intronic sequences of (i) are or are derived from intron 6 and/or intron 7 of SMN1. In some embodiments, the one or more intronic sequences of (i) comprise(s) a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 103 and/or SEQ ID NO: 104. In some embodiments, the one or more intronic sequences of (i) comprise(s) a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 103 and/or SEQ ID NO: 104.
In some embodiments, the second gene is a gene selected from the group consisting of: CAMK2B, PKP2, LGMN, NRAP, VPS39, KSR1, PDLIM3, BIN1, ARFGAP2, KIF13A, and/or PICALM. In some embodiments, the second gene is bridging integrator 1 (BIN1).
In some embodiments, the alternatively-spliced exon comprises exon 11 of BIN1. In some embodiments, the alternatively-spliced exon comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in either SEQ ID NO: 37 or SEQ ID NO: 38. In some embodiments, the alternatively-spliced exon comprises a polynucleotide having a nucleic acid sequence as set forth in either SEQ ID NO: 37 or SEQ ID NO: 38.
In some embodiments, the flanking intronic sequences of (ii) are or are derived from intron 10 and/or intron 11 of BIN1. In some embodiments, the flanking intronic sequences of (ii) each comprise a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in either SEQ ID NO: 15 or SEQ ID NO: 16. In some embodiments, the flanking intronic sequences of (ii) each comprise a polynucleotide having a nucleic acid sequence as set forth in either SEQ ID NO: 15 or SEQ ID NO: 16.
In some embodiments, the alternatively-spliced exon cassette comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in any one of SEQ ID NOs: 107-778. In some embodiments, the alternatively-spliced exon cassette comprises a polynucleotide having a nucleic acid sequence as set forth in any one of SEQ ID NOs: 107-778.
In some embodiments, the third gene is myotubularin 1 (MTM1) or calpain 3 (CAPN3). In some embodiments, the coding region of interest comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in either SEQ ID NO: 1881 or SEQ ID NO: 1882. In some embodiments, the coding region of interest comprises a polynucleotide having a nucleic acid sequence as set forth in either SEQ ID NO: 1881 or SEQ ID NO: 1882.
In some embodiments, if the wild-type alternatively-spliced exon does not comprise an ATG start codon, the alternatively-spliced exon comprises 1-3 nucleic acid substitutions, relative to the wild-type alternatively-spliced exon, to form the ATG start codon within the alternatively-spliced exon. In some embodiments, the ATG start codon is formed in the alternatively-spliced exon by 1 nucleic acid substitution. In some embodiments, the ATG start codon is formed in the alternatively-spliced exon by 2 nucleic acid substitutions. In some embodiments, the ATG start codon is formed in the alternatively-spliced exon by 3 nucleic acid substitutions.
In some embodiments, the alternatively-spliced exon is retained in the spliced transcript. In some embodiments, all native start codons located 5′ to the ATG start codon located within the alternatively-spliced exon are disrupted or deleted.
In some embodiments, the alternatively-spliced exon cassette is located 5′, relative to the coding region of interest. In some embodiments, the constitutive exon is located 5′, relative to the alternatively-spliced exon cassette. In some embodiments, the one or more intronic sequences of (i) flank the alternatively-spliced exon cassette.
In some embodiments, the alternatively-spliced exon comprises a heterologous, in-frame stop codon. In some embodiments, the heterologous, in-frame stop codon is at least 50 nucleotides upstream of the next 5′ splice junction. In some embodiments, the heterologous, in-frame stop codon elicits nonsense-mediated decay.
In some embodiments, the alternatively-spliced exon is retained in the spliced transcript in distinct tissues. In some embodiments, the alternatively-spliced exon is retained in the spliced transcript in skeletal muscle. In some embodiments, the alternatively-spliced exon is not retained in the spliced transcript in heart and/or liver tissue.
In some embodiments, the flanking intronic sequences of (ii)(b) are or are derived from native flanking introns of the alternatively-spliced exon. In some embodiments, the flanking intronic sequences of (ii)(b) each comprise at least one modification, relative to a naturally occurring intronic sequence. In some embodiments, the modification is a substitution or deletion of one or more nucleic acids.
In some embodiments, the ATG start codon is located at the 3′ end of the alternatively-spliced exon. In some embodiments, the ATG start codon is in the same reading frame as the coding region of interest. In some embodiments, the ATG start codon is within up to 5, 10, 20, or 30 nucleotides upstream of the 3′ end of the alternative-spliced exon. In some embodiments, the ATG start codon is within up to 5, 10, 20, or 30 nucleotides upstream of the 3′ end of the alternative-spliced exon and is in the same reading frame as the coding region of interest.
In some embodiments, if the wild-type alternatively-spliced exon does not comprise an ATG start codon at its 3′ end, the first 10 nucleotides of the flanking intronic sequence which is immediately 3′ to the alternatively-spliced exon comprise 1-5 nucleotide substitutions, relative to the wild-type flanking intronic sequence which is immediately 3′ to the wild-type alternatively-spliced exon.
In some embodiments, the one or more intronic sequences of (i) each comprise at least one modification, relative to a naturally occurring intronic sequence. In some embodiments, the modification is a substitution or deletion of one or more nucleic acids.
In some embodiments, the coding region of interest comprises at least one modification, relative to a naturally occurring coding region of the third gene. In some embodiments, the modification is a substitution or deletion of one or more nucleic acids. In some embodiments, the coding region of interest comprises a deletion or disruption of a native start codon. In some embodiments, the coding region of interest comprises at least one heterologous stop codon. In some embodiments, the at least one heterologous stop codon is at least 50 nucleotides upstream of the next 5′ splice junction. In some embodiments, the at least one heterologous stop codon elicits nonsense-mediated decay.
In some embodiments, a transgene as described in any embodiment of the disclosure further comprises a 3′ untranslated region (UTR). In some embodiments, the 3′ UTR is SV40. In some embodiments, the SV40 3′ UTR comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 1883. In some embodiments, the SV40 3′ UTR comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 1883. In some embodiments, the 3′ UTR comprises a polyadenylation (pA) site and a cleavage site. In some embodiments, the polyadenylation site is an SV40 pA site.
In some embodiments, a transgene as described in any embodiment of the disclosure further comprises a promoter, wherein the promoter is located 5′, relative to all of (i), (ii), and (iii). In some embodiments, the promoter is a tissue-specific promoter. In some embodiments, the tissue-specific promoter is an MHCK7 promoter. In some embodiments, an MHCK7 promoter comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 1880. In some embodiments, an MHCK7 promoter comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 1880.
In some embodiments, the alternatively-spliced exon cassette comprises a nucleic acid sequence which is 450 to 650 nucleotides in length.
Aspects of the disclosure relate to a recombinant viral genome comprising a transgene as described in any embodiment of the disclosure. In some embodiments, the recombinant viral genome is a genome from a recombinant adeno-associated virus (rAAV). In some embodiments, the transgene is flanked by AAV inverted terminal repeat (ITR) sequences. In some embodiments, the AAV ITR sequences are AAV2 ITR sequences. In some embodiments, an AAV2 ITR comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 1879. In some embodiments, an AAV2 ITR comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 1879.
In some embodiments, the recombinant viral genome comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in either SEQ ID NO: 105 or SEQ ID NO: 106. In some embodiments, the recombinant viral genome comprises a polynucleotide having a nucleic acid sequence as set forth in either SEQ ID NO: 105 or SEQ ID NO: 106.
Aspects of the disclosure relate to an rAAV particle comprising a recombinant viral genome as described in any embodiment of the disclosure. In some embodiments, the rAAV particle comprises AAV serotype 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10, or AAV derivative or pseudotype AAV2-AAV3 hybrid, AAVrh.10, AAVhu.14, AAV3a/3b, AAVrh32.33, AAV-HSC15, AAV-HSC17, AAVhu.37, AAVrh.8, CHt-P6, AAV2.5, AAV6.2, AAV2i8, AAV-HSC15/17, AAVM41, AAV9.45, AAV6(Y445F/Y731F), AAV2.5T, AAV-HAE1/2, AAV clone 32/83, AAVShH10, AAV2 (Y→F), AAV8 (Y733F), AAV2.15, AAV2.4, AAVM41, and AAVr3.45. In some embodiments, the rAAV particle further comprises at least one helper plasmid. In some embodiments, the helper plasmid comprises a rep gene and a cap gene. In some embodiments, the rep gene encodes Rep78, Rep68, Rep52, or Rep40, and/or wherein the cap gene encodes a VP1, VP2, and/or VP3 region of the viral capsid protein. In some embodiments, the rAAV particle comprises two helper plasmids. In some embodiments, the first helper plasmid comprises a rep gene and a cap gene and the second helper plasmid comprises a E1a gene, a E1b gene, a E4 gene, a E2a gene, and a VA gene.
Aspects of the disclosure relate to a recombinant viral genome comprising a transgene. In some embodiments, the transgene comprises: (i) a constitutive exon and one or more intronic sequences; (ii) an alternative exon cassette; and (iii) a coding region of interest. In some embodiments, the alternative exon cassette comprises: (a) an alternatively-spliced exon; (b) at least a portion of the intron immediately upstream of the alternatively-spliced exon; and (c) at least a portion of the intron immediately downstream of the alternatively-spliced exon. In some embodiments, if the wild-type alternatively-spliced exon does not comprise an ATG start codon at its 3′ end: (1) the 3′ end of the alternatively-spliced exon comprises 1-3 nucleic acid substitutions relative to the wild-type alternatively-spliced exon to form an ATG start codon, and (2) the first 10 nucleotides of the intron immediately downstream of the alternatively-spliced exon comprise 1-5 nucleic acid substitutions relative to the wild-type intron immediately downstream of the wild-type alternatively-spliced exon.
In some embodiments, the 1-5 nucleic acid substitutions of (2) increase splice site strength. In some embodiments, any wild-type start codons within the alternatively-spliced exon located upstream of the ATG start codon at the 3′ end of the alternatively-spliced exon are disrupted or deleted. In some embodiments, the recombinant viral genome further comprises a tissue-specific promoter upstream of the alternative exon cassette. In some embodiments, the coding region of interest is or is derived from a naturally occurring coding region of MTM1 or CAPN3. In some embodiments, the tissue-specific promoter is an MHCK7 promoter. In some embodiments, the alternative exon is exon 11 of the BIN1 gene. In some embodiments, the constitutive exon is exon 6 of the SMN1 gene. In some embodiments, the alternative exon cassette promotes skeletal muscle expression of the coding region of interest and reduces cardiac muscle expression of the coding region of interest. In some embodiments, the alternative exon cassette is approximately 600 nucleotides in length.
Aspects of the disclosure relate to a method of treating a disease or condition in a subject comprising administering a recombinant viral genome or an rAAV particle according to any embodiment of the present disclosure to the subject. In some embodiments, the subject is a mammal. In some embodiments, the mammal is a human. In some embodiments, the recombinant viral genome or rAAV particle is administered to the subject at least one time. In some embodiments, the viral genome or rAAV particle is administered to the subject 2, 3, 4, 5, 6, 7, 8, 9, or 10 times. In some embodiments, the viral genome or rAAV particle is administered to the subject parenterally, subcutaneously, intraocularly, intravitreally, subretinally, intravenously (IV), intracerebro-ventricularly, intramuscularly, intrathecally (IT), intracisternally, intraperitoneally, enterally, via inhalation, topically, or by direct injection to one or more cells, tissues, or organs. In some embodiments, the viral genome or viral particle is administered to the subject by intravenous injection, intramuscular injection, intrathecal injection, or intravitreal injection. In some embodiments, the disease or condition is a disease or condition selected from the group consisting of Dentatorubral-pallido-luysian atrophy (DRPLA), myotonic dystrophy type 1 (DM1), myotonic dystrophy type 2 (DM2), Fragile X syndrome of mental retardation (FMR1), Fragile X tremor ataxia syndrome (FXTAS), FRAXE mental retardation (FMR2), Friedreichs ataxia (FRDA), Huntington disease (HD), Huntington disease-like 2 (HDL2), Oculopharyngeal muscular dystrophy (OPMD), Myoclonic epilepsy type 1, Alzheimer's disease, ALS/FTD, spinocerebellar ataxia type 1 (SCA1), spinocerebellar ataxia type 2 (SCA2), spinocerebellar ataxia type 3 (SCA3), spinocerebellar ataxia type 6 (SCA6), spinocerebellar ataxia type 7 (SCAT), spinocerebellar ataxia type 8 (SCA8), spinocerebellar ataxia type 10 (SCA10), spinocerebellar ataxia type 12 (SCA12), spinocerebellar ataxia type 17 (SCA17), Syndromic/non-syndromic X-linked mental retardation, Emery-Dreifuss muscular dystrophy type 2, familial partial lipodystrophy, limb girdle muscular dystrophy type 1B, dilated cardiomyopathy, familial partial lipodystrophy, Charcot-Marie-Tooth disorder type 2B1, mandibuloacral dysplasia, childhood progeria syndrome (Hutchinson-Gilford syndrome), Werner syndrome, Dilated cardiomyopathy (DCM), Hypertrophic cardiomyopathy (HCM), Restrictive cardiomyopathy (RCM), Left Ventricular Non-compaction (LVNC), Arrhythmogenic Right Ventricular Dysplasia (ARVD), takotsubo cardiomyopathy, Duchenne muscular dystrophy, Becker muscular dystrophy, Limb-girdle muscular dystrophy, Facioscapulohumeral muscular dystrophy, Congenital muscular dystrophy, Oculopharyngeal muscular dystrophy, Distal muscular dystrophy, Emery-Dreifuss muscular dystrophy, dementia, Parkinson's disease (PD), a PD-related disorder, Prion disease, a motor neuron disease (MND), Progressive bulbar palsy (PBP), Progressive muscular atrophy (PMA), Primary lateral sclerosis (PLS), Spinal muscular atrophy (SMA), a bladder cancer, a breast cancer, a colorectal cancer, a kidney cancer, a lung cancer, a lymphoma, a melanoma, an oral cancer, an ovarian cancer, an oropharyngeal cancer, a pancreatic cancer, a prostate cancer, a thyroid cancer, a uterine cancer, Down syndrome, Prader-Willi Syndrome (PWS), Bloom Syndrome, Cockayne Syndrome Type I-216400, Cockayne Syndrome Type III, Cockayne Syndrome Type I, Hutchinson-Gilford Progeria Syndrome, Mandibuloacral Dysplasia with Type A Lipodystrophy, Progeria, Adult Onset Progeroid Syndrome, Neonatal Rothmund-Thomson Syndrome, Seip Syndrome, Werner Syndrome, Replication Focus-Forming Activity 1, myotubular myopathy, Danon Disease, and/or centronuclear myopathy.
The following drawings form part of the present specification and are included to further demonstrate certain aspects of the present disclosure, which can be better understood by reference to one or more of these drawings in combination with the detailed description of specific embodiments presented herein.
The present disclosure relates to the observation that alternatively-spliced exons may be used in the context of viral vectors (e.g., AAV viral vectors or lentivirus viral vectors) to effectively regulate the expression of a coding region of interest (e.g., a coding region of a transgene that encodes a therapeutic protein). In certain aspects, the alternatively-spliced exons regulate the expression of a coding region of interest in a condition-sensitive manner (e.g., expression in one type of cell but not another, expression in a diseased condition, or expression in the presence of certain intracellular conditions). Accordingly, the present disclosure relates to a new approach for regulating expression of a transgene (or a coding region thereof) from a recombinant viral vector that couples alternatively-spliced exons with the expression of a coding region of interest (e.g., a coding region of a transgene encoding a therapeutic protein). The present disclosure describes a variety of exemplary configurations as to how to combine or otherwise pair the expression of a coding region of interest (or multiple portions of coding regions) with an alternatively-spliced exon, but any suitable arrangement or configuration is contemplated so long as the expression of the coding region of interest (or portions thereof) is configured to come under regulatory control of the alternatively-spliced exon.
A schematic representing the disclosed new approach for regulating expression of a transgene (or a coding region of a transgene, e.g., a transgene encoding a therapeutic protein) in a recombinant viral genome using alternatively-spliced exons is provided in
Such constructs represent embodiments that enable the disclosed new approach for regulating transgene expression (e.g., the expression of a therapeutic protein) from recombinant viral vectors in a condition-sensitive manner, whereby the condition-sensitive expression is controlled by alternatively-spliced exons which are included in the recombinant genome of the expression vector in such a manner that imparts a level of control on the expression of a coding region of interest (e.g., encoding a therapeutic protein). It will be understood that alternatively-spliced exons are spliced-in or spliced-out in a manner that can be dependent on one or more environmental conditions, e.g., intracellular conditions, such as a disease state (e.g., cancer) or even a type of cell (e.g., a liver cell versus a neuron, each of which have different intracellular conditions), or the presence of an external factor (such as, for example, an administered agent). Thus, whether the alternatively-spliced exon is spliced-in or spliced-out can be dependent upon the condition of the cell in which the splicing machinery operates.
Turning to
The alternatively-spliced exon may be any naturally-occurring alternatively-spliced exon or any recombinant alternatively-spliced exon. A variety of configurations are contemplated, and no limitation is implied by
In
In certain aspects, the disclosure provides methods and compositions for regulating gene expression using viral vectors comprising a recombinant viral genome described herein. Viral vectors can be used to deliver one or more transgenes (comprising a coding region of interest which encodes a protein of interest, such as a therapeutic protein) for therapeutic, diagnostic, or other purposes. In some aspects, expression of a transgene in a recombinant viral genome can be regulated using alternative splicing of an RNA expressed from the viral genome.
Thus, aspects of the disclosure relate to methods and compositions for regulating expression of a transgene (comprising a coding region of interest which encodes a protein of interest, such as a therapeutic protein) using viral vectors comprising a recombinant viral genome described herein. A recombinant viral genome can be engineered to include one or more exons (e.g., one or more of a constitutive exon, an alternatively-spliced exon, and/or engineered versions thereof) that (a) can be either spliced-in or spliced-out of a pre-mRNA encoded by the genome, and (b) include one or more positive or negative regulatory cis-elements that affect protein expression (e.g., mRNA stability and/or translation of the coding region of interest).
Different intron and exon configurations can be used to provide for alternatively-spliced exon splicing, as discussed in greater detail herein, and shown in
It will be appreciated that different types of splice sites exist which may result in splicing under specific conditions. Such splice sites can be chosen for their ability to regulate splicing under conditions of interest. Alternatively or additionally, splice sites may be chosen based upon their relative strength, as calculated using a variety of published methods (see, e.g., Yeo & Burge (2004), Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals, J. Comput. Biol., 11(2-3):377-94). Such relative strength may in some embodiments reflect the efficiency of recognition by the core spliceosomal machinery (e.g., U1 and U2 snRNPs). In some embodiments, splice sites may be altered to enhance or diminish recognition by the core spliceosomal machinery. Such alterations may be performed, in some embodiments, to achieve the desired regulatory behavior in conditions of interest. For example, splice sites may be used to make splicing responsive to certain endogenous or exogenous factors such that the alternative splicing of the DNA is specific to, such as, for example, certain tissues, certain diseases, certain intracellular conditions, etc. In some embodiments, splicing may be additionally or alternatively responsive to an exogenous agent (e.g., a small molecule, antibody, or other compound) which regulates splicing of the pre-mRNA.
Alternatively-spliced exons as described herein may in some embodiments be contained within an alternatively-spliced exon cassette, as shown in the various embodiments of
Thus, in some embodiments a recombinant viral genome of the present disclosure comprises a transgene comprising at least one alternatively-spliced exon (or “regulatory”) cassette. In some embodiments, a transgene comprising an alternatively-spliced exon cassette comprises at least one alternatively-spliced exon, intronic sequences flanking the alternatively-spliced exon, and an exon comprising a coding region of interest. However, a transgene comprising a regulatory cassette may in some embodiments also contain additional components, such as a constitutive exon, additional intronic sequences, or both. Accordingly, in some embodiments, a transgene comprising an alternatively-spliced exon cassette comprises any one or more of the following components: an alternatively-spliced exon, a flanking intron, an exon comprising a coding region of interest, and/or a constitutive exon.
In some aspects, alternative splicing regulation can be used to help control the expression of a coding region of interest encoded by a recombinant viral genome (e.g., an rAAV recombinant genome, a lentivirus recombinant genome). Thus, aspects of the invention relate to a method of regulating expression of a coding region of interest using a viral vector comprising a recombinant viral genome described herein. In some embodiments, the method comprises: (i) inserting into the recombinant viral genome at least one transgene comprising an alternatively-spliced exon cassette (e.g., such as any of those shown in
In some embodiments, the heterologous, in-frame stop codon elicits nonsense-mediated decay. In some embodiments, a transgene comprising an alternatively-spliced exon cassette comprises any one or more of the following components: an alternatively-spliced exon, a flanking intron, a coding region of interest, and/or a constitutive exon.
Accordingly, compositions and methods described herein can be useful to regulate expression of therapeutic transcripts in the context of viral vector-based treatments for diseases or disorders. Abnormal cellular regulation (e.g., abnormal regulation of intron splicing of one or more genes) can lead to changes in gene regulation and subsequent protein expression associated with a disease state. Some aspects of the invention therefore concern a method of treating a disease or condition in a subject comprising administering a viral vector of the disclosure to a subject, wherein the viral vector comprises a recombinant viral genome described herein. In some aspects, the present application provides compositions and methods that are useful for delivering genes that retain or restore therapeutically effective levels of regulation (e.g., therapeutically effective regulation of intron splicing).
In some aspects, a viral vector (e.g., an rAAV vector; a lentivirus vector, etc.) comprises a recombinant viral genome that includes a nucleic acid that encodes an RNA (e.g., an mRNA) comprising one or more introns. In some embodiments, splicing of at least one intron is regulated by one or more intracellular factor(s). Regulation of intron splicing can control the expression level of the RNA and/or of the type of RNA (e.g., of an RNA splice alternative) inside a cell.
Unless otherwise defined herein, all scientific and technical terms used in connection with the present disclosure shall have the meanings that are commonly understood by those of ordinary skill in the art. The meaning and scope of the terms are clear; however, in the event of any latent ambiguity, definitions provided herein take precedent over any dictionary or extrinsic definition. Further, unless otherwise required by context, singular terms shall include pluralities and plural terms shall include the singular. In this disclosure, the use of “or” means “and/or” unless stated otherwise. Furthermore, the use of the term “including,” as well as other forms, such as “includes” and “included,” is not limiting. Things described as “including” or “comprising” can also be configured as “consisting of” or similar language. Also, terms such as “element” or “component” encompass both elements and components comprising one unit and elements and components that comprise more than one subunit unless specifically stated otherwise.
Generally, nomenclatures used in connection with, and techniques of, cell and tissue culture, molecular biology, immunology, microbiology, genetics, and protein and nucleic acid chemistry and hybridization described herein are those well-known and commonly used in the art. The methods and techniques of the present disclosure are generally performed according to conventional methods well known in the art and as described in various general and more specific references that are cited and discussed throughout the present disclosure unless otherwise indicated. Enzymatic reactions and purification techniques are performed according to manufacturer's specifications, as commonly accomplished in the art or as described herein. The nomenclatures used in connection with, and the laboratory procedures and techniques of, analytical chemistry, synthetic organic chemistry, and medicinal and pharmaceutical chemistry described herein are those well-known and commonly used in the art. Standard techniques are used for chemical syntheses, chemical analyses, pharmaceutical preparation, formulation, and delivery, and treatment of subjects.
That the present disclosure may be more readily understood, select terms are defined below.
As used herein, the term “transgene” refers to any recombinant gene or a segment thereof that includes a non-naturally occurring sequence. The non-naturally occurring sequence may in some embodiments be from a different organism, but it need not be. For example, in some embodiments a transgene is a recombinant gene, or segment thereof, from one organism or infectious agent (e.g., a virus) that is introduced into the genome of another organism or infectious agent. By contrast, in some embodiments, the transgene may contain segments of DNA taken from the same organism, but the segments are arranged in a non-natural configuration. In some embodiments, the non-naturally occurring sequence is an engineered non-naturally occurring sequence. As used herein, a transgene may comprise any combination of naturally-occurring and engineered DNA sequences. In some embodiments, the transgene comprises at least one coding region that encodes a polypeptide of interest (e.g., a therapeutic protein) or fragment thereof. The coding region that encodes a polypeptide of interest (e.g., a therapeutic protein) or fragment thereof may be alternately referred to herein as the “coding region of the transgene.”
A transgene may be introduced into the genome of another organism or infectious agent using recombinant DNA techniques. A transgene may include one or more coding regions of interest that encode a polypeptide of interest, e.g., a therapeutic protein. A transgene may include or may be modified to include one or more regulatory sequences, including, but not limited to, transcription regulatory sequences (e.g., promoter, enhancer, silencer, transcription factor binding sequence, 5′ UTR, or 3′ UTR), post-transcriptional regulatory sequences (e.g., acceptor/donor splicing sites and splicing regulatory sequences), and/or translation regulatory sequences (e.g., translation initiation signals, translation termination signals, mRNA degradation or decay signals, polyadenylation signals). In some embodiments, wherein a transgene is introduced into the genome of another organism using a recombinant adeno associated virus (AAV), the transgene comprises all components (e.g., exons, introns, regulatory sequences, etc.) which are located between the AAV inverted terminal repeat sequences (see, e.g.,
In some embodiments, a transgene may be modified to comprise an alternatively-spliced exon, defined below, such that the regulation of the expression of the transgene—or of the product encoded by the coding region of the transgene—comes under control of the alternatively-spliced exon. The alternatively-spliced exon may be configured as a “cassette,” defined below.
As used herein, a “regulatory sequence” or, equivalently, a “regulatory element,” may refer to a nucleotide sequence that regulates, directly or indirectly, any aspect of the expression of a gene or transgene, including regulatory sequences that effect transcription of a gene or transgene into one or more mRNAs, the processing of mRNA (e.g., the splicing of a pre-mRNA comprising exons and introns to produce one or more mRNA isoforms), and/or the translation of a coding region in a mRNA to form a polypeptide product.
Where a regulatory sequence or element is near, within, or otherwise proximal to a gene or transgene (or coding sequence thereof), the regulatory sequence may be referred to as a cis-acting regulatory sequence. This is in contrast to a trans acting regulatory sequence, which would be a regulatory sequence which is distal from a gene or transgene being regulated on the same or different nucleic acid molecule comprising a gene or transgene being regulated. Such cis-acting regulatory sequences may be referred to a “positive or negative regulatory cis-elements,” and, in certain embodiments, are located within an “alternatively-spliced exon.”
Non-limiting examples of positive or negative regulatory cis-elements can include, for instance, (1) a nucleotide sequence element that regulates, modulates, or otherwise controls the amount, stability, and/or degradation of an mRNA encoding a coding region of interest (or portions thereof); and/or (2) a nucleotide sequence element that regulates, modulates, or otherwise controls the translation of a coding region of interest (or portions thereof) encoded by an mRNA. Where positive or negative regulatory cis-elements are located within an alternatively-spliced exon, the splicing-in or splicing-out of the alternatively-spliced exons either retains or removes the positive or negative regulatory cis-element from a resulting post-spliced mRNA encoding the coding region of interest. Depending upon whether the alternatively-spliced exon is spliced-in or spliced-out, and then depending upon which one or more positive or negative regulatory cis-elements are associated with the alternatively-spliced exon, there will be a corresponding effect on the overall regulation of the expression of the transgene or a coding region of interest therein. Such effect may in some embodiments be that the expression level is upregulated or downregulated, or, for example, “turned-off” completely.
(iii) Alternatively-Spliced Exon
As will be understood, an “alternatively-spliced exon” or an “alternatively-regulated exon” or a “cassette exon” refers to certain exons which are either retained (e.g., spliced-in) or excluded (e.g., spliced-out) during post-transcriptional splicing of a pre-mRNA. Whether an alternatively-spliced exon is spliced-in or spliced-out may depend of a number of different factors, including, but not limited to one or more cellular conditions, such as the presence or absence of a disease state (e.g., cancer), type of cell (e.g., liver cell versus skeletal cell), other intracellular conditions, or an external engineered factor (e.g., the administration of an agent).
The differential splicing events result in different spliced transcripts (e.g., mRNA isoforms) that either retain or exclude the alternatively-spliced exon. Further, as disclosed herein, the alternatively-spliced exons may comprise one or more positive or negative regulatory cis-elements that exert a positive or negative regulatory control on the expression of a coding region of interest (or portions thereof). Alternatively-spliced exons may be found in nature in a naturally-occurring gene, or may be modified by changing or altering the sequence thereof, including adding or changing the splice site, and/or adding or changing a positive or negative regulatory cis-element. Such altered exons may be referred to as “recombinant” or “synthetic” exons. “Recombinant” or “synthetic” may in some embodiments include naturally occurring exons that have been placed into a heterologous gene (e.g., an unmodified exon placed into a non-natural context). In some embodiments, the cis-elements mediate localization to a specific cellular compartment, such as, for example, an organelle, the cytoskeleton, plasma membrane, the endoplasmic reticulum, the mitochondria, the nucleus, etc.
As used herein, the term “cassette” refers to any set of introns and/or exons (including an alternatively-spliced exon) capable of exhibiting a splicing pattern to produce different spliced transcript (e.g., mRNA isoforms).
In some embodiments, when the cassette comprises an alternatively-spliced exon and, in some embodiments, the intronic sequences (or portions thereof) flanking the alternatively-spliced exon, the cassette may be referred to as an “alternative splicing cassette” or equivalently, “alternatively-spliced exon cassette” or “alternative exon cassette.” When situated in an alternatively-spliced exon cassette, an alternative-spliced exon may be alternatively referred to as a “cassette exon.” For purposes of clarity, a “cassette,” and in particular, an “alternatively-spliced exon cassette,” may exclude a coding region of interest, but also may be configured to be operatively linked to any coding region of interest such that the alternatively-spliced exon cassette regulates the expression of the coding region of interest.
As used herein, an “engineered intron” is an intron which comprises at least one modification, relative to a native intron. For example, an engineered intron may comprise one or more nucleotide deletions, and thus be truncated, relative to a native intron.
As used herein, an “engineered exon” is an exon which comprises at least one modification, relative to a native exon. For example, an engineered exon may comprise one or more nucleotide deletions, and thus be truncated, relative to a native exon.
As used herein, a “flanking” component (e.g., a flanking intron) refers to a component which is located upstream (e.g., 5′) or downstream (e.g., 3′) of a central component (e.g., an exon). A flanking component may in some embodiments be immediately adjacent to the central component, but that is not required by the methods and compositions of the present disclosure. For example, a central alternatively-spliced exon may, in some embodiments, be flanked by two introns, wherein such introns are immediately adjacent to the central alternatively-spliced exon. The same central alternatively-spliced exon may also be flanked by two additional exons, which are located upstream and downstream of the central alternatively-spliced exon, respectively, but which are not immediately adjacent to the central alternatively-spliced exon.
As used herein, a “constitutive exon” is an exon that is present in all spliced transcripts (e.g., mRNA isoforms) formed as a result of splicing a pre-mRNA transcript that is transcribed from a gene. A constitutive exon is therefore common to different mRNA isoforms of a gene.
Additional terms are defined throughout the disclosure.
Through alternative splicing of pre-mRNAs, individual mammalian genes often produce multiple mRNAs (i.e., mRNA isoforms) and resultant protein isoforms that may have related, distinct or even opposing functions. The mRNA and protein isoforms produced by alternative splicing (or equivalently, alternative processing) of primary RNA transcripts may differ in structure, function, localization or other properties. Alternative splicing in particular is known to affect more than half of all human genes, and has been proposed as a primary driver of the evolution of phenotypic complexity in mammals. The number of variants of a gene ranges from two to potentially thousands. The resulting proteins may exhibit different and sometimes antagonistic functional and structural properties, and may inhabit the same cell with the resulting phenotype representing a balance between their expression levels. Defects in splicing have been implicated in human diseases, including cancer.
Aspects of the invention utilize alternative splicing mechanisms as a method of regulating the expression of a transgene (e.g., encoding a therapeutic protein). However, unlike naturally occurring alternatively-spliced exons, the alternatively-spliced exons of the application do not necessarily result in alternative sequence isoforms of the encoded protein. In many embodiments, an alternatively-spliced exon impacts the level of protein expression without impacting the sequence of the protein that is expressed. That is, the alternatively-spliced exon is utilized as a means of regulation of the expression of the protein of interest. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in the productive translation of a coding region of interest. In some embodiments, exclusion of the alternatively-spliced exon from the spliced transcript results in the coding region of interest not being translated (e.g., the alternatively-spliced exon is spliced out). In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in nonsense mediated decay. In some embodiments, exclusion of the alternatively-spliced exon from the spliced transcript results in the productive translation of the coding region of interest.
Thus, by manipulating the composition and arrangement of an alternatively-spliced exon cassette, a recombinant viral genome of the present disclosure comprising the alternatively-spliced exon cassette may behave in a predictable manner, and the transgene and/or coding region of interest may be expressed in specific conditions which are therapeutically beneficial (e.g., in a specific cell type, a specific tissue, a disease state, and/or upon an inflammatory response). Transgenes comprising alternatively-spliced exon cassettes may be designed according to any one of several non-limiting models of alternative splicing (shown in
In various aspects, the alternatively-spliced exons are spliced-in or spliced-out in a manner that is dependent upon one or more environmental cues, e.g., cell or tissue type, disease state, or intracellular conditions. The alternatively-spliced exons can be sourced from a naturally occurring gene or may be recombinant, for example, in order to add one or more genetic regulatory elements for influencing expression levels of the transgene and/or coding region of the transgene. Examples of alternatively-spliced exons are disclosed herein.
In various embodiments, the alternatively-spliced exons may comprise one or more regulatory sequences that modulate the expression of a coding sequence of interest. Such regulatory sequences may be referred to a cis-elements. Further, cis-elements that impart a positive regulatory control on a coding sequence of interest may be referred to as a positive regulatory cis-element. To the contrary, cis-elements that impart a negative regulatory control on a coding sequence of interest may be referred to as a negative regulatory cis-element.
Alternatively-spliced exons may be found in nature in a naturally-occurring genes, or may be modified by changing or altering the sequence thereof (e.g., derived from a naturally-occurring gene), including adding or changing the splice site, and/or adding or changing a positive or negative regulatory cis-element. The one or more positive or negative regulatory cis-elements may be located within an alternatively-spliced exon, and may influence the level of expression of a coding region of interest through positive and/or negative controls, and may include any regulatory sequence which exerts—as a consequence being spliced-in or spliced-out of the final mRNA—either a positive or negative regulation on the expression of the coding region.
In some embodiments, the one or more cis-elements can include, but are not limited to, a translation start codon, a translation stop codon, an siRNA binding site, a miRNA binding site, a sequence forming a stem-loop structure, a sequence forming an RNA dimerization motif, a sequence forming a hairpin structure, a sequence forming an RNA quadruplex, polypurine tract, a sequence forming a pair of kissing loops, and a sequence forming a tetraloop/tetraloop receptor pair. In some embodiments, cis-elements include binding sites recognized by regulatory elements, such as, for example, RNA binding proteins. In some embodiments, an RNA binding protein capable of exerting regulatory control once bound is an RNA binding protein described in Van Nostrand, et al. (2020), A large-scale binding and functional map of human RNA-binding proteins, Nature, 583: 711-719, which is herein incorporated by reference with respect to its description of RNA binding proteins.
In various embodiments, the cassettes (e.g., comprised within a transgene) may include one or more additional components, including one or more other constitutive exons, and one or more introns. In
Various specific embodiments of these general groups of configurations are further shown in
In some embodiments, a transgene comprising an alternatively-spliced exon cassette comprises a polynucleotide sequence as set forth in any one of SEQ ID NOs: 45-55. In some embodiments, a transgene comprising an alternatively-spliced exon cassette comprises a polynucleotide sequence that is 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to any one of SEQ ID NOs: 45-55.
In some embodiments, the nucleic acid vectors of the present invention comprise a transgene comprising an alternatively-spliced exon cassette comprising components which, when alternatively spliced, comprise a skipped exon model of alternative splicing (see, e.g.,
Referencing the components as labeled in
Referencing the components as labeled in
Referencing the components as labeled in
In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in expression of the coding region of interest. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript does not result in expression of the coding region of interest. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in nonsense-mediated decay. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript does not result in nonsense-mediated decay. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in expression of the coding region of interest, wherein expression of the coding region of interest is regulated by a positive or negative cis-acting element. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in expression of the coding region of interest, wherein expression of the coding region of interest is not regulated by a positive or negative cis-acting element.
In some embodiments, the nucleic acid vectors of the present invention comprise a transgene comprising an alternatively-spliced exon cassette comprising components which, when alternatively spliced, comprise a retained intron model of alternative splicing (see, e.g.,
Referencing the components as labeled in
Referencing the components as labeled in
Referencing the components as labeled in
In some embodiments, retention of the alternative exon in the spliced transcript results in expression of the coding region of interest. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript does not result in expression of the coding region of interest. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in nonsense-mediated decay. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript does not result in nonsense-mediated decay. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in expression of the coding region of interest, wherein expression of the coding region of interest is regulated by a positive or negative cis-acting element. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in expression of the coding region of interest, wherein expression of the coding region of interest is not regulated by a positive or negative cis-acting element.
(iii) Alternative 5′ Splice Site Model of Alternative Splicing
In some embodiments, the nucleic acid vectors of the present invention comprise a transgene comprising an alternatively-spliced exon cassette comprising components which, when alternatively spliced, comprise an alternative 5′ donor site model of alternative splicing (see, e.g.,
Referencing the components as labeled in
Referencing the components as labeled in
Referencing the components as labeled in
In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in expression of the coding region of interest. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript does not result in expression of the coding region of interest. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in nonsense-mediated decay. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript does not result in nonsense-mediated decay. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in expression of the coding region of interest, wherein expression of the coding region of interest is regulated by a positive or negative cis-acting element. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in expression of the coding region of interest, wherein expression of the coding region of interest is not regulated by a positive or negative cis-acting element.
In some embodiments, the nucleic acid vectors of the present invention comprise a transgene comprising an alternatively-spliced exon cassette comprising components which, when alternatively spliced, comprise an alternative 3′ donor site model of alternative splicing (see, e.g.,
Referencing the components as labeled in
Referencing the components as labeled in
Referencing the components as labeled in
In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in expression of the coding region of interest. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript does not result in expression of the coding region of interest. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in nonsense-mediated decay. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript does not result in nonsense-mediated decay. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in expression of the coding region of interest, wherein expression of the coding region of interest is regulated by a positive or negative cis-acting element. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in expression of the coding region of interest, wherein expression of the coding region of interest is not regulated by a positive or negative cis-acting element.
In some embodiments, the nucleic acid vectors of the present invention comprise a transgene comprising an alternatively-spliced exon cassette comprising components which, when alternatively spliced, comprise a mutually exclusive exon model of alternative splicing (see, e.g.,
Referencing the components as labeled in
Referencing the components as labeled in
Referencing the components as labeled in
In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in expression of the coding region of interest. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript does not result in expression of the coding region of interest. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in nonsense-mediated decay. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript does not result in nonsense-mediated decay. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in expression of the coding region of interest, wherein expression of the coding region of interest is regulated by a positive or negative cis-acting element. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in expression of the coding region of interest, wherein expression of the coding region of interest is not regulated by a positive or negative cis-acting element.
In some embodiments, the nucleic acid vectors of the present invention comprise a transgene comprising an alternatively-spliced exon cassette comprising components which, when alternatively spliced, comprise an alternative last exon model of alternative splicing (see, e.g.,
Referencing the components as labeled in
Referencing the components as labeled in
Referencing the components as labeled in
In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in expression of the coding region of interest. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript does not result in expression of the coding region of interest. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in nonsense-mediated decay. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript does not result in nonsense-mediated decay. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in expression of the coding region of interest, wherein expression of the coding region of interest is regulated by a positive or negative cis-acting element. In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in expression of the coding region of interest, wherein expression of the coding region of interest is not regulated by a positive or negative cis-acting element.
In some embodiments, a nucleic acid vector (e.g., a viral vector) of the present invention comprises a transgene comprising at least one alternatively-spliced exon cassette as described herein. Nucleic acid vectors or transgenes may have one alternatively-spliced exon cassette, or multiple such cassettes. In some embodiments, a nucleic acid vector or transgene comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15, or more alternatively-spliced exon cassettes. A transgene comprising an alternatively-spliced exon cassette may, in some embodiments, comprise any one or more of the following components: an alternatively-spliced exon, an intron (e.g., a flanking intron), an exon comprising a coding region of interest, and/or a constitutive exon. In some embodiments, transgene comprising an alternatively-spliced exon cassette comprises an alternatively-spliced exon, a flanking intron, and an exon comprising a coding region of interest (wherein, in some embodiments, the coding region of interest may be split into portions across two or more exons).
In some embodiments, a nucleic acid vector or transgene comprises an alternatively-spliced exon cassette, wherein the alternatively-spliced exon cassette comprises among other components at least one alternatively-spliced exon. In some embodiments, the alternatively-spliced exon cassette comprises 1, 2, 3, or 4 alternatively-spliced exons. In some other embodiments, the alternatively-spliced exon cassette comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 alternatively-spliced exons. In some embodiments, wherein the alternatively-spliced exon cassette comprises more than one alternatively-spliced exon, the alternatively-spliced exons are adjacent. In some embodiments, wherein the alternatively-spliced exon cassette comprises more than one alternatively-spliced exon, the alternatively-spliced exons are not adjacent.
In some embodiments, the alternatively-spliced exon is synthetic or recombinant. In some embodiments, the alternatively-spliced exon is considered to be synthetic or recombinant because it undergoes one or more nucleic acid modifications, relative to the wild-type alternatively-spliced exon. A nucleic acid modification may be a substitution or deletion of one or more nucleotides that form the nucleic acid sequence of the alternatively-spliced exon.
In some embodiments, an alternative exon comprises an ATG start codon at its 3′ end. In some embodiments, the “3′ end” comprises the 1, 2, or 3 nucleic acids lying at the 3′ end of the alternative exon. As will be understood, in some embodiments a wild-type or naturally occurring alternative exon may comprise an ATG start codon at its 3′ end. In such embodiments, the alternative exon may comprise nucleic acid modifications unrelated to the insertion of a heterologous start codon at the 3′ end of the alternative exon. However, it will be further understood that in some embodiments a wild-type or naturally occurring alternative exon may not comprise an ATG start codon at its 3′ end. In such embodiments, modifications are made to the 3′ end of the alternative exon to introduce a heterologous start codon, such that when the alternative exon is spliced-in or retained in the spliced transcript, the downstream coding sequence is translated as a full-length protein. As will be understood, in some embodiments 1, 2, or 3 nucleic acid substitutions may be necessary in order to introduce the heterologous ATG start codon to the 3′ end of the alternative exon, depending on the sequence which is present at the 3′ end of the wild-type or naturally occurring alternative exon. In such embodiments, the 3′ end of the alternatively-spliced exon comprises 1 nucleotide substitution, relative to the wild-type alternatively-spliced exon, to form the ATG start codon. In such embodiments, the 3′ end of the alternatively-spliced exon comprises 2 nucleotide substitutions, relative to the wild-type alternatively-spliced exon, to form the ATG start codon. In such embodiments, the 3′ end of the alternatively-spliced exon comprises 3 nucleotide substitutions, relative to the wild-type alternatively-spliced exon, to form the ATG start codon.
In some embodiments, the modification comprises the insertion of a heterologous start codon or part of a heterologous start codon at the 3′ end of the alternatively-spliced exon (e.g., 1-3 nucleic acids are added to the 3′ end of the alternatively-spliced exon, rather than substituted, to form an ATG start codon).
In some embodiments, an alternative exon comprises part of an ATG start codon at its 3′ end. In some embodiments, an alternative exon may comprise, for example, “A” as the last nucleic acid, or “AT” as the last two nucleic acids, which formulate the 3′ end of the alternative exon. In such embodiments, the remainder of the ATG start codon may lie at the 5′ end of an exon lying immediately downstream of the alternative exon. For example, in some embodiments the alternative exon may comprise “A” as the last nucleic acid which formulates the 3′ end of the alternative exon, and the exon lying immediately downstream of the alternative exon may comprise “TG” as the first two nucleic acids which formulate the 5′ end of the downstream exon. In some embodiments, the alternative exon may comprise “AT” as the last two nucleic acids which formulate the 3′ end of the alternative exon, and the exon lying immediately downstream of the alternative exon may comprise “G” as the first nucleic acid which formulates the 5′ end of the downstream exon. In some embodiments, the ATG formed as a result of the splicing together of the alternative exon and the exon lying immediately downstream of the alternative exon initiates translation of the exon lying immediately downstream of the alternative exon. In some embodiments, the exon lying immediately downstream of the alternative exon may be, for example, the coding region of the transgene (e.g., an MTM1 coding region).
In some embodiments, an alternative exon comprises an ATG start codon, or part of an ATG start codon, within the nucleic acid sequence of the alternative exon (e.g., not at the 3′ end of the alternative exon). In some embodiments, the ATG start codon is in the same reading frame as the coding region of interest. In some embodiments, the ATG start codon is within up to 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides upstream of the 3′ end of the alternative-spliced exon. In some embodiments, the ATG start codon is within 4-6, 5-7, 6-8, 7-9, 8-10, 9-11, 10-12, 13-15, 14-16, 15-17, 16-18, 17-19, 18-20, 19-21, 20-22, 21-23, 22-24, 23-25, 24-26, 25-27, 26-28, 27-29, or 28-30 nucleotides upstream of the 3′ end of the alternative-spliced exon. In some embodiments, the ATG start codon is within 4-12, 8-16, 12-20, 16-24, or 20-30 nucleotides upstream of the 3′ end of the alternative-spliced exon. In some embodiments, the ATG start codon is within up to 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides upstream of the 3′ end of the alternative-spliced exon and is in the same reading frame as the coding region of interest. In some embodiments, the ATG start codon is within 4-6, 5-7, 6-8, 7-9, 8-10, 9-11, 10-12, 13-15, 14-16, 15-17, 16-18, 17-19, 18-20, 19-21, 20-22, 21-23, 22-24, 23-25, 24-26, 25-27, 26-28, 27-29, or 28-30 nucleotides upstream of the 3′ end of the alternative-spliced exon and is in the same reading frame as the coding region of interest. In some embodiments, the ATG start codon is within 4-12, 8-16, 12-20, 16-24, or 20-30 nucleotides upstream of the 3′ end of the alternative-spliced exon and is in the same reading frame as the coding region of interest.
In some embodiments, wherein the alternative exon comprises 1, 2, or 3 nucleic acid substitutions at the 3′ end to result in a heterologous ATG start codon (e.g., if the wild-type alternatively-spliced exon does not comprise an ATG start codon at its 3′ end), the strength of the 5′ splice site of the alternative exon may be diminished, relative to the strength of the 5′ splice site strength of the wild-type or naturally occurring alternative exon. In such embodiments, one or more additional modifications made be made to the intronic sequence located immediately downstream of the sequence comprising the 3′ end of the alternative exon (see
Additionally or alternatively, in some embodiments the modification comprises disrupting or deleting all native start codons located 5′ to the heterologous start codon. In some embodiments, wherein the alternatively-spliced exon cassette comprises more than one alternatively-spliced exon, all native start codons located 5′ to the heterologous start codon of the 5′-most alternatively-spliced exon are disrupted or deleted. Additionally or alternatively, in some embodiments the modification comprises introducing into the alternatively-spliced exon a heterologous, in-frame stop codon at least 50 nucleotides upstream of the next 5′ splice junction. In some embodiments, the alternatively-spliced exon is a nonsense-mediated decay (NMD) exon. In some embodiments, the NMD exon comprises an in-frame stop codon that is at least 50 nucleotides upstream of the next 5′ splice junction.
In some embodiments, the alternatively-spliced exon is considered to be synthetic when it is situated non-naturally (e.g., is linked to a coding sequence to which it would not be linked in wild-type or naturally-occurring conditions), relative to the wild-type alternatively-spliced exon (e.g., is heterologous). In some embodiments, the alternatively-spliced exon is considered to be synthetic when it (i) undergoes one or more nucleic acid modifications, and (ii) is situated non-naturally, relative to the wild-type alternatively-spliced exon.
In some embodiments, the alternatively-spliced exon is a regulatory exon. In some embodiments, the regulatory exon is an alternatively regulated exon (e.g., an exon known to be subject to alternative splicing mechanisms). It will be appreciated that alternative splicing is a process by which exons or portions of exons or noncoding regions within a pre-mRNA transcript are differentially joined or skipped, resulting in multiple protein isoforms being encoded by a single gene. The regulation of alternative splicing is complex.
Briefly, alternative splicing is known to be regulated by the functional coupling between transcription and splicing. Additional molecular features, such as chromatin structure, RNA structure and alternative transcription initiation or alternative transcription termination, collaborate with these basic components to produce the multiple isoforms that result from alternative splicing (see, e.g., Wang, et al., Biomed Rep. 2015 March; 3(2): 152-158). In certain embodiments, the compositions and methods of the present disclosure utilize the naturally-occurring mechanisms which regulate alternative splicing to express coding regions of interest (e.g., what would be alternatively spliced isoforms in the natural context) in specific biological conditions. In other embodiments, additional genetic elements may be incorporated into the DNA. In some embodiments, such additional genetic elements may become incorporated into the corresponding pre-mRNA, and may consequently influence, control, or otherwise regulate the splicing of the pre-mRNA to form one or more mRNA isoforms.
In some aspects, an alternatively-spliced exon—for which splicing may be regulated—is an exon for which splicing levels differ by at least 5%, for example at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or 100% under two different conditions (e.g., in different tissues, in response to intracellular T cell levels, in response to intracellular levels of one or more RNA binding proteins, in the context of an autoregulated gene, etc). By “splicing levels differ by 5%”, it is meant that the splicing levels for an exon of interest are measured in two different conditions, and the splicing level is compared between the conditions and expressed as a percentage change. For example, if the splicing level in condition A is 80%, and the splicing level in condition B is 85%, the splicing levels between conditions A and B differ by 5%. Likewise, if the splicing level in condition A is 80%, and the splicing level in condition B is 75%, the splicing levels between conditions A and B also differ by 5%.
In some embodiments, the step of calculating a difference in expression of certain isoforms of certain genes in certain conditions as described herein is performed by calculating a percent spliced-in (psi) score. A psi (Ψ) score is a value between 0 to 1 (e.g., 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.10, 0.11, 0.12, 0.13, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, 0.20, 0.21, 0.22, 0.23, 0.24, 0.25, 0.26, 0.27, 0.28, 0.29, 0.30, 0.31, 0.32, 0.33, 0.34, 0.35, 0.36, 0.37, 0.38, 0.39, 0.40, 0.41, 0.42, 0.43, 0.44, 0.45, 0.46, 0.47, 0.48, 0.49, 0.50, 0.51, 0.52, 0.53, 0.54, 0.55, 0.56, 0.57, 0.58, 0.59, 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.90, 0.91, 0.92, 0.93, 0.94, 0.95, 0.96, 0.97, 0.98, 0.99, or 1.0, or any value included therein such as e.g. 0.001, 0.0001, 0.0001, etc.) that quantifies alternative splicing occurrences present within a sample, or under certain conditions of interest.
In some embodiments, the Ψ score is calculated (e.g., calculated from RNAseq reads) by dividing the number of inclusion reads (e.g., the number of alternative splicing events for a gene of interest) by the total number of inclusion reads and exclusion reads (e.g., the number of normal (e.g., non-alternative) splicing events for the gene of interest). Therefore, in some embodiments the Ψ score is calculated according to the following formula for the gene of interest:
In some embodiments, the calculating comprises performing a mixture of isoforms (MISO) analysis. MISO analysis provides an estimate of isoform expression levels within a sample (e.g., a sample comprising a tissue of interest) based on a statistical model and assesses confidence in those estimates. In some embodiments, MISO analysis is performed using MISO software (see, e.g., Katz, Y., E. T. Wang, et al. (2010), Analysis and design of RNA sequencing experiments for identifying isoform regulation, Nat Methods 7(12): 1009-1015).
In some embodiments, a Ψ score higher than (>) 0.50 (for example 0.51, 0.52, 0.53, 0.54, 0.55, 0.56, 0.57, 0.58, 0.59, 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.90, 0.91, 0.92, 0.93, 0.94, 0.95, 0.96, 0.97, 0.98, 0.99, or 1.0, or any value included therein such as e.g. 0.5001, 0.50001, etc.) indicates that a greater number of alternative splicing events for the gene of interest are present in the tested sample than the number of regular splicing events. Conversely, in some embodiments a Ψ score lower than (<) 0.50 (for example 0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.10, 0.11, 0.12, 0.13, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, 0.2, 0.21, 0.22, 0.23, 0.24, 0.25, 0.26, 0.27, 0.28, 0.29, 0.30, 0.31, 0.32, 0.33, 0.34, 0.35, 0.36, 0.37, 0.38, 0.39, 0.40, 0.41, 0.42, 0.43, 0.44, 0.45, 0.46, 0.47, 0.48, 0.49, or any value included therein such as e.g. 0.499, 0.4999, etc.) indicates that a lower number of alternative splicing events for the gene of interest are present in the tested sample than the number of regular splicing events.
As used herein, delta psi (AN) score is used to refer to the calculation of the difference between two Ψ scores for a single gene of interest (e.g., in different tissues, in different intracellular conditions, etc.). The difference between the two calculated Ψ scores is the AN score. It will be understood that, because a Ψ score may be any value between 0 and 1, as described herein, a AN score (that is, the difference between the two calculated Ψ scores) may also be any value between 0 and 1 (e.g., 0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.10, 0.11, 0.12, 0.13, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, 0.20, 0.21, 0.22, 0.23, 0.24, 0.25, 0.26, 0.27, 0.28, 0.29, 0.30, 0.31, 0.32, 0.33, 0.34, 0.35, 0.36, 0.37, 0.38, 0.39, 0.40, 0.41, 0.42, 0.43, 0.44, 0.45, 0.46, 0.47, 0.48, 0.49, 0.50, 0.51, 0.52, 0.53, 0.54, 0.55, 0.56, 0.57, 0.58, 0.59, 0.60, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.90, 0.91, 0.92, 0.93, 0.94, 0.95, 0.96, 0.97, 0.98, 0.99, or 1.0, or any value included therein such as e.g. 0.001, 0.0001, 0.0001, etc.) or any value between 0 and −1 (e.g., 0, −0.01, −0.02, −0.03, −0.04, −0.05, −0.06, −0.07, −0.08, −0.09, −0.10, −0.11, −0.12, −0.13, −0.14, −0.15, −0.16, −0.17, −0.18, −0.19, −0.20, −0.21, −0.22, −0.23, −0.24, −0.25, −0.26, −0.27, −0.28, −0.29, −0.30, −0.31, −0.32, −0.33, −0.34, −0.35, −0.36, −0.37, −0.38, −0.39, −0.40, −0.41, −0.42, −0.43, −0.44, −0.45, −0.46, −0.47, −0.48, −0.49, −0.50, −0.51, −0.52, −0.53, −0.54, −0.55, −0.56, −0.57, −0.58, −0.59, −0.60, −0.61, −0.62, −0.63, −0.64, −0.65, −0.66, −0.67, −0.68, −0.69, −0.70, −0.71, −0.72, −0.73, −0.74, −0.75, −0.76, −0.77, −0.78, −0.79, −0.80, −0.81, −0.82, −0.83, −0.84, −0.85, −0.86, −0.87, −0.88, −0.89, −0.90, −0.91, −0.92, −0.93, −0.94, −0.95, −0.96, −0.97, −0.98, −0.99, or −1.0, or any value included therein such as e.g. −0.001, −0.0001, −0.0001, etc.). In some embodiments, a AN score may be expressed as an absolute value where the absolute value of e.g. −0.1 is 0.1.
In some embodiments, the alternatively-spliced exon is a tissue-specific alternatively-spliced exon. In some embodiments, one or more tissue-specific alternatively-spliced exons are included in a recombinant nucleic acid (e.g., in a rAAV). Non-limiting examples of tissue-specific alternatively-spliced exons are described in Supplemental Table S5 from Wang, E. T., et al., (2008), Nature, 456, 470-76, incorporated herein by reference. Other tissue-specific exons can be identified from transcriptome data. Non-limiting examples of RNA sequence motifs that can exhibit tissue-specific activity, thereby controlling the inclusion or exclusion of tissue-specific exons, are described in Badr, E., et al., (2016), PLOS One, 11(11): e0166978, incorporated herein by reference. In some embodiments, alternative splicing of the tissue-specific exon results in the expression of the transgene (e.g., of the product encoded by the coding region of interest) in heart tissue, but not in skeletal tissue. In some embodiments, alternative splicing of the tissue-specific exon results in the expression of the transgene (e.g., of the product encoded by the coding region of interest) in skeletal tissue, but not in heart tissue. In some embodiments, a tissue-specific alternatively-spliced exon comprises an alternatively-spliced exon from any one or more of: CAMK2B, PKP2, LGMN, NRAP, VPS39, KSR1, PDLIM3, BIN1, ARFGAP2, KIF13A, and/or PICALM.
In some embodiments, the tissue-specific alternatively-spliced exon is or is derived from exon 11 of BIN1. In some embodiments, the tissue-specific alternatively-spliced exon which is or is derived from exon 11 of BIN1 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 37. In some embodiments, the tissue-specific alternatively-spliced exon which is or is derived from exon 11 of BIN1 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 37. In some embodiments, the tissue-specific alternatively-spliced exon which is or is derived from exon 11 of BIN1 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 38. In some embodiments, the tissue-specific alternatively-spliced exon which is or is derived from exon 11 of BIN1 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 38.
In some embodiments, an alternatively-spliced exon is an immunoresponsive alternatively-spliced exon (e.g., undergoes alternative splicing in the presence of an enhanced immune response, such as an increased T cell presence). In some embodiments, the immunoresponsive alternatively-spliced exon is alternatively spliced in states of cellular inflammation. In some embodiments, the immunoresponsive alternatively-spliced exon is alternatively spliced when an abnormally elevated quantity of T cells is present in the intracellular environment (e.g., more T cells are present than under homeostatic conditions). In some embodiments, an immunorepressive alternatively-spliced exon comprises an alternatively-spliced exon from any one of ABCC1, AK125149, ASCC2, BAT2D1, BBX, BRD8, BRE, C17orf70, CAMKK2, CBFB, CCAR1, CCDC7CD6, CHTF8, COL4A3BP, COL6A3, CUGBP1, CUGBP2, CXorf45, DENND3, DGUOK, DKFZp762G094, DNAJC7, DNASE1, EIF4A2, EIF4G2, EIF4H, EXOC7, EZH2, FAM120A, FAM136A, FAM36A, FARSB, FBXO38, FGFR1OP2, FIP1L1, FOXRED1, FUBP3, GALT, GATA3, GOLGA2, HIF1A, HMMR, HRB, IKZF1, ILF3, IRAK4, IRF1, KCTD13, LEF1, LUC7L, LYRM1, MALT1 e7, MAP2K7, MAP3K7, MAP4K2, MBNL2, MFF, NAE1, NCSTN, NR4A3, NRF1, NUP98, PARP6, PCM1, PLAUR, PLSCR3, PPIL5, PPP5C, PTPRC-E4, PTPRC-E6, PTS, RABL5, RAPH1, SEC16A, SFRS3, SFRS7, SLMAP, SNRNP70, STAT6, TBC1D1, TIMM8B, TIR8, TRA2A, TROVE2, UGCGL1, VAP-B, VAV1, ZNF384, and ZNF496.
In some embodiments, an alternatively-spliced exon is a cell type-specific alternatively-spliced exon (e.g., undergoes alternative splicing only when located in certain cell types). In some embodiments, a cell type-specific alternatively-spliced exon comprises an alternatively-spliced exon as described in Joglekar, et al. (2021), A spatially resolved brain region- and cell type-specific isoform atlas of the postnatal mouse brain, Nature Comm., 12(463), which is incorporated herein by reference with respect to its description of cell type-specific alternative exons.
In some embodiments, an alternatively-spliced exon is alternatively spliced in cells which exhibit high levels of expression of a particular protein. In some embodiments, an alternatively-spliced exon is alternatively spliced in cells which exhibit low levels of expression of a particular protein. High or low expression of a particular protein may in some embodiments be indicative of a disease state. For example, in some forms of frontotemporal dementia, MAPT exon 10 is aberrantly included, leading to increased levels of the 4R vs. 3R isoform. Increased 4R isoform is associated with neurodegeneration.
Accordingly, in some embodiments an alternatively-spliced exon is alternatively spliced in cells which exhibit disease (e.g., severe disease). In some embodiments, such disease comprises Dentatorubral-pallido-luysian atrophy (DRPLA), myotonic dystrophy type 1 (DM1), myotonic dystrophy type 2 (DM2), Fragile X syndrome of mental retardation (FMR1), Fragile X tremor ataxia syndrome (FXTAS), FRAXE mental retardation (FMR2), Friedreichs ataxia (FRDA), Huntington disease (HD), Huntington disease-like 2 (HDL2), Oculopharyngeal muscular dystrophy (OPMD), Myoclonic epilepsy type 1, Alzheimer's disease, ALS/FTD, spinocerebellar ataxia type 1 (SCA1), spinocerebellar ataxia type 2 (SCA2), spinocerebellar ataxia type 3 (SCA3), spinocerebellar ataxia type 6 (SCA6), spinocerebellar ataxia type 7 (SCAT), spinocerebellar ataxia type 8 (SCA8), spinocerebellar ataxia type 10 (SCA10), spinocerebellar ataxia type 12 (SCA12), spinocerebellar ataxia type 17 (SCA17), Syndromic/non-syndromic X-linked mental retardation, Emery-Dreifuss muscular dystrophy type 2, familial partial lipodystrophy, limb girdle muscular dystrophy type 1B, dilated cardiomyopathy, familial partial lipodystrophy, Charcot-Marie-Tooth disorder type 2B1, mandibuloacral dysplasia, childhood progeria syndrome (Hutchinson-Gilford syndrome), Werner syndrome, Dilated cardiomyopathy (DCM), Hypertrophic cardiomyopathy (HCM), Restrictive cardiomyopathy (RCM), Left Ventricular Non-compaction (LVNC), Arrhythmogenic Right Ventricular Dysplasia (ARVD), takotsubo cardiomyopathy, Duchenne muscular dystrophy, Becker muscular dystrophy, Limb-girdle muscular dystrophy, Facioscapulohumeral muscular dystrophy, Congenital muscular dystrophy, Oculopharyngeal muscular dystrophy, Distal muscular dystrophy, Emery-Dreifuss muscular dystrophy, dementia, Parkinson's disease (PD), a PD-related disorder, Prion disease, a motor neuron disease (MND), Progressive bulbar palsy (PBP), Progressive muscular atrophy (PMA), Primary lateral sclerosis (PLS), Spinal muscular atrophy (SMA), a bladder cancer, a breast cancer, a colorectal cancer, a kidney cancer, a lung cancer, a lymphoma, a melanoma, an oral cancer, an ovarian cancer, an oropharyngeal cancer, a pancreatic cancer, a prostate cancer, a thyroid cancer, a uterine cancer, Down syndrome, Prader-Willi Syndrome (PWS), Bloom Syndrome, Cockayne Syndrome Type I-216400, Cockayne Syndrome Type III, Cockayne Syndrome Type I, Hutchinson-Gilford Progeria Syndrome, Mandibuloacral Dysplasia with Type A Lipodystrophy, Progeria, Adult Onset Progeroid Syndrome, Neonatal Rothmund-Thomson Syndrome, Seip Syndrome, Werner Syndrome, Replication Focus-Forming Activity 1, myotubular myopathy, Danon Disease, and/or centronuclear myopathy.
In some embodiments, an alternatively-spliced exon comprises an exon which may be differentially spliced depending on the intracellular level of the protein encoded by the coding region associated with the alternatively-spliced exon.
In some embodiments, an alternatively-spliced exon comprises an alternatively-spliced exon comprising a polynucleotide sequence as set forth in any one of SEQ ID NOs: 23-44. In some embodiments, an alternatively-spliced exon comprises a polynucleotide sequence that is 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to any one of SEQ ID NOs: 23-44.
In some embodiments, the alternatively-spliced exon is retained in the spliced transcript. Retention of the alternatively-spliced exon in the spliced transcript occurs under the alternative splicing conditions specific to said alternatively-spliced exon as described herein. In some embodiments, wherein the alternatively-spliced exon cassette comprises more than one alternatively-spliced exon, the 5′-most alternatively-spliced exon is retained in the spliced transcript. In some embodiments, wherein the alternatively-spliced exon cassette comprises more than one alternatively-spliced exon, the 3′-most alternatively-spliced exon is included in the spliced transcript. In some embodiments, wherein the alternatively-spliced exon cassette comprises more than one alternatively-spliced exon, all alternatively-spliced exons are included in the spliced transcript.
In some embodiments, retention of the alternatively-spliced exon in the spliced transcript results in the productive expression of the transgene (e.g., productive translation of the protein). Expression of the product (e.g., therapeutic protein) encoded by the coding region of interest may in some embodiments be desirable. For example, in myotubular myopathy, expression of myotubularin 1 is depleted in skeletal muscle, and therefore restoration of myotubularin 1 in skeletal muscle is desirable. However, in some embodiments, expression of the product (e.g., therapeutic protein) encoded by the coding region of interest may be undesirable. For example, in myotubular myopathy, expression of myotubularin 1 in the heart may be undesirable. Accordingly, in some embodiments retention of the alternatively-spliced exon in the spliced transcript does not result in the productive expression of the transgene (e.g., no productive translation of the protein).
In some embodiments, the alternatively-spliced exon is located 5′ to the coding region of the transgene. In some embodiments, the alternatively-spliced exon is located 3′ to the coding region of the transgene. In some embodiments, the alternatively-spliced exon is located within the coding region of the transgene. In some embodiments, the alternatively-spliced exon is not located within the coding region of the transgene. In some embodiments, the alternatively-spliced exon is located 3′ to a constitutive exon. In some embodiments, the alternatively-spliced exon is located 5′ to a constitutive exon.
In some embodiments, the recombinant viral genomes of the present disclosure comprise one or more constitutive exons. In various embodiments, the alternatively-spliced exon and the one or more constitutive exons may be configured as a cassette (e.g., comprised within a transgene. In some embodiments, the transgene comprising an alternatively-spliced exon cassette comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 constitutive exons. In various embodiments, one or more constitutive exons may comprise a coding region of interest, or a portion thereof. In some embodiments, the constitutive exon is considered to be constitutive when it is present in all isoforms of spliced mRNAs resulting from the splicing of a pre-mRNA transcript.
A constitutive exon may in some embodiments be synthetic, but it need not be. A constitutive exon may be considered synthetic because it undergoes one or more nucleic acid modifications, relative to the wild-type constitutive exon. A nucleic acid modification may be a substitution or deletion of one or more nucleotides that form the nucleic acid sequence of the constitutive exon. In some embodiments, the modification comprises disrupting or deleting all native start codons located within the constitutive exon.
In some embodiments, the constitutive exon is considered to be synthetic when it is situated non-naturally (e.g., is linked to a coding sequence to which it would not be linked in wild-type or naturally-occurring conditions), relative to the wild-type constitutive exon (e.g., is heterologous). In some embodiments, the constitutive exon is considered to be synthetic when it (i) undergoes one or more nucleic acid modifications, and (ii) is situated non-naturally, relative to the wild-type constitutive exon.
In some embodiments, the constitutive exon is naturally occurring (e.g., does not comprise any nucleic acid modifications, relative to the wild-type constitutive exon). In some embodiments, the constitutive exon is a native exon associated with the coding region of the transgene. In some embodiments, the constitutive exon is from or is derived from the same gene as the alternatively-spliced exon.
In some embodiments, the constitutive exon is from or is derived from a constitutive exon of a gene selected from the group consisting of: MBNL1, MBNL2, MBNL3, hnRNP A1, hnRNP A2B1, hnRNP C, hnRNP D, hnRNP DL, hnRNP F, hnRNP H, hnRNP K, hnRNP L, hnRNP M, hnRNP R, hnRNP U, FUS, TDP43, PABPN1, ATXN2, TAF15, EWSR1, MATR3, TIA1, FMRP, MTM1, MTMR2, LAMP2, KIF5A, a microdystrophin-encoding gene, C9ORF72, HTT, DNM2, BIN1, RYR1, NEB, ACTA, TPM3, TPM2, TNNT2, CFL2, KBTBD13, KLHL40, KLHL41, LMOD3, MYPN, SEPN1, TTN, SPEG, MYH7, TK2, POLG1, GAA, AGL, PYGM, SLC22A5, OCTN2, ETF, ETFH, PNPLA2, a cytochrome b oxidase-encoding gene, a cytochrome c oxidase-encoding gene, CLCN1, SCN4A, DMPK, CNBP, MYOT, LMNA, CAV3, DNAJB6, DES, TNPO3, HNRPDL, CAPN3, DYSF, an alpha-sarcoglycan-encoding gene, a beta-sarcoglycan-encoding gene, a gamma-sarcoglycan-encoding gene, a delta-sarcoglycan-encoding gene, TCAP, TRIM32, FKRP, FXN, POMT1, FKTN, POMT2, POMGnT1, DAG1, ANO5, PLEC1, TRAPPC11, GMPPB, ISPD, LIMS2, POPDC1, TOR1AIP1, POGLUT2, LAMA2, COL6A1, POMT1, POMT2, DUX4, EMD, PAX7, PMP22, MPZ, MFN2, SMCHD1, SMN, Lamin A/C (LAMN), and/or GJB1.
In some embodiments, the constitutive exon is from or is derived from a constitutive exon of a gene(s) selected from the group consisting of: ABCC1, AK125149, ASCC2, BAT2D1, BBX, BRD8, BRE, C17orf70, CAMKK2, CBFB, CCAR1, CCDC7CD6, CHTF8, COL4A3BP, COL6A3, CUGBP1, CUGBP2, CXorf45, DENND3, DGUOK, DKFZp762G094, DNAJC7, DNASE1, EIF4A2, EIF4G2, EIF4H, EXOC7, EZH2, FAM120A, FAM136A, FAM36A, FARSB, FBXO38, FGFR1OP2, FIP1L1, FOXRED1, FUBP3, GALT, GATA3, GOLGA2, HIF1A, HMMR, HRB, IKZF1, ILF3, IRAK4, IRF1, KCTD13, LEF1, LUC7L, LYRM1, MALT1 e7, MAP2K7, MAP3K7, MAP4K2, MBNL2, MFF, NAE1, NCSTN, NR4A3, NRF1, NUP98, PARP6, PCM1, PLAUR, PLSCR3, PPIL5, PPP5C, PTPRC-E4, PTPRC-E6, PTS, RABL5, RAPH1, SEC16A, SFRS3, SFRS7, SLMAP, SNRNP70, STAT6, TBC1D1, TIMM8B, TIR8, TRA2A, TROVE2, UGCGL1, VAP-B, VAV1, ZNF384, and/or ZNF496.
In some embodiments, the constitutive exon is from or is derived from a constitutive exon of a gene(s) selected from the group consisting of: CAMK2B, PKP2, LGMN, NRAP, VPS39, KSR1, PDLIM3, BIN1, ARFGAP2, KIF13A, and/or PICALM. In some embodiments, the constitutive exon is from or is derived from a constitutive exon of SMN1. In some embodiments, the constitutive exon is from or is derived from exon 6 of SMN1. In some embodiments, the constitutive exon which is derived from SMN1 exon 6 is a fragment of (e.g., is truncated relative to) the wild-type or naturally occurring sequence of SMN1 exon 6. In some embodiments, the constitutive exon which is derived from SMN1 exon 6 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 102. In some embodiments, the constitutive exon which is derived from SMN1 exon 6 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 102.
In some embodiments, the constitutive exon is not a native exon associated with the coding region of the transgene. In some embodiments, the constitutive exon is not from nor is derived from the same gene as the alternatively-spliced exon.
In some embodiments, a constitutive exon is located 5′ to the alternatively-spliced exon. Additionally or alternatively, in some embodiments a constitutive exon is located 3′ to the alternatively-spliced exon. In some embodiments, a constitutive exon is located 5′ to the coding region of the transgene. Additionally or alternatively, in some embodiments a constitutive exon is located 3′ to coding region of the transgene.
In some embodiments, the constitutive exon is retained in the spliced transcript (e.g., spliced in). In some embodiments, wherein the transgene comprising an alternatively-spliced exon cassette comprises more than one constitutive exon, the 5′-most constitutive exon is retained in the spliced transcript. In some embodiments, wherein the transgene comprising an alternatively-spliced exon cassette comprises more than one constitutive exon, the 3′-most constitutive exon is retained in the spliced transcript. In some embodiments, wherein the transgene comprising an alternatively-spliced exon cassette comprises more than one constitutive exon, all constitutive exons are retained in the spliced transcript. In some embodiments, the constitutive exon is excluded from the spliced transcript (e.g., spliced out).
(iii) Introns
In other embodiments, the recombinant viral genomes of the present disclosure comprise one or more introns. In various embodiments, the alternatively-spliced exon and the one or more introns (or portions thereof) may be configured as a cassette. In some embodiments, a nucleic acid (e.g., a nucleic acid comprising a recombinant viral genome) comprises an alternatively-spliced exon cassette encoding at least one transgene that contains at least one recombinant (e.g., engineered, truncated) intron that supports sufficient splice regulation of the transgene to be therapeutically effective. In some embodiments an alternatively-spliced exon cassette is an RNA molecule (e.g., a pre-mRNA) that contains one or more (e.g., two or more) recombinant (e.g., engineered; e.g., truncated) introns flanking one or more exons. In some embodiments, an alternatively-spliced exon cassette is a DNA molecule that encodes the RNA molecule containing one or more recombinant (e.g., engineered; e.g., truncated) introns. In some embodiments, a transgene comprising an alternatively-spliced exon cassette contains other regulatory sequences (e.g., promoters, 5′ or 3 UTRs, or other regulatory sequences) in addition to the gene coding (e.g., protein coding) sequences and the at least one recombinant (e.g., engineered; e.g., truncated) intron for which splicing can be regulated, as described elsewhere herein.
Accordingly, in some embodiments, a recombinant viral genome of the present disclosure comprises a transgene comprising an alternatively-spliced exon cassette, wherein the alternatively-spliced exon cassette comprises among other components at least one intron (or portion thereof). In some embodiments, the intron is a flanking intron (or portion thereof). In some embodiments, the alternatively-spliced exon cassette comprises 1, 2, 3, 4, 5, 6, 7, or 8 flanking introns (or portion(s) thereof).
In some embodiments, an exon (e.g., an alternatively-spliced exon, or a constitutive exon) is flanked by one or more introns (e.g., flanking introns), or portion(s) thereof. In some embodiments, an alternatively-spliced exon is flanked by one or more introns (or portion(s) thereof). In some embodiments, an alternatively-spliced exon is flanked by one intron (or portion thereof). In some embodiments, wherein the alternatively-spliced exon is flanked by one intron, the flanking intron (or portion thereof) is located 3′ to the alternatively-spliced exon. In some embodiments, wherein the alternatively-spliced exon is flanked by one intron, the flanking intron (or portion thereof) is located 5′ to the alternatively-spliced exon. In some embodiments, an alternatively-spliced exon is flanked by two introns (or portions thereof). In some embodiments, wherein the alternatively-spliced exon cassette comprises more than one alternatively-spliced exon, each alternatively-spliced exon is flanked by at least one, and in some embodiments two, flanking intron(s) (or portion(s) thereof). In some embodiments, an intron is a native flanking intron or native flanking intronic sequence of the alternatively-spliced exon. In some embodiments, an intron is not a native flanking intron or native flanking intronic sequence of the alternatively-spliced exon.
In some embodiments, a constitutive exon is flanked by one or more introns (or portion(s) thereof). In some embodiments, a constitutive exon is flanked by one intron (or portion thereof). In some embodiments, wherein the constitutive exon is flanked by one intron, the flanking intron (or portion thereof) is located 3′ to the constitutive exon. In some embodiments, wherein the constitutive exon is flanked by one intron, the flanking intron (or portion thereof) is located 5′ to the constitutive exon. In some embodiments, a constitutive exon is flanked by two introns (or portions thereof). In some embodiments, wherein the alternatively-spliced exon cassette comprises more than one constitutive exon, each constitutive exon is flanked by at least one, and in some embodiments two, flanking intron(s) (or portion(s) thereof). In some embodiments, an intron is a native flanking intron or native flanking intronic sequence of the constitutive exon. In some embodiments, an intron is not a native flanking intron or native flanking intronic sequence of the constitutive exon.
In some embodiments, an intron is a natural intron, and comprises no modifications, relative to a native intron.
An intron or intronic sequence may in some embodiments be synthetic, but it need not be. A synthetic intron or intronic sequence may be considered synthetic because it undergoes one or more nucleic acid modifications, relative to the wild-type or native intron. A nucleic acid modification may be a substitution or deletion of one or more nucleotides that form the nucleic acid sequence of the intron or intronic sequence.
In some embodiments, an intron or intronic sequence is considered to be synthetic when it is situated non-naturally (e.g., is linked to an exon to which it would not be linked in wild-type or naturally-occurring conditions), relative to the wild-type intron or intronic sequence (e.g., is heterologous). In some embodiments, the intron or intronic sequence is considered to be synthetic when it (i) undergoes one or more nucleic acid modifications, and (ii) is situated non-naturally, relative to the wild-type intron or intronic sequence.
In some embodiments, an intron (e.g., a flanking intron) (or portion thereof) comprising one or more nucleic acid modifications, relative to the wild-type intron, is an engineered intron or intronic sequence. In some embodiments, the engineered intron or intronic sequence comprises a splice donor and splice acceptor site, and a functional branch point to which the splice donor site can be joined in the first trans-esterification reaction of splicing.
In some embodiments, an intron (e.g., a flanking intron) or intronic sequence comprising one or more nucleic acid modifications, relative to the wild-type intron, comprises a truncated version of a natural intron. By “truncated version of a natural intron”, it is meant that the naturally-occurring, full-length intron is shortened (e.g., truncated) via the removal of nucleotides. In some embodiments, an engineered (e.g., recombinant) intron or intronic sequence is a truncated version of a natural intron. However, in some embodiments an engineered intron or intronic sequence can be designed to include functional splice donor and acceptor sites and a functional branch point in addition to one or more regulatory regions that are derived from different introns, or that are non-naturally occurring sequences (e.g., sequence variants of naturally-occurring sequences, consensus sequences, or de novo designed sequences). Accordingly, in some embodiments an engineered intron or intronic sequence is not a truncated version of a naturally occurring intron, but contains one or more sequences from a naturally occurring intron.
In some embodiments, an intron (e.g., a flanking intron) (or portion thereof) comprising one or more nucleic acid modifications, relative to the wild-type intron, is truncated at its 5′ end. In some embodiments, 1-10,000 nucleotides are truncated from the 5′ end (e.g., 1-50, 50-100, 100-500, 500-1,000, 1,000-5,000, 5,000-10,000, 10,000-20,000, 20,000-50,000, or 50,000-100,000 nucleotides are truncated from the 5′ end). In some embodiments, the 5′ splice site is not retained in the truncated intron (or portion thereof). In some embodiments, the 5′ splice site is retained in the truncated intron (or portion thereof). In some embodiments, a different 5′ splice site is included in the truncated intron (or portion thereof).
In some embodiments, an intron (e.g., a flanking intron) (or portion thereof) comprising one or more nucleic acid modifications, relative to the wild-type intron, is truncated at its 3′ end. In some embodiments, 1-10,000 nucleotides are truncated from the 3′ end (e.g., 1-50, 50-100, 100-500, 500-1,000, 1,000-5,000, 5,000-10,000, 10,000-20,000, 20,000-50,000, or 50,000-100,000 nucleotides are truncated from the 3′ end). In some embodiments, the 3′ splice site is not retained in the truncated intron (or portion thereof). In some embodiments, the 3′ splice site is retained in the truncated intron (or portion thereof). In some embodiments, a different 3′ splice site is included in the truncated intron (or portion thereof).
In some embodiments, an intron (e.g., a flanking intron) (or portion thereof) comprising one or more nucleic acid modifications, relative to the wild-type intron, is truncated at one or more internal locations. In some embodiments, 1-10,000 internal nucleotides are removed (e.g., 1-50, 50-100, 100-500, 500-1,000, 1,000-5,000, 5,000-10,000, 10,000-20,000, 20,000-50,000, or 50,000-100,000 internal nucleotides are removed). In some embodiments, the splice regulatory region is not retained in the truncated intron (or portion thereof). In some embodiments, the splice regulatory region is retained in the truncated intron (or portion thereof). In some embodiments, a different splice regulatory region is included in the truncated intron (or portion thereof).
In some embodiments, an intron (e.g., a flanking intron) (or portion thereof) comprising one or more nucleic acid modifications, relative to the wild-type intron, comprises one or more 5′, 3′, and/or internal deletions. It should be understood that the extent of truncation may depend on the size of the intron (or portion thereof) and the size of the gene. A truncation may require removal of sufficient intronic sequence to result in a recombinant gene construct that is small enough to be packaged in a recombinant virus of interest (e.g., in a recombinant AAV or lentivirus).
However, an intron typically includes one or more sequences required for efficient splicing and/or regulated splicing. In some embodiments, an intron or intronic sequence comprises one or more splice junction sites (e.g., a 5′ splice donor site, and/or a 3′ splice acceptor site). In some embodiments, an intron or intronic sequence retains a splice donor site (e.g., towards the 5′ end of the intron or intronic sequence), a branch site (e.g., towards the 3′ end of the intron or intronic sequence), a splice acceptor site (e.g., at the 3′ end of the intron or intronic sequence), and a splice regulatory sequence. In some embodiments, the intron or intronic sequence comprises a 5′ splice donor site. In some embodiments, the 5′ splice donor site is a GU or an AU. In some embodiments, the intron or intronic sequence comprises a 3′ splice acceptor site. In some embodiments, the 3′ splice acceptor site is an AG or an AC. In some embodiments, an intron or intronic sequence comprises at its 5′ end a 5′ splice donor site and at its 3′ end a 3′ splice acceptor site. In some embodiments, a regulatory sequence comprises a response element within an AG exclusion zone of the intron. In some embodiments, the intron or intronic sequence retains sequence motifs bound by the encoded protein (e.g., YGCY motifs for MBNL1, or GCAUG for RBFOX, or YCAY for NOVA, etc.). In some embodiments, an intron or intronic sequence is spliced out, and is not included in the spliced transcript.
In some embodiments, an intron or intronic sequence may include one or more human, non-human primate, and/or other mammalian or non-mammalian intron splice-regulatory sequences. In some embodiments, the regulatory sequences may have 80%-100% (e.g., 80-85%, 85%-90%, greater than 90%, 90%-95%, or 95%-100%) sequence identity, relative to a wild-type regulatory sequence.
In some embodiments, an intron or intronic sequence is approximately 50 to 4000 nucleotides long. In some embodiments, an intron or intronic sequence is approximately 50 to 100, 75-125, 100-150, 125-175, 200-250, 225-275, 300-350, 325-375, 400-450, 425-475, 500-550, 525-575, 600-650, 625-675, 700-750, 725-775, 800-850, 825-875, 900-950, 925-975, 950-1000, 1025-1075, 1050 to 1100, 1075-1125, 1100-1150, 1125-1175, 1200-1250, 1225-1275, 1300-1350, 1325-1375, 1400-1450, 1425-1475, 1500-1550, 1525-1575, 1600-1650, 1625-1675, 1700-1750, 1725-1775, 1800-1850, 1825-1875, 1900-1950, 1925-1975, 1950-2000, 2025-2075, 2050 to 2100, 2075-2125, 2100-2150, 2125-2175, 2200-2250, 2225-2275, 2300-2350, 2325-2375, 2400-2450, 2425-2475, 2500-2550, 2525-2575, 2600-2650, 2625-2675, 2700-2750, 2725-2775, 2800-2850, 2825-2875, 2900-2950, 2925-2975, 2950-3000, 3025-3075, 3050 to 3100, 3075-3125, 3100-3150, 3125-3175, 3200-3250, 3225-3275, 3300-3350, 3325-3375, 3400-3450, 3425-3475, 3500-3550, 3525-3575, 3600-3650, 3625-3675, 3700-3750, 3725-3775, 3800-3850, 3825-3875, 3900-3950, 3925-3975, or 3950-4000 nucleotides long, or any integer contained therein (e.g., 51, 52, 53, 54, 55, etc.). In some embodiments, an intron or intronic sequence is approximately 50-60, 55-65, 60-70, 65-75, 70-80, 75-85, 80-90, 95-105, 100-110, 105-115, 110-120, 115-125, 120-130, 125-135, 130-140, 135-145, 140-150, 145-155, 150-160, 155-165, 160-170, 165-175, 170-180, 175-185, 180-190, 185-195, or 190-200 nucleotides long, or any integer contained therein (e.g., 100, 101, 102, 103, 104, 105, etc.). In some embodiments, an intron or intronic sequence is approximately 50-80, 60-90, 70-100, 80-110, 90-120, 100-130, 110-140, 120-150, 130-160, 140-170, 150-180, 160-190, or 170-200 nucleotides long, or any integer contained therein (e.g., 120, 121, 122, 123, 124, 125, etc.).
In some embodiments, a natural or wild-type intron is truncated or otherwise modified so as to retain only the sequence which regulates the up- or down-stream alternative exon. In some embodiments, said regulatory sequence is located within approximately 100-300 nucleotides upstream or downstream of the exon-intron (or intron-exon) border. In some embodiments, said regulatory sequence is located within approximately 100-110, 105-115, 110-120, 115-125, 120-130, 125-135, 130-140, 135-145, 140-150, 145-155, 150-160, 155-165, 160-170, 165-175, 170-180, 175-185, 180-190, 185-195, 190-200, 205-215, 210-220, 215-225, 220-230, 225-235, 230-240, 235-245, 240-250, 245-255, 250-260, 255-265, 260-270, 265-275, 270-280, 275-285, 280-290, 285-295, or 290-300 nucleotides upstream or downstream of the exon-intron (or intron-exon) border. In some embodiments, said regulatory sequence is located within approximately 100-130, 110-140, 120-150, 130-160, 140-170, 150-180, 160-190, 170-200, 210-240, 220-250, 230-260, 240-270, 250-280, 260-290, or 270-300 nucleotides upstream or downstream of the exon-intron (or intron-exon) border.
In some embodiments, the only intron that is comprised within an alternatively-spliced exon cassette is a truncated regulated intron. A regulated intron may in some embodiments be a regulated intron that flanks the alternative exon in its natural or wild-type context. In some embodiments, two regulated introns flank the alternative exon in its natural or wild-type context. A regulated intron may be located 5′ or 3′ relative to the alternative exon in its natural or wild-type context. In some embodiments, a regulated intron or truncated regulated intron is 5′ relative to the alternative exon within an alternative exon cassette of the disclosure. In some embodiments, a regulated intron or truncated regulated intron is 3′ relative to the alternative exon within an alternative exon cassette of the disclosure. In some embodiments, two or more regulated introns are retained and truncated in an alternatively-spliced exon cassette. In some embodiments, the two or more truncated regulated introns flank the alternative exon within the alternative exon cassette. In some embodiments, all other (e.g., non-regulatory) introns and intronic sequences have been removed. However, in some embodiments, one or more of the other introns (e.g., the introns that are not subject to regulated splicing) or intronic sequences may be retained (and optionally truncated) depending on the size of the nucleic acid and the size limitations of the virus, respectively. In some embodiments, the only introns or intronic sequences in an alternatively-spliced exon cassette are truncated introns or intronic sequences (e.g., only one, 2, 3, 4, 5, 6, 7, 8, 9, 10 truncated introns or intronic sequences). In some embodiments, an alternatively-spliced exon cassette does not contain any full-length introns. In some embodiments, an alternatively-spliced exon cassette does not contain any truncated introns or intronic sequences that are not regulated.
In some embodiments, the intron(s) or intronic sequence(s) flanking an alternative exon(s) comprise an intron or intronic sequence from or derived from a gene selected from the group consisting of: ABCC1, AK125149, ASCC2, BAT2D1, BBX, BRD8, BRE, C17orf70, CAMKK2, CBFB, CCAR1, CCDC7CD6, CHTF8, COL4A3BP, COL6A3, CUGBP1, CUGBP2, CXorf45, DENND3, DGUOK, DKFZp762G094, DNAJC7, DNASE1, EIF4A2, EIF4G2, EIF4H, EXOC7, EZH2, FAM120A, FAM136A, FAM36A, FARSB, FBXO38, FGFR1OP2, FIP1L1, FOXRED1, FUBP3, GALT, GATA3, GOLGA2, HIF1A, HMMR, HRB, IKZF1, ILF3, IRAK4, IRF1, KCTD13, LEF1, LUC7L, LYRM1, MALT1 e7, MAP2K7, MAP3K7, MAP4K2, MBNL2, MFF, NAE1, NCSTN, NR4A3, NRF1, NUP98, PARP6, PCM1, PLAUR, PLSCR3, PPIL5, PPP5C, PTPRC-E4, PTPRC-E6, PTS, RABL5, RAPH1, SEC16A, SFRS3, SFRS7, SLMAP, SNRNP70, STAT6, TBC1D1, TIMM8B, TIR8, TRA2A, TROVE2, UGCGL1, VAP-B, VAV1, ZNF384, and/or ZNF496.
In some embodiments, the intron(s) or intronic sequence(s) flanking an alternative exon(s) comprise an intron or intronic sequence from or derived from a gene selected from the group consisting of: CAMK2B, PKP2, LGMN, NRAP, VPS39, KSR1, PDLIM3, BIN1, ARFGAP2, KIF13A, and/or PICALM.
In some embodiments, the intron(s) or intronic sequence(s) flanking an alternative exon(s) is or is derived from an intron of BIN1. In some embodiments, the intron(s) or intronic sequence(s) flanking an alternative exon(s) is or is derived from intron 10 and/or intron 11 of BIN1. In some embodiments, intron(s) or intronic sequence(s) flanking an alternative exon(s) which is or is derived from intron 10 of BIN1 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 15. In some embodiments, the intron(s) or intronic sequence(s) flanking an alternative exon(s) which is or is derived from intron 10 of BIN1 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 15. In some embodiments, the intron(s) or intronic sequence(s) flanking an alternative exon(s) which is or is derived from intron 11 of BIN1 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 16. In some embodiments, the intron(s) or intronic sequence(s) flanking an alternative exon(s) which is or is derived from intron 11 of BIN1 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 16.
In some embodiments, the intron(s) or intronic sequence(s) flanking an alternative exon(s) comprise an intron or intronic sequence comprising a polynucleotide sequence as set forth in any one of SEQ ID NOs: 1-22, 103, and 104. In some embodiments, an intron or intronic sequence comprises a polynucleotide sequence that is 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to any one of SEQ ID NOs: 1-22, 103, and 104.
In some embodiments, all the introns (or portion(s) thereof) and exons (or portion thereof) of an alternatively-spliced exon cassette are from the same gene. Some embodiments of the present invention contemplate heterologous gene constructs, wherein introns (or portion(s) thereof) and exons (or portion(s) thereof) from different genes are integrated into a single alternatively-spliced exon cassette or transgene. In some embodiments, at least one intron (or portion thereof) and at least one exon (or portion thereof) of the nucleic acid construct are from different genes.
In some embodiments, an intron (or portion thereof) and/or an exon (or portion thereof) is from or derived from a gene(s) which comprises any one or more of: MBNL1, MBNL2, MBNL3, hnRNP A1, hnRNP A2B1, hnRNP C, hnRNP D, hnRNP DL, hnRNP F, hnRNP H, hnRNP K, hnRNP L, hnRNP M, hnRNP R, hnRNP U, FUS, TDP43, PABPN1, ATXN2, TAF15, EWSR1, MATR3, TIA1, FMRP, MTM1, MTMR2, LAMP2, KIF5A, a microdystrophin-encoding gene, C9ORF72, HTT, DNM2, BIN1, RYR1, NEB, ACTA, TPM3, TPM2, TNNT2, CFL2, KBTBD13, KLHL40, KLHL41, LMOD3, MYPN, SEPN1, TTN, SPEG, MYH7, TK2, POLG1, GAA, AGL, PYGM, SLC22A5, OCTN2, ETF, ETFH, PNPLA2, a cytochrome b oxidase-encoding gene, a cytochrome c oxidase-encoding gene, CLCN1, SCN4A, DMPK, CNBP, MYOT, LMNA, CAV3, DNAJB6, DES, TNPO3, HNRPDL, CAPN3, DYSF, an alpha-sarcoglycan-encoding gene, a beta-sarcoglycan-encoding gene, a gamma-sarcoglycan-encoding gene, a delta-sarcoglycan-encoding gene, TCAP, TRIM32, FKRP, FXN, POMT1, FKTN, POMT2, POMGnT1, DAG1, ANO5, PLEC1, TRAPPC11, GMPPB, ISPD, LIMS2, POPDC1, TOR1AIP1, POGLUT2, LAMA2, COL6A1, POMT1, POMT2, DUX4, EMD, PAX7, PMP22, MPZ, MFN2, SMCHD1, SMN1, and/or GJB1.
In some embodiments, an intron (or portion thereof) and/or an exon (or portion thereof) is from or derived from a gene(s) which comprises any one or more of: ABCC1, AK125149, ASCC2, BAT2D1, BBX, BRD8, BRE, C17orf70, CAMKK2, CBFB, CCAR1, CCDC7CD6, CHTF8, COL4A3BP, COL6A3, CUGBP1, CUGBP2, CXorf45, DENND3, DGUOK, DKFZp762G094, DNAJC7, DNASE1, EIF4A2, EIF4G2, EIF4H, EXOC7, EZH2, FAM120A, FAM136A, FAM36A, FARSB, FBXO38, FGFR1OP2, FIP1L1, FOXRED1, FUBP3, GALT, GATA3, GOLGA2, HIF1A, HMMR, HRB, IKZF1, ILF3, IRAK4, IRF1, KCTD13, LEF1, LUC7L, LYRM1, MALT1 e7, MAP2K7, MAP3K7, MAP4K2, MBNL2, MFF, NAE1, NCSTN, NR4A3, NRF1, NUP98, PARP6, PCM1, PLAUR, PLSCR3, PPIL5, PPP5C, PTPRC-E4, PTPRC-E6, PTS, RABL5, RAPH1, SEC16A, SFRS3, SFRS7, SLMAP, SNRNP70, STAT6, TBC1D1, TIMM8B, TIR8, TRA2A, TROVE2, UGCGL1, VAP-B, VAV1, ZNF384, and/or ZNF496.
In some embodiments, an intron (or portion thereof) and/or an exon (or portion thereof) is from or derived from a gene(s) which comprises any one or more of: CAMK2B, PKP2, LGMN, NRAP, VPS39, KSR1, PDLIM3, BIN1, ARFGAP2, KIF13A, and/or PICALM.
In some embodiments, one or more introns (or portions thereof) and/or an exon (or portion thereof) is from or derived from BIN1.
In some embodiments, the one or more introns (or portions thereof) is or is derived from an intron(s) of BIN1. In some embodiments, the one or more introns (or portions thereof) is or is derived from intron 10 and/or intron 11 of BIN1. In some embodiments, the one or more introns (or portions thereof) which is or is derived from intron 10 of BIN1 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 15. In some embodiments, the one or more introns (or portions thereof) which is or is derived from intron 10 of BIN1 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 15. In some embodiments, the one or more introns (or portions thereof) which is or is derived from intron 11 of BIN1 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 16. In some embodiments, the one or more introns (or portions thereof) which is or is derived from intron 11 of BIN1 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 16.
In some embodiments, an exon (or portion thereof) is or is derived from exon 11 of BIN1. In some embodiments, the exon (or portion thereof) which is or is derived from exon 11 of BIN1 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 37. In some embodiments, the exon (or portion thereof) which is or is derived from exon 11 of BIN1 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 37. In some embodiments, the exon (or portion thereof) which is or is derived from exon 11 of BIN1 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 38. In some embodiments, the exon (or portion thereof) which is or is derived from exon 11 of BIN1 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 38.
In some embodiments, the one or more introns (or portions thereof) and/or the exon (or portion thereof) which are from or derived from BIN1 together comprise an alternative exon cassette. In some embodiments, the alternative exon cassette (which comprises the one or more introns (or portions thereof) and/or the exon (or portion thereof) which are from or derived from BIN1) comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in any one of SEQ ID NOs: 107-778. In some embodiments, the alternative exon cassette (which comprises the one or more introns (or portions thereof) and/or the exon (or portion thereof) which are from or derived from BIN1) comprises a polynucleotide having a nucleic acid sequence as set forth in any one of SEQ ID NOs: 107-778.
In some embodiments, an alternative exon cassette (e.g., which comprises the one or more introns (or portions thereof) and/or the exon (or portion thereof) which are from or derived from BIN1) is selected for inclusion in a transgene based on the psi values which the alternative exon cassette achieves in a specific tissue of interest (see, e.g., Table 4; Table 5). For example, if the coding region of the transgene encodes a protein which would be therapeutically useful in skeletal tissue (e.g., MTM1), but which would not be desirable to express in heart tissue, the alternative exon cassette selected for inclusion in a transgene would be one wherein a high psi value is observed for skeletal tissue, and wherein a low psi value is observed for heart tissue (e.g., the A psi between skeletal tissue and heart tissue is large). In some embodiments, wherein the coding region of the transgene encodes a protein which would be therapeutically useful in skeletal tissue (e.g., MTM1), but which would not be desirable to express in heart tissue, the alternative exon cassette selected from inclusion in a transgene would be one wherein a high psi value is observed for skeletal tissue. In some embodiments, wherein the coding region of the transgene encodes a protein which would be therapeutically useful in skeletal tissue (e.g., MTM1), but which would not be desirable to express in heart tissue, the alternative exon cassette selected from inclusion in a transgene would be one wherein a low psi value is observed for heart tissue. As will be understood, the alternative exon cassette which is included in a transgene may be selected based on a variety of factors including, but not limited to: the identity of the protein cargo to be encoded by the coding region of interest; the A psi observed between a first tissue (or condition, etc.) which is of interest and a second tissue (or condition, etc.) which is not of interest; the psi observed in a tissue (or condition, etc.) which is of interest; and/or the psi observed in a tissue (or condition, etc.) which is not of interest. However, various other factors may also impact which alternative exon cassette is selected for inclusion in a transgene, as described throughout the disclosure.
In some embodiments, an intron (or portion thereof) and/or an exon (or portion thereof) is from or derived from SMN1.
In some embodiments, an intron(s) is or is derived from intron 6 and/or intron 7 of SMN1. In some embodiments, the intron which is derived from SMN1 intron 6 is a fragment of (e.g., is truncated relative to) the wild-type or naturally occurring sequence of SMN1 intron 6. In some embodiments, the intron which is derived from SMN1 intron 6 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 103. In some embodiments, the intron which is derived from SMN1 intron 6 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 103. In some embodiments, the intron which is derived from SMN1 intron 7 is a fragment of (e.g., is truncated relative to) the wild-type or naturally occurring sequence of SMN1 intron 7. In some embodiments, the intron which is derived from SMN1 intron 7 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 104. In some embodiments, the intron which is derived from SMN1 intron 7 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 104.
In some embodiments, an exon is or is derived from exon 6 of SMN1. In some embodiments, the exon which is derived from SMN1 exon 6 is a fragment of (e.g., is truncated relative to) the wild-type or naturally occurring sequence of SMN1 exon 6. In some embodiments, the exon which is derived from SMN1 exon 6 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 102. In some embodiments, the exon which is derived from SMN1 exon 6 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 102.
In other embodiments, the recombinant viral genomes of the present disclosure comprise one or more regulatory sequences. In some embodiments, the regulatory sequences impart a positive control on the expression of a coding sequence of interest. In other embodiments, the regulatory sequences impart a negative control on the expression of a coding sequence of interest. Regulatory sequences may be present, inserted, or otherwise included in an alternatively-spliced exon. Such sequences may be referred to as positive or negative regulatory control cis-elements or “regulatory cis-elements” or merely as “cis-elements.”
The one or more cis-elements located within an alternatively-spliced exon and which may influence the level of expression of a coding region of interest through positive and/or negative controls may comprehensively include any genetic element which exerts—as a consequence being spliced-in or spliced-out of the final mRNA—either a positive or negative regulation on the expression of the coding region. Non-limiting examples of positive or negative regulatory cis-elements located within the alternatively-spliced exons can include, without limitation, a translation start codon, a translation stop codon, a binding site for an RNA binding protein that serves to positively regulate mRNA translation, a binding site for an RNA binding protein that serves to negatively regulate mRNA translation, a binding site for a nucleic acid molecule (e.g., an miRNA) that serves to positively regulate mRNA translation, or a binding site for a nucleic acid molecule (e.g., an siRNA) that serves to negatively regulate mRNA stability or degradation, a binding site for an RNA binding protein that serves to positively regulate mRNA stability or degradation, a binding site for an RNA binding protein that serves to negatively regulate mRNA stability or degradation, a binding site for a nucleic acid molecule (e.g., an miRNA) that serves to positively regulate mRNA stability or degradation, a binding site for a nucleic acid molecule (e.g., an siRNA) that serves to negatively regulate mRNA stability or degradation, a nuclease recognition site, a sequence that can form a secondary structure that slows down translation (for example a stem loop that delays the ribosome), or a sequence that can form a secondary structure that promotes translation. This list of examples is not intended to place any limitation on the scope and meaning of the positive and negative cis-elements and the disclosure embraces any genetic element or region positioned within or at least associated with an alternatively-spliced exon which exerts a positive or negative control on the overall expression of a coding region of the transgene (e.g., encoding a therapeutic protein).
In some cases, the cis-element is located within the alternatively-spliced exon, but in other cases, the cis-element is separate from, but at least associated with, the alternatively-spliced exon, such that it is spliced-in or spliced-out at the same time as the alternatively-spliced exon. Non-limiting examples of positive or negative regulatory cis-elements can include, for instance, (1) a nucleotide sequence element that regulates, modulates, or otherwise affects the stability and/or degradation of a mRNA; and (2) a nucleotide sequence element that regulates, modulates, or otherwise affects the translation of a mRNA into one or more encoded polypeptide products (e.g., a therapeutic product).
In some embodiments, the one or more cis-elements can include, but are not limited to, a translation start codon, a translation stop codon, an siRNA binding site, a miRNA binding site, a sequence forming a stem-loop structure, a sequence forming an RNA dimerization motif, a sequence forming a hairpin structure, a sequence forming an RNA quadruplex, polypurine tract, a sequence forming a pair of kissing loops, and a sequence forming a tetraloop/tetraloop receptor pair. In some embodiments, cis-elements include binding sites recognized by regulatory elements, such as, for example, RNA binding proteins.
In some embodiments, an RNA binding protein may be involved in binding to one or more positive or negative cis-elements and, as such, may be involved in regulating the expression of the coding region of interest.
In some embodiments, the RNA binding protein is a sequence-specific RNA binding protein. In some embodiments, a useful sequence-specific RNA binding protein binds to a target sequence with a binding affinity (e.g., Kd) of 0.01-1000 nM or less (e.g., 0.01 to 1, 1-10, 10-50, 50-100, 100-500, 500-1,000 nM). In some embodiments, an RNA binding protein has serine/arginine domains that act as splicing enhancers, or glycine-rich domains that act as splicing repressors. In some embodiments, an RNA binding protein acts as an intronic splicing enhancer, intronic splicing silencer, exonic splicing enhancer, or exonic splicing silencer.
Different types of sequence-specific RNA binding proteins can be used. In some embodiments, a sequence-specific RNA binding protein is one that contains zinc fingers, RNA recognition motifs, KH domains, deadbox domains, or dsRBDs. Non-limiting examples of RBPs that contain zinc fingers include: MBNL, TIS11, or TTP. Non-limiting examples of RBPs that contain RNA recognition motifs include hnRNPs and SR proteins, RbFox, PTB, Tra2beta. Non-limiting examples of RNA binding proteins that contain KH domains include Nova, SF1, and FBP. Non-limiting examples of RNA binding proteins that contain deadbox domains are DDX5, DDX6, and DDX17. Non-limiting examples of RNA binding proteins that contain dsRBDs include ADAR, Staufen, and TRBP.
Further examples of these types of RNA binding proteins and their respective sequence specific binding motifs are known in the art, and can be found, for example, in Perez-Perri, J. I., et al., (2018), Nat. Comm., 9:4408; Van Nostrand, E. L., et al., (2020), Nature, 583, 711-19; and Corley, M., et al., (2020), Cell, (20): 30159-3, the contents of which are hereby incorporated by reference with respect to RNA protein binding sites and RNA binding proteins.
In some embodiments, the recombinant viral vector genomes may further comprise one or more regulatory sequences and/or genes encoding factors that regulate splicing, including splicing of the alternatively-spliced exon.
In some embodiments, that regulatory gene encodes a tissue-specific RNA binding protein, an autoregulatory RNA binding protein, or a condition-specific RNA binding protein. In some embodiments, the protein auto-regulates splicing of the mRNA encoded by the recombinant viral genome. In some embodiments, splicing can be regulated by two or more different splice regulatory proteins that bind to splicing regulatory regions. For example, in some embodiments, NRAP exon 12 is highly included in skeletal muscle but absent in heart. In some embodiments, TPM2 exon 2 is low in heart but high in smooth muscle. In some embodiments, SLC25A3 is very high in heart but low in brain. Many other examples can be found in the literature and one example of a list of such “switch-like exons” can be found in Wang, E. T., et al., (2008), Nature, 456(7221):470-6. Such sequences may be included in the recombinant viral genomes to further regulate splicing under certain desired conditions.
In some embodiments, the recombinant viral genome may further encode a splice-regulatory protein, which can include, for instance, MBNL protein, an SR protein (e.g., SRSF1, SRSF2, SRSF3, SRSF4, SRSF5, SRSF6, SRSF7, SRSF8, SRSF9, SRSF10, SRSF11, or SRSF12), an hnRNP protein, an RbFox protein, a CELF protein, a Nova protein, or a PTB protein.
In some embodiments, the viral vectors may also encode a splicing factor in the form of an RNA, which may comprise a regulatory RNA molecule, a short hairpin RNA molecule (shRNA), a microRNA molecule, a transfer RNA molecule (tRNA), or an RNA that comprises a DMPK-targeting shRNA or microRNA. The RNA that regulates splicing may also comprise a repeat-targeting shRNA or microRNA (e.g., a CUG shRNA, CAG shRNA, or GGGGCC shRNA), e.g., which targets an RNA binding protein or other member of a related biological pathway.
In some embodiments, the viral vectors may also encode a splicing factor that comprises a protein-RNA complex, the protein-RNA complex comprises a ribosome, snRNP complex, or other macromolecular complex that can interact with RNA to regulate splicing decisions. In some embodiments, wherein the intracellular factor comprises a protein-RNA complex, a snRNP complex comprises U1 snRNP or U2 snRNP. In some embodiments, wherein the intracellular factor comprises a protein-RNA complex, the RNA comprises a ribozyme that targets one or more CUG repeats. In some embodiments, wherein the intracellular factor comprises a protein-RNA complex, the RNA comprises a ribozyme that targets specific mRNAs.
Non-limiting examples of RNA binding protein motifs and RNA target sequences that can confer or regulate spicing activity are described, for example, in Ray, D., et al., (2014), Nature, 499(7457): 172-77; Lambert, N., et al., (2014), Mol. Cell., 54(5): 887-900; and Van Nostrand, E. L., et al., (2020), Nature, and may be incorporated in the recombinant viral vector genomes described herein to further regulate splicing activity.
In some embodiments, the recombinant viral vector genomes may comprise an alternatively-spliced exon cassette configured to regulate expression of a coding region of interest by including a nonsense mediated decay (NMD) exon (e.g., an alternative exon comprising a heterologous stop codon) within the RNA. In certain embodiments, the NMD exon is flanked by introns (or portion(s) thereof) for which alternative splicing is regulated. In some embodiments, an NMD exon is an exon that encodes at least one stop codon that is in frame with a previous exon, wherein the stop codon is upstream (5′) from the 3′ splice site of the exon. In various embodiments, the in-frame stop codon is inserted at least 100 nucleotides, at least 95 nucleotides, at least 90 nucleotides, at least 85 nucleotides, at least 80 nucleotides, at least 75 nucleotides, at least 70 nucleotides, at least 65 nucleotides, at least 60 nucleotides, at least 55 nucleotides, at least 50 nucleotides, at least 45 nucleotides, at least 40 nucleotides, at least 35 nucleotides, at least 30 nucleotides, at least 25 nucleotides, at least 20 nucleotides, at least 15 nucleotides, at least 10 nucleotides, or at least 5 nucleotides, or between 1 to 5 nucleotides upstream of the next 5′ splice junction.
In some embodiments, if the NMD exon is included in the spliced RNA, it causes degradation of the RNA via nonsense-mediated decay. In some embodiments, if the NMD exon is spliced out, the resulting transcript is stable, and in some embodiments encodes a functional (e.g., full-length) protein of interest.
In some embodiments, an alternatively-spliced exon cassette for which splicing is regulated is a construct configured to regulate expression of a protein by including a 5′ exon comprising an amino terminal amino acid encoding sequence (e.g., an ATG or part of the ATG) and/or translation control sequences, wherein the 5′ exon is separated from subsequent exon(s) by an intron for which splicing is regulated. In some embodiments, if the intron is spliced out of the RNA transcript, the recombinant 5′ exon is spliced in frame to the subsequent exon(s) and the resulting spliced transcript encodes a protein that is expressed. In some embodiments, if the intron is not spliced out of the RNA transcript, the recombinant 5′ exon is not spliced to the subsequent exon(s) and as a result a protein is not expressed from the transcript. In some embodiments, an intron (or portion thereof) for which splicing is regulated can be included within a gene that encodes a regulatory RNA (e.g., an siRNA). In some embodiments, an intron(s) (or portion thereof) for which splicing is regulated and that encodes regulatory RNA(s) can be included in an alternatively-spliced exon cassette encoding an RNA transcript.
(vii) Transgenes and Coding Regions Thereof
In various embodiments, the recombinant genomes disclosed herein may comprise one or more transgenes. A transgene may be recombinant (or “synthetic”), and may be modified to comprise an alternatively-spliced exon or an alternatively-spliced exon cassette described herein (e.g., see
A coding region of a transgene may be naturally-occurring, and may in some embodiments comprise no nucleic acid modifications, relative to the coding region of a wild-type gene. In some embodiments, a coding region of a transgene may be synthetic. The coding region of a transgene may be considered synthetic if it undergoes one or more nucleic acid modifications, relative to the coding region of a wild-type gene. A nucleic acid modification may be a substitution or deletion of one or more nucleotides that form the nucleic acid sequence of the coding region of the transgene. In some embodiments, the modification comprises disrupting or deleting a native start codon located at the 5′ end of the coding region of the transgene. In some embodiments, the modification comprises the insertion of an alternatively-spliced exon into the coding region of the transgene.
In some embodiments, the coding region of the transgene may comprise one or more nucleic acid modifications (e.g., substitutions) such that the coding region comprises a “barcode” sequence. Barcode sequences may be useful in some embodiments to characterize the identity of the transgene (e.g., a transgene comprising a BIN1 alternative exon cassette and MTM1 coding sequence), for example when multiple transgenes are being tested together. In some embodiments, the wobble positions of five codons within the coding region of the transgene are modified to produce a barcode sequence. As will be understood, a “wobble position” is the third nucleic acid of a codon. Nucleic acids lying at wobble positions can be modified without altering the identity of the amino acid encoded by the associated codon (see
In some embodiments, the five codons which are modified are located approximately 350 nucleotides from the 5′ end of the coding region of the transgene. In some embodiments, the five codons which are modified are located approximately 100, 125, 150, 175, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, 525, or 550 nucleotides from the 5′ end of the coding region of the transgene. In some embodiments, the five codons which are modified are located approximately 100, 125, 150, 175, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, 525, or 550 nucleotides from the 5′ end of the coding region of the transgene. In some embodiments, the five codons which are modified are located approximately 100-130, 120-150, 140-170, 160-190, 180-210, 200-230, 220-250, 240-270, 260-290, 280-310, 300-330, 320-350, 340-370, 370-400, 390-420, 410-440, 430-460, 450-480, 470-500, 490-520, 510-540, or 530-560 nucleotides from the 5′ end of the coding region of the transgene.
In some embodiments, a coding region of a transgene may naturally comprise one or more internal, out-of-frame ATG start codons. As will be understood, in the splicing condition wherein the alternative exon (comprising an ATG start codon at its 3′ end) is spliced-out, translation of the coding region via an alternate, out-of-frame ATG start codon located within the coding region of the transgene would be undesirable. However, any modification made to the coding region of the transgene must also preserve translation of the full-length protein when the alternative exon is spliced-in. Accordingly, in some embodiments one or more modifications are made to the coding region of the transgene which preserve translation of the full-length protein in the condition wherein the alternative exon is spliced-in, but which disrupt or terminate translation of the full-length protein in the condition wherein the alternative exon is spliced-out. In some embodiments, one or more nucleic acid substitutions are made within the coding region of the transgene to introduce one or more heterologous stop codons located downstream of (e.g., 3′ relative to) one or more of the internal, out-of-frame start codons located within the coding region of the transgene. As will be understood, such substitutions may comprise the substitution of 1, 2, or 3 nucleic acids to produce any of a TAA, TGA, or TAG stop codon, depending on the nucleic acids which are naturally present at the desired location within the coding sequence. Additionally or alternatively, in some embodiments a 3′ UTR intron is included in the transgene which elicits nonsense-mediated decay in the condition wherein the alternative exon is spliced-out (such that translation of the full-length protein is disrupted or terminated), but which preserves translation of the full-length protein in the condition wherein the alternative exon is spliced-in.
In some embodiments, the coding region of the transgene is from or is derived from a coding region from a gene selected from the group consisting of: MBNL1, MBNL2, MBNL3, hnRNP A1, hnRNP A2B1, hnRNP C, hnRNP D, hnRNP DL, hnRNP F, hnRNP H, hnRNP K, hnRNP L, hnRNP M, hnRNP R, hnRNP U, FUS, TDP43, PABPN1, ATXN2, TAF15, EWSR1, MATR3, TIA1, FMRP, MTM1, MTMR2, LAMP2, KIF5A, microdystrophin, C9ORF72, HTT, DNM2, BIN1, RYR1, NEB, ACTA, TPM3, TPM2, TNNT2, CFL2, KBTBD13, KLHL40, KLHL41, LMOD3, MYPN, SEPN1, TTN, SPEG, MYH7, TK2, POLG1, GAA, AGL, PYGM, SLC22A5, OCTN2, ETF, ETFH, PNPLA2, cytochrome b/cytochrome c oxidase, CLCN1, SCN4A, DMPK, CNBP, MYOT, LMNA (Lamin A/C), CAV3, DNAJB6, DES, TNPO3, HNRPDL, CAPN3, DYSF, alpha-sarcoglycan, beta-sarcoglycan, gamma-sarcoglycan, delta-sarcoglycan, TCAP, TRIM32, FKRP, FXN, POMT1, FKTN, POMT2, POMGnT1, DAG1, ANO5, PLEC1, TRAPPC11, GMPPB, ISPD, LIMS2, POPDC1, TOR1AIP1, POGLUT2, LAMA2, COL6A1, POMT1, POMT2, DUX4, EMD, PAX7, PMP22, MPZ, MFN2, SMCHD1, or GJB1. In some embodiments, the coding region of the transgene is from or is derived from a coding region of FXN.
In some embodiments, the coding region of the transgene is from or is derived from a coding region of MTM1. In some embodiments, the coding region of the transgene which is or is derived from MTM1 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 1881. In some embodiments, the coding region of the transgene which is or is derived from MTM1 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 1881.
In some embodiments, the coding region of the transgene is from or is derived from a coding region of CAPN3. In some embodiments, the coding region of the transgene which is or is derived from CAPN3 comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 1882. In some embodiments, the coding region of the transgene which is or is derived from CAPN3 comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 1882.
In other embodiments, the transgene may encode one or more therapeutic proteins (e.g., a biologic or biosimilar thereof), including, but not limited to: adalimumab, rituximab, pegfilgrastim, infliximab, bevacizumab, trastuzumab, etanercept, and epoetin.
Aspects of the present disclosure provide for the packaging of the herein disclosed recombinant viral genomes into viral vectors (i.e., complete viral particles which may infect cells to deliver the recombinant genomes, and the concomitant expression of the transgenes in a manner dependent on the alternatively-splice exons). Thus, in some embodiments a recombinant viral genome comprising an alternatively-spliced exon cassette as described herein is provided in a viral vector (e.g., an rAAV vector; a lentivirus vector). The viral vectors may include rAAV particles, lentivirus particles, or other viral vectors.
In some embodiments, the recombinant viral genomes packaged into the rAAV or lentiviral vectors further comprise a promoter. In some embodiments, the promoter is a constitutive promoter or a regulated promoter. In some embodiments, the regulated promoter is an inducible promoter. In some embodiments, the promoter comprises any one of: CMV, EF1alpha, CBh, synapsin, enolase, MECP2, MHCK7, Desmin, or GFAP.
In some embodiments, an MHCK7 promoter comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 1880. In some embodiments, an MHCK7 promoter comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 1880.
In some embodiments, the promoter is a ubiquitous promoter. In some embodiments, a ubiquitous promoter is a promoter selected from the group consisting of: an EF1 alpha promoter, a beta actin promoter, CMV, CBh, and CAG promoter. In some embodiments, the promoter is a tissue-specific promoter, such as a muscle- or heart-biased promoter. In some embodiments, a tissue-specific promoter, such as a muscle- or heart-biased promoter, is a promoter selected from the group consisting of: a muscle creatine kinase promoter, a C5-12 muscle promoter, MHCK7, and Desmin. In some embodiments, the promoter is a neuronal-biased promoter. In some embodiments, a neuronal-biased promoter is a promoter selected from the group consisting of: synapsin and MECP2. In some embodiments, the promoter is an astrocyte-biased promoter. In some embodiments, an astrocyte-biased promoter is a GFAP promoter. Thus, in some embodiments, the nucleic acid comprises a promoter and sequence corresponding to an RNA molecule that is capable of being expressed from the nucleic acid.
In some embodiments, the recombinant viral genome is sufficiently small to be effectively packaged in an AAV viral particle (e.g., the gene construct may be around 0.5-5 kb long, for example around 4.9 kb, 4.8 kb, 4.7 kb, 4.6 kb, 4.5 kb, 4.4 kb, 4.3 kb, 4.2 kb, 4.1 kb, 4 kb, 3.5 kb, or 3 kb long). So as to fit into the AAV viral particle, in some embodiments a nucleic acid comprises one or more truncated and/or recombinant introns, as described elsewhere herein. Accordingly, a recombinant intron for an rAAV vector is typically shorter than 4 kb, but can be between around 20 bases long and around 2,000 bases long to provide space for other components (e.g., exons, regulatory sequences, other introns, viral packaging sequences) in the nucleic acid (e.g., recombinant gene) construct. In some embodiments a recombinant intron is around 50 bases, around 100 bases, around 250 bases, around 500 bases, around 1,000 bases, around 1,500 bases, or around 2,000 bases long. In some embodiments, a recombinant intron is shorter than 4 kb, shorter than 3 kb, shorter than 2 kb, shorter than 1 kb, 100-900 bases long, or shorter than 500 bases long.
In some embodiments, the recombinant viral genome contains sufficient viral sequences for packaging in a viral vector (e.g., an rAAV particle). For example, in some embodiments a recombinant viral genome is flanked by viral sequences (for example, terminal repeat sequences) that are useful to package the recombinant viral genome in a viral particle (e.g., encapsidated by viral capsid proteins and/or an envelope, where appropriate). In some embodiments, the flanking terminal repeat sequences are rAAV inverted terminal repeats (ITRs). In some embodiments, the AAV ITR sequences comprise AAV1, AAV2, AAV5, AAV7, AAV8, or AAV9 ITR sequences.
In some embodiments, the AAV ITR sequences comprise AAV2 ITR sequences. In some embodiments, an AAV2 ITR comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 1879. In some embodiments, an AAV2 ITR comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 1879.
In some embodiments, the recombinant viral genome comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in either SEQ ID NO: 105 or SEQ ID NO: 106. In some embodiments, the recombinant viral genome comprises a polynucleotide having a nucleic acid sequence as set forth in either SEQ ID NO: 105 or SEQ ID NO: 106.
In some embodiments, the recombinant viral genome is a lentivirus genome comprising a DNA molecule, wherein the DNA molecule comprises sequences that encode an RNA molecule.
(i) Manufacture of rAAV Vectors
In some embodiments, the recombinant viral genome is encapsidated by an rAAV particle as described herein. The rAAV particle may be of any AAV serotype (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10), including any derivative (including non-naturally occurring variants of a serotype) or pseudotype. In some embodiments, the rAAV particle is an AAV8 particle, which may be pseudotyped with AAV2 ITRs. In some embodiments, an AAV2 ITR comprises a polynucleotide having at least 70%, at least 75%, at least 80%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity, relative to a nucleic acid sequence as set forth in SEQ ID NO: 1879. In some embodiments, an AAV2 ITR comprises a polynucleotide having a nucleic acid sequence as set forth in SEQ ID NO: 1879.
Non-limiting examples of derivatives and pseudotypes include AAV2-AAV3 hybrid, AAVrh.10, AAVhu.14, AAV3a/3b, AAVrh32.33, AAV-HSC15, AAV-HSC17, AAVhu.37, AAVrh.8, CHt-P6, AAV2.5, AAV6.2, AAV2i8, AAV-HSC15/17, AAVM41, AAV9.45, AAV6(Y445F/Y731F), AAV2.5T, AAV-HAE1/2, AAV clone 32/83, AAV5hH10, AAV2 (Y→F), AAV8 (Y733F), AAV2.15, AAV2.4, AAVM41, and AAVr3.45; or a derivative thereof. In some embodiments, the rAAV vector is of serotype AAV8. In some embodiments, the rAAV vector is pseudotyped. Such AAV serotypes and derivatives/pseudotypes, and methods of producing such derivatives/pseudotypes are known in the art (see, e.g., Mol Ther. 2012 April; 20(4):699-708. doi: 10.1038/mt.2011.287. 2012 Jan. 24. The AAV vector toolkit: poised at the clinical crossroads. Asokan A1, Schaffer D V, Samulski R J.). In some embodiments, the rAAV particle is a pseudotyped rAAV particle, which comprises (a) a nucleic acid vector comprising ITRs from one serotype (e.g., AAV2) and (b) a capsid comprised of capsid proteins derived from another serotype (e.g., AAV1, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, or AAV10). Methods for producing and using pseudotyped rAAV vectors are known in the art (see, e.g., Duan et al., J. Virol., 75:7662-7671, 2001; Halbert et al., J. Virol., 74:1524-1532, 2000; Zolotukhin et al., Methods, 28:158-167, 2002; and Auricchio et al., Hum. Molec. Genet., 10:3075-3081, 2001).
Exemplary rAAV nucleic acid vectors useful according to the disclosure include single-stranded (ss) or self-complementary (sc) AAV nucleic acid vectors, such as single-stranded or self-complementary recombinant viral genomes.
Methods of producing rAAV particles and recombinant viral genomes are also known in the art and commercially available (see, e.g., Zolotukhin et al. Production and purification of serotype 1, 2, and 5 recombinant adeno-associated viral vectors. Methods 28 (2002) 158-167; and U.S. Patent Publication Numbers US20070015238 and US20120322861, which are incorporated herein by reference; and plasmids and kits available from ATCC and Cell Biolabs, Inc.). For example, a plasmid containing the recombinant viral genome may be combined with one or more helper plasmids, e.g., that contain a rep gene (e.g., encoding Rep78, Rep68, Rep52 and Rep40) and a cap gene (encoding VP1, VP2, and VP3, including a modified VP3 region), and transfected into a producer cell line such that the rAAV particle can be packaged and subsequently purified.
In some embodiments, the one or more helper plasmids includes a first helper plasmid comprising a rep gene and a cap gene and a second helper plasmid comprising a E1a gene, a E1b gene, a E4 gene, a E2a gene, and a VA gene. In some embodiments, the rep gene is a rep gene derived from AAV2 and the cap gene is derived from AAV2 and includes modifications to the gene in order to produce a modified capsid protein described herein. Helper plasmids, and methods of making such plasmids, are known in the art and commercially available (see, e.g., pDM, pDG, pDP1rs, pDP2rs, pDP3rs, pDP4rs, pDP5rs, pDP6rs, pDG(R484E/R585E), and pDP8.ape plasmids from PlasmidFactory, Bielefeld, Germany; other products and services available from Vector Biolabs, Philadelphia, PA; Cellbiolabs, San Diego, CA; Agilent Technologies, Santa Clara, Ca; and Addgene, Cambridge, MA; pxx6; Grimm et al. (1998), Novel Tools for Production and Purification of Recombinant Adenoassociated Virus Vectors, Human Gene Therapy, Vol. 9, 2745-2760; Kern, A. et al. (2003), Identification of a Heparin-Binding Motif on Adeno-Associated Virus Type 2 Capsids, Journal of Virology, Vol. 77, 11072-11081.; Grimm et al. (2003), Helper Virus-Free, Optically Controllable, and Two-Plasmid-Based Production of Adeno-associated Virus Vectors of Serotypes 1 to 6, Molecular Therapy, Vol. 7, 839-850; Kronenberg et al. (2005), A Conformational Change in the Adeno-Associated Virus Type 2 Capsid Leads to the Exposure of Hidden VP1 N Termini, Journal of Virology, Vol. 79, 5296-5303; and Moullier, P. and Snyder, R. O. (2008), International efforts for recombinant adeno-associated viral vector reference standards, Molecular Therapy, Vol. 16, 1185-1188).
An exemplary, non-limiting, rAAV particle production method is described next. One or more helper plasmids are produced or obtained, which comprise rep and cap ORFs for the desired AAV serotype and the adenoviral VA, E2A (DBP), and E4 genes under the transcriptional control of their native promoters. The cap ORF may also comprise one or more modifications to produce a modified capsid protein as described herein. HEK293 cells (available from ATCC®) are transfected via CaPO4-mediated transfection, lipids or polymeric molecules such as Polyethylenimine (PEI) with the helper plasmid(s) and a plasmid containing a nucleic acid vector described herein. The HEK293 cells are then incubated for at least 60 hours to allow for rAAV particle production. Alternatively, in another example Sf9-based producer stable cell lines are infected with a single recombinant baculovirus containing the nucleic acid vector. As a further alternative, in another example HEK293 or BHK cell lines are infected with a HSV containing the nucleic acid vector and optionally one or more helper HSVs containing rep and cap ORFs as described herein and the adenoviral VA, E2A (DBP), and E4 genes under the transcriptional control of their native promoters. The HEK293, BHK, or Sf9 cells are then incubated for at least 60 hours to allow for rAAV particle production. The rAAV particles can then be purified using any method known the art or described herein, e.g., by iodixanol step gradient, CsCl gradient, chromatography, or polyethylene glycol (PEG) precipitation.
As used herein, the terms “engineered” and “recombinant” cells are intended to refer to a cell into which an exogenous polynucleotide segment (such as DNA segment that leads to the transcription of a biologically active molecule) has been introduced. Therefore, engineered cells are distinguishable from naturally occurring cells, which do not contain a recombinantly introduced exogenous DNA segment. Engineered cells are, therefore, cells that comprise at least one or more heterologous polynucleotide segments introduced through the hand of man.
To express a therapeutic agent in accordance with the present invention one may prepare a tyrosine capsid-modified rAAV particle containing an expression vector that comprises a therapeutic agent-encoding nucleic acid segment under the control of one or more promoters. To bring a sequence “under the control of” a promoter, one positions the 5′ end of the transcription initiation site of the transcriptional reading frame generally between about 1 and about 50 nucleotides “downstream” of (i.e., 3′ of) the chosen promoter. The “upstream” promoter stimulates transcription of the DNA and promotes expression of the encoded polypeptide. This is the meaning of “recombinant expression” in this context. In some embodiments, the recombinant nucleic acid (e.g., viral) vector constructs are those that comprise an rAAV nucleic acid vector that contains a therapeutic gene of interest operably linked to one or more promoters that is capable of expressing the gene in one or more selected mammalian cells. Such nucleic acid vectors are described in detail herein.
In some embodiments, wherein the recombinant viral genome is an rAAV genome, the transgene comprising an alternatively-spliced exon cassette comprises a polynucleotide sequence as set forth in any one of SEQ ID NOs: 45-55. In some embodiments, wherein the recombinant viral genome is an rAAV genome, the transgene comprising an alternatively-spliced exon cassette comprises a polynucleotide sequence that is 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to any one of SEQ ID NOs: 45-55.
In some embodiments, a viral vector of the present disclosure comprises a recombinant lentivirus genome. Lentiviruses are the only type of virus that are diploid; they have two strands of RNA. The lentivirus is a retrovirus, meaning it has a single stranded RNA genome with a reverse transcriptase enzyme, which functions to perform transcription of the viral genetic material upon entering the cell. Lentiviruses also have a viral envelope with protruding glycoproteins that aid in attachment to the outer membrane of a host cell.
Within the lentivirus genome are RNA sequences that code for specific proteins that facilitate the incorporation of the viral sequences into genome of a host cell. The “gag” gene codes for the structural components of the viral nucleocapsid proteins: the matrix (MA/p17), the capsid (CA/p24) and the nucleocapsid (NC/p7) proteins. The “pol” domain codes for the reverse transcriptase and integrase enzymes. Lastly, the “env” domain of the viral genome encodes for the glycoproteins and envelope on the surface of the virus. The ends of the genome are flanked with long terminal repeats (LTRs). LTRs are necessary for integration of the dsDNA into the host chromosome. LTRs also serve as part of the promoter for transcription of the viral genes.
In some embodiments, the env, gag, and/or pol vector(s) forming the particle do not contain a nucleic acid sequence from the lentiviral genome that expresses an envelope protein. In some embodiments, a separate vector containing a nucleic acid sequence encoding an envelope protein operably linked to a promoter is used (e.g., an env vector). In some embodiments, such env vector also does not contain a lentiviral packaging sequence. In some embodiments, the env nucleic acid sequence encodes a lentiviral envelope protein.
The native lentivirus promoter is located in the U3 region of the 3′ LTR. As will be understood by those of skill in the art, the presence of the lentivirus promoter can in some embodiments interfere with heterologous promoters operably linked to a transgene. To minimize such interference and better regulate the expression of transgenes, in some embodiments the lentiviral promoter is deleted. In some embodiments, the lentivirus vector contains a deletion within the viral promoter. After reverse transcription, such a deletion is in some embodiments transferred to the 5′ LTR, yielding a vector/provirus that is incapable of synthesizing vector transcripts from the 5′ LTR in the next round of replication.
In some embodiments, the lentivirus particle is expressed by a vector system encoding the necessary viral proteins to produce a lentivirus particle. In some embodiments, there is at least one vector containing a nucleic acid sequence encoding the lentiviral Pol proteins necessary for reverse transcription and integration, operably linked to a promoter. In some embodiments, the Pol proteins are expressed by multiple vectors. In some embodiments, there is also a vector containing a nucleic acid sequence encoding the lentiviral Gag proteins necessary for forming a viral capsid operably linked to a promoter. In some embodiments, the gag-pol genes are on the same vector. In some embodiments, the gag nucleic acid sequence is on a separate vector than at least some of the pol nucleic acid sequence. In some embodiments, the gag nucleic acid sequence is on a separate vector from all the pol nucleic acid sequences that encode Pol proteins.
In some embodiments, the lentivirus vector does not contain nucleotides from the lentiviral genome that package lentiviral RNA, referred to as the lentiviral packaging sequence.
It will be understood that selective inclusion of envelopes could result in changes in infectivity, such that the lentivirus vector could infect many different types of cells, and could be targeted to specific cell types of interest. Accordingly, in some embodiments, the envelope protein is not from the lentivirus, but from a different virus. The resultant lentivirus particle is referred to as a pseudotyped particle. In some embodiments, env gene that encodes an envelope protein that targets an endocytic compartment such as that of the influenza virus, VSV-G, alpha viruses (Semliki forest virus, Sindbis virus), arenaviruses (lymphocytic choriomeningitis virus), flaviviruses (tick-borne encephalitis virus, Dengue virus), rhabdoviruses (vesicular stomatitis virus, rabies virus), and orthomyxoviruses (influenza virus) is used.
In some embodiments, the lentivirus is a human immunodeficiency virus (HIV1 or HIV2), a feline immunodeficiency virus (FIV), a bovine immunodeficiency virus (BIV), a caprine arthritis encephalitis virus, an equine infectious anemia virus, a jembrana disease virus, a puma lentivirus, aimian immunodeficiency virus, or a visna-maedi virus.
In some embodiments, a nucleic acid sequence encoding a transgene comprising an alternatively-spliced exon cassette of the present invention is inserted into the empty lentiviral particles by use of a plurality of vectors each containing a nucleic acid segment of interest and a lentiviral packaging sequence necessary to package lentiviral RNA into the lentiviral particles (the packaging vector). In some embodiments, the packaging vector contains a 5′ and 3′ lentiviral LTR with the desired nucleic acid segment inserted between them. The nucleic acid segment can be antisense molecules or, in some embodiments, encodes a therapeutic protein. As will be understood, proper orientation of the transgene within the lentiviral genome is necessary to avoid the loss of introns (e.g., the splicing-out of introns) during viral packaging. Accordingly, in some embodiments, the transgene is oriented in the anti-sense orientation within the lentiviral genome. In some embodiments, orienting the transgene in the anti-sense direction within the lentiviral genome avoids the loss of introns (e.g., the splicing-out of introns) during viral packaging.
In some embodiments, the packaging vector contains a selectable marker gene. Such marker genes are well known in the art and include such genes as green fluorescent protein (GFP), blue fluorescent protein (BFP), luciferase, LacZ, nerve growth factor receptor (NGFR), etc.
Some aspects of the invention contemplate a method of treating a disease or condition in a subject comprising administering a viral vector of the present disclosure to a subject, wherein the viral vectors comprise a recombinant viral genome described herein. Accordingly, provided herein is a method of delivering the disclosed viral (e.g., rAAV; lentivirus) particles. In some embodiments, viral particles are delivered by administering any one of the compositions disclosed herein to a subject. In some embodiments, “administering” or “administration” means providing a material to a subject in a manner that is pharmacologically useful. In some embodiments, viral particles are delivered to one or more tissues and cell types in a subject. In some embodiments, viral particles are delivered to one or more of muscle, heart, CNS, and immune cells. In some embodiments, delivery of a viral particle restores transcriptome homeostasis.
Delivery vehicles, vectors, particles, nanoparticles, formulations and components thereof which are suitable for expression of one or more elements of an engineered AAV capsid system as described herein are as described in, for example, International Patent Application Publication Nos. WO 2021/050974 and WO 2021/077000 and International Application No. PCT/US2021/042812, the contents of each of which are incorporated by reference herein.
In some embodiments, a viral particle is administered to the subject parenterally. In some embodiments, a viral particle is administered to a subject subcutaneously, intraocularly, intravitreally, subretinally, intravenously (IV), intracerebro-ventricularly, intramuscularly, intrathecally (IT), intracisternally, intraperitoneally, enterally, via inhalation, topically, or by direct injection to one or more cells, tissues, or organs. In some embodiments, a viral particle is administered to the subject by injection into the hepatic artery or portal vein.
To “treat” a disease, as the term is used herein, means to reduce the frequency or severity of at least one sign or symptom of a disease or disorder experienced by a subject. The compositions described above or elsewhere herein are typically administered to a subject in an effective amount, that is, an amount capable of producing a desirable result. The desirable result will depend upon the active agent being administered. For example, an effective amount of rAAV particles may be an amount of the particles that are capable of transferring an expression construct to a host organ, tissue, or cell. A therapeutically acceptable amount may be an amount that is capable of treating a disease. As is well known in the medical and veterinary arts, dosage for any one subject depends on many factors, including the subject's size, body surface area, age, the particular composition to be administered, the active ingredient(s) in the composition, time and route of administration, general health, and other drugs being administered concurrently.
In some embodiments, a single composition comprising viral particles as disclosed herein is administered only once. In some embodiments, a subject may need more than 1 administration of a viral composition (e.g., 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more times). For example, a subject may need to be provided a second administration of any one of the viral compositions as disclosed herein 1 day, 1 week, 1 month, 1 year, 2 years, 5 years, or 10 years after the subject was administered a first composition. In some embodiments, a first composition of viral particles is different from the second composition of viral particles.
In some embodiments, the administration of the composition is repeated at least once (e.g., at least once, at least twice, at least thrice, at least four times, at least five times, at least six times, at least 10 times, at least 25 times, or at least 50 times), and wherein the time between a repeated administration and a previous administration is at least 1 month (e.g., at least 1 month, at least 2 months, at least 3 months, at least 4 months, at least 5 months, at least 6 months, or at least 12 months). In some embodiments, the administration of the composition is repeated at least once, and wherein the time between a repeated administration and a previous administration is at least 1 year (e.g., at least 1, at least 2, at least 3, at least 4, at least 5, at least 10, or at least 20 years).
In some embodiments, the administration of the composition is facilitated by AAV capsids such as AAV1-9, e.g., with AAV2 ITRs, or other capsids that sufficiently deliver to affected tissues.
Additional AAV vectors are described in International Patent Application Publication No. WO 2019/2071632, the content of which is incorporated by reference herein.
Further AAV vectors are described in International Patent Application Publication Nos. WO 2020/086881 and WO 2020/235543, the contents of each of which are incorporated by reference herein.
Further AAV vectors are described in International Patent Application Publication Nos. WO 2005/033321; WO 2006/110689; WO 2007/127264; WO 2008/027084; WO 2009/073103; WO 2009/073104; WO 2009/105084; WO 2009/134681; WO 2009/136977; WO 2010/051367; WO 2010/138675; WO 2001/038187; WO 2012/112832; WO 2015/054653; WO 2016/179496; WO 2017/100791; WO 2017/019994; WO 2018/209154; WO 2019/067982; WO 2019/195701; WO 2019/217911; WO 2020/041498; WO 2020/210839; U.S. Pat. Nos. 7,906,111; 9,737,618; 10,265,417; 10,485,883; 10,695,441; 10,722,598; 8,999,678; 10,301,648; 10,626,415; 9,198,984; 10,155,931; 8,524,219; 9,206,238; 8,685,387; 9,359,618; 8,231,880; 8,470,310; 9,597,363; 8,940,290; 9,593,346; 10,501,757; 10,786,568; 10,973,928; 10,519,198; 8,846,031; 9,617,561; 9,884,071; 10,406,173; 9,596,220; 9,719,010; 10,117,125; 10,526,584; 10,881,548; 10,738,087; U.S. Patent Publication No. 2011-023353; U.S. Patent Publication No. 2019-0015527; U.S. Patent Publication No. 2020-155704; U.S. Patent Publication No 2017-0191079; U.S. Patent Publication No. 2019-0218574; U.S. Patent Publication No. 2020-0208176; U.S. Patent Publication No. 2020-0325491; U.S. Patent Publication No. 2019-0055523; U.S. Patent Publication No. 2020-0385689; U.S. Patent Publication No. 2009-0317417; U.S. Patent Publication No. 2016-0051603; U.S. Patent Publication No. 2016-00244783; U.S. Patent Publication No. 2017-0183636; U.S. Patent Publication No. 2020-0263201; U.S. Patent Publication No. 2020-0101099; U.S. Patent Publication No. 2020-0318082; U.S. Patent Publication No. 2018-0369414; U.S. Patent Publication No. 2019-0330278; U.S. Patent Publication No. 2020-0231986, the contents of each of which are incorporated by reference herein.
Aspects of the disclosure relate to methods for use with a subject (e.g., a mammal). In some embodiments, a mammalian subject is a human, a non-human primate, or other mammalian subject. In some embodiments, the subject has one or more mutations associated with aberrant intron and/or alternative splicing.
In some embodiments, a subject suffers from or is at risk of developing a disease or condition associated with aberrant splice regulation resulting in one or more symptoms of a disease or condition. Non-limiting examples of these diseases/conditions include instances in which the homeostasis of RNA binding proteins is altered (e.g., other repeat expansion diseases), or diseases/conditions in which there are mutations in RNA binding protein sequences. In some embodiments, the disease or condition is selected from: a repeat expansion disease, a laminopathy, a cardiomyopathy, a muscular dystrophy, a neurodegenerative disease, a cancer, an intellectual disability, and/or premature aging.
In a non-limiting example, compositions of this application are administered to a subject resulting in regulated overexpression of the RNA binding protein exhibiting aberrant activity. In another non-limiting example, compositions of this application are administered to a subject resulting in the regulated addition of additional non-mutated, non-aberrant RNA binding protein(s).
In some embodiments, the disease or condition is selected from the group consisting of: Dentatorubral-pallido-luysian atrophy (DRPLA), myotonic dystrophy type 1 (DM1), myotonic dystrophy type 2 (DM2), Fragile X syndrome of mental retardation (FMR1), Fragile X tremor ataxia syndrome (FXTAS), FRAXE mental retardation (FMR2), Friedreichs ataxia (FRDA), Huntington disease (HD), Huntington disease-like 2 (HDL2), Oculopharyngeal muscular dystrophy (OPMD), Myoclonic epilepsy type 1, Alzheimer's disease, ALS/FTD, spinocerebellar ataxia type 1 (SCA1), spinocerebellar ataxia type 2 (SCA2), spinocerebellar ataxia type 3 (SCA3), spinocerebellar ataxia type 6 (SCA6), spinocerebellar ataxia type 7 (SCAT), spinocerebellar ataxia type 8 (SCA8), spinocerebellar ataxia type 10 (SCA10), spinocerebellar ataxia type 12 (SCA12), spinocerebellar ataxia type 17 (SCA17), Syndromic/non-syndromic X-linked mental retardation, Emery-Dreifuss muscular dystrophy type 2, familial partial lipodystrophy, limb girdle muscular dystrophy type 1B, dilated cardiomyopathy, familial partial lipodystrophy, Charcot-Marie-Tooth disorder type 2B1, mandibuloacral dysplasia, childhood progeria syndrome (Hutchinson-Gilford syndrome), Werner syndrome, Dilated cardiomyopathy (DCM), Hypertrophic cardiomyopathy (HCM), Restrictive cardiomyopathy (RCM), Left Ventricular Non-compaction (LVNC), Arrhythmogenic Right Ventricular Dysplasia (ARVD), takotsubo cardiomyopathy, Duchenne muscular dystrophy, Becker muscular dystrophy, Limb-girdle muscular dystrophy, Facioscapulohumeral muscular dystrophy, Congenital muscular dystrophy, Oculopharyngeal muscular dystrophy, Distal muscular dystrophy, Emery-Dreifuss muscular dystrophy, dementia, Parkinson's disease (PD), a PD-related disorder, Prion disease, a motor neuron disease (MND), Progressive bulbar palsy (PBP), Progressive muscular atrophy (PMA), Primary lateral sclerosis (PLS), Spinal muscular atrophy (SMA), a bladder cancer, a breast cancer, a colorectal cancer, a kidney cancer, a lung cancer, a lymphoma, a melanoma, an oral cancer, an ovarian cancer, an oropharyngeal cancer, a pancreatic cancer, a prostate cancer, a thyroid cancer, a uterine cancer, Down syndrome, Prader-Willi Syndrome (PWS), Bloom Syndrome, Cockayne Syndrome Type I-216400, Cockayne Syndrome Type III, Cockayne Syndrome Type I, Hutchinson-Gilford Progeria Syndrome, Mandibuloacral Dysplasia with Type A Lipodystrophy, Progeria, Adult Onset Progeroid Syndrome, Neonatal Rothmund-Thomson Syndrome, Seip Syndrome, Werner Syndrome, Replication Focus-Forming Activity 1, myotubular myopathy, Danon Disease, and/or centronuclear myopathy.
Non-limiting examples of symptoms of these diseases/conditions include neurodevelopmental, neurofunctional, or neurodegenerative changes (e.g., ALS, FTD, Spinocerebellar Ataxias, FXTAS, or Huntington's Disease symptoms) or abnormal proliferation or migration of cells (e.g., as in cancer). For example, myotonic dystrophy type 1 and type 2 (dystrophia myotonica, DM1 and DM2, respectively) are caused by expanded CTG repeats in the DMPK gene and CCTG repeats in the CNBP gene, respectively. Both diseases are highly multi-systemic with symptoms in skeletal muscles, cardiac tissue, gastrointestinal tract, endocrine system, and central nervous system, among others.
In some aspects, the present disclosure relates to methods and compositions that are useful for treating myotonic dystrophy type 1 and type 2 (dystrophia myotonica, DM1 and DM2, respectively), for example by delivering viral particles comprising viral constructs (e.g., containing one or more alternative spicing cassettes) to cells or tissue in a subject. In addition to the symptoms described above, DM1 can also manifest in a severe form called congenital DM1, in which profound developmental delays occur. A 25% chance of death before the age of 18 months and 50% chance of survival into mid-30s has been reported. Methods and compositions of the application can be useful to treat, alleviate, or otherwise improve one or more symptoms of DM1.
Accordingly, in some embodiments one or more viral constructs can be delivered to a subject having one or more symptoms of myotonic dystrophy. Such symptoms may include, but are not limited to, delayed muscle relaxation, muscle weakness, prolonged involuntary muscle contraction, loss of muscle, abnormal heart rhythm, cataracts, or difficulty swallowing. In some embodiments, a viral composition provided herein is administered to a subject having congenital DM1 or DM2. In some embodiments, the viral constructs treat, alleviate, ameliorate, or otherwise improve one or more symptoms associated with DM1 and/or DM2. In some embodiments, the viral constructs reduce muscle weakness, reduce muscle loss, reduce muscle wasting, reduce prolonged muscle contractions, improve speech, and/or improve swallowing in a subject. In some embodiments, treatment reduces or corrects one or more other symptoms of myotonic dystrophy.
In some embodiments, splicing of a recombinant intron and/or an alternatively-spliced exon is sufficiently regulated to be therapeutically effective.
Certain embodiments are set forth in the enumerated clauses below.
Clause 1. A recombinant viral genome for delivering a transgene, wherein said genome comprises at least one alternatively-spliced exon cassette comprising at least one alternatively-spliced exon, at least one flanking intron, and a coding region of the transgene.
Clause 2. The viral genome of clause 1, wherein the alternatively-spliced exon is retained in the spliced transcript.
Clause 3. The viral genome of clause 1 or clause 2, wherein the alternatively-spliced exon cassette further comprises at least one constitutive exon.
Clause 4. The viral genome of any preceding clause, wherein the alternatively-spliced exon cassette comprises one flanking intron.
Clause 5. The viral genome of clause 4, wherein the flanking intron is located 3′ or 5′ to the alternatively-spliced exon.
Clause 6. The viral genome of any one of clauses 1-3, wherein the alternatively-spliced exon cassette comprises two flanking introns.
Clause 7. The viral genome of any preceding clause, wherein the alternatively-spliced exon comprises at least one modification, relative to a naturally occurring alternatively-spliced exon.
Clause 8. The viral genome of any preceding clause, wherein the alternatively-spliced exon comprises at its 3′ end a heterologous start codon or part of a heterologous start codon.
Clause 9. The viral genome of clause 8, wherein all native start codons located 5′ to the heterologous start codon are disrupted or deleted.
Clause 10. The viral genome of any preceding clause, wherein the alternatively-spliced exon is located 5′ to the coding region of the transgene.
Clause 11. The viral genome of any one of clauses 1-7, wherein the alternatively-spliced exon cassette comprises two alternatively-spliced exons, each with flanking introns.
Clause 12. The viral genome of clause 11, wherein the two alternatively-spliced exons are adjacent.
Clause 13. The viral genome of clause 11 or clause 12, wherein the constitutive exon is located 5′ to the two alternatively-spliced exons.
Clause 14. The viral genome of any one of clauses 11-13, wherein each alternatively-spliced exon comprises at its 3′ end a heterologous start codon or part of a heterologous start codon.
Clause 15. The viral genome of clause 14, wherein all native start codons located 5′ to the heterologous start codon of the 5′-most alternatively-spliced exon are disrupted or deleted.
Clause 16. The viral genome of any one of clauses 11-15, wherein only one of the two alternatively-spliced exons is retained in the spliced transcript.
Clause 17. The viral genome of any one of clauses 11-16, wherein the 5′-most alternatively-spliced exon is retained in the spliced transcript.
Clause 18. The viral genome of any one of clauses 11-16, wherein the 3′-most alternatively-spliced exon is retained in the spliced transcript.
Clause 19. The viral genome of any preceding clause, wherein the alternatively-spliced exon(s) and flanking intron(s) are located within the coding region of the transgene.
Clause 20. The viral genome of any preceding clause, wherein the alternatively-spliced exon comprises a heterologous, in-frame stop codon.
Clause 21. The viral genome of clause 20, wherein the heterologous, in-frame stop codon is at least 50 nucleotides upstream of the next 5′ splice junction.
Clause 22. The viral genome of clause 20 or clause 21, wherein the heterologous stop codon elicits nonsense-mediated decay.
Clause 23. The viral genome of any preceding clause, wherein the alternatively-spliced exon is retained in the spliced transcript in distinct tissues or in distinct cell types.
Clause 24. The viral genome of any preceding clause, wherein the alternatively-spliced exon is retained in the spliced transcript in the presence of activated T cells, and/or in states of inflammation.
Clause 25. The viral genome of any preceding clause, wherein the alternatively-spliced exon is retained in the spliced transcript in cells exhibiting one or more signs or symptoms of a disease state, and/or in cells exhibiting non-homeostatic levels of the protein encoded by the natural gene comprising the transgene.
Clause 26. The viral genome of any preceding clause, wherein the alternatively-spliced exon comprises an alternatively-spliced exon from a gene selected from the group consisting of ABCC1, AK125149, ASCC2, BAT2D1, BBX, BRD8, BRE, C17orf70, CAMKK2, CBFB, CCAR1, CCDC7CD6, CHTF8, COL4A3BP, COL6A3, CUGBP1, CUGBP2, CXorf45, DENND3, DGUOK, DKFZp762G094, DNAJC7, DNASE1, EIF4A2, EIF4G2, EIF4H, EXOC7, EZH2, FAM120A, FAM136A, FAM36A, FARSB, FBXO38, FGFR1OP2, FIP1L1, FOXRED1, FUBP3, GALT, GATA3, GOLGA2, HIF1A, HMMR, HRB, IKZF1, ILF3, IRAK4, IRF1, KCTD13, LEF1, LUC7L, LYRM1, MALT1 e7, MAP2K7, MAP3K7, MAP4K2, MBNL2, MFF, NAE1, NCSTN, NR4A3, NRF1, NUP98, PARP6, PCM1, PLAUR, PLSCR3, PPIL5, PPP5C, PTPRC-E4, PTPRC-E6, PTS, RABL5, RAPH1, SEC16A, SFRS3, SFRS7, SLMAP, SNRNP70, STAT6, TBC1D1, TIMM8B, TIR8, TRA2A, TROVE2, UGCGL1, VAP-B, VAV1, ZNF384, ZNF496, CAMK2B, PKP2, LGMN, NRAP, VPS39, KSR1, PDLIM3, BIN1, ARFGAP2, KIF13A, and/or PICALM.
Clause 27. The viral genome of any preceding clause, wherein the alternatively-spliced exon comprises an alternatively-spliced exon comprising a polynucleotide sequence as set forth in any one of SEQ ID NOs: 23-44.
Clause 28. The viral genome of any preceding clause, wherein the flanking intron(s) is a native flanking intron(s) of the alternatively-spliced exon(s).
Clause 29. The viral genome of any preceding clause, wherein the flanking intron(s) comprises at its 5′ end a 5′ splice donor site.
Clause 30. The viral genome of any preceding clause, wherein the flanking intron(s) comprises at its 3′ end a 3′ splice donor site.
Clause 31. The viral genome of any preceding clause, wherein the flanking intron(s) comprises no modifications, relative to a naturally occurring intron.
Clause 32. The viral genome of any one of clauses 1-31, wherein the flanking intron(s) comprises at least one modification, relative to a naturally occurring intron.
Clause 33. The viral genome of clause 32, wherein the modification is a substitution or deletion of one or more nucleotides.
Clause 34. The viral genome of any preceding clause, wherein the flanking intron(s) is a regulated intron.
Clause 35. The viral genome of any preceding clause, wherein the flanking intron(s) comprises an intron from a gene selected from the group consisting of ABCC1, AK125149, ASCC2, BAT2D1, BBX, BRD8, BRE, C17orf70, CAMKK2, CBFB, CCAR1, CCDC7CD6, CHTF8, COL4A3BP, COL6A3, CUGBP1, CUGBP2, CXorf45, DENND3, DGUOK, DKFZp762G094, DNAJC7, DNASE1, EIF4A2, EIF4G2, EIF4H, EXOC7, EZH2, FAM120A, FAM136A, FAM36A, FARSB, FBXO38, FGFR1OP2, FIP1L1, FOXRED1, FUBP3, GALT, GATA3, GOLGA2, HIF1A, HMMR, HRB, IKZF1, ILF3, IRAK4, IRF1, KCTD13, LEF1, LUC7L, LYRM1, MALT1 e7, MAP2K7, MAP3K7, MAP4K2, MBNL2, MFF, NAE1, NCSTN, NR4A3, NRF1, NUP98, PARP6, PCM1, PLAUR, PLSCR3, PPIL5, PPP5C, PTPRC-E4, PTPRC-E6, PTS, RABL5, RAPH1, SEC16A, SFRS3, SFRS7, SLMAP, SNRNP70, STAT6, TBC1D1, TIMM8B, TIR8, TRA2A, TROVE2, UGCGL1, VAP-B, VAV1, ZNF384, ZNF496, CAMK2B, PKP2, LGMN, NRAP, VPS39, KSR1, PDLIM3, BIN1, ARFGAP2, KIF13A, and/or PICALM.
Clause 36. The viral genome of any preceding clause, wherein the flanking intron(s) comprises an intron comprising a polynucleotide sequence as set forth in any one of SEQ ID NOs: 1-22, 103, and 104.
Clause 37. The viral genome of any one of clauses 3-36, wherein the constitutive exon is a native exon of the transgene.
Clause 38. The viral genome of any one of clauses 3-36, wherein the constitutive exon is not a native exon of the transgene.
Clause 39. The viral genome of any one of clauses 3-38, wherein the constitutive exon is from the same gene as the alternatively-spliced exon(s).
Clause 40. The viral genome of clause 39, wherein the gene is the transgene.
Clause 41. The viral genome of any one of clauses 3-38, wherein the constitutive exon is not from the same gene as the alternatively-spliced exon(s).
Clause 42. The viral genome of any one of clauses 39-41, wherein the gene is a gene selected from the group consisting of MBNL1, MBNL2, MBNL3, hnRNP A1, hnRNP A2B1, hnRNP C, hnRNP D, hnRNP DL, hnRNP F, hnRNP H, hnRNP K, hnRNP L, hnRNP M, hnRNP R, hnRNP U, FUS, TDP43, PABPN1, ATXN2, TAF15, EWSR1, MATR3, TIA1, FMRP, MTM1, MTMR2, LAMP2, KIF5A, a microdystrophin-encoding gene, C9ORF72, HTT, DNM2, BIN1, RYR1, NEB, ACTA, TPM3, TPM2, TNNT2, CFL2, KBTBD13, KLHL40, KLHL41, LMOD3, MYPN, SEPN1, TTN, SPEG, MYH7, TK2, POLG1, GAA, AGL, PYGM, SLC22A5, OCTN2, ETF, ETFH, PNPLA2, a cytochrome b oxidase-encoding gene, a cytochrome c oxidase-encoding gene, CLCN1, SCN4A, DMPK, CNBP, MYOT, LMNA, CAV3, DNAJB6, DES, TNPO3, HNRPDL, CAPN3, DYSF, an alpha-sarcoglycan-encoding gene, a beta-sarcoglycan-encoding gene, a gamma-sarcoglycan-encoding gene, a delta-sarcoglycan-encoding gene, TCAP, TRIM32, FKRP, FXN, POMT1, FKTN, POMT2, POMGnT1, DAG1, ANO5, PLEC1, TRAPPC11, GMPPB, ISPD, LIMS2, POPDC1, TOR1AIP1, POGLUT2, LAMA2, COL6A1, POMT1, POMT2, DUX4, EMD, PAX7, PMP22, MPZ, MFN2, SMCHD1, SMN, Lamin A/C (LAMN), GJB1, ABCC1, AK125149, ASCC2, BAT2D1, BBX, BRD8, BRE, C17orf70, CAMKK2, CBFB, CCAR1, CCDC7CD6, CHTF8, COL4A3BP, COL6A3, CUGBP1, CUGBP2, CXorf45, DENND3, DGUOK, DKFZp762G094, DNAJC7, DNASE1, EIF4A2, EIF4G2, EIF4H, EXOC7, EZH2, FAM120A, FAM136A, FAM36A, FARSB, FBXO38, FGFR1OP2, FIP1L1, FOXRED1, FUBP3, GALT, GATA3, GOLGA2, HIF1A, HMMR, HRB, IKZF1, ILF3, IRAK4, IRF1, KCTD13, LEF1, LUC7L, LYRM1, MALT1 e7, MAP2K7, MAP3K7, MAP4K2, MBNL2, MFF, NAE1, NCSTN, NR4A3, NRF1, NUP98, PARP6, PCM1, PLAUR, PLSCR3, PPIL5, PPP5C, PTPRC-E4, PTPRC-E6, PTS, RABL5, RAPH1, SEC16A, SFRS3, SFRS7, SLMAP, SNRNP70, STAT6, TBC1D1, TIMM8B, TIR8, TRA2A, TROVE2, UGCGL1, VAP-B, VAV1, ZNF384, ZNF496, CAMK2B, PKP2, LGMN, NRAP, VPS39, KSR1, PDLIM3, BIN1, ARFGAP2, KIF13A, and/or PICALM.
Clause 43. The viral genome of any preceding clause, further comprising a promoter.
Clause 44. The viral genome of clause 43, wherein the promoter is a native promoter of the transgene.
Clause 45. The viral genome of clause 43, wherein the promoter is not a native promoter of the transgene.
Clause 46. The viral genome of any one of clauses 43-45, wherein the promoter is constitutive.
Clause 47. The viral genome of any one of clauses 43-45, wherein the promoter is inducible.
Clause 48. The viral genome of any one of clauses 43-47, wherein the promoter is a tissue-specific promoter.
Clause 49. The viral genome of any one of clauses 43-48, wherein the promoter is selected from the group consisting of an EF1 alpha promoter, beta actin promoter, CMV, muscle creatine kinase promoter, C5-12 muscle promoter, MHCK7, CBh, synapsin, MECP2, enolase, GFAP, Desmin, and CAG promoter.
Clause 50. The viral genome of any one of clauses 43-49, wherein the promoter drives expression of the transgene.
Clause 51. The viral genome of any one of clauses 1-50, wherein the coding region of the transgene comprises at least one modification, relative to a coding region of a naturally occurring gene.
Clause 52. The viral genome of clause 51, wherein the modification is a substitution or deletion of at least one nucleotide.
Clause 53. The viral genome of clause 51 or clause 52, wherein the coding region of the transgene comprises a deletion of a native start codon, or a portion thereof.
Clause 54. The viral genome of any preceding clause, wherein the transgene comprises one or more recombinant introns.
Clause 55. The viral genome of any one of clauses 51-54, wherein the naturally occurring gene is a gene selected from the group consisting of MBNL1, MBNL2, MBNL3, hnRNP A1, hnRNP A2B1, hnRNP C, hnRNP D, hnRNP DL, hnRNP F, hnRNP H, hnRNP K, hnRNP L, hnRNP M, hnRNP R, hnRNP U, FUS, TDP43, PABPN1, ATXN2, TAF15, EWSR1, MATR3, TIA1, FMRP, MTM1, MTMR2, LAMP2, KIF5A, a microdystrophin-encoding gene, C9ORF72, HTT, DNM2, BIN1, RYR1, NEB, ACTA, TPM3, TPM2, TNNT2, CFL2, KBTBD13, KLHL40, KLHL41, LMOD3, MYPN, SEPN1, TTN, SPEG, MYH7, TK2, POLG1, GAA, AGL, PYGM, SLC22A5, OCTN2, ETF, ETFH, PNPLA2, a cytochrome b oxidase-encoding gene, a cytochrome c oxidase-encoding gene, CLCN1, SCN4A, DMPK, CNBP, MYOT, LMNA, CAV3, DNAJB6, DES, TNPO3, HNRPDL, CAPN3, DYSF, an alpha-sarcoglycan-encoding gene, a beta-sarcoglycan-encoding gene, a gamma-sarcoglycan-encoding gene, a delta-sarcoglycan-encoding gene, TCAP, TRIM32, FKRP, FXN, POMT1, FKTN, POMT2, POMGnT1, DAG1, ANDS, PLEC1, TRAPPC11, GMPPB, ISPD, LIMS2, POPDC1, TOR1AIP1, POGLUT2, LAMA2, COL6A1, POMT1, POMT2, DUX4, EMD, PAX7, PMP22, MPZ, MFN2, SMCHD1, SMN, Lamin A/C (LAMN), and/or GJB1.
Clause 56. The viral genome of any preceding clause, wherein the viral genome is a genome from a recombinant adeno-associated virus (rAAV), lentivirus, retrovirus, or foamyvirus.
Clause 57. The viral genome of clause 56, wherein the viral genome is from an rAAV.
Clause 58. The viral genome of clause 56 or clause 57, wherein the transgene is flanked by AAV inverted terminal repeat (ITR) sequences.
Clause 59. The viral genome of clause 58, wherein the ITR sequences comprise AAV1, AAV2, AAV5, AAV7, AAV8, or AAV9 ITR sequences.
Clause 60. The viral genome of clause 56, wherein the viral genome is from a lentivirus.
Clause 61. The viral genome of clause 60, wherein the alternatively-spliced exon cassette is located on the minus strand of the lentivirus genome.
Clause 62. The viral genome of any preceding clause, further comprising a 3′ untranslated region (UTR) that is endogenous or exogenous to the transgene.
Clause 63. The viral genome of clause 62, wherein the exogenous 3′ UTR is the 3′ UTR from bovine growth hormone, SV40, EBV, or Myc.
Clause 64. A viral particle comprising a viral genome according to any preceding clause. Clause 65. The viral particle of clause 64, wherein the viral particle is an rAAV particle. Clause 66. The viral particle of clause 65, wherein the rAAV particle comprises AAV serotype 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10.
Clause 67. The viral particle of clause 65, wherein the rAAV particle comprises AAV derivative or pseudotype AAV2-AAV3 hybrid, AAVrh.10, AAVhu.14, AAV3a/3b, AAVrh32.33, AAV-HSC15, AAV-HSC17, AAVhu.37, AAVrh.8, CHt-P6, AAV2.5, AAV6.2, AAV2i8, AAV-HSC15/17, AAVM41, AAV9.45, AAV6(Y445F/Y731F), AAV2.5T, AAV-HAE1/2, AAV clone 32/83, AAV5hH10, AAV2 (Y→F), AAV8 (Y733F), AAV2.15, AAV2.4, AAVM41, and AAVr3.45.
Clause 68. The viral particle of any one of clauses 64-67, further comprising at least one helper plasmid.
Clause 69. The viral particle of clause 68, wherein the helper plasmid comprises a rep gene and a cap gene.
Clause 70. The viral particle of clause 69, wherein the rep gene encodes Rep78, Rep68, Rep52, or Rep40.
Clause 71. The viral particle of clause 69 or clause 70, wherein the cap gene encodes a VP1, VP2, and/or VP3 region of the viral capsid protein.
Clause 72. The viral particle of any one of clauses 68-71, wherein the viral particle comprises two helper plasmids.
Clause 73. The viral particle of clause 72, wherein the first helper plasmid comprises a rep gene and a cap gene and the second helper plasmid comprises a E1a gene, a E1b gene, a E4 gene, a E2a gene, and a VA gene.
Clause 74. The viral particle of clause 64, wherein the viral particle is a recombinant lentivirus particle.
Clause 75. The viral particle of clause 74, wherein the lentivirus is a human immunodeficiency virus (HIV1 or HIV2), a feline immunodeficiency virus (FIV), a bovine immunodeficiency virus (BIV), a caprine arthritis encephalitis virus, an equine infectious anemia virus, a jembrana disease virus, a puma lentivirus, aimian immunodeficiency virus, or a visna-maedi virus.
Clause 76. The viral particle of clause 74 or clause 75, further comprising a viral envelope.
Clause 77. A method of treating a disease or condition in a subject comprising administering a viral genome according to any one of clauses 1-63 or a viral particle according to any one of clauses 64-76 to the subject.
Clause 78. The method of clause 77, wherein the subject is a mammal.
Clause 79. The method of clause 78, wherein the mammal is a human.
Clause 80. The method of any one of clauses 77-79, wherein the viral genome or viral particle is administered to the subject at least one time.
Clause 81. The method of clause 80, wherein the viral genome or viral particle is administered to the subject 2, 3, 4, 5, 6, 7, 8, 9, or 10 times.
Clause 82. The method of any one of clauses 77-81, wherein the viral genome or viral particle is administered to the subject parenterally, subcutaneously, intraocularly, intravitreally, subretinally, intravenously (IV), intracerebro-ventricularly, intramuscularly, intrathecally (IT), intracisternally, intraperitoneally, enterally, via inhalation, topically, or by direct injection to one or more cells, tissues, or organs.
Clause 83. The method of any one of clauses 77-82, wherein the viral genome or viral particle is administered to the subject by intravenous injection, intramuscular injection, intrathecal injection, or intravitreal injection.
Clause 84. The method of any one of clauses 77-83, wherein the disease or condition is a disease or condition selected from the group consisting of Dentatorubral-pallido-luysian atrophy (DRPLA), myotonic dystrophy type 1 (DM1), myotonic dystrophy type 2 (DM2), Fragile X syndrome of mental retardation (FMR1), Fragile X tremor ataxia syndrome (FXTAS), FRAXE mental retardation (FMR2), Friedreichs ataxia (FRDA), Huntington disease (HD), Huntington disease-like 2 (HDL2), Oculopharyngeal muscular dystrophy (OPMD), Myoclonic epilepsy type 1, Alzheimer's disease, ALS/FTD, spinocerebellar ataxia type 1 (SCA1), spinocerebellar ataxia type 2 (SCA2), spinocerebellar ataxia type 3 (SCA3), spinocerebellar ataxia type 6 (SCA6), spinocerebellar ataxia type 7 (SCAT), spinocerebellar ataxia type 8 (SCA8), spinocerebellar ataxia type 10 (SCA10), spinocerebellar ataxia type 12 (SCA12), spinocerebellar ataxia type 17 (SCA17), Syndromic/non-syndromic X-linked mental retardation, Emery-Dreifuss muscular dystrophy type 2, familial partial lipodystrophy, limb girdle muscular dystrophy type 1B, dilated cardiomyopathy, familial partial lipodystrophy, Charcot-Marie-Tooth disorder type 2B1, mandibuloacral dysplasia, childhood progeria syndrome (Hutchinson-Gilford syndrome), Werner syndrome, Dilated cardiomyopathy (DCM), Hypertrophic cardiomyopathy (HCM), Restrictive cardiomyopathy (RCM), Left Ventricular Non-compaction (LVNC), Arrhythmogenic Right Ventricular Dysplasia (ARVD), takotsubo cardiomyopathy, Duchenne muscular dystrophy, Becker muscular dystrophy, Limb-girdle muscular dystrophy, Facioscapulohumeral muscular dystrophy, Congenital muscular dystrophy, Oculopharyngeal muscular dystrophy, Distal muscular dystrophy, Emery-Dreifuss muscular dystrophy, dementia, Parkinson's disease (PD), a PD-related disorder, Prion disease, a motor neuron disease (MND), Progressive bulbar palsy (PBP), Progressive muscular atrophy (PMA), Primary lateral sclerosis (PLS), Spinal muscular atrophy (SMA), a bladder cancer, a breast cancer, a colorectal cancer, a kidney cancer, a lung cancer, a lymphoma, a melanoma, an oral cancer, an ovarian cancer, an oropharyngeal cancer, a pancreatic cancer, a prostate cancer, a thyroid cancer, a uterine cancer, Down syndrome, Prader-Willi Syndrome (PWS), Bloom Syndrome, Cockayne Syndrome Type I-216400, Cockayne Syndrome Type III, Cockayne Syndrome Type I, Hutchinson-Gilford Progeria Syndrome, Mandibuloacral Dysplasia with Type A Lipodystrophy, Progeria, Adult Onset Progeroid Syndrome, Neonatal Rothmund-Thomson Syndrome, Seip Syndrome, Werner Syndrome, Replication Focus-Forming Activity 1, myotubular myopathy, Danon Disease, and/or centronuclear myopathy.
Clause 85. A method of regulating transgene expression using a viral vector comprising a viral genome, the method comprising:
These and other aspects of the application are illustrated by the following non-limiting examples.
Virally-mediated gene therapies that seek to deliver a protein cargo commonly package a coding region of interest along with a 5′ untranslated region, 3′ untranslated region, a promoter that will drive the gene of interest, and, sometimes, a constitutive intron to enhance nuclear export and RNA stability. However, almost all multi-exonic human genes in the human genome (>95%) are alternatively spliced such that multiple isoforms are generated from a single gene locus. These isoforms may exhibit distinct functions or expression patterns in different cellular conditions. Therefore, they comprise an important aspect of gene regulation and allow multiple species to be generated from a single locus.
There are many descriptions of tissue-specific exons in the literature; these types of data have been derived from microarray or RNAseq analyses of human tissues, or other conditions in which a perturbation is made and the transcriptome is profiled. The inclusion level of an exon is commonly described by “percent spliced in” (psi) and describes the percentage of mRNAs transcribed from a locus that are spliced to contain an alternatively-spliced exon of interest. For example, an exon that has a psi of 10% in a given tissue is included in the mature mRNA 10% of the time. Some examples of tissue-specific or tissue-biased exons include TPM1 exon 2 (<5% psi in heart but >95% psi in colon), or SLC25A3 exon 3 (>90% in heart but <5% in brain). Exons with a strong shift in psi between tissues are sometimes referred to as “switch-like” exons. Switchlike exons tend to exhibit greater phylogenetic conservation in their proximal introns, as compared to constitutively spliced exons or alternatively-spliced exons that do not exhibit switch-like behavior.
To date, tissue-specific alternative splicing regulation has not been used to control virally-mediated gene therapies, and there has been no straightforward method for how to do so. Described here are specific sequences that may confer tissue-specific regulation for virally-mediated gene therapies (e.g., AAV; lentivirus). In some embodiments, the virus is an adeno-associated virus (AAV). In embodiments where the virus is an AAV, the orientation of the cargo is invariant. This is because the AAV ITRs are symmetric. In some embodiments, the virus is a lentivirus. In embodiments, where the virus is a lentivirus, a cargo with spliced introns must be placed on the minus strand. This is because lentivirus packaging undergoes an RNA intermediate, and the introns must not be lost. Examples 1-6 describe an AAV-mediated gene therapy, however it should be understood that either an AAV or a lentivirus may be utilized according the methods described in the Examples.
Alternatively-spliced exons and their flanking introns can be incorporated into AAV cargoes by at least two distinct methods to confer similar tissue-specific behavior. Both approaches utilize a skipped exon “trio” where there are two flanking constitutive exons and the middle exon is alternative.
In the first approach, the exon trio is placed at the start of the AAV cargo and an ATG or part of an ATG translation start codon is introduced at the end of the middle (alternative) exon. The downstream (constitutive) exon is omitted, but the transgene cargo of interest sans ATG is inserted in its place, such that inclusion of the alternatively-spliced exon results in joining of the ATG from the alternatively-spliced exon with the rest of the transgene of interest upon splicing. ATGs that lie upstream of the intended start codon are mutated or removed. Thus, this results in translation of the transgene only in settings that include the alternatively-spliced exon.
In the second approach, the alternatively-spliced exon and flanking introns are placed within the coding region of the AAV cargo. A stop codon is introduced within the alternatively-spliced exon such that it follows nonsense-mediated decay (NMD) rules, and thus elicits NMD when included. This results in productive translation of the transgene only in settings that exclude the alternatively-spliced exon. If the exon is too short to elicit NMD, another constitutive intron can be placed downstream in the transgene such that NMD rules (e.g., the stop codon should be >50 nucleotides from the next splice junction) are satisfied.
These two approaches may be applied not only to tissue-specific exons, but also exons that respond to different cellular states or conditions. For example, it may be desirable to confer regulatory behavior that occurs in:
The general approach described herein is advantageous over protein-based regulatory strategies because no additional protein components are necessary to confer regulation; all regulation occurs using endogenous machinery, and no neo-antigens are generated that could be immunogenic. All of the regulation occurs at the RNA level.
In some embodiments, the virus is an adeno-associated virus (AAV). In embodiments where the virus is an AAV, the orientation of the cargo is invariant. This is because the AAV ITRs are symmetric. In some embodiments, the virus is a lentivirus. In embodiments, where the virus is a lentivirus, a cargo with spliced introns must be placed on the minus strand. This is because lentivirus packaging undergoes an RNA intermediate, and the introns must not be lost.
Commonly used methods to regulate tissue-specific expression include tissue-specific promoters and microRNAs. However, these methods are not quite specific enough to provide the level of control needed for certain therapeutic interventions. In contrast, there are exons that show close to 0% psi in heart but >90% psi in skeletal muscle. A regulatory cassette is generated using alternatively-spliced exons that allows an AAV transgene cargo to be expressed in skeletal muscle, but not in the heart. The exons shown in Table 1 will be tested to evaluate differential expression in skeletal versus heart tissue. These exons are good candidates for this type of tissue-specific behavior because they show robust switch-like behavior between heart and muscle. Some of the exons shown in Table 1 are conserved between mouse and human, and, correspondingly, the switch-like behavior is conserved across species. In some embodiments, the intronic sequences that flank the exons shown in Table 1 are also included as part of the regulatory cassette.
These exons were chosen because of their switch-like behavior between heart and muscle, and because they are all <250 nucleotides in length, with reasonably conserved intronic sequences that flank the exons. Additionally, these exons are all amenable to being cloned out of their endogenous context and placed into a minigene to act as regulatory cassettes to control AAV cargo expression. It is expected that incorporation of these exons into an AAV-delivered transgene will enable production of a protein cargo in the skeletal muscle and will result in decreased production of that cargo in the heart.
A regulatory cassette (e.g., an alternatively-spliced exon cassette) is designed that exhibits dynamic behavior during T-cell activation. In some embodiments, such a cassette controls regulators of T-cell biology in the context of lentiviral-based cargoes (e.g., CAR-T approaches). For example, upon T-cell activation, a cargo produced using a regulatory cassette as described herein modulates the outcome of that T-cell. Exons from genes that have been previously shown to exhibit splicing changes upon T-cell activation, as published in the literature and shown in Table 2, will be tested.
In some embodiments, the intronic sequences flanking the exons shown in Table 2, along with the exons, will be introduced into a lentivirus splicing reporter and tested in resting and activated T-cells to assess activity. Sequence cassettes that exhibit behavior that is similar to their endogenous counterparts will be further developed to control heterologous cargoes. Exons from these genes were selected because they have been observed to change in splicing behavior following T-cell activation. It is expected that, when taken out of their endogenous context and placed within an AAV-delivered transgene, some of the exons will recapitulate behavior in activated T-cells.
A broad screen was performed to identify tissue-specific exon cassettes that exhibit similar behavior when placed within the context of an AAV cargo. These exons were identified using RNAseq data and exons that are <200 nucleotides long and exhibit high conservation across multiple species were chosen. These alternatively-spliced exons and their proximal introns are packaged into a heterologous context such that their inclusion level can be assessed by RT-PCR or deep sequencing. Nucleotide barcodes are included in the 3′ untranslated region such that the identity of each exon cassette can be determined by deep sequencing the barcode. The exon cassettes are packaged as a pool into an AAV library and administered to mice. Tissues or cell types of interest are harvested, and RNA originating from the AAV transgenes is prepared for deep sequencing such that psi values can be associated with each barcode in each tissue. Exon cassettes that exhibit tissue-specific behaviors of interest are identified using this procedure.
Examples of datasets used to identify tissue specific exons can be found in Wang, et al. (2008), Alternative isoform regulation in human tissue transcriptomes, Nature 456(7221): 470-76; Li, et al. (2017), A Comprehensive Mouse Transcriptomic BodyMap across 17 Tissues by RNA-seq, Sci. Rep. 7(1): 4200; and the GEO dataset entitled “[E-MTAB-513] Illumina Human Body Map 2.0 Project” (Series GSE30611).
A general research operating procedure for how to develop gene therapies that take advantage of alternative regulation is also provided. This approach can be generalized to facilitate the identification of particular sequences that confer regulatory behavior that is desired. In some embodiments, it is desirable to prevent over-dosing or over-expression in a given tissue.
The procedure is as follows:
A major challenge in the gene therapy field is to develop strategies yielding precise cargo expression—in levels, location, and timing. Because functional transduction of many tissues and cell types by viral vectors remains relatively inefficient, existing cargo sequences often incorporate strong promoters and minimal 5′ and 3′ UTR elements that enhance RNA stability and translation efficiency, aiming to maximize gene expression levels. However, over-expression of some cargoes in certain cell types and tissues may lead to toxicity, thus narrowing or eliminating the therapeutic windows available to treat disease. Solutions to achieve cell type-specific expression include use of tissue-specific promoters, incorporation of regulatory elements within mRNA sequence (e.g., microRNA binding sites), and packaging of cargoes into capsid variants exhibiting cell type-specific tropisms. These approaches, however, provide limited control, and fail to incorporate certain basic mechanisms of gene regulation ubiquitously employed by the naturally-occurring genome. One of these mechanisms is alternative splicing, which has been relatively unexplored as a mechanism by which to regulate gene therapy cargo expression.
Alternative splicing occurs in −95% of all multi-exonic human genes, with a major portion of regulated exons showing a tissue or cell type-specific bias (1). The most studied form of alternative splicing is the “skipped exon” or “cassette exon”, in which an alternative exon can be included or excluded between a pair of constitutive exons. The present inventors have identified a subset of “switch-like” cassette exons that show differences in inclusion level between tissues; these exons tend to preserve reading frame more frequently than other cassette exons and display increased phylogenetic conservation in the −200 intronic nucleotides both upstream and downstream of these exons.
Regulation of alternative splicing is controlled by core spliceosomal machinery, along with RNA binding proteins (RBPs); many RBPs themselves show tissue-specific expression profiles (2). Mechanistic studies of alternative splicing regulation are often performed by cloning the cassette exon sequence (e.g., upstream intron, cassette exon, and downstream intron) into a heterologous context in which the flanking constitutive exons are taken from a separate gene (3). For example, beta globin exons 1-3 (4) and SMN1 exons 6-8 (5,6) are commonly employed exon/intron contexts into which cassette exon sequences have been incorporated for further study. In addition, the behavior of alternatively spliced exons can be recapitulated in heterologous contexts and has even been re-purposed to control fluorescent reporter expression (7). Similar concepts have been used to regulate AAV-mediated gene expression in vivo using alternative splicing, wherein expression of a target gene is controlled via exposure to, for example, an aptamer ligand, such as a small molecule (8,9). However, no attempts have been made to use exons displaying cell- or tissue-specific, or endogenous activity-dependent, splicing patterns to regulate gene therapy cargoes.
The current gene therapy landscape is focused on a multitude of disease indications, but several broad areas could benefit from improved cell or tissue type-specific regulation. Firstly, observed toxicities of AAV-delivered therapies in dorsal root ganglia suggest that minimization of heterologous cargo expression in this tissue could be beneficial, even if a major portion of the toxicity is capsid-mediated. Secondly, a great number of gene therapies are being developed for neuromuscular or cardiac indications; however, some cargoes that are therapeutic in one tissue may be toxic when over-expressed in the other, and there are limited approaches available to fully de-target either tissue.
Described herein is a general approach to re-purpose, engineer, and optimize alternative splicing cassettes to de-target specific tissues and cell types. Alternative splicing cassettes were engineered to control protein cargo expression in the context of AAV. These cassettes were designed such that incorporation of the AUG translation initiation codon within the cassette exon would lead to cargo production upon inclusion (
The approach described herein, Tissue-specific Alternative splicing to Restrict Globally Expressed Therapeutic (TARGET), is broadly applicable to any set of tissues or cell types and can be applied to any cargo that satisfies viral packaging limit restrictions in any virus that supports packaging of splicing-competent transgenes. Some viruses that undergo splicing during packaging (e.g., lentivirus) would require encoding of the transgene on the minus strand of the viral genome to avoid removal of introns during the packaging process.
RNAseq datasets were analyzed to identify candidate exons that display extreme “switch-like” behavior between human heart (10) and skeletal muscle (SRA project SRP082676). These candidates were further filtered by those that were also conserved to mouse, and those which displayed similar percent spliced in (psi) values in mouse heart (low psi) and skeletal muscle (high psi). A set of 11 cassette exons were selected and ˜500 nucleotides of total sequence were cloned—including the cassette exon and immediately adjacent flanking introns—into the SMN1 exon 6/intron 7 context, which has been previously used to study alternative splicing regulation (11) (
For each of the alternative exon candidates, the final nucleotides of the exon were altered to either be “ATG”, “AT”, or “A” (depending on which nucleotides naturally occurred), such that initiation of translation could be achieved when the exon was included. Additionally, any upstream ATGs within the alternative exon were removed by substitution or deletion, to avoid translation initiation at an earlier location. In the case of exon skipping, downstream ATGs within the MTM1 coding sequence might lead to translation of unwanted protein fragments; thus, stop codons were introduced within 15 nucleotides of each of these ATGs, such that translation would terminate within just a few (<5) amino acids (
Because changes to the end of the alternative exon sequence can affect the strength of the alternative exon's 5′ splice site, both the original and altered 5′ splice sites of the alternative exon were scored using MaxEntScan (14) and compensatory mutations were made to the intronic bases of the alternative exon's 5′ splice site to compensate for any potential weakening of the splice site signal (
Barcoding Method to Uniquely Identify Each Alternative Exon within the Pool of Candidates
A unique nucleotide “barcode” sequence was introduced within the MTM1 coding sequence such that it preserved the amino acid composition of MTM1, but also uniquely identified the upstream alternative exon cassette (
All 11 alternative exon candidates (see Table 3 for exon coordinates, psi values, translational initiation scores, and sequence alterations) were packaged into AAV9 as a pool and administered to mice systemically (retro-orbital injection, 4 C57/BL6 mice and 2 FVB mice at 6 weeks of age, 2e13 vg/kg) and intramuscularly (4 C57/BL6 mice at 6 weeks of age, tibialis anterior, 2e11 vg total into one leg). Mice were sacrificed after 4 weeks; the heart and liver were harvested from the systemically injected animals and the tibialis anterior (TA) was harvested from the intramuscularly injected animals. Reverse transcription and polymerase chain reaction was performed using primers targeting the upstream SMN1 exon 6 and also a region in MTM1 3′ of the barcode. Illumina adapters with unique indexes to identify each sample were incorporated into the final amplicon libraries and then sequenced.
Psi values were computed by associating junction reads to barcodes and computing the frequency of inclusion versus exclusion of each exon (
To assess whether the time following intramuscular administration might influence the psi value assessment, the same library was administered into 7 additional mice intramuscularly (2e11 vg total into one tibialis anterior (TA) of each mouse). The TAs were harvested 1, 2, 3, or 4 weeks following dosing. Sequencing libraries were generated and the psi values were correlated for each exon candidate across all samples. The results were strongly concordant, regardless of what time point was analyzed (
Screens to Identify Sequences that Further De-Target Heart but Maximize Expression in Skeletal Muscle
Based upon the initial hits of the first screen, described above, alternative exon cassette sequences were identified which might further enhance the switch-like behavior in heart versus skeletal muscle. A higher throughput approach was taken to simultaneously screen many sequence variants of candidate alternative exon cassettes. Core splice site sequences as well as intronic/exonic sequences play important roles in splicing decisions, by modulating the ability of specific trans-factors to bind a pre-mRNA. The core splicing signals, which include the 3′ splice site, 5′ splice site, and branch point, can all influence the frequency with which an alternatively spliced exon is chosen. These core splicing signals are recognized by the U1, U5, and U2 snRNPs, among other components; but they may also be bound by other RNA binding proteins (RBPs), which play roles in modulating how well the basal splicing machinery can recognize the core signals. Furthermore, RBPs can bind to intronic or exonic sequence in the vicinity of these core splicing signals to affect overall splicing decisions. The abundance of certain RBPs in certain contexts can therefore influence splicing patterns in those contexts. To aid efforts to further optimize sequences that display switch-like alternative splicing in heart versus skeletal muscle, the expression level of RNA binding proteins in these 2 tissues was analyzed (
The high throughput screening approach described herein was first applied to BIN1 exon 11 because it showed the largest dynamic range in psi between heart and skeletal muscle (see
Given all of the above information, the 3′ splice site, 5′ splice site (
Viruses were generated using the eMyoAAV capsid (24) and administered to mice at a titer of 2.5e13 vg/kg. Heart, tibialis anterior, and triceps muscles were collected from mice sacrificed 3 weeks following administration. Sequencing libraries were prepared by RT-PCR and sequenced by Illumina sequencing. Psi values were computed for each barcode and a psi value for each variant was obtained by averaging the psi across every barcode for each variant. The psi value for each variant is shown for 2 heart samples in a scatter plot (
The same BIN1 exon 11 variants were also tested with a different cargo, CAPN3. A separate AAV library was generated in which all 672 BIN1 variants (Table 4) were cloned upstream of the CAPN3 coding sequence, analogously to how they were cloned upstream of the MTM1 coding sequence. Similarly, a 10 nucleotide barcode was embedded within the CAPN3 coding sequence to identify each splice variant. The mean psi values across heart, gastrocnemius, and tibialis anterior tissues from 4 animals were plotted as scatters (
Application of this High Throughput Screening Approach to Identify Alternative Exon Cassettes with Regulated Splicing Patterns in Additional Tissues
The ability to limit or augment gene expression in a variety of tissues would be useful for gene therapies, and some notable tissues include the liver, different brain regions, dorsal root ganglia (DRG), skeletal muscle, cardiac muscle, and smooth muscle. GTEX data was mined as well as a human DRG-specific dataset (SRA runs SRR8533960-SRR8533986) to identify 110 alternative exons that show differential inclusion in these tissues (Table 5), and 96 exon cassettes were selected to test for splicing behavior within these tissues. A similar procedure as outlined above was followed; alternative exons that are <200 nucleotides in length were selected, all ATGs within the alternative exon body were removed, and the end of each alternative exon was modified to terminate in ATG. The 5′ splice sites of the new exons were scored and new variants for each alternative exon cassette were designed that were 1 bit weaker, similar, and 1 bit stronger than the endogenous 5′ splice site in the absence of adjustments to generate a new ATG. −500 nucleotides of total sequence were included from each alternative exon cassette, including the alternative exon itself and immediately flanking intronic regions, and were cloned into the SMN1 exon 6/intron 7 context (as above). EGFP was used as the downstream cargo (rather than MTM1). A similar 10 nucleotide barcode was incorporated into the EGFP coding sequence to allow for identification of each alternative exon cassette. Two versions of the library were generated; one driven by an MHCK7 promoter to bias expression towards cardiac, smooth, and skeletal muscles, and the other driven by a CBh promoter to drive ubiquitous expression. The MHCK7 promoter-driven construct will be packaged by the eMyoAAV capsid to bias delivery to muscle, whereas the CBh promoter-driven construct will be packaged by the PHP.eB capsid (25) to bias delivery to the nervous system, including DRG.
Applications of this High Throughput Screening Approach to Identification of Alternative Exon Cassettes that can be Regulated by T-Cell Activation
The ability to increase or decrease exon inclusion in response to T-cell activation provides utility for various therapeutic purposes, such as CAR-T therapy or other immunotherapies. A major challenge in the context of CAR-T for solid tumors is T-cell exhaustion, a state in which the engineered T-cells no longer exhibit sufficient potency to eliminate tumor cells expressing the neoantigen due to co-expression of multiple inhibitory receptors. It has been previously shown that transcription factors such as T-bet can repress the expression of these inhibitory receptors and can instead sustain the activity of T-cells during chronic infection (26) and also enhance antitumor activity and limit T-cell exhaustion in CAR-T cells (27). However, constitutive over-expression of T-bet may also lead to undesired or autoimmune-like responses (28,29), and thus the addition of T-bet must be regulated in a context-dependent manner. Thus, an exon that is activated by T-cell activation might be engineered to control translation of T-bet or other cargoes that can modulate the state of the T-cell, thereby preventing or limiting T-cell exhaustion.
Publicly available transcriptome datasets in which T-cells were transcriptionally profiled before and after activation (30) were mined, and a set of 98 alternative exon cassettes were chosen (Table 6) to test for splicing behavior within the context of a lentivirus that can integrate into the T-cell genome. Two intronic regions, along with splice site-proximal exon fragments, were selected and fused together to form a new exon cassette. These alternative exon cassettes will be packaged into a lentivirus capsid and will be used to transduce naïve T-cells (31); T-cells will then be activated, RNA will be harvested, and deep sequencing libraries will be prepared and sequenced to identify alternative exon cassettes that show changes in splicing patterns upon activation.
All of the features disclosed in this specification may be combined in any combination. Each feature disclosed in this specification may be replaced by an alternative feature serving the same, equivalent, or similar purpose. Thus, unless expressly stated otherwise, each feature disclosed is only an example of a generic series of equivalent or similar features. From the above description, one skilled in the art can easily ascertain the essential characteristics of the present disclosure, and without departing from the spirit and scope thereof, can make various changes and modifications of the present disclosure to adapt it to various usages and conditions. Thus, other embodiments are also within the claims.
While several inventive embodiments have been described and illustrated herein, those of ordinary skill in the art will readily envision a variety of other means and/or structures for performing the function and/or obtaining the results and/or one or more of the advantages described herein, and each of such variations and/or modifications is deemed to be within the scope of the inventive embodiments described herein. More generally, those skilled in the art will readily appreciate that all parameters, dimensions, materials, and configurations described herein are meant to be exemplary and that the actual parameters, dimensions, materials, and/or configurations will depend upon the specific application or applications for which the inventive teachings is/are used. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific inventive embodiments described herein. It is, therefore, to be understood that the foregoing embodiments are presented by way of example only and that, within the scope of the appended claims and equivalents thereto, inventive embodiments may be practiced otherwise than as specifically described and claimed. Inventive embodiments of the present disclosure are directed to each individual feature, system, article, material, kit, and/or method described herein. In addition, any combination of two or more such features, systems, articles, materials, kits, and/or methods, if such features, systems, articles, materials, kits, and/or methods are not mutually inconsistent, is included within the inventive scope of the present disclosure.
All definitions, as defined and used herein, should be understood to control over dictionary definitions, definitions in documents incorporated by reference, and/or ordinary meanings of the defined terms.
All references, patents and patent applications disclosed herein are incorporated by reference with respect to the subject matter for which each is cited, which in some cases may encompass the entirety of the document.
The indefinite articles “a” and “an,” as used herein in the specification and in the claims, unless clearly indicated to the contrary, should be understood to mean “at least one.”
The phrase “and/or,” as used herein in the specification and in the claims, should be understood to mean “either or both” of the elements so conjoined, i.e., elements that are conjunctively present in some cases and disjunctively present in other cases. Multiple elements listed with “and/or” should be construed in the same fashion, i.e., “one or more” of the elements so conjoined. Other elements may optionally be present other than the elements specifically identified by the “and/or” clause, whether related or unrelated to those elements specifically identified. Thus, as a non-limiting example, a reference to “A and/or B”, when used in conjunction with open-ended language such as “comprising” can refer, in one embodiment, to A only (optionally including elements other than B); in another embodiment, to B only (optionally including elements other than A); in yet another embodiment, to both A and B (optionally including other elements); etc.
As used herein in the specification and in the claims, “or” should be understood to have the same meaning as “and/or” as defined above. For example, when separating items in a list, “or” or “and/or” shall be interpreted as being inclusive, i.e., the inclusion of at least one, but also including more than one, of a number or list of elements, and, optionally, additional unlisted items. Only terms clearly indicated to the contrary, such as “only one of” or “exactly one of,” or, when used in the claims, “consisting of,” will refer to the inclusion of exactly one element of a number or list of elements. In general, the term “or” as used herein shall only be interpreted as indicating exclusive alternatives (i.e. “one or the other but not both”) when preceded by terms of exclusivity, such as “either,” “one of,” “only one of,” or “exactly one of.” “Consisting essentially of,” when used in the claims, shall have its ordinary meaning as used in the field of patent law.
As used herein in the specification and in the claims, the phrase “at least one,” in reference to a list of one or more elements, should be understood to mean at least one element selected from any one or more of the elements in the list of elements, but not necessarily including at least one of each and every element specifically listed within the list of elements and not excluding any combinations of elements in the list of elements. This definition also allows that elements may optionally be present other than the elements specifically identified within the list of elements to which the phrase “at least one” refers, whether related or unrelated to those elements specifically identified. Thus, as a non-limiting example, “at least one of A and B” (or, equivalently, “at least one of A or B,” or, equivalently “at least one of A and/or B”) can refer, in one embodiment, to at least one, optionally including more than one, A, with no B present (and optionally including elements other than B); in another embodiment, to at least one, optionally including more than one, B, with no A present (and optionally including elements other than A); in yet another embodiment, to at least one, optionally including more than one, A, and at least one, optionally including more than one, B (and optionally including other elements); etc.
It should also be understood that, unless clearly indicated to the contrary, in any methods claimed herein that include more than one step or act, the order of the steps or acts of the method is not necessarily limited to the order in which the steps or acts of the method are recited.
In the claims, as well as in the specification above, all transitional phrases such as “comprising,” “including,” “carrying,” “having,” “containing,” “involving,” “holding,” “composed of,” and the like are to be understood to be open-ended, i.e., to mean including but not limited to. Only the transitional phrases “consisting of” and “consisting essentially of” shall be closed or semi-closed transitional phrases, respectively, as set forth in the United States Patent Office Manual of Patent Examining Procedures, Section 2111.03. It should be appreciated that embodiments described in this document using an open-ended transitional phrase (e.g., “comprising”) are also contemplated, in alternative embodiments, as “consisting of” and “consisting essentially of” the feature described by the open-ended transitional phrase. For example, if the disclosure describes “a composition comprising A and B”, the disclosure also contemplates the alternative embodiments “a composition consisting of A and B” and “a composition consisting essentially of A and B”.
This application claims the benefit under 35 U.S.C. § 119(e) of the filing date of U.S. Provisional Application Ser. No. 63/151,402, filed Feb. 19, 2021, entitled “METHODS AND COMPOSITIONS TO CONFER REGULATION TO GENE THERAPY CARGOES BY HETEROLOGOUS USE OF ALTERNATIVE SPLICING CASSETTES”, the entire content of which is incorporated herein by reference in its entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2022/017015 | 2/18/2022 | WO |
Number | Date | Country | |
---|---|---|---|
63151402 | Feb 2021 | US |