The official copy of the sequence listing named “1412057 (DU7940US) Sequence Listing.xml”, created on Jan. 16, 2024, and having a size of 172 KB is submitted electronically via Patent Center in .xml format and is hereby incorporated by reference in its entirety.
The N6-methyladenosine (m6A) modification is found in thousands of cellular mRNAs and is a critical regulator of gene expression and cellular physiology. In pathological instances, m6A modifications may be dysregulated, contributing to several human diseases. For example, m6A dysregulation can lead to hypermethylation of oncogenic mRNAs and, in turn, leads to increased translation and cancer progression. The m6A methyltransferase machinery therefore has emerged as a promising therapeutic target.
Ongoing study of m6A modifications enabled by new tools provides clues as to strategies for overcoming hypermethylation. Current strategies for overcoming hypermethylation are focused on developing drugs that inhibit methyltransferase such as, for example, m(6)A methyltransferase (METTL3). However, as they can impact the methylation of all mRNAs, these approaches can have unwanted effects. Thus, targeted approaches for decreasing m6A hypermethylation and expression of upregulated oncogenes caused by m6A dysregulation are necessary avoid to globally affecting m6A modifications in unwanted areas, such as non-pathological tissue or cells.
The Summary is provided to introduce a selection of concepts that are further described below in the Detailed Description. This Summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used as an aid in limiting the scope of the claimed subject matter.
Provided herein is a N6-methyladenosine (m6A)-coupled effector protein expression system and methods of introducing same into a cell, tissue, and/or animal model to achieve m6A-dependent protein expression.
In some embodiments, the m6A-coupled effector protein expression system comprises (a) a nucleic acid sequence encoding a fusion protein, wherein the fusion protein comprises an m6A binding domain of a YT521-B homology (YTH) domain-containing protein fused to a catalytic domain of a cytidine deaminase, and (b) a nucleic acid sequence encoding an effector protein (e.g., a protein that modulates expression of one or more proteins in the cell) and dihydrofolate reductase (DHFR). In some embodiments, the m6A-coupled effector protein expression system comprises (a) a nucleic acid sequence encoding a fusion protein, wherein the fusion protein comprises an m6A binding domain of a YT521-B homology (YTH) domain-containing protein fused to a catalytic domain of a cytidine deaminase, and (b) a nucleic acid sequence encoding an effector protein (e.g., a protein that modulates expression of one or more proteins in the cell), a M6A sensing domain, and dihydrofolate reductase (DHFR).
In some embodiments, the expression system is a vector system wherein a first plasmid comprises the nucleic acid sequence encoding the fusion protein comprising an N6-methyladenosine (m6A) binding domain of a YT521-B homology (YTH) domain-containing protein fused to a catalytic domain of a cytidine deaminase, and a second plasmid comprises the nucleic acid sequence encoding an effector protein (e.g., a protein that modulates expression of one or more proteins in the cell or otherwise targets a component of an expression system to a cell or within a cell) and dihydrofolate reductase (DHFR).
In some embodiments, the catalytic domain of the cytosine deaminase is the catalytic domain of apolipoprotein B mRNA editing enzyme (APOBEC-1). In some embodiments, the effector protein is a tumor suppressor protein, for example, METTL3. In some embodiments, the effector protein is an RNA-guided endonuclease. In some embodiments, the RNA-guided endonuclease is a dead RNA-guided endonuclease, for example, dead Cas9 (dCas9). In some embodiments, the effector protein comprises dCas9 linked or fused to a transcriptional regulator, for example, a transcriptional repressor (e.g., KRAB).
In further embodiments, the expression system can comprise: a first DNA construct comprising a nucleic acid sequence encoding a fusion protein, wherein the fusion protein comprises (i) a catalytically-dead RNA-targeting CRISPR-Cas system enzyme fused to (ii) a catalytic domain of a cytidine deaminase fused to (iii) an N6-methyladenosine (m6A) binding domain of a YT521-B homology (YTH) domain-containing protein; a second DNA construct comprising a nucleic acid sequence encoding: an effector protein; a m6A sensor sequence; and a polypeptide encoding dihydrofolate reductase (DHFR); and a guide RNA configured to bind to the nucleic acid of the second DNA construct. In embodiments, the cytidine deaminase can be APOBEC-1. In embodiments, the effector protein can be a tumor suppressor protein. In some embodiments, the effector protein can be a p53 or a SOCS2. In some embodiments, the dead RNA-guided endonuclease can be a dead type VI dCas13. In some embodiments, the fusion protein can further comprise a nuclear localization sequence (NLS).
In embodiments, described herein is an expression system comprising: (a) a first DNA construct comprising a polynucleotide encoding a fusion protein, wherein the fusion protein comprises an N6-methyladenosine (m6A) binding domain of a YT521-B homology (YTH) domain-containing protein fused to a catalytic domain of a cytidine deaminase or a catalytic domain of an adenosine deaminase; and a second DNA construct comprising a polynucleotide encoding a heterologous polypeptide, the polynucleotide encoding a heterologous polypeptide comprising: polynucleotide encoding an effector protein; polynucleotide encoding a m6A sensor sequence; and a polynucleotide encoding a dihydrofolate reductase (DHFR).
In embodiments, the m6A binding domain comprises a sequence having at least 90% or greater sequence identity to SEQ ID Nos: 66 or 108-116. In embodiments, the m6A binding domain is fused to the catalytic domain via a peptide linker. In embodiments, the catalytic domain comprises a polypeptide having at least 95% identity to SEQ ID NO 78 or a catalytic fragment thereof, SEQ ID NO: 79 or a catalytic fragment thereof, SEQ ID NO: 80 or a catalytic fragment thereof; or SEQ ID NO: 81. In embodiments, a vector comprises the first DNA construct, a second DNA construct, or both. In embodiments, the nucleic acid sequence encoding a fusion protein, the nucleic acid sequence encoding a heterologous polypeptide and a polypeptide encoding dihydrofolate reductase (DHFR), or both, are operably linked to a first promoter. In embodiments, the system further comprises a nucleic acid sequence encoding a selectable marker operably linked to a second promoter. In embodiments, the first promoter is a constitutive or an inducible promoter. In embodiments, the first promoter is a constitutive or an inducible promoter. In embodiments, the cytidine deaminase is APOBEC-1. In embodiments, the effector protein is a tumor suppressor protein or a catalytically dead RNA-guided endonuclease. In embodiments, the tumor suppressor protein is suppressor of cytokine signaling 2 (SOC2) or p53 or one of the proteins listed in Table 1. In embodiments, the catalytically dead RNA-guided endonuclease is a dCas9 or a dCas13.
In embodiments, described herein is a polynucleotide comprising a nucleic acid sequence encoding an effector protein polypeptide, a m6A sensor sequence, and a polypeptide encoding dihydrofolate reductase (DHFR). Also described herein are vectors and host cells comprising one or more components of expression systems as described herein, as well as non-human transgenic animals comprising one or more components of expression vectors as described herein. Described herein additionally are kits comprising any one or more components of expression systems as described herein.
Further described herein are methods. In embodiments, described herein are methods of increasing expression of a tumor suppressor protein in one or more cells, comprising introducing the expression system of claim 1 into the one or more cells, for example, hepatocellular carcinoma (HCC) cells. In embodiments, the tumor suppression protein is SOCS2 or p53 or one of the proteins listed in Table 1.
Described herein are methods of reducing m6A effector regulator expression in a sample or a subject. In embodiments, described herein is a method of reducing M6A effector regulator expression, comprising: introducing an expression system into a subject having or suspected of having a cancer, wherein the expression system comprises: a first DNA construct comprising a polynucleotide encoding a fusion protein, wherein the fusion protein comprises an N6-methyladenosine (m6A) binding domain of a YT521-B homology (YTH) domain-containing protein fused to a catalytic domain of a cytidine deaminase or a catalytic domain of an adenosine deaminase; and a second DNA construct comprising a polynucleotide encoding a heterologous polypeptide, the polynucleotide encoding a heterologous polypeptide comprising: a polynucleotide encoding a catalytically-dead RNA-guided endonuclease; a polynucleotide encoding a m6A sensor sequence; and a polynucleotide encoding a dihydrofolate reductase (DHFR); an sgRNA configured to bind to an m6A regulator. In embodiments, the sgRNA is configured to bind to a m6A regulator listed in Table 2. In embodiments, the cancer comprises at least one of acute myeloid leukemia (AML), glioblastoma (GBM), lung cancer, endometrial cancer, cervical cancer, ovarian cancer, breast cancer, colorectal cancer (CRC), a hepatocellular carcinoma (HCC), pancreatic cancer, gastric cancer, prostate cancer, or renal cell carcinoma. In embodiments, the cancer is a cancer listed in Table 1 or Table 2. In embodiments, the catalytically-dead RNA-guided endonuclease is a dCas9 or dCas13.
Described herein are methods of reducing m6A hypermethylation in a subject or sample. In embodiments, methods comprising: introducing an expression system into a subject having or suspected of having a cancer, wherein the expression system comprises: a first DNA construct comprising a polynucleotide encoding a fusion protein, wherein the fusion protein comprises an N6-methyladenosine (m6A) binding domain of a YT521-B homology (YTH) domain-containing protein fused to a catalytic domain of a cytidine deaminase or a catalytic domain of an adenosine deaminase; and a second DNA construct comprising a polynucleotide encoding a heterologous polypeptide, the polynucleotide encoding a heterologous polypeptide comprising: a polynucleotide encoding a catalytically-dead RNA-guided endonuclease; a polynucleotide encoding a m6A sensor sequence; and a polynucleotide encoding a dihydrofolate reductase (DHFR); an sgRNA configured to bind to an m6A regulator. The sgRNA is configured to bind to a m6A regulator listed in Table 2. The cancer comprises at least one of acute myeloid leukemia (AML), glioblastoma (GBM), lung cancer, endometrial cancer, cervical cancer, ovarian cancer, breast cancer, colorectal cancer (CRC), a hepatocellular carcinoma (HCC), pancreatic cancer, gastric cancer, prostate cancer, or renal cell carcinoma. The cancer is a cancer listed in Table 1 or Table 2. The catalytically-dead RNA-guided endonuclease is a dCas9 or dCas13.
In embodiments, described herein are methods of inhibiting cancer cells. In an embodiment, a method of inhibiting a cancer cell, the method comprising: introducing the expression system as described herein into the cancer cell, wherein the cancer cell comprises m6A RNA hypermethylation, and wherein the second DNA construct comprising a polynucleotide encoding an effector protein, the effector protein comprising a tumor suppressor protein.
The cancer cell can comprise an acute myeloid leukemia (AML) cell, a glioblastoma (GBM) cell, a lung cancer cell, an endometrial cancer, a cervical cancer cell, an ovarian cancer cell, a breast cancer cell, a colorectal cancer (CRC) cell, a hepatocellular carcinoma (HCC) cell, a pancreatic cancer cell, a gastric cancer cell, a prostate cancer cell, or a renal cell carcinoma cell. In an embodiment, the lung cancer cell is a non-small cell lung carcinoma cell. In an embodiment, the cancer cell is a hepatocellular carcinoma cell. In an embodiment, the tumor suppressor protein comprises at least one of the tumor suppressor proteins listed in Table 1. In an embodiment, expression of the tumor suppressor protein upregulates downstream signaling targets. In an embodiment, the tumor suppressor protein comprises p53. In an embodiment, expression of p53 upregulates at least one of CDKN1A or GADD45A. In an embodiment, the tumor suppressor protein comprises suppressor of cytokine signaling 2 (SOCS2). In an embodiment, the expression system is introduced into the cancer cell by transfection, viral infection, or electroporation. In an embodiment of methods as described herein, inhibiting the cancer cell comprises decreasing at least one of cell proliferation, cell migration, or metastasis.
Described herein are methods of treating a subject having a cancer. In embodiments, methods of treating a subject having a cancer characterized by m6A RNA hypermethylation, the methods comprise inhibiting a cancer cell according to the methods as described above. In embodiments, the cancer comprises at least one of acute myeloid leukemia (AML), glioblastoma (GBM), lung cancer, endometrial cancer, cervical cancer, ovarian cancer, breast cancer, colorectal cancer (CRC), a hepatocellular carcinoma (HCC), pancreatic cancer, gastric cancer, prostate cancer, or renal cell carcinoma. In embodiments, the cancer comprises hepatocellular carcinoma. In embodiments, expression of the tumor suppressor protein results in decreasing at least one of cell proliferation, cell migration, or metastasis of the cancer. In embodiments, the expression system is introduced into the subject by viral infection or electroporation.
The present application includes the following figures. The figures are intended to illustrate certain embodiments and/or features of the compositions and methods, and to supplement any description(s) of the compositions and methods. The figures do not limit the scope of the compositions and methods, unless the written description expressly indicates that such is the case.
The following description recites various aspects and embodiments of the present compositions and methods. No particular embodiment is intended to define the scope of the compositions and methods. Rather, the embodiments merely provide non-limiting examples of various compositions and methods that are at least included within the scope of the disclosed compositions and methods. The description is to be read from the perspective of one of ordinary skill in the art; therefore, information well known to the skilled artisan is not necessarily included.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. All patents, patent applications and publications referred to throughout the disclosure herein are incorporated by reference in their entirety.
Articles “a” and “an” are used herein to refer to one or to more than one (i.e. at least one) of the grammatical object of the article. By way of example, “an element” means at least one element and can include more than one element.
The use of any and all examples or exemplary language (e.g., “such as”) provided herein, is intended merely to better illustrate the invention and does not pose a limitation on the scope of the invention unless otherwise claimed.
The terms “may,” “may be,” “can,” and “can be,” and related terms are intended to convey that the subject matter involved is optional (that is, the subject matter is present in some examples and is not present in other examples), not a reference to a capability of the subject matter or to a probability, unless the context clearly indicates otherwise.
“About” is used to provide flexibility to a numerical range endpoint by providing that a given value may be “slightly above” or “slightly below” the endpoint without affecting the desired result.
The use herein of the terms “including,” “comprising,” or “having” and variations thereof, is meant to encompass the elements listed thereafter and equivalents thereof as well as additional elements. Embodiments recited as “including,” “comprising,” or “having” certain elements are also contemplated as “consisting essentially of” and “consisting of” those certain elements. As used herein, “and/or” refers to and encompasses any and all possible combinations of one or more of the associated listed items, as well as the lack of combinations where interpreted in the alternative (“or”).
As used herein, the transitional phrase “consisting essentially of” (and grammatical variants) is to be interpreted as encompassing the recited materials or steps and those that do not materially affect the basic and novel characteristic(s)” of the claimed invention. See, In re Herz, 537 F.2d 549, 551-52, 190 U.S.P.Q. 461, 463 (CCPA 1976) (emphasis in the original); see also MPEP § 2111.03. Thus, the term “consisting essentially of” as used herein should not be interpreted as equivalent to “comprising.”
As used throughout, the term “nucleic acid” or “nucleotide” refers to deoxyribonucleic acids (DNA) or ribonucleic acids (RNA) and polymers thereof in either single- or double-stranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleotides that have similar properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides. A nucleic acid sequence can comprise combinations of deoxyribonucleic acids and ribonucleic acids. Such deoxyribonucleic acids and ribonucleic acids include both naturally occurring molecules and synthetic analogues. The polynucleotides of the invention also encompass all forms of sequences including, but not limited to, single-stranded forms, double-stranded forms, hairpins, stem-and-loop structures, and the like.
Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions), alleles, orthologs, SNPs, and complementary sequences as well as the sequence explicitly indicated. Specifically, degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al., Nucleic Acid Res. 19:5081 (1991); Ohtsuka et al., J. Biol. Chem. 260:2605-2608 (1985); and Rossolini et al., Mol. Cell. Probes 8:91-98 (1994)).
As used throughout the term “mRNA” or “mRNA transcript” refers to a single-stranded RNA having at least one open reading frame that can be translated by a cell to express a protein, The cell can be an in vitro cell or an in vivo cell.
“Polypeptide,” “peptide,” and “protein” are used interchangeably herein to refer to a polymer of amino acid residues. As used herein, the terms encompass amino acid chains of any length, including full-length proteins, wherein the amino acid residues are linked by covalent peptide bond”.
“Contacting” as used herein, e.g., as in “contacting a cell” refers to contacting a cell directly or indirectly in vitro, ex vivo, or in vivo (i.e., within a subject as defined herein). Contacting a cell may include addition of a compound (e.g., a genetically encoded m6A-coupled effector protein delivery system) to a cell, or administration to a subject. Contacting encompasses administration to a solution, cell, tissue, mammal, subject, patient, or human. Further, contacting a cell includes adding an agent to a cell culture.
As used herein, the terms “subject” and “patient” are used interchangeably herein and refer to both human and nonhuman animals. The term “nonhuman animals” of the disclosure includes all vertebrates, e.g., mammals and non-mammals, such as nonhuman primates, sheep, dog, cat, horse, cow, chickens, amphibians, reptiles, and the like, as well as animal models, such as transgenic animals, and the like. The methods and compositions disclosed herein can be used on a sample either in vitro (for example, on isolated cells or tissues) or in vivo in a subject (i.e., living organism, such as a patient or animal model). In embodiments of methods as described herein, the sample comprises a plurality of cells.
As used throughout, a catalytic domain of a cytidine deaminase is a polypeptide comprising a cytidine deaminase, for example, Apolipoprotein B mRNA Editing Enzyme Catalytic Subunit (APOBEC1), activation induced cytidine deaminase (AICDA), Apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 3A (APOBEC3A), or a catalytic fragment of any thereof, that catalyzes deamination of cytidine (“C”) to uridine (“U”) in RNA molecules. As used throughout, a catalytic domain of an adenosine deaminase, is a polypeptide comprising an adenosine deaminase, for example, double-stranded RNA-specific adenosine deaminase (ADAR1), or a catalytic fragment thereof, that catalyzes deamination of adenosine (“A”) to inosine (“I”) in RNA molecules. In some embodiments, the catalytic domain retains at least about 75%, 80%, 90%, 95%, or 99% of the enzymatic activity of the wildtype deaminase from which the domain is derived.
As used throughout, the term “Cas9 polypeptide” means a Cas9 protein or a fragment thereof present in any bacterial species that encodes a Type II CRISPR/Cas9 system. See, for example, Makarova et al. Nature Reviews, Microbiology, 9: 467-477 (2011), including supplemental information, hereby incorporated by reference in its entirety. For example, the Cas9 protein or a fragment thereof can be from Streptococcus pyogenes. Full-length Cas9 is an endonuclease comprising a recognition domain and two nuclease domains (HNH and RuvC, respectively) that creates double-stranded breaks in DNA sequences. In the amino acid sequence of Cas9, HNH is linearly continuous, whereas RuvC is separated into three regions, one left of the recognition domain, and the other two right of the recognition domain flanking the HNH domain. Cas9 from Streptococcus pyogenes is targeted to a genomic site in a cell by interacting with a guide RNA that hybridizes to a 20-nucleotide DNA sequence that immediately precedes an NGG motif recognized by Cas9. This results in a double-strand break in the genomic DNA of the cell.
As used throughout, a dCas9 polypeptide is a deactivated or nuclease-dead Cas9 (dCas9) that has been modified to inactivate Cas9 nuclease activity. Modifications include, but are not limited to, altering one or more amino acids to inactivate the nuclease activity or the nuclease domain. For example, and not to be limiting, D10A and H840A mutations can be made in Cas9 from Streptococcus pyogenes to inactivate Cas9 nuclease activity. Other modifications include removing all or a portion of the nuclease domain of Cas9, such that the sequences exhibiting nuclease activity are absent from Cas9. Accordingly, a dCas9 may include polypeptide sequences modified to inactivate nuclease activity or removal of a polypeptide sequence or sequences to inactivate nuclease activity. The dCas9 retains the ability to bind to DNA even though the nuclease activity has been inactivated. Accordingly, dCas9 includes the polypeptide sequence or sequences required for DNA binding but includes modified nuclease sequences or lacks nuclease sequences responsible for nuclease activity. It is understood that similar modifications can be made to inactivate nuclease activity in other site-directed nucleases, for example in Cpf1 or C2c2.
In some examples, the dCas9 protein is a full-length Cas9 sequence from S. pyogenes lacking the polypeptide sequence of the RuvC nuclease domain and/or the HNH nuclease domain and retaining the DNA binding function. In other examples, the dCas9 protein sequences have at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 98% or 99% identity to Cas9 polypeptide sequences lacking the RuvC nuclease domain and/or the HNH nuclease domain and retains DNA binding function. In other examples, the dCas9 protein sequence is encoded by a polynucleotide that has at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 98% or 99% to SEQ ID NO: 59.
As used throughout, the term “Cas13 polypeptide” means a Cas13 protein or a fragment thereof present in any bacterial species that encodes a Type VI CRISPR/Cas13 system. Exemplary Cas13 polypeptides include dPspCas13b, dLwaCas13a, and dRfxCas13d. Additional Cas13 polypeptides are described, for example, in Abudayyeh et al., Science. 2016 August 5; 353(6299): aaf5573. doi:10.1126/science.aaf5573, including supplemental information, hereby incorporated by reference in its entirety; Cox et al., Science 358, 1019-1027 (2017) including supplemental information, hereby incorporated by reference in its entirety; and Tang et al., Front. Cell Dev. Biol., 27 Jul. 2021 Sec. Epigenomics and Epigenetics Volume 9-2021; doi: 10.3389/fcell.2021.677587. For example, the Cas13 protein or a fragment thereof with ssRNA targeting activity can be from Leptotrichia wadei, Leptotrichia shahii, Prevotella sp. P5-125 (PspCas13b), or Ruminococcus flavefaciens. Generally, Cas13 enzymes have two higher eukaryotes and prokaryotes nucleotide-binding (HEPN) endoRNase domains that mediate precise RNA cleavage with a preference for targets with protospacer flanking sites (PFSs) observed biochemically and in bacteria.
As used throughout, a dCas13 polypeptide is a deactivated or nuclease-dead Cas13 (dCas13) that has been modified to inactivate Cas13 nuclease activity. Modifications include, but are not limited to, altering one or more amino acids to inactivate the nuclease activity or the nuclease domain. For example, and not to be limiting, H133A and H1058A mutations can be made in Cas13 HEPN domains from Prevotella sp. P5-125 (PspCas13b) to inactivate Cas13 nuclease activity (see, for example, Cox et al., Science 358, 1019-1027 (2017) including supplemental information, hereby incorporated by reference in its entirety, and International Patent Publication WO 2019/005884, also incorporated by reference in its entirety). Other modifications include removing all or a portion of the nuclease domain of Cas13 (for example, A984-1090 H133A of Cas13b is from Prevotella sp. P5-125; see, for example, Programmable m(6)A modification of cellular RNAs with a Cas13-directed methyltransferase. Wilson C, Chen P J, Miao Z, Liu D R. Nat Biotechnol. 2020 Jun. 29. pii: 10.1038/s41587-020-0572-6. doi: 10.1038/s41587-020-0572-6. 10.1038/s41587-020-0572-6 PubMed 32601430), such that the sequences exhibiting nuclease activity are absent from Cas13. Exemplary dCas13 polypeptide mutations include R474A/R1046A in dCas13 from L. wadei and mutations R239R/H244A/ and R858A/H863A from Ruminococcus flavefaciens strain XPD3002. Accordingly, a dCas13 may include polypeptide sequences modified to inactivate nuclease activity or removal of a polypeptide sequence or sequences to inactivate nuclease activity. The dCas13 retains the ability to target ssRNA even though the nuclease activity has been inactivated. Accordingly, dCas13 includes the polypeptide sequence or sequences required for ssRNA targeting but includes modified nuclease sequences or lacks nuclease sequences responsible for nuclease activity.
In some examples, the dCas13 protein is a full-length Cas13 sequence from L. wadei, L. shahii, Prevotella sp. P5-125 (PspCas13b), or R. flavefaciens having one or more mutations in one or more HEPN domains and retaining the ssRNA targeting function. In other examples, the dCas13 protein sequences have at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 98% or 99% identity to Cas13 polypeptide sequences with HEPN mutations and retains RNA binding function. In other examples, the dCas13 protein sequence is encoded by a dCas13 polynucleotide coding fragment that has at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 98% or 99% to the corresponding dCas13 polynucleotide coding fragment present in SEQ ID NO: 60.
N6-methyladenosine (m6A) is the most abundant internal mRNA modification and influences several steps of the RNA life cycle, including splicing, stability, and translation 1, 2. The majority of m6A sites in cells are deposited co-transcriptionally by a single methyltransferase, METTL3, which interacts with additional accessory proteins to target RNAs for methylation. In mammals, m6A occurs in a unique consensus sequence which at its core consists of RAC (R=G or A), and It is enricled in proximal 3′UTRs and in the vicinity of the stop codon3, 4. m6A carries out its diverse RNA regulatory functions by recruiting m6A binding proteins, which mediate the ability of m6A to impact the expression of thousands of cellular mRNAs.
Consistent with the broad roles for m6A in gene expression control, m6A has emerged as an important regulator of cellular function. m6A is necessary for several physiological processes, including stem cell maintenance, development, innate immunity, and learning and memory5-7. Additionally, dynamic regulation of m6A provides a mechanism for cells to fine-tune gene expression in response to changing cellular conditions. For instance, some forms of cellular stress can lead to hyper- or hypomethylated states which impact the expression of stress response genes 8-10 and synaptic activity alters mRNA methylation in the brain to control the expression of synaptic plasticity genes5, 11-13 In addition, abnormal regulation of m6A levels in cells contributes to a variety of human diseases, including cardiovascular disease, the response to viral infection, and several cancers14-16 METTL3 and other methyltransferase complex proteins are often upregulated in cancer, leading to elevated levels of m6A that promote the expression of genes that support cancer cell proliferation and migration. Thus, detecting changes in m6A levels across cell types or under certain cellular conditions is important for understanding how m6A contributes to cellular function in both healthy and disease states.
Much of the progress that has been made in understanding m6A regulation in cells has been through the development of new tools that have enabled m6A detection. Strategies for detecting global changes in cellular m6A levels have primarily used three approaches: m6A antibodies, thin-layer chromatography, or mass spectrometry. However, these methods suffer from several limitations, including high cost, the need for large amounts of RNA, and multiple sample processing steps. Moreover, antibody-based methods suffer from non-specificity, mass spectrometry requires specialized equipment, and TLC depends on radioactivity. More recently, alternatives to antibody-based global m6A mapping have been developed17-21, but these methods often require substantial amounts of input RNA. Importantly, all current strategies involve isolation of RNA from cells and therefore do not enable real-time monitoring of m6A methylation in living cells. These limitations have been a major barrier for understanding how cellular m6A is dynamically regulated. In addition, no method exists for providing a specific readout of cellular m6A methylation in a manner compatible with high-throughput screening (HTS). This has substantially limited drug discovery efforts aimed at identifying inhibitors of METTL3, and it has prevented other high-throughput studies designed to identify factors that regulate m6A in cells.
Based on the aforementioned deficiencies, there existed a great need to develop a simple, low-cost method for detecting adenosine methylation in living cells which is also compatible with HTS. As described at least in International Patent Application PCT/US2022/079709 (which is incorporated by reference as if fully set forth herein), progress has been made in this area. Genetically Encoded m6A Sensor technology (also referred to herein as “GEMS”) is described at least in International Patent Application PCT/US2022/079709, which can couple protein expression, such as a fluorescent signal, with cellular mRNA methylation. Sensors and methods as described therein can detect changes in m6A levels caused by pharmacological inhibition of the m6A methyltransferase, giving it potential utility for drug discovery efforts.
However, prior methods for studying m6A required RNA isolation and did not provide a real-time readout of mRNA methylation in living cells, leading to the development of technology such as the Genetically Encoded m6A Sensor technology (also referred to herein as “GEMS”). Other aspects of GEMS system components are additional described for example, at least in U.S. Pat. No. 11,680,109, which is incorporated by reference as if fully set forth herein.
Some of these prior approaches to date, however, may risk editing of off-target endogenous RNAs when fusion proteins comprising N6-methyladenosine (m6A) binding domain of a YT521-B homology (YTH) domains are utilized. Furthermore, while drug discovery efforts have been made aimed at METTL3 inhibition, targeted delivery of therapeutics (such as tumor suppressors and cell cycle proteins, for example), an area of research which can see improvements.
Described herein are constructs, expression systems, methods, kits, animals, and cells relating to programmable sensors and methods which can be programmed for targeted delivery of cells to achieve m6A-dependent delivery of custom protein payloads in cells. Thus, constructs, expression systems, and methods as described herein can provide a versatile platform based on m6A sensing, allowing for (at least): (1) a simple readout for m6A methylation; (2) a system for m6A-coupled protein expression; and (3) a system for targeted m6A-coupled protein expression. Furthermore, the GEMS systems as described can be modified for effector protein expression (e.g., expression of proteins related to tumor suppression or cell cycle regulation, such as p53 or suppressor of tumor signaling 2 (SOCS2)) or an RNA-guided endonuclease that has been modified to remove cleavage activity (e.g., a “dead” CAS protein). Systems as described herein additionally can be employed in transgenic or knock-in animals or cells derived from animal models as described herein.
Disclosed herein are compositions, systems, and methods related to overcoming the aforementioned limitations.
Disclosed herein are genetically encoded sensors for m6A which can provide a fluorescent readout when m6A is deposited on mRNA. The sensor may be used for detecting mRNA methylation in a variety of cell types (without intending to be limiting in immortalized or primary tumor cells in vitro, for example), and for responding to small molecule inhibition of the m6A methyltransferase, METTL3, as discussed. In addition, as disclosed herein, the m6A sensor platform can be utilized to express effector proteins of interest instead of a reporter protein (i.e., eGFP), such as anti-tumor therapeutics or tumor suppression proteins. For example, sensors as described herein can achieve m6A-coupled delivery of anti-tumor therapeutics (for example, tumor suppressor proteins to slow the growth of cancer cells through the expression of p53 or other tumor suppressor proteins) in cancer cells that have elevated m6A levels.
Additionally, components of the compositions, systems, and methods as described herein can be targeted to prevent off-target effects (such as unwanted editing of off-target RNAs in physiologically normal or otherwise healthy cells) utilizing catalytically-dead CRISPR-associated (Cas) enzymes, for example, of RNA-targeting (also referred to herein as “RNA-guided”) type III (i.e., Csm/Csr), type VI (i.e. Cas13), or type II (i.e., Cas9) CRISPR-Cas systems. Altogether, the system provides a simple, highly versatile approach that can be used for sensing m6A in living cells and coupling mRNA methylation to effector protein expression.
Provided herein is an expression system comprising: (a) a first DNA construct comprising a nucleic acid sequence encoding a fusion protein, wherein the fusion protein comprises an N6-methyladenosine (m6A) binding domain of a YT521-B homology (YTH) domain-containing protein fused to a catalytic domain of a cytidine deaminase or a catalytic domain of an adenosine deaminase (e.g., APOBEC1); and (b) a second DNA construct comprising (i) a nucleic acid sequence encoding an effector polypeptide; (ii) a m6A sensor sequence; and (iii) a polypeptide encoding dihydrofolate reductase (DHFR). The nucleic acid sequence encoding an effector; (ii) a m6A sensor sequence; and (iii) a polypeptide encoding dihydrofolate reductase (DHFR) is also referred to as the mRNA reporter sequence or effector sequence. Also provided is a nucleic acid sequence comprising a nucleic acid sequence encoding an effector protein, a m6A sensor sequence, and, a polypeptide encoding dihydrofolate reductase (DHFR).
The m6A methylation sensor system previously discovered by the inventors, as described in PCT/US2022/079709, U.S. Pat. No. 11,680,109, and Meyer, K. D., “DART-seq: an antibody-free method for global m(6)A detection,” Nat Methods. 2019 December, 16(12):1275-1280 (published online Sep. 23, 2019); doi: 10.1038/s41592-019-0570-0, the entire contents of all of which (including sequence information and any supplemental information) are incorporated by reference in their entirety as fully set forth herein includes at least two components: 1) expression of APO1-YTH, and 2) expression of a protein in the presence of m6A (
Although the m6A sensor system uses m6A-coupled GFP expression as a readout, any gene of interest can be cloned in place of GFP to achieve m6A-dependent protein expression. Such an m6A-coupled effector protein delivery system has several potential applications (e.g., in cancer therapy). Additional aspects of expression systems are provided in Sections I and II above.
The recombinant nucleic acids provided herein can be included in expression cassettes for expression in a host cell or an organism of interest. The cassette will include 5′ and 3′ regulatory sequences operably linked to a recombinant nucleic acid provided herein that allows for expression of the modified polypeptide. The cassette may additionally contain at least one additional gene or genetic element to be co-transformed into the organism. Where additional genes or elements are included, the components are operably linked. Alternatively, the additional gene(s) or element(s) can be provided on multiple expression cassettes. Such an expression cassette is provided with a plurality of restriction sites and/or recombination sites for insertion of the polynucleotides to be under the transcriptional regulation of the regulatory regions. The expression cassette may additionally contain a selectable marker gene. The expression cassette will include in the 5′ to 3′ direction of transcription: a transcriptional and translational initiation region (i.e., a promoter), a polynucleotide of the invention, and a transcriptional and translational termination region (i.e., termination region) functional in the cell or organism of interest. The promoters of the invention are capable of directing or driving expression of a coding sequence in a host cell. The regulatory regions (i.e., promoters, transcriptional regulatory regions, and translational termination regions) may be endogenous or heterologous to the host cell or to each other. As used herein, “heterologous” in reference to a sequence is a sequence that originates from a foreign species, or, if from the same species, is substantially modified from its native form in composition and/or genomic locus by deliberate human intervention.
Additional regulatory signals include, but are not limited to, transcriptional initiation start sites, operators, activators, enhancers, other regulatory elements, ribosomal binding sites, an initiation codon, termination signals, and the like. See Sambrook et al. (1992) Molecular Cloning: A Laboratory Manual, ed. Maniatis et al. (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) (hereinafter “Sambrook 11”); Davis et al., eds. (1980) Advanced Bacterial Genetics (Cold Spring Harbor Laboratory Press), Cold Spring Harbor, N.Y., and the references cited therein.
Further provided is a vector comprising a nucleic acid or expression cassette set forth herein. The vector is contemplated to have the necessary functional elements that direct and regulate transcription of the inserted nucleic acid. These functional elements include, but are not limited to, a promoter, regions upstream or downstream of the promoter, such as enhancers that may regulate the transcriptional activity of the promoter, an origin of replication, appropriate restriction sites to facilitate cloning of inserts adjacent to the promoter, antibiotic resistance genes or other markers which can serve to select for cells containing the vector or the vector containing the insert, RNA splice junctions, a transcription termination region, or any other region which may serve to facilitate the expression of the inserted gene or hybrid gene (See generally, Sambrook et al. Molecular Cloning: A Laboratory Manual, 4th ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 2012). The vector, for example, can be a plasmid.
The expression vectors described herein can also include the nucleic acids as described herein under the control of an inducible promoter such as the tetracycline inducible promoter or a glucocorticoid inducible promoter. The nucleic acids of the present invention can also be under the control of a tissue-specific promoter to promote expression of the nucleic acid in specific cells, tissues or organs. Any regulatable promoter, such as a metallothionein promoter, a heat-shock promoter, and other regulatable promoters, of which many examples are well known in the art are also contemplated. Furthermore, a Cre-loxP inducible system can also be used, as well as a Flp recombinase inducible promoter system, both of which are known in the art.
Provided herein is a m6A-coupled effector protein expression system and methods of introducing same into a cell, tissue, and/or animal model to achieve m6A-dependent protein expression. In some embodiments, the m6A-coupled effector protein expression system comprises (a) a nucleic acid sequence encoding a fusion protein, wherein the fusion protein comprises an N6-methyladenosine (m6A) binding domain of a YT521-B homology (YTH) domain-containing protein fused to a catalytic domain of a cytidine deaminase, and (b) a nucleic acid sequence encoding an effector protein and dihydrofolate reductase (DHFR). In some embodiments, the catalytic domain of the cytosine deaminase is the catalytic domain of apolipoprotein B mRNA editing enzyme (APOBEC-1). Also provided is a vector comprising any of the nucleic acid sequences described herein.
In some embodiments, the effector protein is a tumor suppressor protein, for example, METTL3. In some embodiments, the effector protein is an RNA-guided endonuclease. In some embodiments, the RNA-guided endonuclease is a dead RNA-guided endonuclease, for example, dead Cas9 (dCas9). In some embodiments, the effector protein comprises dCas9 linked or fused to a transcriptional regulator, for example, a transcriptional repressor (e.g., KRAB). In some embodiments, the effector protein comprises dCas9 linked or fused to a transcriptional activator. In any of the methods described herein, one or more guide RNAs can be introduced into the cell to guide the dCas9 to a specific site in the genome of the cell.
Also provided is a DNA construct comprising a promoter operably linked to a recombinant nucleic acid described herein. A nucleic acid is “operably linked” when it is placed into a functional relationship with another nucleic acid sequence. Numerous promoters can be used in the constructs described herein. A promoter is a region or a sequence located upstream and/or downstream from the start of transcription that is involved in recognition and binding of RNA polymerase and other proteins to initiate transcription. The promoter can be a eukaryotic or a prokaryotic promoter. In some embodiments the promoter is an inducible promoter. In some embodiments, the promoter is a constitutive promoter.
Any of the nucleic acid sequences provided herein can be included in expression cassettes for expression in a host cell or an organism of interest. The cassette will include 5′ and 3′ regulatory sequences operably linked to a recombinant nucleic acid provided herein that allows for expression of the modified polypeptide. A nucleic acid is “operably linked” when it is placed into a functional relationship with another nucleic acid sequence. Numerous promoters can be used in the constructs described herein. A promoter is a region or a sequence located upstream and/or downstream from the start of transcription that is involved in recognition and binding of RNA polymerase and other proteins to initiate transcription. The promoter can be a eukaryotic or a prokaryotic promoter. In some embodiments the promoter is an inducible promoter. In some embodiments, the promoter is a constitutive promoter.
In some embodiments, the nucleic acid sequence encoding a fusion protein comprising an N6-methyladenosine (m6A) binding domain of a YT521-B homology (YTH) domain-containing protein fused to a catalytic domain of a cytidine deaminase or a catalytic domain of an adenosine deaminase is operably linked to an inducible promoter, e.g., a tetracycline inducible promoter; and the nucleic acid construct encoding the mRNA reporter sequence is operably linked to a constitutive promoter (e.g., a CMV promoter)”
A “constitutive” promoter is a promoter that is active under most environmental and developmental conditions. Examples of constitutive promoters include, but are not limited to, a CMV promoter, a U6 promoter, a PGK promoter, a EF-1α promoter and a SV40 promoter.
An “inducible” promoter is a promoter that is active under environmental or developmental regulation, for example, regulated by the presence or absence of a drug. Examples of inducible promoters include, but are not limited to, the pL promoter (induced by an increase in temperature), the pBAD promoter, (induced by the addition of arabinose to the growth medium). the tetracycline-controlled transcriptional activation system (Tet-On/Tet-Off, Bujard and Gossen, PNAS, 89(12):5547-5551 (1992)), the Lac switch inducible system (Wyborski et al., Environ Mol Mutagen, 28(4):447-58 (1996)), the ecdysone-inducible gene expression system (No et al., PNAS, 93(8):3346-3351 (1996)), the cumate gene-switch system (Mullick et al., BMC Biotechnology, 6:43 (2006)), and the tamoxifen-inducible gene expression (Zhang et al., Nucleic Acids Research, 24:543-548 (1996)). Furthermore, a Cre-loxP inducible system can also be used, as well as a Flp recombinase inducible promoter system, both of which are known in the art.
In some embodiments, the promoter is a cell-specific or tissue-specific promoter. When using a cell- or tissue-specific promoter, expression occurs primarily, but not exclusively, in a particular cell or tissue. For example, expression can occur in at least 90%, 95%, or 99% of the targeted cell or tissue. It will be understood, however, that tissue-specific promoters may have a detectable amount of background or base activity in those tissues where they are mostly silent.
Examples of tissue-specific promoters include, but are not limited to, liver-specific promoters (e.g., APOA2, SERPINA1, CYP3A4, MIR122), pancreatic-specific promoters (e.g., insulin, insulin receptor substrate 2, pancreatic and duodenal homeobox 1, Aristaless-like homeobox 3, and pancreatic polypeptide), cardiac-specific promoters (e.g., myosin, heavy chain 6, myosin, light chain 2, troponin I type 3, natriuretic peptide precursor A, solute carrier family 8), central nervous system promoters (e.g., glial fibrillary acidic protein, internexin neuronal intermediate filament protein, Nestin, myelin-associated oligodendrocyte basic protein, myelin basic protein, tyrosin hydroxylase, and Forkhead box A2), skin-specific promoters (e.g., Filaggrin, Keratin 14 and transglutaminase 3), pluripotent and embryonic germ layer promoters (e.g., POU class 5 homeobox 1, Nanog homeobox, Nestin, and MicroRNA 122).
The cassette may additionally contain at least one additional gene or genetic element to be co-transformed into the organism (i.e., a cell, plurality of cells, tissue, or animal). Where additional genes or elements are included, the components are operably linked. Alternatively, the additional gene(s) or element(s) can be provided on multiple expression cassettes. Such an expression cassette is provided with a plurality of restriction sites and/or recombination sites for insertion of the polynucleotides to be under the transcriptional regulation of the regulatory regions. The expression cassette may additionally contain a selectable marker gene. The expression cassette will include in the 5′ to 3′ direction of transcription: a transcriptional and translational initiation region (i.e., a promoter), a polynucleotide of the invention, and a transcriptional and translational termination region (i.e., termination region) functional in the cell or organism of interest. The promoters of the invention are capable of directing or driving expression of a coding sequence in a host cell. The regulatory regions (i.e., promoters, transcriptional regulatory regions, and translational termination regions) may be endogenous or heterologous to the host cell or to each other. As used herein the term “heterologous” refers to a nucleotide sequence or polypeptide not normally found in a given cell in nature. As such, a heterologous nucleotide sequence or heterologous polypeptide may be: (a) foreign to its host cell (i.e., is exogenous to the cell); (b) naturally found in the host cell (i.e., endogenous) but present at an unnatural quantity in the cell (i.e., greater or lesser quantity than naturally found in the host cell); or (c) be naturally found in the host cell but positioned outside of its natural locus.
Additional regulatory signals include, but are not limited to, transcriptional initiation start sites, operators, activators, enhancers, other regulatory elements, ribosomal binding sites, an initiation codon, termination signals, and the like. See Sambrook et al. (1992) Molecular Cloning: A Laboratory Manual, ed. Maniatis et al. (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) (hereinafter “Sambrook 11”); Davis et al., eds. (1980) Advanced Bacterial Genetics (Cold Spring Harbor Laboratory Press), Cold Spring Harbor, N.Y., and the references cited therein.
In preparing the expression cassette, the various DNA fragments may be manipulated, to provide for the DNA sequences in the proper orientation and, as appropriate, in the proper reading frame. Toward this end, adapters or linkers may be employed to join the DNA fragments or other manipulations may be involved to provide for convenient restriction sites, removal of superfluous DNA, removal of restriction sites, or the like. For this purpose, in vitro mutagenesis, primer repair, restriction, annealing, resubstitutions, e.g., transitions and transversions, may be involved.
Further provided is a vector comprising a nucleic acid or expression cassette set forth herein. The vector is contemplated to have the necessary functional elements that direct and regulate transcription of the inserted nucleic acid. These functional elements include, but are not limited to, a promoter, regions upstream or downstream of the promoter, such as enhancers that may regulate the transcriptional activity of the promoter, an origin of replication, appropriate restriction sites to facilitate cloning of inserts adjacent to the promoter, antibiotic resistance genes or other markers which can serve to select for cells containing the vector or the vector containing the insert, RNA splice junctions, a transcription termination region, or any other region which may serve to facilitate the expression of the inserted gene or hybrid gene (See generally, Sambrook et al. Molecular Cloning: A Laboratory Manual, 4th ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 2012). The vector, for example, can be a plasmid.
In some embodiments, a vector comprises the first DNA construct. In some embodiments, a vector comprises the second DNA construct. In some embodiments, a vector comprises the first and second DNA construct. In some embodiments, the vector is a plasmid. In some embodiments, a vector comprises the first DNA construct, the second DNA construct and a nucleic acid encoding a selectable marker. In some embodiments, the first DNA construct and the second DNA construct are operably linked to a first promoter, and the nucleic acid sequence encoding a selectable marker is operably linked to a second promoter (i.e., a promoter that is different from the first promoter). In some embodiments, the selectable marker is a fluorescent protein, that is different from the effector protein or the fluorescent protein encoded by second DNA construct, for example, dsRed. An exemplary dual-promoter construct that can be modified to express effector proteins as described herein, for example, but exchanging the nucleic acid sequence encoding a fluorescent report for an effector protein comprises: (1) a nucleic acid sequence encoding an effector protein, a m6A reporter sequence and DHFR; (2) a nucleic acid sequence encoding a fusion protein (APOBEC1-YTH); and (3) a nucleic acid sequence encoding dsRed (provided herein as SEQ ID NO: 107). In certain embodiments, the first DNA construct and second DNA construct do not contain nucleic acid sequences encoding a fluorescent protein.
There are numerous E. coli expression vectors known to one of ordinary skill in the art, which are useful for the expression of any of the nucleic acid sequences described herein (e.g., any of the fusion proteins described herein). Other microbial hosts suitable for use include bacilli, such as Bacillus subtilis, and other enterobacteriaceae, such as Salmonella, Senatia, and various Pseudomonas species. In these prokaryotic hosts, one can also make expression vectors, which will typically contain expression control sequences compatible with the host cell (e.g., an origin of replication). In addition, any number of a variety of well-known promoters will be present, such as the lactose promoter system, a tryptophan (Trp) promoter system, a beta-lactamase promoter system, or a promoter system from phage lambda. Additionally, yeast expression can be used.
“Polypeptide,” “peptide,” and “protein” are used interchangeably herein to refer to a polymer of amino acid residues. As used herein, the terms encompass amino acid chains of any length, including full-length proteins, wherein the amino acid residues are linked by covalent peptide bonds.
As used throughout, a “fusion protein” is a protein comprising two different polypeptide sequences, i.e. a binding domain and a catalytic domain, that are joined or linked to form a single polypeptide. The two amino acid sequences are encoded by separate nucleic acid sequences that have been joined so that they are transcribed and translated to produce a single polypeptide. In some embodiments, the fusion protein comprises, in the following order, a m6A binding domain, and a catalytic domain of a cytidine deaminase or an adenosine deaminase.
As used throughout, “m6A” refers to posttranscriptional methylation of an adenosine residue in the RNA of prokaryotes and eukaryotes (e.g., mammals, insects, plants and yeast).
As used throughout an “m6A sensor sequence” is a sequence comprising one or more m6A methylation consensus motifs (GAC). The m6A sensor sequence can also comprise at least one sequence that can be converted to a stop codon when the m6A sensor sequence is methylated in the cell. In the constructs described herein, the m6A sensor sequence is in-frame with the nucleic acid encoding the heterologous protein, e.g. a reporter protein. The m6A sensor sequence is flanked by the nucleic acid sequence encoding the heterologous protein (e.g., reporter protein) and the nucleic acid sequence encoding a destabilization domain, e.g., DHFR. When the construct is methylated in the cell, a C to U modification generates a stop codon in the m6A sensor sequence. The stop codon prevents expression of the destabilization domain, thus preventing degradation of the heterologous protein. Exemplary m6A sensor sequences include, but are not limited to, a nucleic acid sequence comprising, consisting of, or consisting essentially of, SEQ ID NOs: 66 and 108-116. Nucleic acid sequences having at least 90, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identity with a nucleic acid sequence comprising, consisting essentially of, or consisting of SEQ ID NOs: 66 and 108-116 are also provided. One of skill in the art would understand that these sequences are merely exemplary because any m6A sensor sequence comprising at least one m6A methylation consensus motif (GAC) (e.g., one, two, three, four etc.) can be used as a sensor sequence.
As used throughout, a m6A binding domain of a YT521-B homology (YTH) domain-containing protein is a polypeptide fragment of a YTH domain-containing protein that binds to m6A-containing sequence (e.g., a RNA, such as a mRNA or a m6A sensor sequence). The m6A binding domain derived from a YT521-B homology (YTH) domain-containing protein can be of any size as long as it retains binding activity and is not the full-length YTH domain-containing protein. In some embodiments, the binding domain retains at least about 75%, 80%, 90%, 95%, or 99% of the binding activity of the wildtype YTH domain-containing protein from which the binding domain is derived.
In some embodiments, the DNA construct encodes a m6A binding domain comprising a polypeptide having at least 95% identity, for example, at least about 95%, 96%, 97%, 98% or 99% identity, to SEQ ID NO: 67 (amino acid sequence of YTHDF2-YTH, a m6A binding domain of YTHDF2), SEQ ID NO: 68 (amino acid sequence of YTHDF2-YTH_W432A_W486A, a mutated m6A binding domain of YTHDF2), SEQ ID NO: 69 (amino acid sequence of YTHDF2-YTHmut, an amino acid sequence that includes the YTH domain of YTHDF2, and does not include the m6A-binding domain), SEQ ID NO: 70 (amino acid sequence of YTHDF2-YTHmut, an amino acid sequence comprising SEQ ID NO: 69, with a W432A mutation and a W486a mutation), SEQ ID NO: 71 (amino acid sequence of YTHDF2-YTH D422N, a mutated m6A binding domain of YTHDF2), SEQ ID NO: 72 (amino acid sequence of a m6A binding domain of YTHDF1), SEQ ID NO: 73 (amino acid sequence of YTHDF1mut, an amino acid sequence that includes the YTH domain of YTHDF2, and does not include the m6A-binding domain), SEQ ID NO: 74 (amino acid sequence of YTHDF1 D401N, a mutated m6A binding domain of YTHDF1), SEQ ID NO: 75 (amino acid sequence of a m6A binding domain of YTHDF3); SEQ ID NO: 76 (amino acid sequence of a m6A binding domain of YTHDC1) or SEQ ID NO: 77 (amino acid sequence of a m6A binding domain of YTHDC2).
As used throughout, a catalytic domain of a cytidine deaminase is a polypeptide comprising a cytidine deaminase, for example, Apolipoprotein B MRNA Editing Enzyme Catalytic Subunit (APOBEC1 or APO1), activation induced cytidine deaminase (AICDA) or Apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 3A (APOBEC3A), or a catalytic fragment thereof, that catalyzes deamination of cytidine (“C”) to uridine (“U”) in RNA molecules. As used throughout, a catalytic domain of an adenosine deaminase, is a polypeptide comprising an adenosine deaminase, for example, double-stranded RNA-specific adenosine deaminase (ADAR1), or a catalytic fragment thereof, that catalyzes deamination of adenosine (“A”) to inosine (“I”) in RNA molecules. In some embodiments, the catalytic domain retains at least about 75%, 80%, 90%, 95%, or 99% of the enzymatic activity of the wildtype deaminase from which the domain is derived.
In some embodiments, the catalytic domain comprises a polypeptide having at least 95% identity, for example, at least about 95%, 96%, 97%, 98% or 99% identity, to SEQ ID NO: 78 (amino acid sequence of rAPOBEC1) or its catalytic domain (SEQ ID NO: 120), SEQ ID NO: 13 (amino acid sequence of hAICDA) or its catalytic domain (SEQ ID NO: 79); SEQ ID NO: 80 (amino acid sequence of hAPOBEC3A) or its catalytic domain (SEQ ID NO: 128); SEQ ID NO: 81 (amino acid sequence of ADAR2) or its catalytic domain (SEQ ID NO: 119); or SEQ ID NO: 121 (amino acid sequence of ADAR1) or its catalytic domain (SEQ ID NO: 122).
The catalytic domain can also comprise a polypeptide having at least 95% identity to SEQ ID NO: 119 (amino acid sequence of catalytic domain of ADAR2), as set forth in U.S. Patent Application Publication No. 20190010478.
In some embodiments, the DNA construct encodes a m6A binding domain fused to the catalytic domain via a peptide linker. The peptide linker can be about 2 to about 150 amino acids in length. For example, the linker can be a linker of from about 5 to about 20 amino acids in length, from about 5 to about 25 amino acids in length, from about 10 to about 30 amino acids in length, 5 to about 35 amino acids in length, from about 5 to about 40 amino acids in length, from about 5 to about 45 amino acids in length, from about 5 to about 50 amino acids in length, from about 5 to about 55 amino acids in length, from about 5 to about 60 amino acids in length, from about 5 to about 65 amino acids in length, from about 5 to about 70 amino acids in length, from about 5 to about 75 amino acids in length, from about 5 to about 80 amino acids in length, from about 5 to about 85 amino acids in length, from about 5 to about 90 amino acids in length, from about 5 to about 95 amino acids in length, from about 5 to about 100 amino acids in length, from about 5 to about 105 amino acids in length, from about 5 to about 110 amino acids in length, from about 5 to about 115 amino acids in length, from about 5 to about 120 amino acids in length, from about 5 to about 125 amino acids in length, from about 5 to about 130 amino acids in length, from about 5 to about 135 amino acids in length, from about 5 to about 140 amino acids in length, from about 5 to about 145 amino acids in length, or from about 5 to about 150 amino acids in length.
Exemplary peptide linkers include, but are not limited to, peptide linkers comprising SEQ ID NO: 82 (SGSETPGTSESATPE), SEQ ID NO: 83 (SGSETPGTSESATPES), SEQ ID NO: 84 ((GGGGS)3), SEQ ID NO: 85 ((GGGGS)10), SEQ ID NO: 117 ((GGGGS)20), SEQ ID NO: 86 (A(EAAAK)3A), SEQ ID NO: 123 (A(EAAAK)10A), or SEQ ID NO: 124 (A(EAAAK)2MA).
In some embodiments, the fusion protein further comprises a localization element. In some embodiments, the localization element is fused to the N-terminus or the C-terminus of the fusion protein. As used herein, a localization element targets or localizes the fusion protein to one or more subcellular compartments. Subcellular compartments include but are not limited to, the nucleus, the endoplasmic reticulum, the mitochondria, chromatin, the cellular membrane, and RNA granules (for example, P-bodies, stress granules and transport granules). In some embodiments, the fusion protein can be targeted to the nuclear lamina, nuclear speckles nuclear paraspeckles in the nucleus of a cell. In some embodiments, the protein can be targeted to the outer mitochondrial membrane or the inner mitochondrial membrane.
Exemplary localization elements include, but are not limited to, a peptide comprising a nuclear localization signal, for example, SEQ ID NO: 89 (PKKKRKV), a peptide comprising a nuclear export signal, for example, SEQ ID NO: 90 (LPPLERLTL), a peptide comprising an endoplasmic reticulum targeting sequence, for example, SEQ ID NO: 91 (MDPVVVLGLCLSCLLLLSLWKQSYGGG), or SEQ ID NO: 92 (METDTLLLWVLLLWVPGSTGD), a peptide comprising a Myc tag, for example, SEQ ID NO: 93 (EQKLISEEDL), a peptide comprising a V5 tag, for example, SEQ ID NO:94 (GKPIPNPLLGLDST) or SEQ ID NO: 95 (IPNPLLGLD), a peptide comprising a FLAG tag, for example, SEQ ID NO: 96 (DYKDDDDK), a peptide comprising a 3×FLAG tag, for example, SEQ ID NO: 97 (DYKDHDGDYKDHDIDYKDDDDK) and a peptide comprising a DHFR destabilization domain, for example, SEQ ID NO: 98 (ISLIAALAVDHVIGMETVMPWNLPADLAWFKRNTLNKPVIMGRHTWESIGRPLPGRKNI ILSSQPSTDDRVTWVKSVDEAIAACGDVPEIMVIGGGRVYEQFLPKAQKLYLTHIDAEVE GDTHFPDYEPDDWESVFSEFHDADAQNSHSYCFEILERR). HA tags and NLS tags can also be utilized as known in the art.
Exemplary targeting effector proteins, such as catalytically-inactive RNA-guided endonucleases are provided above in the definitions above (for example, dCas9 and dCas13).
Exemplary effector proteins being tumor suppression proteins include p53 and SOCS2. In some embodiments, p53 comprises a polypeptide (or a polynucleotide encoding a polypeptide) having at least 95% identity, for example, at least about 95%, 96%, 97%, 98% or 99% identity, to SEQ ID NO: 125. In some embodiments, human SOCS2 comprises a polypeptide (or a polynucleotide encoding a polypeptide) having at least 95% identity, for example, at least about 95%, 96%, 97%, 98% or 99% identity, to SEQ ID NO: 126. Other tumor suppression proteins may be utilized, for example, those that affect the cell cycle or other proteins that are upstream or downstream of the JAK/STAT signaling pathway.
Provided herein are polypeptides that relate to methyladenosine (m6A) sensors and systems for detecting m6A modifications, in addition to effector protein expression systems and systems for targeting sensing and/or effector expression. Polypeptides as described herein can comprise more than one coding sequence for a protein of interest that are translationally fused so as to create a fusion protein. Provided herein are polypeptides encoded by any of the polynucleotides as described herein.
Modifications to any of the polypeptides or proteins provided herein are made by known methods. By way of example, modifications are made by site specific mutagenesis of nucleotides in a nucleic acid encoding the polypeptide, thereby producing a DNA encoding the modification, and thereafter expressing the DNA in recombinant cell culture to produce the encoded polypeptide. Techniques for making substitution mutations at predetermined sites in DNA having a known sequence are well known. For example, M13 primer mutagenesis and PCR-based mutagenesis methods can be used to make one or more substitution mutations. Any of the nucleic acid sequences provided herein can be codon-optimized to alter, for example, maximize expression, in a host cell or organism.
The amino acids in the polypeptides described herein can be any of the 20 naturally occurring amino acids, D-stereoisomers of the naturally occurring amino acids, unnatural amino acids, and chemically modified amino acids. Unnatural amino acids (that is, those that are not naturally found in proteins) are also known in the art, as set forth in, for example, Zhang et al. “Protein engineering with unnatural amino acids,” Curr. Opin. Struct. Biol. 23(4): 581-587 (2013); Xie et al. “Adding amino acids to the genetic repertoire,” 9(6): 548-54 (2005)); and all references cited therein. B and γ amino acids are known in the art and are also contemplated herein as unnatural amino acids.
As used herein, a chemically modified amino acid refers to an amino acid whose side chain has been chemically modified. For example, a side chain can be modified to comprise a signaling moiety, such as a fluorophore or a radiolabel. A side chain can also be modified to comprise a new functional group, such as a thiol, carboxylic acid, or amino group. Post-translationally modified amino acids are also included in the definition of chemically modified amino acids.
Also contemplated are conservative amino acid substitutions. By way of example, conservative amino acid substitutions can be made in one or more of the amino acid residues, for example, in one or more lysine residues of any of the polypeptides provided herein. One of skill in the art would know that a conservative substitution is the replacement of one amino acid residue with another that is biologically and/or chemically similar. The following eight groups each contain amino acids that are conservative substitutions for one another:
By way of example, when an arginine to serine is mentioned, also contemplated is a conservative substitution for the serine (e.g., threonine). Nonconservative substitutions, for example, substituting a lysine with an asparagine, are also contemplated.
Recombinant nucleic acids encoding any of the polypeptides described herein are also provided.
As used throughout, the term “nucleic acid” or “nucleotide” refers to deoxyribonucleic acids (DNA) or ribonucleic acids (RNA) and polymers thereof in either single- or double-stranded form. It is understood that when an RNA is described, its corresponding cDNA is also described, wherein uridine is represented as thymidine. Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleotides that have similar properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides. A nucleic acid sequence (i.e., a polynucleotide) can comprise combinations of deoxyribonucleic acids and ribonucleic acids. Such deoxyribonucleic acids and ribonucleic acids include both naturally occurring molecules and synthetic analogues. The polynucleotides of the invention also encompass all forms of sequences including, but not limited to, single-stranded forms, double-stranded forms, hairpins, stem-and-loop structures, and the like.
Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions), alleles, orthologs, SNPs, and complementary sequences as well as the sequence explicitly indicated. Specifically, degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues. See Batzer et al., Nucleic Acid Res. 19:5081 (1991); Ohtsuka et al., J. Biol. Chem. 260:2605-2608 (1985); and Rossolini et al., Mol. Cell. Probes 8:91-98 (1994).
The term “identity” or “substantial identity,” as used in the context of a polynucleotide or polypeptide sequence described herein, refers to a sequence that has at least 60% sequence identity to a reference sequence. Alternatively, percent identity can be any integer from 60% to 100%. Exemplary embodiments include at least: 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%, as compared to a reference sequence using the programs described herein; preferably BLAST using standard parameters, as described below. One of skill will recognize that these values can be appropriately adjusted to determine corresponding identity of proteins encoded by two nucleotide sequences by taking into account codon degeneracy, amino acid similarity, reading frame positioning, and the like.
For sequence comparison, typically one sequence acts as a reference sequence to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Default program parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters.
A “comparison window,” as used herein, includes reference to a segment of any one of the number of contiguous positions selected from the group consisting of from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. Methods of alignment of sequences for comparison are well-known in the art. Optimal alignment of sequences for comparison may be conducted by the local homology algorithm of Smith and Waterman Add. APL. Math. 2:482 (1981), by the homology alignment algorithm of Needleman and Wunsch J. Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson and Lipman Proc. Natl. Acad. Sci. (U.S.A.) 85: 2444 (1988), by computerized implementations of these algorithms (e.g., BLAST), or by manual alignment and visual inspection.
Algorithms that are suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al. (1990) J. Mol. Biol. 215: 403-410 and Altschul et al. (1977) Nucleic Acids Res. 25: 3389-3402, respectively. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (NCBI) web site. The algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al, supra). These initial neighborhood word hits acts as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a word size (W) of 28, an Iectation (E) of 10, M=1, N=−2, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a word size (W) of 3, Iexpectation (E) of 10, and the BLOSUM62 scoring matrix. See Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989).
The BLAST algorithm also performs a statistical analysis of the similarity between two sequences. See, e.g., Karlin & Altshcul, Proc. Nat'l. Acad. Sci. USA 90:5873-5787 (1993). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.01, more preferably less than about 10−5, and most preferably less than about 10−20.
A target-specific guide RNA (gRNA) can comprise a nucleotide sequence that is complementary to a polynucleotide or RNA target sequence as described herein (for example one encoding a GEMS as described herein), and thereby mediates binding of the Cas-gRNA complex by hybridization at the target site. A target-specific guide RNA (gRNA) can comprise a nucleotide sequence that is complementary to a polynucleotide or RNA target sequence as described herein (for example METLL3, or other methylation target or therapeutic target in the cell, for example, a regulator of the cell cycle or protein involved in the JAK/STAT signaling pathway), and thereby mediates binding of the Cas-gRNA complex by hybridization at the target site. In certain embodiments, the gRNA is 5-50 nucleotides, 10-30 nucleotides, 15-25 nucleotides, 18-22 nucleotides, or 19-21 nucleotides in length, or any length between the stated ranges, including, for example, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 nucleotides in length.
Provided herein are DNA constructs comprising aspects of expression systems as described herein, for example, components as described in Section I and II above.
The recombinant nucleic acids provided herein can be included in expression cassettes for expression in a host cell or an organism of interest. The cassette will include 5′ and 3′ regulatory sequences operably linked to a recombinant nucleic acid provided herein that allows for expression of the modified polypeptide. The cassette may additionally contain at least one additional gene or genetic element to be co-transformed into the cell or organism. Where additional genes or elements are included, the components are operably linked. Alternatively, the additional gene(s) or element(s) can be provided on multiple expression cassettes. Such an expression cassette is provided with a plurality of restriction sites and/or recombination sites for insertion of the polynucleotides to be under the transcriptional regulation of the regulatory regions. The expression cassette may additionally contain a selectable marker gene. The expression cassette will include in the 5′ to 3′ direction of transcription: a transcriptional and translational initiation region (i.e., a promoter), a polynucleotide of the invention, and a transcriptional and translational termination region (i.e., termination region) functional in the cell or organism of interest. The promoters of the invention are capable of directing or driving expression of a coding sequence in a host cell. The regulatory regions (i.e., promoters, transcriptional regulatory regions, and translational termination regions) may be endogenous or heterologous to the host cell or to each other. As used herein, “heterologous” in reference to a sequence is a sequence that originates from a foreign species, or, if from the same species, is substantially modified from its native form in composition and/or genomic locus by deliberate human intervention.
Additional regulatory signals include, but are not limited to, transcriptional initiation start sites, operators, activators, enhancers, other regulatory elements, ribosomal binding sites, an initiation codon, termination signals, and the like. See Sambrook et al. (1992) Molecular Cloning: A Laboratory Manual, ed. Maniatis et al. (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.); Davis et al., eds. (1980) Advanced Bacterial Genetics (Cold Spring Harbor Laboratory Press), Cold Spring Harbor, N.Y., and the references cited therein.
The expression cassette can also comprise a selectable marker gene for the selection of transformed cells. Marker genes include genes conferring antibiotic resistance, such as those conferring hygromycin resistance, ampicillin resistance, gentamicin resistance, neomycin resistance, to name a few. Additional selectable markers are known and any can be used.
In preparing the expression cassette, the various DNA fragments may be manipulated, so as to provide for the DNA sequences in the proper orientation and, as appropriate, in the proper reading frame. Toward this end, adapters or linkers may be employed to join the DNA fragments or other manipulations may be involved to provide for convenient restriction sites, removal of superfluous DNA, removal of restriction sites, or the like. For this purpose, in vitro mutagenesis, primer repair, restriction, annealing, resubstitutions, e.g., transitions and transversions, may be involved.
In preparing the expression cassette, the various DNA fragments may be manipulated, so as to provide for the DNA sequences in the proper orientation and, as appropriate, in the proper reading frame. Toward this end, adapters or linkers may be employed to join the DNA fragments or other manipulations may be involved to provide for convenient restriction sites, removal of superfluous DNA, removal of restriction sites, or the like. For this purpose, in vitro mutagenesis, primer repair, restriction, annealing, resubstitutions, e.g., transitions and transversions, may be used.
Further provided is a vector comprising a nucleic acid or expression cassette set forth herein. The vector is contemplated to have the necessary functional elements that direct and regulate transcription of the inserted nucleic acid. These functional elements include, but are not limited to, a promoter, regions upstream or downstream of the promoter, such as enhancers that may regulate the transcriptional activity of the promoter, an origin of replication, appropriate restriction sites to facilitate cloning of inserts adjacent to the promoter, antibiotic resistance genes or other markers which can serve to select for cells containing the vector or the vector containing the insert, RNA splice junctions, a transcription termination region, or any other region which may serve to facilitate the expression of the inserted gene or hybrid gene. See generally, Sambrook et al. Molecular Cloning: A Laboratory Manual, 4th ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 2012. The vector, for example, can be a plasmid.
There are numerous E. coli expression vectors known to one of ordinary skill in the art, which are useful for the expression of a nucleic acid. Other microbial hosts suitable for use include bacilli, such as Bacillus subtilis, and other enterobacteriaceae, such as Salmonella, Senatia, and various Pseudomonas species. In these prokaryotic hosts, one can also make expression vectors, which will typically contain expression control sequences compatible with the host cell (e.g., an origin of replication). In addition, any number of a variety of well-known promoters will be present, such as the lactose promoter system, a tryptophan (Trp) promoter system, a beta-lactamase promoter system, or a promoter system from phage lambda. Additionally, yeast expression can be used. Provided herein is a nucleic acid encoding a polypeptide of the present invention, wherein the nucleic acid can be expressed by a yeast cell. More specifically, the nucleic acid can be expressed by Pichia pastoris or S. cerevisiae.
Mammalian cells also permit the expression of proteins in an environment that favors important post-translational modifications such as folding and cysteine pairing, addition of complex carbohydrate structures, and secretion of active protein. Vectors useful for the expression of active proteins in mammalian cells are known in the art and can contain genes conferring hygromycin resistance, geneticin or G418 resistance, or other genes or phenotypes suitable for use as selectable markers, or methotrexate resistance for gene amplification. A number of suitable host cell lines capable of secreting intact human proteins have been developed in the art, and include CHO cells, HeLa cells, HEK-293 cells, HEK-293T cells, U2OS cells, or any other primary or transformed cell line. Other suitable host cell lines include COS-7 cells, myeloma cell lines, Jurkat cells, etc. Expression vectors for these cells can include expression control sequences, such as an origin of replication, a promoter, an enhancer, and necessary information processing sites, such as ribosome binding sites, RNA splice sites, polyadenylation sites, and transcriptional terminator sequences. Preferred expression control sequences are promoters derived from immunoglobulin genes, SV40, Adenovirus, Bovine Papilloma Virus, etc.
The expression vectors described herein can also include the nucleic acids as described herein under the control of an inducible promoter such as the tetracycline inducible promoter or a glucocorticoid inducible promoter. The nucleic acids of the present invention can also be under the control of a tissue-specific promoter to promote expression of the nucleic acid in specific cells, tissues or organs. Any regulatable promoter, such as a metallothionein promoter, a heat-shock promoter, and other regulatable promoters, of which many examples are well known in the art are also contemplated. Furthermore, a Cre-loxP inducible system can also be used, as well as a Flp recombinase inducible promoter system, both of which are known in the art.
Insect cells also permit the expression of the polypeptides. Recombinant proteins produced in insect cells with baculovirus vectors undergo post-translational modifications similar to that of wild-type mammalian proteins.
Also provided herein is a vector comprising the polynucleotides as described herein. The vector may be a DNA vector or a RNA vector. In some embodiments, the vector is a non-viral vector (e.g., a plasmid or naked DNA) or a viral vector. In some embodiments, the vector is a viral vector. Examples of viral vectors include, but are not limited to, an adeno-associated virus (AAV) vector, a retroviral vector, a lentiviral vector, a herpes simplex viral vector, or an adenoviral vector. It is understood that any of the viral vectors described herein can be packaged into viral particles or virions for administration to the subject.
In some aspects, the disclosure provides a virus comprising the nucleic acid comprising a nucleotide sequence encoding a polypeptide as described herein or the viral vector as described herein. The virus may be a AAV, a lentivirus, or a retrovirus.
Non-viral vectors can also be used to deliver the polynucleotides described herein. Accordingly, in some embodiments, the vector is a non-viral vector. For example, non-viral systems, such as naked DNA formulated as a microparticle, may be used. In some embodiments, delivery may include using virus-like particles (VLPs), cationic liposomes, nanoparticles, cell-derived nanovesicles, direct nucleic acid injection, hydrodynamic injection, use of nucleic acid condensing peptides and non-peptides. In one approach, virus-like particles (VLP's) are used to deliver the polypeptide(s). The VLP comprises an engineered version of a viral vector, where nucleic acids are packaged into VLPs through alternative mechanisms (e.g., mRNA recruitment, protein fusions, protein-protein binding). See Itaka and Kataoka, 2009, “Recent development of nonviral gene delivery systems with virus-like structures and mechanisms,” Eur J Pharma and Biopharma 71:475-483; and Keeler et al., 2017, “Gene Therapy 2017: Progress and Future Directions” Clin. Transl. Sci. (2017) 10, 242-248, incorporated by reference.
Aspects of this disclosure include host cells and transgenic animals comprising the nucleic acid sequences or constructs described herein as well as methods of making such cells and transgenic animals.
a. Host Cells
A host cell comprising a nucleic acid or a vector or an expression as described herein is provided. The host cell can be an in vitro, ex vivo, or in vivo host cell. Populations of any of the host cells described herein are also provided. A cell culture comprising one or more host cells described herein is also provided. Methods for the culture and production of many cells, including cells of bacterial (for example E. coli and other bacterial strains), animal (especially mammalian), and archebacterial origin are available in the art. See e.g., Sambrook, Ausubel, and Berger (all supra), as well as Freshney (1994) Culture of Animal Cells, a Manual of Basic Technique, 3rd Ed., Wiley-Liss, New York and the references cited therein; Doyle and Griffiths (1997) Mammalian Cell Culture: Essential Techniques John Wiley and Sons, NY; Humason (1979) Animal Tissue Techniques, 4th Ed. W.H. Freeman and Company; and Ricciardelli, et al., (1989) In vitro Cell Dev. Biol. 25:1016-1024.
The host cell can be a prokaryotic cell, including, for example, a bacterial cell. Alternatively, the cell can be a eukaryotic cell, for example, a mammalian cell. In some embodiments, the cell can be an HEK293T cell, a Chinese hamster ovary (CHO) cell, a COS-7 cell, a HELA cell, an avian cell, a myeloma cell, a Pichia cell, an insect cell or a plant cell. A number of other suitable host cell lines have been developed and include myeloma cell lines, fibroblast cell lines, and a variety of tumor cell lines such as melanoma cell lines. The vectors containing the nucleic acid segments of interest can be transferred or introduced into the host cell by well-known methods, which vary depending on the type of cellular host. Host cells can be derived from any of the animals models discussed in (b) below.
In some embodiments, the provided cells express the protein stably or transiently by introducing an expression system (or any component thereof) into the cell. Stable expression of the protein in a cell refers to integration of any of the nucleic acids, DNA constructs, or vectors described herein into the genome of the cell, thereby allowing the cell to express the protein. Transient expression refers to expression of the protein directly from any of the nucleic acids, DNA constructs, and/or vectors following introduction into the cell (i.e., the gene encoding the protein is not integrated into the genome of the cell).
As used herein, the phrase “introducing” in the context of introducing a nucleic acid into a cell refers to the translocation of the nucleic acid sequence from outside a cell to inside the cell. In some cases, introducing refers to translocation of the nucleic acid from outside the cell to inside the nucleus of the cell. Various methods of such translocation are contemplated, including but not limited to, electroporation, nanoparticle delivery, viral delivery, contact with nanowires or nanotubes, receptor mediated internalization, translocation via cell penetrating peptides, liposome mediated translocation, DEAE dextran, lipofectamine, calcium phosphate or any method now known or identified in the future for introduction of nucleic acids into prokaryotic or eukaryotic cellular hosts. A targeted nuclease system (e.g., an RNA-guided nuclease, a transcription activator-like effector nuclease (TALEN), a zinc finger nuclease (ZFN), or a megaTAL (MT) (Li et al. Signal Transduction and Targeted Therapy 5, Article No. 1 (2020)) can also be used to introduce a nucleic acid, for example, a nucleic acid encoding a fusion protein and/or mRNA transcript (e.g, mRNA reporter mRNA) described herein, into a host cell.
In some embodiments, the provided cells express the protein constitutively or inducibly. Constitutive expression refers to ongoing, continuous expression of a gene (i.e., of a protein), whereas inducible expression refers to gene (protein) expression that is responsive to a stimulus. Inducible expression is generally regulated via an inducible promoter, a description of which is included above.
The CRISPR/Cas9 system, an RNA-guided nuclease system that employs a Cas9 endonuclease, can be used to edit the genome of a host cell or organism. Other RNA-guided CAS effector proteins can be used as well, for example, Cas13. The “CRISPR/Cas” system refers to a widespread class of bacterial systems for defense against foreign nucleic acid. CRISPR/Cas systems are found in a wide range of eubacterial and archaeal organisms. CRISPR/Cas systems include type I, II, and III sub-types. Wild-type type II CRISPR/Cas systems utilize an RNA-mediated nuclease, for example, Cas9, in complex with guide and activating RNA to recognize and cleave foreign nucleic acid. Guide RNAs having the activity of both a guide RNA and an activating RNA are also known in the art. In some cases, such dual activity guide RNAs are referred to as a single guide RNA (sgRNA).
Any of the components encoded by the nucleic acid constructs described herein, for example, fusion proteins or a m6A/effector protein fusion protein, can be purified or isolated from a host cell or population of host cells. For example, a recombinant nucleic acid encoding any of the fusion proteins described herein can be introduced into a host cell under conditions that allow expression of the fusion protein. In some embodiments, the recombinant nucleic acid is codon-optimized for expression. After expression in the host cell, the fusion protein can be isolated or purified. Similarly, any of the nucleic acids encoding a m6A reporter mRNA described herein can be introduced into a host cell under conditions that allow transcription of the m6A reporter mRNA. After expression in the host cell, the m6A reporter mRNA can be isolated or purified.
b. Animal Models
Also provided is a non-human transgenic animal comprising a mammalian host cell that comprises any of the nucleic acid sequences or constructs described herein. Methods for making transgenic animals, include, but are not limited to, oocyte pronuclear DNA microinjection, intracytoplasmic sperm injection, embryonic stem cell manipulation, somatic nuclear transfer, recombinase systems (for example, Cre-LoxP systems, Flp-FRT systems and others), zinc finger nucleases (ZNFs), transcriptional activator-like effector nucleases (TALENs) and clustered regularly interspaced short palindromic repeat/CRISPR-associated protein 9 (CRISPR/Cas9). See, for example, Volobueva et al. Braz. J. Med. Biol. Res. 52(5): e8108 (2019)).
The term “transgenic animal” as used herein means an animal into which a genetic modification has been introduced by a genetic engineering procedure and in particular an animal into which has been introduced an exogenous nucleic acid, and may loosely also encompass “knock in” animals. That is the animal comprises a nucleic acid sequence which is not normally present in the animal. Such animals can be created by a one-for-one substitution of DNA sequence information in a predetermined genetic locus or the insertion of sequence information not found within the locus.
A transgenic animal may be developed, for example, from embryonic cells into which the genetic modification (e.g. exogenous nucleic acid sequence) has been directly introduced or from the progeny of such cells. The exogenous nucleic acid is introduced artificially into the animal (e.g. into a founder animal). Animals that are produced by transfer of an exogenous nucleic acid through breeding of the animal comprising the nucleic acid (into whom the nucleic acid was artificially introduced), which are progeny animals, are also included. Representative examples of non-human mammals include, but are not limited to non-human primates, mice, rats, rabbits, pigs, goats, sheep, horses, zebrafish and cows. A cell or a population of cells from any of the non-human transgenic animals provided herein is also provided.
The exogenous nucleic acid may be integrated into the genome of the animal or it may be present in an non-integrated form, e.g. as an autonomously-replicating unit, for example, an artificial chromosome which does not integrate into the genome, but which is maintained and inherited substantially stably in the animal. In some embodiments, the exogenous nucleic acid is under the control of a cell-specific or tissue-specific promoter. For example, transgenic animals that express a fusion protein and a mRNA reporter sequence in specific cells or tissues can be produced by introducing one or more nucleic acids into fertilized eggs, embryonic stem cells or the germline of the animal, wherein the one or more nucleic acids are under the control of a specific promoter which allows expression of the nucleic acid fusion protein and mRNA reporter sequence in specific types of cells or tissues. As used herein, a protein or mRNA is expressed predominantly in a given tissue, cell type, cell lineage or cell, when 90% or greater of the observed expression occurs in the given tissue cell type, cell lineage or cell.
In some embodiments, the exogenous nucleic acid in the animal is under the control of a constitutive or an inducible promoter, as described above. Inducible systems can also be used to allow expression of the fusion and/or mRNA reporter sequence at designated times during development, expanding the temporal specificity of fusion protein and/or mRNA reporter expression in the transgenic animal.
Included are both progenitor and progeny animals. Progeny animals include animals which are descended from the progenitor as a result of sexual reproduction or cloning and which have inherited genetic material from the progenitor. Thus, the progeny animals comprise the genetic modification introduced into the parent. A transgenic animal may be developed, for example, from embryonic cells into which the genetic modification (e.g. exogenous nucleic acid sequence) has been directly introduced or from the progeny of such cells. The exogenous nucleic acid is introduced artificially into the animal (e.g. into a founder animal). Animals that are produced by transfer of an exogenous nucleic acid through breeding of the animal comprising the nucleic acid (into whom the nucleic acid was artificially introduced), which are progeny animals, are also included.
Although the present disclosure is described primarily in a mouse, one of ordinary skill in the art would understand that other non-human mammals, for example, rodent, rabbit, bovine, ovine, canine, feline, equine, porcine, camelid, non-human primate, and other mammals, can also be engineered to express aspects of the present disclosure in a similar fashion, and these transgenic animals can also be used for applications as disclosed herein. A cell or a population of cells from any of the non-human transgenic animals provided herein is also provided.
Also provided herein are pharmaceutical compositions of the nucleic acids, the vectors, the viruses, or the cells described herein. The pharmaceutical compositions described herein are for delivery to subjects in need thereof by any suitable route or a combination of different routes. The pharmaceutical compositions can be delivered to a subject, so as to allow expression of the polypeptide in cells of the subject and produce an effective amount of the polypeptide that treats a condition in the subject. In some embodiments, the pharmaceutical composition comprising the nucleic acid, the vector, the virus, or the cell as described herein further comprises a pharmaceutically acceptable excipient or carrier.
The terms “pharmaceutically acceptable carrier” and “pharmaceutically acceptable excipient” are used interchangeably and refer to a substance or compound that aids or facilitates preparation, storage, administration, delivery, effectiveness, absorption by a subject, or any other feature of the composition for its intended use or purpose. Such pharmaceutically acceptable carrier is not biologically or otherwise undesirable and can be included in the compositions of the present invention without causing a significant adverse toxicological effect on the subject or interacting in a deleterious manner with the other components of the pharmaceutical composition.
In some approaches, sterile injectable solutions can be prepared with the vectors in the required amount and an excipient suitable for injection into a human patient. In some embodiments, the pharmaceutically and/or physiologically acceptable excipient is particularly suitable for administration to the cardiac muscle. For example, a suitable carrier may be buffered saline or other buffers, e.g., HEPES, to maintain pH at appropriate physiological levels, stabilizing agents, adjuvants, diluents, or surfactants. In some embodiments, the pharmaceutically acceptable excipient comprises a non-ionic detergent, such as, for example, Pluronic F-681. For injection, the excipient will typically be a liquid. Exemplary pharmaceutically acceptable excipients include sterile, pyrogen-free water and sterile, pyrogen-free, phosphate buffered saline. Pharmaceutically acceptable salts can be included therein, for example, mineral acid salts such as hydrochlorides, hydrobromides, phosphates, sulfates, and the like; and the salts of organic acids such as acetates, propionates, malonates, benzoates, and the like. The preparation of pharmaceutically acceptable carriers, excipients and formulations is described in, e.g., Remington: The Science and Practice of Pharmacy, 22nd edition, Loyd V. Allen et al, editors, Pharmaceutical Press (2012). See also Bennicelli et al., “Reversal of blindness in animal models of leber congenital amaurosis using optimized AAV2-mediated gene transfer,” Mol Ther. (2008); 16(3):458-65. A variety of known carriers are also provided in U.S. Pat. Nos. 7,629,322, and 6,764,845, incorporated herein by reference.
Provided herein are methods for inducing m6A methylation-dependent expression of a heterologous polypeptide (comprising an effector protein) in one or more cells, a biological samples (for example a group of cells or tissue biopsy from a mammalian subject), or a subject having or suspected of having a cancer as described herein (or otherwise derived from a cancer in the case of cells in vitro), comprising introducing or administering any of the expression systems described herein into one or more cells, the sample, or the subject. As set forth above, when any of the expression systems described herein is introduced into a cell, sample, or subject, if m6A methylation occurs in the cell, the effector protein expressed by the expression system, i.e., a mRNA comprising a heterologous protein, a m6A sensor sequence and a destabilization domain (e.g., DHFR), will be methylated (at the m6A sensor sequence). Upon methylation, C to U editing results in a stop codon in the m6A sensor sequence that inhibits expression of DHFR, thus allowing the heterologous protein to be expressed without degradation.
In embodiments, the cell can be an in vitro, ex vivo or in vivo cell. The cell may be a mammalian cell or a rodent cell.
Provided herein are also methods for administering or introducing any of the expression systems described herein to a cell, sample, or subject as described herein. The administering or introducing to one or more cells, a sample, or a subject, can be by mechanisms known in the art to introduce exogenous nucleic acids into cells, for example, lipofection, nucleofection, or electroporation. Alternatively, the skilled artisan would understand that aspects of expression systems as described herein can be cloned into viral expression vectors and packaged into an adenoviral (AAV) or lentiviral (LV) vector, and subsequently used to transduce the exogenous genetic material into the cell, sample, or subject.
Also provided is a virus (e.g., an AAV, a lentivirus, or a retrovirus) comprising any of the nucleic acids or vectors described in this disclosure.
Also provided is a cell comprising any of the nucleic acids, vectors, or viruses described in this disclosure.
Provided herein is also a pharmaceutical composition comprising any of the nucleic acids, vectors, viruses, or cells described herein, and a pharmaceutically acceptable excipient.
One aspect provided in this disclosure is a method of inhibiting a cancer cell, the method comprising introducing into the cancer cell the expression system as provided in this disclosure. In some embodiments, inhibiting the cancer cell by methods as described herein results in decreasing at least one of cell proliferation, cell migration, or metastasis.
In some embodiments of this method, the cancer cell can comprise m6A RNA hypermethylation. In some embodiments, the cancer cell comprises an acute myeloid leukemia (AML) cell, a glioblastoma (GBM) cell, a lung cancer cell, an endometrial cancer, a cervical cancer cell, an ovarian cancer cell, a breast cancer cell, a colorectal cancer (CRC) cell, a hepatocellular carcinoma (HCC) cell, a pancreatic cancer cell, a gastric cancer cell, a prostate cancer cell, or a renal cell carcinoma cell. In certain embodiments, the lung cancer cell is a non-small cell lung carcinoma cell. In certain embodiments, the cancer cell is a hepatocellular carcinoma cell.
In some embodiments, the second DNA construct comprises a polynucleotide encoding an effector protein, wherein the effector protein comprises a tumor suppressor protein. In some embodiments, expression of the tumor suppressor protein upregulates downstream signaling targets. The tumor suppressor protein may comprise at least one of the tumor suppressor proteins listed in Table 1. In certain embodiments, the tumor suppressor protein can comprise p53. In some embodiments, expression of p53 upregulates at least one of CDKN1A or GADD45A. In certain embodiments, the tumor suppressor protein comprises suppressor of cytokine signaling 2 (SOCS2).
The expression system may be introduced into the cancer cell by viral infection (in particular, adenoviral, lentiviral, or AAV infection).
In another aspect of this disclosure, provided herein is a method of treating a subject having a cancer characterized by m6A RNA hypermethylation, the method comprising introducing into a cancer cell in the subject the expression system as provided in this disclosure. In some embodiments, the method comprises inhibiting a cancer cell of the subject's cancer in the subject. In some embodiments, expression of the tumor suppressor protein results in decreasing at least one of cell proliferation, cell migration, or metastasis of the cancer.
In some embodiments, the cancer can comprise acute myeloid leukemia (AML), glioblastoma (GBM), lung cancer, endometrial cancer, cervical cancer, ovarian cancer, breast cancer, colorectal cancer (CRC), a hepatocellular carcinoma (HCC), pancreatic cancer, gastric cancer, prostate cancer, and/or renal cell carcinoma. In certain embodiments, the cancer comprises hepatocellular carcinoma.
In some embodiments, the expression system can be introduced into the subject by viral infection (adenoviral, lentiviral, AAV).
While SOCS2 and p53 are provided as examples, the skilled artisan would recognize that other tumor suppression proteins can be expressed depending on the type of cancer by cloning a coding sequence of any of the gene products from Table 1 into the expression system as the effector protein.
Effects of expression of tumor suppression proteins according to the present disclosure include: inhibition of mitogenic signaling pathways; inhibition of cell cycle progression; inhibition of “pro-growth” programs of metabolism and angiogenesis; inhibition of invasion and metastasis; stabilization of the genome; DNA repair factors; and induction of apoptosis.
Additional examples are provided in Table 1 below:
Described herein are methods of reducing m6A effector regulator expression in a sample or a subject. In particular, CRISPRi can be utilized to knock-down expression of m6A “writers”, which are proteins that are responsible for m6A dysregulation (in particular hypermethylation) observed in cancer cells.
In embodiments, described herein is a method of reducing m6A effector regulator expression, comprising: introducing an expression system into a subject having or suspected of having a cancer, wherein the expression system comprises: a first DNA construct comprising a polynucleotide encoding a fusion protein, wherein the fusion protein comprises an N6-methyladenosine (m6A) binding domain of a YT521-B homology (YTH) domain-containing protein fused to a catalytic domain of a cytidine deaminase or a catalytic domain of an adenosine deaminase; and a second DNA construct comprising a polynucleotide encoding a heterologous polypeptide, the polynucleotide encoding a heterologous polypeptide comprising: a polynucleotide encoding a catalytically-dead RNA-guided endonuclease; a polynucleotide encoding a m6A sensor sequence; and a polynucleotide encoding a dihydrofolate reductase (DHFR); an sgRNA configured to bind to an m6A regulator. In embodiments, the sgRNA is configured to bind to a m6A regulator listed in Table 2. In embodiments, the cancer comprises at least one of acute myeloid leukemia (AML), glioblastoma (GBM), lung cancer, endometrial cancer, cervical cancer, ovarian cancer, breast cancer, colorectal cancer (CRC), a hepatocellular carcinoma (HCC), pancreatic cancer, gastric cancer, prostate cancer, or renal cell carcinoma. In embodiments, the cancer is a cancer listed in Table 1 or Table 2. In embodiments, the catalytically-dead RNA-guided endonuclease is a dCas9 or dCas13.
Described herein are methods of reducing m6A hypermethylation in a subject or sample. In embodiments, methods comprising: introducing an expression system into a subject having or suspected of having a cancer, wherein the expression system comprises: a first DNA construct comprising a polynucleotide encoding a fusion protein, wherein the fusion protein comprises an N6-methyladenosine (m6A) binding domain of a YT521-B homology (YTH) domain-containing protein fused to a catalytic domain of a cytidine deaminase or a catalytic domain of an adenosine deaminase; and a second DNA construct comprising a polynucleotide encoding a heterologous polypeptide, the polynucleotide encoding a heterologous polypeptide comprising: a polynucleotide encoding a catalytically-dead RNA-guided endonuclease; a polynucleotide encoding a m6A sensor sequence; and a polynucleotide encoding a dihydrofolate reductase (DHFR); an sgRNA configured to bind to an m6A regulator. The sgRNA is configured to bind to a m6A regulator listed in Table 2. The cancer comprises at least one of acute myeloid leukemia (AML), glioblastoma (GBM), lung cancer, endometrial cancer, cervical cancer, ovarian cancer, breast cancer, colorectal cancer (CRC), a hepatocellular carcinoma (HCC), pancreatic cancer, gastric cancer, prostate cancer, or renal cell carcinoma. The cancer is a cancer listed in Table 1 or Table 2. The catalytically-dead RNA-guided endonuclease is a dCas9 or dCas13.
A subject can be a subject having or suspected of having a cancer as described herein. Introducing to the subject can comprise viral infection or electroporation. Modulating m6A levels by affecting an m6A regulator can decreasing at least one of cell proliferation, cell migration, or metastasis.
While METTL3 is provided as an example, the skilled artisan would recognize that other tumor suppression proteins can be expressed depending on the type of cancer by cloning a coding sequence of any of the gene products in Table 2 below into the expression system as the effector protein in order to reduce hypermethylation (or the effects thereof). It would be recognized that m6A “washers” may be also be expressed as tumor suppression proteins, while m6A “writers” can be targeted by the catalytically-dead RNA-guided endonuclease to block protein expression of the writers.
Additional aspects of CRISPi generally, can be found, for example in Carroll & Giacca, CRISPR activation and interference as investigative tools in the cardiovascular system, Int. J. of Biochem. &Cell Bio., Volume 155, February 2023, 106348, the contents of which regarding CRISPRi are incorporated by reference as if fully set forth herein.
(Adapted from Pan J, Huang T, Deng Z, Zou C. Roles and therapeutic implications of m6A modification in cancer immunotherapy. Front Immunol. 2023 Mar. 7; 14:1132601. doi: 10.3389/fimmu.2023.1132601. PMID: 36960074; PMCID: PMC10028070). Additional m6a regulators can be found, for example, in Gu. et al. RNA m6A Modification in Cancers: Molecular Mechanisms and Potential Clinical Applications, Cell, The Innovation 1, 100066, Nov. 25, 2020, as well as Chen, X Y., Zhang, J. & Zhu, J S. The role of m6A RNA methylation in human cancer. Mol Cancer 18, 103 (2019). Doi: 10.1186/s12943-019-1033-z; Chang, G., et al., RNa m6A Modification in Cancers: Molecular Mechanisms and Potential Clinical Applications. The Innovation, Vol. 1, Issue 3, Article 100066, Nov. 25, 2020. doi: 10.1016/j.xinn.2020.100066; and Chen, X. Y., et al. The role of m6A RNA methylation in human cancer. Molecular Cancer, Vol. 18, Article 103 (2019). doi: 10.1186/s12943-019-1033-z, molecular-cancer.biomedcentral.com/articles/10.1186/s12943-019-1033-z the contents of all of which are incorporated by reference regarding m6A regulators and cancers and effectors of regulators the m6A regulators.
Also provided are methods of treating a disease or disorder in a subject in need thereof, wherein the method comprises administering any of the expressions systems described herein to the subject. In some methods, the subject has cancer. In some methods, the subject is diagnosed with a disease or disorder (e.g., cancer).
As used herein, the term “administering” “administration”, or “administer” means delivering the pharmaceutical composition as described herein to a target cell or a subject. Administration refers to the act of introducing, injecting or otherwise physically delivering a substance as it exists outside the body (e.g., one or more nucleic acids, vectors, viruses, cells, or pharmaceutical compositions described herein) into a subject. The compositions described herein can be delivered to subjects in need thereof by any suitable route or a combination of different routes. Any suitable route of administration or combination of different routes can be used, including systemic administration (e.g., intravenous, intravascular, or intra-arterial injection), local injection into the heart muscle, local injection into the CNS (e.g., intracranial injection, intracerebral injection, intracerebroventricular, or injection into the Cerebrospinal fluid (CSF) via the cerebral ventricular system, cisterna magna, or intrathecal space), or local injection at other bodily sites (e.g. intraocular, intramuscular, subcutaneous, intradermal, or transdermal injection). In some embodiments, the compositions described herein are administered into the coronary arteries. In some embodiments, the compositions described herein are administered into the coronary sinus.
As used herein the terms “treatment”, “treat”, or “treating” refers to a clinical intervention made in response to a disease, disorder or physiological condition manifested by a patient or to which a patient may be susceptible. The aim of treatment includes the reduction, alleviation, slowing, or stopping the progression or worsening of a disease, disorder, or condition including reducing or preventing one or more of the effects or symptoms of the disease, disorder, or condition and/or the remission of the disease, disorder or condition, for example, a cardiac condition, in the subject. Thus, in the disclosed methods, treatment can refer to a 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 100% reduction in the severity of a cardiac condition. For example, a method for treating a cardiac condition is considered to be a treatment if there is a 10% reduction in one or more symptoms of a cardiac condition in a subject as compared to a control. Thus the reduction can be a 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, or any percent reduction in between 10% and 100% as compared to native or control levels. It is understood that treatment does not necessarily refer to a cure or complete ablation of the disease or symptoms of the disease.
Administration can be performed by injection, by use of an osmotic pump, by electroporation, or by other means. In some approaches, administration of the compositions of the present disclosure can be performed before, after, or simultaneously with surgical treatment.
Dosage values may depend on the nature of the product and the severity of the condition. It is to be understood that for any particular subject, specific dosage regimens can be adjusted over time and in course of the treatment according to the individual need and the professional judgment of the person administering or supervising the administration of the compositions. Accordingly, dosage ranges set forth herein are exemplary only and are not intended to limit the scope or practice of the claimed composition.
A therapeutically effective amount of such a composition may vary according to factors such as the disease state, age, sex, weight of the individual, and whether it is used concomitantly with other therapeutic agents. Dosage regimens may be adjusted to provide the optimum response. A suitable dose can also depend on the particular viral vector used, or the ability of the viral vector to elicit a desired response in the individual. A therapeutically effective amount is also one in which any toxic or detrimental effects of the viral vector are outweighed by the therapeutically beneficial effects. Other factors determining a dose can include, e.g., other medical disorders concurrently or previously affecting the subject, the general health of the subject, the genetic disposition of the subject, diet, time of administration, and any other additional therapeutics that are administered to the subject. It should also be understood that a specific dosage and treatment regimen for any particular subject also depends upon the judgment of the treating medical practitioner.
The effective amount of the compositions described herein can be determined by one of ordinary skill in the art. One of skill in the art will appreciate that an effective amount of a composition, for example, comprising an AAV or a lentivirus, can be empirically determined. An effective amount of any of the compositions described herein will vary and can be determined by one of skill in the art through experimentation and/or clinical trials. For example, quantification of genome copies (GC), vector genomes (VG), virus particles (VP), or infectious viral titer may be used as a measure of the dose contained in a formulation or suspension. Any method known in the art can be used to determine the GC, VG, VP or infectious viral titer of the virus compositions of the invention, including as measured by qPCR, digital droplet PCR (ddPCR), UV spectrophotometry, ELISA, next-generation sequencing, or fluorimetry as described in, e.g., in Dobkin et al., “Accurate Quantification and Characterization of Adeno-Associated Viral Vectors.” Front Microbiol 10: 1570-1583 (2019); Lock et al., “Absolute determination of single-stranded and self-complementary adeno-associated viral vector genome titers by droplet digital PCR.” Hum Gene Ther Methods 25: 115-125 (2014); Sommer, et al., “Quantification of adeno-associated virus particles and empty capsids by optical density measurement.” Mol Ther 7: 122-128 (2003); Grimm, et al. “Titration of AAV-2 particles via a novel capsid ELISA: packaging of genomes can limit production of recombinant AAV-2.” Gene Ther 6: 1322-1330 (1999); Maynard et al., “Fast-Seq: A Simple Method for Rapid and Inexpensive Validation of Packaged Single-Stranded Adeno-Associated Viral Genomes in Academic Settings.” Hum Gene Ther 30(6): 195-205 (2019); Piedra, et al., “Development of a rapid, robust, and universal picogreen-based method to titer adeno-associated vectors.” Hum Gene Ther Methods 26: 35-42 (2015); which are incorporated herein by reference. For intravenous injection, an exemplary human dosage range in vector particles (vp) may be between 5×10e13-10×10e14 vp per kilogram bodyweight (vp/kg) in a volume of 1-100,000 μl. In one embodiment, an exemplary human dose for intramuscular (cardiac muscle injection) or intracoronary delivery may be 1×10e14-5×10e14 vp per injection into the heart in a volume of 1-1000 μl.
In one approach, the composition is administered in a single dosage selected from those above listed. In another embodiment, the method involves administering the compositions in two or more dosages (e.g., split dosages). In another embodiment, multiple injections are made at different locations. In another embodiment, a second administration of the composition is performed at a later time point. Such time point may be weeks, months or years following the first administration. In some embodiments, multiple treatments may be required in any given subject over a lifetime.
As mentioned above, a targeted m6A-coupled effector protein delivery system can be used in cancer therapy. For example, METTL3 is elevated in many cancers, and hypermethylation of oncogenic mRNAs leads to increased translation and cancer progression (Vu et al. “The N(6)-methyladenosine (m(6)A)-forming enzyme METTL3 controls myeloid differentiation of normal hematopoietic and leukemia cells,” Nat Med. 2017; 23(11):1369-76). Current strategies for overcoming this have focused on developing drugs that inhibit METTL3. However, this approach can have unwanted effects since it can impact the methylation of all mRNAs. Thus, using the m6A sensor system to express a tumor suppressor protein or to deliver CRISPR systems targeting upregulated oncogenes offers a more targeted approach.
Additionally, the m6A sensor system can be used to develop an m6A-coupled effector protein expression system. To demonstrate the utility and versatility of this technology, m6A sensor systems can be engineered and utilized to deliver a tumor suppressor protein to counteract the effects of hypermethylation in cancer cells, and (in embodiments) to express METTL3-targeting CRISPRi tools to maintain cellular m6A levels through a METTL3 feedback mechanism or express other tumor suppression proteins (such as cycle proteins like p53 for example). The utility of the system, to influence physiological outcomes can also be studied.
Cell culture. All cell types used in this study were cultured at 37° C. and 5% CO2 using the recommended cell type-specific growth medium. HEK293T cells (ATCC, CRL-3216), HeLa cells (ATCC, CRM-CCL-2), and NIH/3T3 cells (ATCC, CRL-1658) were cultured in Dulbecco's Modified 430 Eagle's Medium (DMEM, Corning). A549 cells (ATCC, CCL-185) and CHO-K1 cells (ATCC, CCL61) were cultured in Ham's F-12K (Kaighn's) Medium (Gibco). Huh-7 cells (obtained through the Duke University Cell Culture Facility) were cultured in Dulbecco's Modified Eagle's Medium (DMEM, Corning) with the addition of 12.5 mL of 1M HEPES (Fisher Scientific). HepG2 cells (ATCC, HB-8065) were cultured in Gibco Minimum Essential Media (MEM, Gibco) with the addition of 1% Sodium Pyruvate (Fisher Scientific) and 1% NEAA (Fisher Scientific). METTL3 degron cells were cultured as for HEK293T cells. All cell lines were cultured with the addition of 10% fetal bovine serum (Avantor) and 10 units/mL Penicillin/10 μg/mL Streptomycin (Gibco) to their respective growth media. HEK293T cells were tested for mycoplasma infection by the Duke University Cell Culture Facility and were confirmed to be mycoplasma-free.
Plasmids and cloning. The sequence for the EGFP-DHFR reporter mRNA was synthesized using custom gene synthesis (IDT gblock). All RAC consensus motifs within the EGFP sequence were mutated to avoid m6A methylation and potential editing of the EGFP coding sequence. Synonymous/codon-445 optimized mutations were used when possible. The EGFP and DHFR coding sequences are separated by a linker region which contains the first 81 nt of the human ACTB 3′UTR with some modifications (Table 2). The “m6A sensor sequence” consists of 5′-GCGGACUUACGACAG-3′ and contains the m6A sites at positions 1216 and 1222 of ACTB, with mutations of some nearby residues to enable C-to-U editing sites that produce in-frame stop codons (Table 3). The DHFR sequence contains the E. coli DHFR gene as previously described25. This EGFP-DHFR gblock sequence was cloned into the pCMV-APOBEC1-YTH plasmid20 at Not1/XhoI sites. The resulting plasmid, pCMV-EGFP-DHFR, was used for experiments involving expression of EGFP-DHFR alone or co-transfected with APOBEC1-YTH or APOBEC1-YTHmut. For all other experiments, the pGEMS plasmid was used, which contains CMV-EGFP-DHFR and EF1a-APOBEC1-YTH. pGEMSmut is the same plasmid but contains EF1a-APOBEC1-YTHmut. To generate pGEMS and pGEMSmut, we first generated iDuet101A-APOBEC1-YTH and iDuet101A-APOBEC1-YTHmut by cloning APOBEC1-YTH/YTHmut from pCMV-APOBEC1-YTH/YTHmut into iDuet101A (a gift from Linzhao Cheng, Addgene plasmid #17629) using XbaI/ClaI sites. EGFP-DHFR was then cloned into iDuet101A-APOBEC1-YTH and -YTHmut at NruI and SanDI sites to generate pGEMS and pGEMSmut, both of which contain the EGFP-DHFR sequence under control of the CMV promoter and APOBEC1-YTH or APOBEC1-YTHmut under control of the EF1a promoter. The puromycin-P2A-rtTA from TLCV2 (a gift from Adam Karpf, Addgene #87360) was also inserted into pGEMS and pGEMSmut using Gibson assembly.
The hPGK-DsRed-Express2 construct was cloned out of LVDP-CArG-RE-GPR (a gift from Stelios Andreadis, Addgene plasmid #89762) and subcloned into pGEMS and pGEMSmut by Gibson assembly to generate pGEMS-II and pGEMSmut-II. For experiments using GEMS-EGFP-PEST, the PEST destabilization domain was subcloned from the pCAG-GFP-PEST plasmid (a gift from Debra Silver) and inserted at the c-terminus of EGFP to produce pGEMS-II-PEST. To generate pGEMS-SOCS2 and pGEMS-p53, human SOCS2 and p53 CDSs were amplified from a cDNA library prepared from HeLa cells and subcloned into pGEMS-II in place of EGFP.
The GEMS-dCas13 system (dCas13-NLS-APO1-YTH) was adapted from pCMV-dCas13-M3nls (a gift from David Liu, plasmid #155366), by subcloning dCas13-NLS upstream of APO1-YTH in the pGEMS-II-PEST plasmid. dCas13 gRNA sequences (listed in Table 3) were subcloned into the pC016 plasmid (a gift from Feng Zhang, plasmid #91906).
Plasmid Transfection. Transfections were performed using Fugene HD (Promega) according to the manufacturer's instructions. For METTL3 inhibition experiments, cells were treated with 10 μM or 30 μM of STM2457 (MedChemExpress) for 16 hours prior to transfection. Cells were treated with 0.1% DMSO (VWR Life Science) as a control. For experiments using METTL3 degron cells, 0.1 mg/mL of auxin or equivalent volume of H2O (control) was added to the cells for 24 hours prior to plasmid transfection.
Microscopy. All images were obtained using a Leica DMi8 inverted fluorescence microscope. Images were processed using the Leica LAS X software. 4-5 fields of view were obtained per sample, and representative images were selected for each experiment.
Quantitative Microscopy. HEK293T cells were plated in 10 cm cell culture-treated dishes at a density of 2.2×106 cells per plate and allowed to grow overnight. Cells were then treated with 30 μM of STM2457 or DMSO control for 16 hours. Cells were then transfected with the GEMS-PEST-DsRed plasmid and allowed to grow for an additional 16 hours. Prior to imaging, media was replaced with 1×PBS with 1 μg/mL Hoechst nuclear fluorescent stain (ThermoFisher Scientific) for 30 minutes. Cells were imaged using a Leica DMi8 fluorescence microscope. Images were analyzed by ImageJ image analysis software. First, fluorescence channels were separated and RGB fluorescence channels were converted to grayscale (16-bit). Binary image thresholds were set in relation to DsRed fluorescence signal and background noise was subtracted. Individual cells were selected, and pixel intensities were generated of each individual cell within the field of view. Similar binary image thresholding, cell selection, and intensity datasets were generated for the EGFP image channel. Data were analyzed by dividing EGFP signal intensity for each cell by average DsRed signal intensity. Data were plotted in JMP software (IMP 17.0, SAS Institute Inc.) by normalized EGFP signal intensity for each treatment. Boxplots represent interquartile range of the data and whiskers represent minimum and maximum data points excluding outliers. Statistical significance was calculated using a two-way t-test assuming unequal variance.
Western blotting. Cells were dissociated in culture plates using TrypLE (Gibco) and collected by centrifugation (6,000 rpm, 10 minutes, 4° C.). Cell pellets were resuspended in 150 μL chilled standard RIPA buffer (1% Triton X-100, 0.1% Sodium Deoxycholate, 0.1% SDS) prepared with the addition of Complete Mini protease inhibitor (Sigma Aldrich) immediately before use. Cells were resuspended and incubated on ice in RIPA buffer for 30 minutes. Cell lysates were cleared by centrifugation (13,000 rpm, 10 minutes, 4° C.) and mixed in a 1:1 ratio with NuPage LDS sample buffer (Invitrogen) with 5% Beta-Mercaptoethanol (Sigma). Samples were separated by gel electrophoresis on NuPage 4-12% Bis-Tris SDS-PAGE gels (Invitrogen), then transferred onto PVDF membranes (Amersham) using semi-dry transfer (Trans-Blot Turbo, Biorad). Membranes were blocked with 4% milk powder in 0.1% PBST and incubated with the appropriate primary antibody overnight at 4° C. Membranes were then washed with 0.1% PBST and incubated for 1 hour at room temperature in secondary antibody. Blots were washed and incubated in chemiluminescent ECL reagent solution (Amersham) then imaged in a ChemiDoc MP imaging system (BioRad) under chemiluminescent and colorimetric light. In western blot images, cyclophilin A is shown as a loading control and all western blots are representative of a minimum of 3 independent biological replicates. The following antibodies were used in this study: GFP tag Polyclonal antibody (Proteintech, 50430-2-AP). Anti-HA rabbit monoclonal antibody (Cell Signaling, 3724). Anti-Cyclophilin A antibody (Cell Signaling, 2175S). Anti-DsRed-Express2 Monoclonal Antibody (Fisher Scientific, CF180014). Anti-METTL3 antibody (abcam, ab195352). Anti-SOCS2 antibody (abcam, ab109245). Anti-p53 (7F5) Rabbit monoclonal antibody (Cell Signaling, 2527S). Recombinant Anti-JAK2 Antibody (abcam, ab108596). Recombinant Anti-STAT5 (phosphorylated Y694) (abcam, ab32364). Anti-STAT5 antibody (Cell Signaling, 25656S). Secondary antibodies used in this study: Goat Anti-Rabbit IgG HRP (Abcam, ab6721), Goat anti-mouse IgG HRP (Fisher Scientific, 62-6520).
For densitometry analysis, western blot images were quantified using ImageJ software 58. EGFP band intensity was normalized to either EGFP-DHFR or (EGFP+EGFP-DHFR) as indicated in each experiment. Similar densitometry analysis was used to measure total protein production (EGFP+EGFP-DHFR), normalized to the Cyclophilin A loading control. At least two replicates were used for each western blot quantification analysis. Bars are plotted as mean intensity ratio and error bars represent standard deviation.
Sanger sequencing and RT-qPCR. Cells were dissociated in culture using TrypLE (Gibco) treatment for 2 minutes and collected by centrifugation (6,000 rpm, 10 minutes, 4° C.). RNA was extracted using the RNA easy Plus Mini kit (Qiagen) according to the manufacturer's protocol. Extracted RNA was purified for genomic and plasmid DNA contamination by incubation with 1 μL DNase I at 37° C. for 30 minutes, and purified RNA was precipitated in 2.5 volumes of isopropanol overnight. RNA quantification was performed using the Qubit 4.0 fluorometer following the Qubit RNA broad range assay kit (Invitrogen). 500 ng of RNA was used for reverse transcription with the iScript reverse transcription supermix (Biorad) following the manufacturer's protocol. PCR was then used to 550 amplify the region of interest in the m6A sensor mRNA sequence (see Table 2). PCR products were column purified before Sanger sequencing using QiaQuick spin columns (Qiagen) and 10 μL reactions were submitted for standard amplicon sequencing (Azenta Life Sciences). C-to-U editing percentage was calculated using the EditR web server 59. To measure gene expression using gene-specific oligos, qPCR was performed using iTaq universal SYBR green Supermix (Biorad). 20 μL reactions were set up with 1 μL of cDNA for each sample, in 3 technical replicates. qPCR was performed on a Biorad CFX Duet real-time PCR instrument, and results were analyzed by normalizing threshold cycle of each target gene to 18S rRNA according to established methods60. At least 2 biological replicates were used for each sample, and results are plotted as mean relative fold expression comparing treatment to the control group as indicated. Error bars represent standard deviation, and statistical significance was calculated using a two-way t-test assuming unequal variance.
m6A detection using RT-qPCR. RNA was extracted and treated with DNase I as described above, and a relative quantification of m6A by RT-qPCR was adapted from28. 4 reverse transcription reactions were set up using 150 ng of RNA for each: 2 reactions using BstI polymerase (NEB), and 2 reactions using Superscript reverse transcriptase enzyme (Fisher Scientific). For each reverse transcriptase, one reaction included a primer adjacent (+) to the site being tested (reverse compliment reverse oligo immediately downstream of the site), and the other reaction included a primer that is non-adjacent (−) to the site. The BstI reaction consisted of 10 U BstI polymerase, 50 mM dNTPs, 500 nM oligos (adjacent (+) or non-adjacent (−)), in 1× ThermoPol Buffer. Reactions were incubated in a thermal cycler with the following cycling settings: 3 minutes at 25° C., 30 minutes at 50° C., and 3 minutes at 85° C. Superscript III (SSIII) reactions consisted of 200 U Superscript III, 1M DTT, 25 mM MgCl2+, 10 mM dNTPs, 500 nM oligos (adjacent or non-adjacent), 2 μL 10× FS Buffer, and water up to 20 μL. SSIII thermal cycler settings were set according to manufacturer's protocol. All 4 reactions were used as a template in a qPCR reaction, in 3 technical replicates, using primers that flank the m6A site being tested. Threshold cycle values were obtained and relative m6A was calculated using the formula: 2−(CT Bst(−)−Ct SSIII(−)/Ct Bst(+)−Ct SSIII(+)). At least 2 biological replicates were used for each sample. Error bars represent standard deviation, and statistical significance was calculated using a two-way t-test assuming unequal variance.
Flow cytometry. After 24 hours, the culture media was replaced with media containing 2 μM puromycin for an additional 72 hours to select against the non-infected cells. Cells were then cultured in puromycin-free media for 48 hours, followed by 589 transfection with the GEMS-EGFP system. After 24 hours, cells were dissociated by TrypLE (Gibco) treatment for 10 minutes at 37° C. and 5% CO2. Trypsinized cells were resuspended in 5 mL of growth media containing 1% FBS and passed through a 4 μm cell filter to further separate the culture into single cells. Flow cytometry analysis was performed on a Sony MA900 cell sorter. Cell suspensions were first sorted by size and forward scatter to gate on live cells (BSC-A vs. FSC-A) and to eliminate doublets (FSC-H vs. FSC-A). 2 lasers were used to sort EGFP-positive cells (laser excitation 488 nm) and DsRed-positive cells (laser excitation 561 nm). Cells were sorted in a 4-way channel and collected in 5 mL conical tubes containing 1 mL of 1×PBS. Thresholds for EGFP and DsRed negative fluorescence were pre-calibrated using non-transfected HEK293T cells. Collection stopped when 500,000 cells were collected in the target populations (DsRed+/EGFP− and DsRed+/EGFP+). Sorted cells were collected by centrifugation (6,000 rpm, 10 minutes, 4° C.), and genomic DNA was extracted and analyzed by sequencing using METTL3 locus-specific primers. Indels at the METTL3 locus were identified by aligning the obtained sequences with the genomic sequence of endogenous METTL3 (GRCh38; chr14:21503198-21503835). Results were reported as the percentage of cells containing METTL3 indels out of the total number of cells obtained in each sorted cell population.
For APO1-YTH vs. APO1-YTHmut analysis, HEK293T cells were co-transfected with EGFP-DHFR and either the APO1-YTH or APO1-YTHmut plasmid. Cells were collected 24 hours later and samples were prepared for flow cytometry analysis as described above. Samples were analyzed using a Sony MA900 cell sorter and 1 million cells were recorded to measure EGFP fluorescence (last excitation 488 nm). FCS files were analyzed using Floreada.io software and plotted on a density plot as the frequency of events vs. EGFP fluorescence for each sample.
Mass spectrometry analysis. Total RNA was extracted as described above, and mRNA was purified using two rounds of oligo(dT) purification with Dynabeads oligo-(dT) mRNA purification kit (Invitrogen), followed by 2 rounds of rRNA depletion using NEBNext rRNA Depletion Kit (V2.0, NEB), and an additional two rounds of oligo(dT) purification with Dynabeads oligo-(dT) mRNA purification kit (Invitrogen). All mRNA purification steps were performed following the manufacturer's instructions. Purified mRNA quality was checked using a Bioanalyzer high sensitivity RNA analysis 6000 pico kit (Agilent). For mass spectrometry analysis, 100-200 ng of purified mRNA was incubated with 2 U of Nuclease P1 (Sigma) with 2.5 mM ZnCl and 25 mM NaCl at 37° C. for 2 hours. mRNA samples were treated with 5 U of antarctic phosphatase (NEB) for 2 h at 37° C. Samples were then processed using the Xevo TQ-S mass spectrometry system. All nucleosides were quantified by retention time and ion mass transitions of 268.2 to 133.2 (A) and 282.2 to 150.1 (m6A). Data were plotted as a percentage of m6A relative to A. At least 2 biological replicates were performed for each sample.
Cell growth assays. Huh-7 and HepG2 cells were plated in 6-well culture plates and transiently transfected with the indicated plasmids. 12 hours after transfection, cells were dissociated using TrypLE (Gibco) treatment for 2 minutes at 37° C. and 5% CO2. Trypsinized cells were resuspended in 5 mL of cell type-specific growth media, and an aliquot was used to count the number of cells in the culture using a hemocytometer. 10,000 cells for each sample were plated in one well of a 6-well culture plate for a total of 6 wells per condition. A hemocytometer was used to count the number of cells in each well every 24 hours for 5 days. Counts were performed in 3 technical replicates, and the average number of cells was used to calculate the ratio as indicated in each experiment.
Migration assays. Huh-7 cells were plated in 6-well culture plates and transiently transfected with the indicated plasmids. 12 hours after transfection, cells were dissociated using TrypLE (Gibco) treatment for 2 minutes at 37° C. and 5% CO2. 500 cells from each sample were plated on the top of a 6.5 mm transwell membrane with 8 μm pores (Corning). Culture medium inside the transwell chamber was formulated without the addition of FBS, while the culture medium at the bottom of the well included 10% FBS as a chemoattractant. 24 hours after plating the cells in the transwells, transwells were washed twice with 1×PBS, and the non-migrated cells were cleared using a cotton swab on the top of the transwell membrane. The membrane was fixed with methanol for 30 minutes, then washed with 1×PBS. Membranes were then stained with 5% Crystal Violet (VWR) for 30 minutes, then washed 3 times with 1×PBS. Transwell membranes were placed on a microscope glass slide and imaged under a brightfield 20× objective. At least 4 images were obtained for each condition and representative images were selected.
A system for sensing m6A in cellular mRNAs was envisioned. That system has three main features: 1) it is genetically encoded to enable m6A sensing in living cells, 2) it is versatile and capable of being used in a variety of cell and tissue types, and 3) it provides a simple readout compatible with high-throughput studies. To achieve these goals, a system was designed that uses a reporter mRNA which produces a fluorescent protein (EGFP) only when the mRNA is methylated. This simple system was referred to as GEMS (genetically encoded m6A sensor), and, therefore couples cellular fluorescence with m6A methylation.
To achieve m6A-dependent production of EGFP in the GEMS system, the DART-seq was used. DART-seq is a method that previously developed for m6A detection20. DART-seq identifies m6A residues in cells by using a fusion protein consisting of the YTH domain, which directly binds to m6A sites, tethered to the cytidine deaminase APOBEC1. When the APOBEC1-YTH fusion protein is expressed in cells, it binds to m6A and catalyzes C-to-U editing of nearby cytidine residues (
The GEMS system contains two components: APO1-YTH and an m6A reporter mRNA (
To determine whether the GEMS system can sense cellular mRNA methylation, the system was transfected into HEK293T cells and assessed cellular fluorescence 24 hours later. Cells expressing APO1-YTH together with the m6A reporter mRNA exhibit robust EGFP fluorescence, whereas cells only expressing the m6A reporter mRNA are dark (
As an additional control to demonstrate that EGFP production and cellular fluorescence are due to recognition of m6A by APO1-YTH, cells were transfected with the m6A reporter mRNA and APO1-YTHmut, a mutant version of the APO1-YTH fusion protein which lacks the full m6A binding region of the YTH domain and exhibits greatly reduced m6A-binding activity20. This resulted in loss of EGFP fluorescence and EGFP protein production as well as decreased editing of the m6A sensor sequence (
Both the EGFP and EGFP-DHFR protein products are detected in cells expressing the GEMS system (
Altogether, these data demonstrate that the GEMS system produces robust EGFP fluorescence that depends both on the m6A-binding ability of APO1-YTH and on methylation and C-to-U editing of the m6A sensor sequence. Furthermore, methylation of the GEMS reporter mRNA mirrors the m6A level seen in a similar region of the ACTB mRNA, indicating that GEMS is an accurate representation of endogenous cellular mRNA methylation.
This example discusses that the m6A sensor is METTL3-dependent. The GEMS system was expressed in HEK293T cells that contain an auxin-inducible degradation tag at the endogenous METTL3 locus and which exhibit decreased levels of m6A in the presence of auxin (
To determine whether the GEMS system can also detect elevated methylation caused by increased levels of METTL3, GEMS was introduced into HEK293T cells together with exogenous expression of METTL3. This led to increased EGFP fluorescence, a higher EGFP:EGFP-DHFR ratio, and increased C-to-U editing of the sensor sequence (
Since GEMS uses EGFP fluorescence as a readout for m6A, factors that inhibit general transcription, translation, or fluorescent protein (FP) production could potentially lead to a false readout and limit the utility of GEMS for some applications. To address this, the GEMS system was modified to include DsRed under the control of a separate promoter to control for transcription and general FP production (
HEK293T cells were infected with a Cas9-expressing lentivirus and sgRNAs targeting either METTL3 or the AAVS1 safe harbor gene locus followed by transfection with the GEMS system and flow cytometry to isolate cells based on red/green fluorescence. Cells were then subjected to targeted sequencing of the METTL3 locus to determine whether CRISPR-induced indels are enriched in DsRed+/EGFP− cells, which would be expected if selective reduction of EGFP fluorescence reflects METTL3 disruption. Indeed, METTL3 indels are substantially higher in DsRed+/EGFP− cells compared to DsRed+/EGFP+ cells (
This example discusses the utility of GEMS for sensing m6A across diverse cell types by expressing the system in a variety of mouse and human cell lines. For each cell type, EGFP protein production and fluorescence were observed, as well as editing of the m6A sensor sequence, indicating that the GEMS system is active (
Interestingly, some cell types were observed to have higher levels of EGFP fluorescence and sensor sequence editing than others. This could reflect different levels of m6A and perhaps GEMS can be used to report differential mRNA methylation across distinct cell types. To test this, the system was expressed in three commonly used human cell lines (HEK293T, HeLa, and Huh-7) with DsRed as an internal control. It was found that EGFP fluorescence, EGFP:EGFP-DHFR ratio, and m6A sensor sequence editing are highly similar in HEK293T and HeLa cells but substantially reduced in Huh-7 cells (
GEMS uses a single mRNA to sense m6A. To validate that GEMS activity reflects mRNA methylation levels globally, mRNA was purified from HEK293T, HeLa, and Huh-7 cells and performed mass spectrometry to quantify m6A levels (
The m6A methyltransferase machinery has recently emerged as a promising therapeutic target for the potential treatment of cancer and other diseases29-32. However, efforts to identify METTL3 inhibitors have been hampered by the lack of methods that provide a simple readout for m6A methyltransferase activity in living cells on a scale that is compatible with HTS. Since GEMS couples m6A methylation with cellular fluorescence, it has potential utility as a HTS-compatible technology for determining the effects of drugs or small molecules on m6A levels in cells.
To explore whether the GEMS system can detect pharmacological inhibition of METTL3, HEK293T cells expressing GEMS were subjected to STM2457, a small molecule inhibitor of METTL329, and performed quantitative microscopy. A significant decrease in EGFP fluorescence was observed following STM2457 treatment (
The ability of the GEMS system to report m6A reduction depends in part on the half-life of EGFP: if cellular mRNA methylation decreases, this can potentially be difficult to detect due to the presence of pre-existing EGFP protein. It may be that an improved GEMS system could be developed by tagging EGFP with a destabilizing domain to reduce its half-life in cells. A PEST degradation sequence was therefore added to the EGFP coding sequence in the GEMS reporter mRNA; this modified system was tested for its ability to respond to METTL3 inhibition with STM2457. Indeed, the EGFP-PEST reporter enabled improved detection of m6A depletion compared to the original EGFP version (
Because the APO1-YTH protein edits cellular methylated mRNAs in addition to the GEMS reporter mRNA, it could potentially lead to unwanted effects in cells. Therefore, an alternative approach was developed to target APO1-YTH specifically to the reporter mRNA and reduce editing of endogenous cellular transcripts. An additional application of the m6A-coupled effector protein delivery system is driving expression of CRISPR tools that target METTL3. This can provide an m6A-dependent feedback mechanism which reduces METTL3 expression when m6A levels become too high and could therefore serve as a way to maintain m6A homeostasis in cells. This can be tested by developing a system that expresses m6A-coupled CRISPRi tools to inhibit METTL3 transcription (
dCas9
In the present embodiment, the GFP sequence of the m6A sensor system can be replaced with dCas9-KRAB, which is a fusion protein consisting of inactive Cas9 tethered to the Kruppel-associated box (KRAB) transcriptional repressor (Alerasool et al., An efficient KRAB domain for CRISPRi applications in human cells. Nat Methods. 2020; 17(11):1093-6). Then, a U6-METTL3 sgRNA cassette can be introduced into this plasmid. The result will be constitutive expression of the METTL3 sgRNA but only m6A-dependent dCas9-KRAB expression in the presence of doxycyclin, which induces APO1-YTH (
Then, a lentivirus expressing this “m6A feedback system” can be packaged and infect HEK293T cells. RNA and protein can be isolated at various timepoints over the course of 72 hours (this can be expanded to longer times as needed). Sensor sequence methylation can be measured using SELECT. Sensor sequence editing can be evaluated with Sanger sequencing. Western blot can be used to assess METTL3, APO1-YTH, and dCas9-KRAB/dCas9-KRAB-DHFR protein levels. Global m6A levels in cellular mRNA can also be measured using UPLC-MS/MS. Collectively, these readouts can provide important quantitative metrics of how the m6A feedback system responds to gain/loss of m6A and how effective the feedback system is at maintaining m6A levels as the cell cycles between high and low levels of METTL3. As an additional approach, the cycling of m6A levels can be assessed using the m6A sensor system. Cells infected with the m6A feedback system can be transfected with the GFP-encoding m6A reporter mRNA. Live-cell imaging will be used to monitor GFP fluorescence over the course of 72 hours (or longer, as needed).
To determine whether a new protein of interest can be produced in place of GFP in the m6A sensor system, dCas9-KRAB was cloned in place of GFP. Robust dCas9-KRAB expression was detected using western blot (
The dCas9-KRAB effector protein delivery system can be expressed in METTL3 degron cells to show that auxin treatment (which leads to METTL3 degradation) reduces dCas9-KRAB expression. The system can also be used to show that STM2457, a METTL3 inhibitor, reduces dCas9-KRAB expression when the system is expressed in wildtype cells. Finally, experiments can be done to show that METTL3 overexpression increases dCas9-KRAB expression. [0218] gRNAs that target other genes of interest (e.g., oncogenes) can be evaluated using this system. It is understood that any gene in the genome of a cell can be targeted by the dCas9-KRAB effector protein, as long as one or more gRNA guide the dCas9-KRAB to the gene of interest.
Any polypeptide can be expressed in a cell by using the m6A-coupled effector protein expression system. Other proteins of interest include, but are not limited to SOCS2 and other tumor suppressors, which can be expressed in cancer cells to determine if expression of a tumor suppressor can reduce cancer cell proliferation, migration, and colony formation.
dCas13
Previous studies have shown that dCas13 can be tethered to the m6A methyltransferase machinery and coupled with guide RNA (gRNA)-mediated targeting to achieve methylation of cellular mRNAs of interest33. A similar approach of fusing dCas13 to APO1-YTH might enable targeted m6A recognition and C-to-U editing of the sensor sequence in the GEMS reporter mRNA. Thus, APO1-YTH in the GEMS system was replaced with dCas13-APO1-YTH and co-expressed this in cells together with a gRNA targeting the m6A sensor sequence (
The m6A-coupled payload delivery system can be used to influence cellular function. METTL3 expression is elevated in several cancers, and m6A hypermethylation has been shown to promote cancer cell proliferation and tumorigenesis4. Although pharmacological inhibition of METTL3 is a promising strategy for counteracting the effects of hypermethylation of transcripts associated with cancer progression, such approaches may have unwanted consequences because they also influence methylation of other RNAs in the cell. Since the GEMS system can couple protein expression with m6A methylation, it could replace EGFP with effector proteins of interest to overcome the oncogenic effects of mRNA hypermethylation in cancer cells.
A protein expression system described herein can be tested by infecting Huh-7 cells with a lentivirus expressing the system (or other means of introducing exogenous polynucleotides into a cell, for example, lipofection, nucleofection, or electroporation). The GFP-expressing sensor system can be used as a control, with both systems including APO1-YTH under an inducible promoter. Cells can be treated with doxycycline, and cell proliferation and colony formation will be tested over the course of 72 hours using established protocols (Chen et al.). METTL3 inhibition with STM2457 will be used in wildtype cells as a control to confirm METTL3-dependent effects on proliferation and colony formation.
The m6A feedback system can be tested using a similar approach with Huh-7 cells as well as MOLM-13 cells, an AML cell line with high levels of METTL3 that exhibits reduced proliferation and colony formation in response to STM2457. Thus, in both cell types, it is expected that expression of the m6A feedback system can lead to high levels of m6A sensor sequence methylation and editing. Effector protein or other therapeutic delivery can lead to METTL3 transcription inhibition, reduced cell proliferation and colony formation as the cell cycles from high to low m6A.
Huh-7 cells are a hepatocyte-derived carcinoma cell line frequently used to model hepatocellular carcinoma (HCC). Previous studies have shown that METTL3 and other methyltransferase complex components are upregulated in HCC and associated with increased disease severity and cancer progression38,39. One mechanism for this is through hypermethylation of the SOCS2 mRNA, which acts as a tumor suppressor in HCC40, 41 Elevated m6A methylation of SOCS2 promotes its degradation and reduces SOCS2 protein levels to accelerate cancer cell growth36, 42. This system was chosen because METTL3-induced hypermethylation of SOCS2 mRNA leads to m6A-dependent transcript degradation and a reduction in SOCS2 protein, which in turn promotes HCC cell proliferation, migration, and colony formation (Chen et al. RNA N6-methyladenosine methyltransferase-like 3 promotes liver cancer progression through YTHDF2-dependent posttranscriptional silencing of SOCS2,” Hepatology. 2018; 67(6):2254-70.) One could use the GEMS system to couple cellular m6A levels with SOCS2 protein expression and effectively rescue the loss of SOCS2 protein that is caused by hypermethylation of the SOCS2 mRNA (
To test this, the EGFP sequence in the GEMS reporter mRNA was replaced with the coding sequence for SOCS2. Expression of GEMS-SOCS2 in Huh-7 cells led to robust expression of SOCS2 protein and m6A sensor sequence editing (
To confirm activity of the GEMS-delivered SOCS2 protein, the JAK-STAT signaling pathway, which is inhibited by SOCS family proteins43-45, was examined. Reduced levels of phosphorylated JAK2 and STAT5 were observed; both JAK2 and STAT5 are known targets of SOCS2-mediated inhibition 45 (
Previous studies have shown that m6A-mediated SOCS2 depletion promotes cancer cell proliferation and migration40, 41 To determine whether GEMS-SOCS2 expression can reverse these effects, Huh-7 cells were transfected with GEMS-SOCS2 and measured cell growth over the course of 5 days. Expression of GEMS-SOCS2 significantly reduced Huh-7 cell growth, indicating that SOCS2 delivery with the GEMS system can counteract the effects of m6A hypermethylation on cancer cell proliferation (
In other embodiments, The SOCS2 coding sequence can be cloned in place of GFP in the m6A sensor system described above using a lentiviral backbone. Huh-7 cells can then be infected with the system. APO1-YTH expression can be induced with doxycycline treatment. RNA and protein will be collected at various timepoints over the course of 72 hours. Sensor sequence methylation and editing can be measured with SELECT and Sanger sequencing, respectively, and SOCS2/SOCS2-DIFR and APO1-YTH levels can be assessed by Western blot. These studies can establish the timing and amount of SOCS2 protein expression that can be achieved by the system. To confirm that SOCS2 expression is m6A-dependent, the experiments can be repeated in cells treated with STM2457 to inhibit METTL3. Expression of the GFP-encoding m6A sensor system can be used in parallel as a control.
Next, the GEMS system was analyzed as a possible general strategy to deliver tumor suppressors to inhibit cancer cell growth. The tumor suppressor protein p53 regulates transcriptional programs involved in cell cycle arrest, apoptosis, and DNA repair and plays a critical role in the prevention of cancer progression49, 50. Consistent with this, the TP53 gene is mutated in nearly half of human cancers51, 52 Huh-7 cells express mutated p53 (Y220C) which is stable but has impaired DNA binding and transcriptional activity53, 54. Therefore, the wild type TP53 coding sequence was cloned into the GEMS system and introduced it into Huh-7 cells. This led to robust p53 expression and upregulation of downstream p53 transcription targets, including CDKN1A and GADD45A (
A major advantage of the GEMS platform is that it enables protein output to be tuned to m6A levels. Indeed, the amounts of SOCS2 and p53 protein delivered by the GEMS system were compared in HepG2 and Huh-7 cells—elevated levels of both proteins were found in HepG2 cells, which have higher m6A41 (
Disclosed herein is a genetically encoded m6A sensor system which provides a fluorescent readout in cells when m6A is deposited on mRNA. This disclosure offers a simple, low-cost method for cellular m6A sensing which can be implemented in virtually any cell or tissue type and easily carried out by a standard molecular biology lab. The ability of GEMS to sense changes in m6A methylation in living cells makes it an attractive system for monitoring m6A dynamics in a variety of cell types and conditions. Indeed, as disclosed herein, GEMS can be used as a readout for m6A in a variety of mouse and human cell lines and that relative differences in EGFP reporter fluorescence can be used to identify differences in methylation levels between cell types. GEMS may also have wide utility for studies of m6A dynamics in cells. Since the sensitivity of GEMS for reporting changes in mRNA methylation depends in part on the half-life of EGFP, using a reporter protein with a short half-life will improve the sensitivity of GEMS for sensing dynamic regulation of m6A. Consistent with this, as discussed above, adding a PEST sequence to EGFP substantially reduces sensor protein longevity and improves the ability to detect changes in m6A caused by pharmacological inhibition of METTL3. Depending on the application, photoconvertible proteins or other reporter proteins could also be substituted for EGFP to further improve detection of m6A dynamics.
Also, GEMS may be utilized for in vivo monitoring of m6A. This could be achieved either through the generation of transgenic animals expressing the two main components of the GEMS system or by introducing GEMS into a desired tissue of interest using viral-mediated or other delivery methods. Such studies might be useful for monitoring the in vivo effects of m6A methylation inhibitors, for examining how certain conditions or stresses alter m6A, or for understanding tissue-specific differences in methylation.
Due to its simple design and ability to sense m6A in living cells, the GEMS system may be useful for a variety of HTS-based approaches. For instance, the factors that control m6A methylation in cells are still not completely understood, so GEMS may be useful for global knockout screens designed to identify cellular proteins that influence m6A. Additionally, GEMS will be highly enabling for drug discovery efforts, as it provides a simple method for screening drug or small molecule libraries to identify novel inhibitors of METTL3. Other methyltransferase complex proteins such as METTL14 and WTAP have also been implicated in human disease and are upregulated in several cancers56, 57 so such screens have the potential to uncover inhibitors of these proteins as well.
Although GEMS opens up several new avenues for both low- and high-throughput studies of m6A, there are some important considerations when using the system. For instance, factors that influence proteasomal degradation could impact EGFP-DHFR stability and therefore cellular fluorescence. Additionally, changes in transcription or translation rates could influence FP production, although the use of m6A-uncoupled internal reporters such as DsRed can help mitigate this. Lastly, since GEMS requires APO1-YTH expression, factors that influence the fusion protein's activity or m6A recognition could impact the system. APO1-YTH also edits cellular methylated RNAs in addition to the GEMS reporter mRNA, which could influence other processes in the cell. Importantly, as discussed above, tethering APO1-YTH to dCas13 enables targeted editing of the GEMS reporter mRNA and reduces editing of cellular mRNAs. Additional refining of the GEMS system based on this approach may further improve its functionality by limiting unwanted effects of APO1-YTH-mediated editing of endogenous methylated RNAs.
In addition to its utility as an m6A-coupled fluorescent reporter, the GEMS system can be programmed to deliver protein payloads of interest in an m6A-dependent manner. As discussed above, GEMS may be used to express SOCS2 and p53 in liver cancer cells, leading to slowed cell growth and reduced migration capacity. Thus, the GEMS system can be used both to rescue the expression of proteins whose production is decreased by mRNA hypermethylation, as in the case of SOCS2, or as a general strategy for tumor suppressor protein expression in cells with elevated m6A, as with p53. In theory, any protein of interest can be expressed using the GEMS system, opening up numerous possibilities for m6A-coupled effector protein expression as a means of achieving desired cellular outcomes or counteracting the effects of high or low levels of m6A. For instance, GEMS could be used to deliver CRISPR/Cas9 tools targeting METTL3 itself, which could be used to activate or inhibit METTL3 expression in response to changing levels of m6A and therefore maintain m6A homeostasis. Given the numerous associations between m6A dysregulation and human disease, the GEMS system has potential utility as a novel therapeutic strategy.
Additional aspects of sequences relating to polynucleotides and polypeptides as described herein and in Table 3 below can be found, for example, in PCT/US2022/079709, U.S. Pat. No. 11,680,109, and Meyer, K. D., “DART-seq: an antibody-free method for global m(6)A detection,” Nat Methods. 2019 December, 16(12):1275-1280 (published online Sep. 23, 2019); doi: 10.1038/s41592-019-0570-0, the entire contents of all of which (including sequence information and any supplemental information) are incorporated by reference in their entirety as fully set forth herein.
E. coli codon
E. coli codon
Disclosed are materials, compositions, and components that can be used for, can be used in conjunction with, can be used in preparation for, or are products of the disclosed embodiments. These and other materials are disclosed herein, and it is understood that when combinations, subsets, interactions, groups, etc. of these materials are disclosed that while specific reference of each various individual and collective combinations and permutations of these compositions may not be explicitly disclosed, each is specifically contemplated and described herein. For example, if a method is disclosed and discussed and a number of modifications that can be made to a number of molecules included in the method are discussed, each and every combination and permutation of the method, and the modifications that are possible are specifically contemplated unless specifically indicated to the contrary. Likewise, any subset or combination of these is also specifically contemplated and disclosed. This concept applies to all aspects of this disclosure including, but not limited to, steps in methods using the disclosed compositions. Thus, if there are a variety of additional steps that can be performed, it is understood that each of these additional steps can be performed with any specific method steps or combination of method steps of the disclosed methods, and that each such combination or subset of combinations is specifically contemplated and should be considered disclosed.
One skilled in the art will readily appreciate that the present disclosure is well adapted to carry out the objects and obtain the ends and advantages mentioned, as well as those inherent therein. The present disclosure described herein are presently representative of preferred embodiments, are exemplary, and are not intended as limitations on the scope of the present disclosure. Changes therein and other uses will occur to those skilled in the art which are encompassed within the spirit of the present disclosure as defined by the scope of the claims.
No admission is made that any reference, including any non-patent or patent document cited in this specification, constitutes prior art. In particular, it will be understood that, unless otherwise stated, reference to any document herein does not constitute an admission that any of these documents forms part of the common general knowledge in the art in the United States or in any other country. Any discussion of the references states what their authors assert, and the applicant reserves the right to challenge the accuracy and pertinence of any of the documents cited herein. All references cited herein are fully incorporated by reference, unless explicitly indicated otherwise. The present disclosure shall control in the event there are any disparities between any definitions and/or description found in the cited references.
This application claims priority to, and the benefit of, U.S. Provisional Application No. 63/415,395, filed on Oct. 12, 2022, and U.S. Provisional Application No. 63/531,948, filed on Aug. 10, 2023, the entire contents of both of which are incorporated by reference as if fully set forth herein.
This invention was made with government support under Grant Nos. DP1DA046584 and R0IMH118366 awarded by the National Institutes of Health/National Institute on Drug Abuse and National Institutes of Health/National Institute of Mental Health, respectively. The government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
63531948 | Aug 2023 | US | |
63415395 | Oct 2022 | US |