Acute myeloid leukemia is a common acute leukemia in adults and children. Targeted therapy gradually has been one of the major ways of AML treatments. AML cell surface antigens are shared with normal myeloid progenitors, in other words, some surface markers that found on AML cells can also be found on normal cells. Therefore, targeting AML tumor cells based on surface marker also result in toxicity to myeloid system, limiting the use in clinical trials. The new approaches of targeted therapy are required to target AML tumor cells while retaining normal hematopoietic system unaffected.
Currently, there is a new paradigm for antigen-specific targeted therapeutics: regenerating a surface antigen-negative myeloid system that is resistant to targeted therapy by using genome-edited hematopoietic stem and progenitor cells (HSPCs).
CD33 (Siglec-3) is a member of the sialic acid-binding immunoglobulin-like lectin family. CD123 (IL-3Ra) is a receptor for interleukin-3. Both CD33 and CD123 are major AML cell-specific antigens and therapeutic target for AML, but their expression on normal myeloid cells limits the therapy window. Besides CD33 and CD123, many cell surface antigens that are expressed on both normal stem cells and leukemic stem cells, including CD47/IAP (integrin associated protein), CD45 (common leukocyte antigen) and CLL-1 (C-type lectin protein-1). Therefore, targeting AML based on these surface markers often comes at a risk of myelosuppression. It's a limitation of clinical applications owing to adverse life-threatening reactions. Cas9 nucleases have been applied to disrupt CD33 gene in the primary cells. CD33-null human HSPCs remain functional and proliferating while being resistant to CD33-targeted AML therapy, e.g., antibody drug conjugate (ADC) therapeutics. Gene editing tools are likely to make the surface antigens promising AML therapy targets.
The combination of CRISPR-Cas9 and cytidine deaminases leads to cytosine base editors (CBEs) for programmable cytosine to thymine (C-T) substitution, which has been applied to achieve efficient editing in various species successfully and holds great potentials in clinical applications. As base editor avoids inducing DNA double strand break (DSB), unwanted nucleotide insertions/deletions (indels) or DNA damage responses (DDRs) can be largely avoided.
The safety and efficiency of gene editing tools are of great importance in clinical applications. Previous studies have reported that the DSBs induced by Cas9 nuclease can activate a p53-mediated DDR pathway and then lead to cell death. Moreover, APOBEC/AID family members can trigger C-to-T base substitutions in single-stranded DNA (ssDNA) regions, which are formed randomly during various cellular processes including DNA replication, repair and transcription. Thus, the specificity of previous base editing systems is compromised, limiting the applications of BEs for therapeutic purposes.
The instant disclosure, in some embodiments, describes gene editing technologies, including specifically designed and tested guide RNA sequences for improved base editors, useful for disrupting the expression of genes, such as CD33, CD123, CD47, CD45 and CLL1, in a cell. Such methods and edited cells are useful in reducing the toxicity associated with therapies targeting such cell surface antigens, such as those for treating acute myeloid leukemia.
One embodiment of the present disclosure, accordingly, provides a method for reducing the biological activity of a gene in a cell, comprising introducing into the cell a CRISPR-associated (Cas) protein, a nucleobase deaminase, a single-guide RNA (sgRNA), and a helper single-guide RNA (hsgRNA), wherein the Cas protein, the nucleobase deaminase, the sgRNA, and the hsgRNA are preferably introduced into the cell by one or more encoding polynucleotides.
The gene, in some embodiments, is a surface antigen expressed on a cancer cell but is also expressed in a non-cancerous cell, such as CD33, CD123, CD47, CD45, and CLL1. Example sgRNA and the hsgRNA are provided in Tables 1A-1M. In some embodiments, the hsgRNA comprises a corresponding 10-nt sequence listed therein. In some embodiments, the hsgRNA comprises a corresponding 20-nt sequence listed therein.
In some embodiments, the nucleobase deaminase is a cytidine deaminase, such as APOBEC3B (A3B), APOBEC3C (A3C), APOBEC3D (A3D), APOBEC3F (A3F), APOBEC3G (A3G), APOBEC3H (A3H), APOBEC1 (A1), APOBEC3 (A3), APOBEC2 (A2), APOBEC4 (A4) and AICDA (AID).
In some embodiments, the method further comprises introducing into the cell a nucleobase deaminase inhibitor, fused to the nucleobase deaminase, via a protease cleavage site. In some embodiments, the nucleobase deaminase inhibitor is an inhibitory domain of a nucleobase deaminase. In some embodiments, the nucleobase deaminase inhibitor is an inhibitory domain of a cytidine deaminase.
In some embodiments, the method further comprises introducing into the cell a protease that is capable of cleaving at the protease cleavage site. In some embodiments, the protease is selected from the group consisting of TuMV protease, PPV protease, PVY protease, ZIKV protease and WNV protease.
Example Cas proteins include SpCas9, FnCas9, St1Cas9, St3Cas9, NmCas9, SaCas9, AsCpf1, LbCpf1, FnCpf1, VQR SpCas9, EQR SpCas9, VRER SpCas9, SpCas9-NG, xSpCas9, RHA FnCas9, KKH SaCas9, NmeCas9, StCas9, CjCas9, AsCpf1, FnCpf1, SsCpf1, PcCpf1, BpCpf1, CmtCpf1, LiCpf1, PmCpf1, Pb3310Cpf1, Pb4417Cpf1, BsCpf1, EeCpf1, BhCas12b, AkCas12b, EbCas12b, LsCas12b, RfCas13d, LwaCas13a, PspCas13b, PguCas13b, and RanCas13b. In some embodiments, the Cas protein is catalytically impaired, such as nCas9 or dCpf1.
The cell being targeted here, in some embodiments, is a blood cell, such as a myeloid cell, in particular non-cancerous blood cells. In some embodiments, the cell is in vitro, ex vivo, or in vivo in a human patient. In some embodiments, the patient suffers from a cancer.
Also provided, in some embodiments, is one or more polynucleotides encoding a CRISPR-associated (Cas) protein, a nucleobase deaminase, a single-guide RNA (sgRNA), and a helper single-guide RNA (hsgRNA), wherein the sgRNA and the hsgRNA are selected from the sequences from Table 1.
Also provided, in some embodiments, is a cell prepared by the method of the present disclosure, and methods of using the cell. One embodiment provides a method of reducing toxicity in a patient undergoing a therapy targeting a cell surface antigen on a cancer cell, comprising administering to the patient the cell. Another embodiment provides a method of reducing toxicity in a patient undergoing a therapy targeting a cell surface antigen on a cancer cell, comprising administering to the patient the polynucleotides.
Also provided are genomic sequences, mRNA sequences and protein sequences that can be prepared by the disclosed base editing technologies and guide RNA sequences.
It is to be noted that the term “a” or “an” entity refers to one or more of that entity; for example, “an antibody,” is understood to represent one or more antibodies. As such, the terms “a” (or “an”), “one or more,” and “at least one” can be used interchangeably herein.
As used herein, the term “polypeptide” is intended to encompass a singular “polypeptide” as well as plural “polypeptides,” and refers to a molecule composed of monomers (amino acids) linearly linked by amide bonds (also known as peptide bonds). The term “polypeptide” refers to any chain or chains of two or more amino acids, and does not refer to a specific length of the product. Thus, peptides, dipeptides, tripeptides, oligopeptides, “protein”, “amino acid chain” or any other term used to refer to a chain or chains of two or more amino acids, are included within the definition of “polypeptide,” and the term “polypeptide” may be used instead of, or interchangeably with any of these terms. The term “polypeptide” is also intended to refer to the products of post-expression modifications of the polypeptide, including without limitation glycosylation, acetylation, phosphorylation, amidation, derivatization by known protecting/blocking groups, proteolytic cleavage, or modification by non-naturally occurring amino acids. A polypeptide may be derived from a natural biological source or produced by recombinant technology, but is not necessarily translated from a designated nucleic acid sequence. It may be generated in any manner, including by chemical synthesis.
“Homology” or “identity” or “similarity” refers to sequence similarity between two peptides or between two nucleic acid molecules. Homology can be determined by comparing a position in each sequence which may be aligned for purposes of comparison. When a position in the compared sequence is occupied by the same base or amino acid, then the molecules are homologous at that position. A degree of homology between sequences is a function of the number of matching or homologous positions shared by the sequences. An “unrelated” or “non-homologous” sequence shares less than 40% identity, though preferably less than 25% identity, with one of the sequences of the present disclosure.
A polynucleotide or polynucleotide region (or a polypeptide or polypeptide region) has a certain percentage (for example, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99%) of “sequence identity” to another sequence means that, when aligned, that percentage of bases (or amino acids) are the same in comparing the two sequences. This alignment and the percent homology or sequence identity can be determined using software programs known in the art, for example those described in Ausubel et al. eds. (2007) Current Protocols in Molecular Biology. Preferably, default parameters are used for alignment. One alignment program is BLAST, using default parameters.
The term “an equivalent nucleic acid or polynucleotide” refers to a nucleic acid having a nucleotide sequence having a certain degree of homology, or sequence identity, with the nucleotide sequence of the nucleic acid or complement thereof. A homolog of a double stranded nucleic acid is intended to include nucleic acids having a nucleotide sequence which has a certain degree of homology with or with the complement thereof. In one aspect, homologs of nucleic acids are capable of hybridizing to the nucleic acid or complement thereof. Likewise, “an equivalent polypeptide” refers to a polypeptide having a certain degree of homology, or sequence identity, with the amino acid sequence of a reference polypeptide. In some aspects, the sequence identity is at least about 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99%. In some aspects, the equivalent polypeptide or polynucleotide has one, two, three, four or five addition, deletion, substitution and their combinations thereof as compared to the reference polypeptide or polynucleotide. In some aspects, the equivalent sequence retains the activity (e.g., epitope-binding) or structure (e.g., salt-bridge) of the reference sequence.
The term “encode” as it is applied to polynucleotides refers to a polynucleotide which is said to “encode” a polypeptide if, in its native state or when manipulated by methods well known to those skilled in the art, it can be transcribed and/or translated to produce the mRNA for the polypeptide and/or a fragment thereof. The antisense strand is the complement of such a nucleic acid, and the encoding sequence can be deduced therefrom.
As provided, the expression of surface antigens on a normal cell can cause serious toxicities in cancer patients that are treated with therapies targeting such antigens. Generating non-cancerous cells not targeted by such therapies can help reduce the toxicities. The instant inventors have developed a new base editing system, transformer base editor (tBE), which can specifically edit cytosine in target regions with no observable off-target mutations. The tBE technology can be suitably employed to generate surface antigen-negative cells.
The tBE system is composed of a cytidine deaminase inhibitor (dCDI) and split-TEV system. tBE remains inactive at off-target sites with a cleavable fusion of dCDI domain, thus eliminating unintended mutations. Only when binding at on-target sites, tBE is transformed to cleave off the dCDI domain and catalyzes targeted deamination for precise editing. More specifically, tBE uses a sgRNA (normally 20 nt) to bind at the target genomic site and a helper sgRNA (hsgRNA, normally 10 or 20 nt) to bind at a nearby region upstream to the target genomic site. The binding of two sgRNAs can guide the components of the tBE system to correctly assemble at the target genomic site for base editing.
As demonstrated in the accompanying examples, the tBE technology can be used to perform highly specific and efficient base editing in living organisms and enables potential clinical applications, e.g., inducing a premature stop codon to repress CD33 or CD123 protein expression or breaking the GU-AG rule to disrupt splicing sites. Generation of CD33 or CD123-negative cells can widen the targeted therapeutic index of a variety of treatment modalities in AML, including monoclonal antibodies, antibody-drug conjugates, and bi-specific T cell engagers.
In accordance with one embodiment of the present disclosure, therefore, provided is a base editing system, or one or more polynucleotides encoding the base editing system, useful for reducing the biological activity of a cell surface antigen in a cell.
In some embodiments, the base editing system includes a CRISPR-associated (Cas) protein, a nucleobase deaminase, a single-guide RNA (sgRNA)/helper single-guide RNA (hsgRNA) pair targeting the cell surface antigen to introduce a premature stop codon or disrupt a splicing site.
“Guide RNAs” are non-coding short RNA sequences which bind to the complementary target DNA sequences. A guide RNA first binds to the Cas enzyme and the gRNA sequence guides the complex via pairing to a specific location on the DNA, where Cas performs its endonuclease activity by cutting the target DNA strand. A “single guide RNA,” frequently simply referred to as “guide RNA”, refers to synthetic or expressed single guide RNA (sgRNA) that consists of both the crRNA and tracrRNA as a single construct. The tracrRNA portion is responsible for Cas endonuclease activity and the crRNA portion binds to the target specific DNA region. Therefore, the trans activating RNA (tracrRNA, or scaffold region) and crRNA are two key components and are joined by tetraloop which results in formation of sgRNA. Guide RNA targets the complementary sequences by simple Watson-Crick base pairing. TracrRNA are base pairs having a stemloop structure in itself and attaches to the endonuclease enzyme. crRNA includes a spacer, complementary to the target sequence, flanked region due to repeat sequences.
In one embodiment, the cell surface antigen is CD33. Example sgRNA/hsgRNA pairs are shown in Tables 1A-1C. In Table 1A, each sgRNA can be paired with any one of the corresponding hsgRNA, and they can be used to make C-to-T editing to create a stop codon at a CAG or CAA codon in the CD33 gene. In Table 1B, each sgRNA can be paired with any one of the corresponding hsgRNA, and they can be used to make G-to-A editing to create a stop codon at a TGG codon in the CD33 gene. In Table 1C, each sgRNA can be paired with any one of the corresponding hsgRNA, and they can be used to make G-to-A editing to disrupt a GU-AG splicing site in the CD33 gene. For each hsgRNA, the tables provide a 20-nt sequence and a shorter 10-nt version within the 20-nt sequence, either of which is sufficient for the editing.
In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:1, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:51 to 62. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:2, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:63-74. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:3, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:75-86. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:4, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:87-98. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:5, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:99-108. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:6, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:109-114.
In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:7, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:115-124. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:8, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:125-130. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:9, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:131-136. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:10, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:137-154. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:11, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:155-168.
In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:12, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:169-178. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:13, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:179-194. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:14, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:195-198. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:15, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:199-208. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:16, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:209-210. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:17, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:211-216. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:18, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:217-224.
In Table 1D, each sgRNA can be paired with any one of the corresponding hsgRNA, and they can be used to make C-to-T editing to create a stop codon at a CGA, CAG or CAA codon in the CD123 gene. In Table 1E, each sgRNA can be paired with any one of the corresponding hsgRNA, and they can be used to make G-to-A editing to create a stop codon at a TGG codon in the CD123 gene. In Table 1F, each sgRNA can be paired with any one of the corresponding hsgRNA, and they can be used to make G-to-A editing to disrupt a GU-AG splicing site in the CD123 gene. For each hsgRNA, the tables provide a 20-nt sequence and a shorter 10-nt version within the 20-nt sequence, either of which is sufficient for the editing.
In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:19, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:225-230. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:20, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:231-236. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:21, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:237-240. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:22, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:241-244. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:23, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:245-254. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:24, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:255-260.
In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:25, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:261-270. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:26, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:271-274.
In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:27, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:275-278. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:28, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:279-284. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:29, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:285-286. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:30, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:287-290. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:31, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:291-300.
In Table 1G, each sgRNA can be paired with any one of the corresponding hsgRNA, and they can be used to make G-to-A editing to create a stop codon at a TGG codon in the CD47 gene. In Table 1H, each sgRNA can be paired with any one of the corresponding hsgRNA, and they can be used to make G-to-A editing to disrupt a GU-AG splicing site in the CD47 gene. For each hsgRNA, the tables provide a 20-nt sequence and a shorter 10-nt version within the 20-nt sequence, either of which is sufficient for the editing.
In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:32, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:301-306.
In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:33, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:307-310. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:34, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:311-318.
In Table 1I, each sgRNA can be paired with any one of the corresponding hsgRNA, and they can be used to make C-to-T editing to create a stop codon at a CGA, CAG or CAA codon in the CD45 gene. In Table 1J, each sgRNA can be paired with any one of the corresponding hsgRNA, and they can be used to make G-to-A editing to create a stop codon at a TGG codon in the CD45 gene. In Table 1K, each sgRNA can be paired with any one of the corresponding hsgRNA, and they can be used to make G-to-A editing to disrupt a GU-AG splicing site in the CD45 gene. For each hsgRNA, the tables provide a 20-nt sequence and a shorter 10-nt version within the 20-nt sequence, either of which is sufficient for the editing.
In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:35, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:319-322. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:36, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:323-324. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:37, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:325-326. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:38, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:327-328. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:39, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:329-332. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:40, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:333-338. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:41, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:339-340.
In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:42, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:341-342. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:43, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:343-344.
In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:44, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:345-354. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:45, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:355-372. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:46, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:373-382. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:47, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:383-384.
In Table 1L, each sgRNA can be paired with any one of the corresponding hsgRNA, and they can be used to make C-to-T editing to create a stop codon at a CAG codon in the CLL1 gene. In Table 1M, each sgRNA can be paired with any one of the corresponding hsgRNA, and they can be used to make G-to-A editing to create a stop codon at a TGG codon in the CLL1 gene. For each hsgRNA, the tables provide a 20-nt sequence and a shorter 10-nt version within the 20-nt sequence, either of which is sufficient for the editing.
In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:48, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:385-392.
In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:49, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:393-398. In some embodiments, the sgRNA includes the nucleic acid sequence of SEQ ID NO:50, and the hsgRNA includes any of the nucleic acid sequences of SEQ ID NO:399-404.
The term “nucleobase deaminase” as used herein, refers to a group of enzymes that catalyze the hydrolytic deamination of nucleobases such as cytidine, deoxycytidine, adenosine and deoxyadenosine. Non-limiting examples of nucleobase deaminases include cytidine deaminases and adenosine deaminases.
“Cytidine deaminase” refers to enzymes that catalyze the irreversible hydrolytic deamination of cytidine and deoxycytidine to uridine and deoxyuridine, respectively. Cytidine deaminases maintain the cellular pyrimidine pool. A family of cytidine deaminases is APOBEC (“apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like”). Members of this family are C-to-U editing enzymes. Some APOBEC family members have two domains, one domain of APOBEC like proteins is the catalytic domain, while the other domain is a pseudocatalytic domain. More specifically, the catalytic domain is a zinc dependent cytidine deaminase domain and is important for cytidine deamination.
Non-limiting examples of APOBEC proteins include APOBEC1, APOBEC2, APOBEC3A, APOBEC3B, APOBEC3C, APOBEC3D, APOBEC3F, APOBEC3G, APOBEC3H, APOBEC4, and activation-induced (cytidine) deaminase (AID).
Various mutants of the APOBEC proteins are also known that have bring about different editing characteristics for base editors. For instance, for human APOBEC3A, certain mutants (e.g., W98Y, Y130F, Y132D, W104A, D131Y and P134Y) even outperform the wildtype human APOBEC3A in terms of editing efficiency or editing window. Accordingly, the term APOBEC and each of its family member also encompasses variants and mutants that have certain level (e.g., 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99%) of sequence identity to the corresponding wildtype APOBEC protein or the catalytic domain and retain the cytidine deaminating activity. The variants and mutants can be derived with amino acid additions, deletions and/or substitutions. Such substitutions, in some embodiments, are conservative substitutions.
“Adenosine deaminase”, also known as adenosine aminohydrolase, or ADA, is an enzyme (EC 3.5.4.4) involved in purine metabolism. It is needed for the breakdown of adenosine from food and for the turnover of nucleic acids in tissues.
Non-limiting examples of adenosine deaminases include tRNA-specific adenosine deaminase (TadA), adenosine deaminase tRNA specific 1 (ADAT1), adenosine deaminase tRNA specific 2 (ADAT2), adenosine deaminase tRNA specific 3 (ADAT3), adenosine deaminase RNA specific B1 (ADARB1), adenosine deaminase RNA specific B2 (ADARB2), adenosine monophosphate deaminase 1 (AMPD1), adenosine monophosphate deaminase 2 (AMPD2), adenosine monophosphate deaminase 3 (AMPD3), adenosine deaminase (ADA), adenosine deaminase 2 (ADA2), adenosine deaminase like (ADAL), adenosine deaminase domain containing 1 (ADAD1), adenosine deaminase domain containing 2 (ADAD2), adenosine deaminase RNA specific (ADAR) and adenosine deaminase RNA specific B1 (ADARB1).
Some of the nucleobase deaminases have a single, catalytic domain, while others also have other domains, such as an inhibitory domain as currently discovered by the instant inventors.
In some embodiments, therefore, the first fragment only includes the catalytic domain, such as mA3-CDA1, hA3F-CDA2 and hA3B-CDA2. In some embodiments, the first fragment includes at least a catalytic core of the catalytic domain.
The term “Cas protein” or “clustered regularly interspaced short palindromic repeats (CRISPR)-associated (Cas) protein” refers to RNA-guided DNA endonuclease enzymes associated with the CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) adaptive immunity system in Streptococcus pyogenes, as well as other bacteria. Cas proteins include Cas9 proteins, Cas12a (Cpf1) proteins, Cas12b (formerly known as C2c1) proteins, Cas13 proteins and various engineered counterparts. Example Cas proteins include SpCas9, FnCas9, St1Cas9, St3Cas9, NmCas9, SaCas9, AsCpf1, LbCpf1, FnCpf1, VQR SpCas9, EQR SpCas9, VRER SpCas9, SpCas9-NG, xSpCas9, RHA FnCas9, KKH SaCas9, NmeCas9, StCas9, CjCas9, AsCpf1, FnCpf1, SsCpf1, PcCpf1, BpCpf1, CmtCpf1, LiCpf1, PmCpf1, Pb3310Cpf1, Pb4417Cpf1, BsCpf1, EeCpf1, BhCas12b, AkCas12b, EbCas12b, LsCas12b, RfCas13d, LwaCas13a, PspCas13b, PguCas13b, and RanCas13b.
In some embodiments, the base editing system further includes a nucleobase deaminase inhibitor fused to the nucleobase deaminase. A “nucleobase deaminase inhibitor,” accordingly, refers to a protein or a protein domain that inhibits the deaminase activity of a nucleobase deaminase. In some embodiments, the second fragment includes at least an inhibitory core of the inhibitory protein/domain.
Two example nucleobase deaminase inhibitors are mA3-CDA2, hA3F-CDA1 and hA3B-CDA1, which are the inhibitory domains of the corresponding nucleobase deaminases. Additional nucleobase deaminase inhibitors have been identified in the protein databases as homologues of mA3-CDA2, hA3F-CDA1 and hA3B-CDA1 (see, e.g., WO2020156575A1). Their biological equivalents (e.g., having at least about 80%, 85%, 90%, 95%, 97%, 98%, 99%, 99.5% sequence identity, or having one, two, or three amino acid addition/deletion/substitution, and having nucleobase deaminase inhibitor activity) can also be prepared with known methods in the art, such as conservative amino acid substitutions.
When the nucleobase deaminase inhibitor is included, it is fused to the nucleobase deaminase but is separated by a protease cleavage site. In some embodiments, the base editing system further includes the protease that is capable of cleaving the protease cleavage site.
The protease cleavage site can be any known protease cleavage site (peptide) for any proteases. Non-limiting examples of proteases include TEV protease, TuMV protease, PPV protease, PVY protease, ZIKV protease and WNV protease.
In some embodiments, the protease cleavage site is a self-cleaving peptide, such as the 2A peptides. “2A peptides” are 18-22 amino-acid-long viral oligopeptides that mediate “cleavage” of polypeptides during translation in eukaryotic cells. The designation “2A” refers to a specific region of the viral genome and different viral 2As have generally been named after the virus they were derived from. The first discovered 2A was F2A (foot-and-mouth disease virus), after which E2A (equine rhinitis A virus), P2A (porcine teschovirus-1 2A), and T2A (thosea asigna virus 2A) were also identified.
In some embodiments, the protease cleavage site is a cleavage site for the TEV protease. In some embodiments, the TEV protease provided in the base editing system includes two separate fragments, each of which on its own is not active. However, in the presence of the remaining fragment of the TEV protease, they will be able to execute the cleavage. Such an arrangement provides additional control and flexible of the base editing capabilities. The TEV fragments may be the TEV N-terminal domain or the TEV C-terminal domain.
Such fusion proteins may include other fragments, such as uracil DNA glycosylase inhibitor (UGI) and nuclear localization sequences (NLS). A “nuclear localization signal or sequence” (NLS) is an amino acid sequence that tags a protein for import into the cell nucleus by nuclear transport. Typically, this signal consists of one or more short sequences of positively charged lysines or arginines exposed on the protein surface. Different nuclear localized proteins may share the same NLS. An NLS has the opposite function of a nuclear export signal (NES), which targets proteins out of the nucleus. A non-limiting example of NLS is the internal SV40 nuclear localization sequence (iNLS).
The “Uracil Glycosylase Inhibitor” (UGI), which can be prepared from Bacillus subtilis bacteriophage PBS1, is a small protein (9.5 kDa) which inhibits E. coli uracil-DNA glycosylase (UDG) as well as UDG from other species. Inhibition of UDG occurs by reversible protein binding with a 1:1 UDG:UGI stoichiometry. UGI is capable of dissociating UDG-DNA complexes.
In some embodiments, a peptide linker is optionally provided between each of the fragments in the fusion protein. In some embodiments, the peptide linker has from 1 to 100 amino acid residues (or 3-20, 4-15, without limitation). In some embodiments, at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% or 90% of the amino acid residues of peptide linker are amino acid residues selected from the group consisting of alanine, glycine, cysteine, and serine.
The disclosed base editing system can be used to engineer a target cell. The editing approach can disrupt the expression of a normal cell surface antigen in the target cell, which can be carried out in vitro, ex vivo, or in vivo. The engineered target cell would be resistant to therapies designed to destroy cells, such as tumor cells, that express such surface antigens. For instance, if a patient receiving an anti-CD33 immunotherapy suffers from dysfunction of CD33-expressing myeloid cells, the present editing technology can lead to production of myeloid cells not targeted by the anti-CD33 therapy and thus restore the function of regular myeloid cells.
In some embodiments, each component of the base editing system can be introduced to the target cell individually, or in combination. For instance, a fusion protein may be packaged into nanoparticle such as liposome. In another example, a guide RNA and a protein may be combined into a complex for introduction.
In some embodiments, some or all of the components of the base editing system can be introduced as one or more polynucleotides encoding them. These polynucleotides may be constructed as plasmids or viral vectors, without limitation.
In an example ex vivo approach, CD34+ hematopoietic stem and progenitor cells (HSPCs) can be collected from a patient. The HSPCs can then be edited with the disclosed gene editing technology, along with the designed sgRNA/hsgRNA, to produce edited cells. DNA sequencing can be used to evaluate the percentage of allelic editing at the on-target site. The edited cells can be injected back to the patient which can help reduce surface antigen-targeted therapy-mediated toxicities. Prior to infusion of the edited cells, the patient can be given a pharmacokinetically adjusted busulfan myeloablation. The edited cells can be administered through intravenous infusion.
Cells, genomic sequences, mRNA sequences, and proteins that can be prepared by the instant base editing technologies and designed sgRNA/hsgRNA sequences are also provided, in some embodiments.
In some embodiments, the genomic sequence originally encodes the human CD33 protein, but has been edited by the instant base editing system such that the normal expression of the CD33 protein is disrupted. The disrupted expression, in some embodiments, is due to introduction of a premature stop codon, a frame shift mutation or an altered splicing site. In some embodiments, a mutated mRNA encoded by the edited genomic sequence is provided. In some embodiments, a mutated CD33 protein encoded by the edited genomic sequence is provided. In some embodiments, a cell that contains the genomic sequence, the mRNA or the protein is provided.
Likewise, in some embodiments, a genomic sequence that encode a disrupted CD123, CD47, CD45 or CLL1 protein is also provided. In some embodiments, a mutated mRNA encoded by the edited genomic sequence is provided. In some embodiments, a mutated CD123, CD47, CD45 or CLL1 protein encoded by the edited genomic sequence is provided. In some embodiments, a cell that contains the genomic sequence, the mRNA or the protein is provided.
This example employed a transformer Base Editor (tBE) to disrupt certain genes which can be useful for treating acute myeloid leukemia (AML).
The transformer Base Editor (tBE) a new base editor that specifically edits cytosine in a target region with no observable off-target mutations. In the tBE system, a cytidine deaminase is fused with a nucleobase deaminase inhibitor to inhibit the activity of the nucleobase deaminase until the tBE complex is assembled at the target genomic site. In some instances, the tBE employs a sgRNA to bind at the target genomic site and a helper sgRNA to bind at a nearby region upstream to the target genomic site. The binding of two sgRNAs can guide the components of tBE to correctly assemble at the target genomic site for efficient base editing. Upon such assembly, a protease in the tBE system is activated, capable of cleaving the nucleobase deaminase inhibitor off from the nucleobase deaminase, which becomes activated.
To apply the tBE system to generate stop codons or disrupt splicing site in CD33 gene, this example designed 87 pairs of sgRNA/hsgRNAs that target the CD33 gene (Table 1A-1M).
First, this example used tBE to induce C-to-T base editing in the codons of CAG (Gln) and CAA (Gln) in CD33 genes to create TAG and TAA stop codon (Table 1,
Next, this example used tBE to induce G-to-A (C-to-T on the opposite strand) base editing in the codon of TGG (Trp) in CD33 to create the TGA, TAG or TAA stop codon (Table 1B,
Then, this example used tBE to induce G-to-A base editing in 5′ GU or 3′ AG splice site to disrupt GU-AG canonical splicing rule (Table 1C,
Similar experiments were conducted for the CD123, CD47, CD45 and CLL1 genes. The sgRNA/hsgRNA sequences are provided in Tables 1D-1M. As shown in
The base editors and base editing method, along with the designed sgRNA/hsgRNA sequences, therefore, can perform high-specificity and high-efficiency base editing in the genome of various eukaryotes. Furthermore, the tBE system, which contains Cas9 nickase (D10A), is less toxic to cells than Cas9 nuclease as Cas9 nickase activates a lower level of p53-mediated DDR.
The present disclosure is not to be limited in scope by the specific embodiments described which are intended as single illustrations of individual aspects of the disclosure, and any compositions or methods which are functionally equivalent are within the scope of this disclosure. It will be apparent to those skilled in the art that various modifications and variations can be made in the methods and compositions of the present disclosure without departing from the spirit or scope of the disclosure. Thus, it is intended that the present disclosure cover the modifications and variations of this disclosure provided they come within the scope of the appended claims and their equivalents.
All publications and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.
Number | Date | Country | Kind |
---|---|---|---|
PCT/CN2021/131565 | Nov 2021 | WO | international |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2022/132953 | 11/18/2022 | WO |