The instant application contains a Sequence Listing which has been submitted in ASCII format via EFS-Web and is hereby incorporated by reference in its entirety. Said ASCII copy, created on ______, is named IDT01-021-US_ST25.xml, and is ______ bytes in size.
This invention pertains to ubiquitin polypeptide variants with increased affinity for 53BP1 and improved efficacy for enhancing homology directed repair rates.
Double-strand breaks (DSBs) of DNA are predominantly repaired through two mechanisms, non-homologous end joining (NHEJ), in which broken ends are rejoined, often imprecisely, or homology directed repair (HDR), which typically involves a sister chromatid or homologous chromosome being used as a repair template. HDR is facilitated by the presence of a sister chromatid and there are cellular mechanisms in place biasing repair towards NHEJ during the G1 phase of the cell cycle [1]. A key determinant of repair pathway choice is 53BP1. 53BP1 was first described as a binding partner of the tumor suppressor gene p53 and was later shown to be a key protein in NHEJ [2]. 53BP1 rapidly accumulates at sites of double-strand breaks. In G1, 53BP1 recruits RIF1 and inhibits end resection [3, 4]. End resection is a critical step in repair pathway choice, as it is necessary for HDR and inhibits NHEJ [1]. By inhibiting end resection, 53BP1 biases repair towards NEHJ and consequently loss of 53BP1 results in increased HDR [5]. Targeted nucleases can be introduced into cells in conjunction with a DNA repair template with homology to a targeted cut site to facilitate precise genome editing via HDR[6]. A strong inhibitor of 53BP1 is therefore useful for precise genome editing.
The recruitment of 53BP1 to DSB sites is dependent upon both H4K20 methylation and H2AK15 ubiquitination. 53BP1 has tandem Tudor domains that have been shown to specifically bind mono and dimethylated H4K20 and H4K20 methylation was shown to be important for 53BP1 recruitment to double-strand breaks [7, 8]. Introducing D1521R, a mutation that disrupts the activity of the Tudor domain, impairs the ability of 53BP1 to form ionizing radiation-induced foci [9]. The minimal focus-forming region of 53BP1 consists of the Tudor domain flanked by an N-terminal oligomerization region and a C-terminal extension. Notably, 53BP1 accumulation at DSBs requires the E3 ubiquitin ligase RNF168, that mediates H2AK13 and H2AK15 ubiquitination [10]. The C-terminal extension was shown to contain a ubiquitination-dependent recruitment motif (UDR) that binds specifically to H2AK15ub and is required for 53BP1 recruitment to DSB sites [9].
Thus, the ubiquitin polypeptide (SEQ ID NO:1) and its interaction with 53BP1 influences the repair pathway choice for DSB sites.
Due to the affinity of 53BP1 for ubiquitinated H2A, a screen of ubiquitin polypeptide variants for interaction with 53BP1 was conducted recently by Canny et al. in which they discovered and modified a ubiquitin polypeptide variant with selective binding to 53BP1 that they named i53 (inhibitor of 53BP1; SEQ ID NO: 2) [11]. The top five hits from the ubiquitin polypeptide variant screen were A10, A11, C08, G08, and H04, with G08 having the highest affinity. In contrast to what might be expected, the interaction of 53BP1 with G08 did not require the UDR and the interaction was shown to be between G08 and the 53BP1 Tudor domain. To generate i53, G08 was modified by introducing an I44A mutation that disrupts a solvent exposed hydrophobic patch on ubiquitin that most ubiquitin binding proteins interact with [9, 12]. Notably, this mutation in the context of H2AKcl5ub(I44A) interferes with 53BP1 interaction with ubiquitinated H2A, yet does not interfere with the ability of i53 to enhance HDR, consistent with i53 enhancing HDR through interaction with the 53BP1 Tudor domain and not the UDR domain [9, 11]. Additionally, i53 was modified relative to G08 through the removal of the C-terminal di-glycine motif Introduction of i53, but not a 53BP1 binding deficient i53 variant DM (i53 P69L+L70V), into cells inhibited the formation ionizing radiation induced 53BP1 foci. Introduction of i53 via plasmid delivery, adeno-associated virus mediated gene delivery, or delivery of mRNA were all shown to improve the rates of HDR. Rates of HDR were improved with the introduction of i53 using both double-stranded DNA donors and using single-stranded DNA donors, which have been shown to use different HDR mechanisms [11, 13, 14].
The present disclosure pertains to ubiquitin polypeptide variants (Ubvs) with increased affinity for 53BP1 and improved efficacy for enhancing HDR rates, and in particular, candidate amino acid changes in i53 that improve its affinity for 53BP1. Methods to identify such variants from a population of mutagenized ubiquitin polypeptides are provided, as well as the identification of additional beneficial mutations at specific amino acid positions. Improving the rate of HDR allows for increased rates of successful genome editing using the CRISPR/Cas9 system or other targeted nucleases in conjunction with supplying a repair template to direct precise genome editing events.
In a first aspect, an isolated polypeptide comprising a ubiquitin polypeptide variant is provided. The isolated polypeptide comprises at least one member selected from one of the following groups:
SEQ ID NO:450, wherein X1 is selected from M, H, Y, W, Q, T, F, S, R, I, and N; X2 is selected from Q, L, I, and M; X6 is selected from K and R; X7 is selected from T, M, I, C, L, and V; X9 is selected from T, I, S, E and V; X12 is selected from T, M, and Y; X13 is selected from I, F, H and P; X14 is selected from T, E, D, H, and N; X16 is selected from E, M, T, N, Y, D, and H; X17 is selected from V and C; X18 is selected from E, M, Y, L, H, F, W, S, Q, T, C, N, R, and D; X19 is selected from P and K; X20 is selected from S, D, N, C, A, and W; X21 is selected from D and E; X25 is selected from N, V, I, E, G, M, Q, D, A, L, R, S, K, T, C, and F; X26 is selected from I, V, and L; X28 is selected from A, E, Q, W, I, M, and D; X29 is selected from K, M, L, R, Q, and H; X31 is selected from Q, C, F, W, H, Y, L, R, and M; X32 is selected from D, A, E, and R; X33 is selected from K, H, A, Q, S, V, L, E, M, T, I, F, C, Y, R, N, and W; X34 is selected from E and T; X38 is selected from P, L, C, F, I, V, Y, T, M, H, S, Q, A, W, N, and K; X39 is selected from D, W, E, G, S, L, and Q; X40 is selected from Q, E, and D; X41 is selected from Q, Y, I, C, and V; X42 is selected from R, W, F, H, Y, N, C, and S; X44 is selected from I, A and T; X46 is selected from A, Q, and G; X48 is selected from K, T, M, I, Q, V, R, L, and N; X49 is selected from Q, S, L, M, P, E V, A, D, I, C, G, and N; X51 is selected from E and D; X52 is selected from D and E; X54 is selected from R, Y, M, T, H, F, N, Q, K, and C; X5s is selected from T and R; X57 is selected from S, G, D, N, H, E, A, Q, M, R, and K; X5s is selected from D and S; X60 is selected from N, E, and Q; X61 is selected from I and L; X62 is selected from Q, L, T, V, C, A, M, I and S; X63 is selected from K, I, M, F, and V; X64 is selected from E, D, and S; X65 is selected from S, P, E, K, H, R, A, D, N, and Q; X66 is selected from T, K, R, and E; X67 is selected from L, H, K, R, S, M, C, Y, and T; X68 is selected from H, M, Q, and E; X69 is selected from L, P, R, A, G, C, F, M, and S; X70 is selected from V, L, M, F, and C; X73 is selected from L and M; and X74 is selected from R, Q, V, L, M, C, I, T, E, and K, and combinations thereof, provided that SEQ ID NOS:1-3 are excluded; and
at least one member selected from the group of SEQ ID NOs:452-665.
In a second aspect, an isolated polypeptide comprising an isolated fusion polypeptide having an Ubv amino acid sequence with an N-terminal His6-tag is provided. The isolated fusion polypeptide comprises at least one member selected from the following: an isolated fusion polypeptide comprising SEQ ID NO:1100, wherein X12 is selected from M, H, Y, W, Q, T, F, S, R, I, and N; X13 is selected from Q, L, I, and M; X17 is selected from K and R; X18 is selected from T, M, I, C, L, and V; X20 is selected from T, I, S, E and V; X23 is selected from T, M, and Y; X24 is selected from I, F, H and P; X25 is selected from T, E, D, H, and N; X27 is selected from E, M, T, N, Y, D, and H; X28 is selected from V and C; X29 is selected from E, M, Y, L, H, F, W, S, Q, T, C, N, R, and D; X30 is selected from P and K; X31 is selected from S, D, N, C, A, and W; X32 is selected from D and E; X36 is selected from N, V, I, E, G, M, Q, D, A, L, R, S, K, T, C, and F; X37 is selected from I, V, and L; X39 is selected from A, E, Q, W, I, M, and D; X40 is selected from K, M, L, R, Q, and H; X42 is selected from Q, C, F, W, H, Y, L, R, and M; X43 is selected from D, A, E, and R; X44 is selected from K, H, A, Q, S, V, L, E, M, T, I, F, C, Y, R, N, and W; X45 is selected from E and T; X49 is selected from P, L, C, F, I, V, Y, T, M, H, S, Q, A, W, N, and K; X50 is selected from D, W, E, G, S, L, and Q; X51 is selected from Q, E, and D; X52 is selected from Q, Y, I, C, and V; X53 is selected from R, W, F, H, Y, N, C, and S; X5s is selected from I, A and T; X57 is selected from A, Q, and G; X59 is selected from K, T, M, I, Q, V, R, L, and N; X60 is selected from Q, S, L, M, P, E V, A, D, I, C, G, and N; X62 is selected from E and D; X63 is selected from D and E; X65 is selected from R, Y, M, T, H, F, N, Q, K, and C; X66 is selected from T and R; X68 is selected from S, G, D, N, H, E, A, Q, M, R, and K; X69 is selected from D and S; X71 is selected from N, E, and Q; X72 is selected from I and L; X73 is selected from Q, L, T, V, C, A, M, I and S; X74 is selected from K, I, M, F, and V; X75 is selected from E, D, and S; X76 is selected from S, P, E, K, H, R, A, D, N, and Q; X77 is selected from T, K, R, and E; X78 is selected from L, H, K, R, S, M, C, Y, and T; X79 is selected from H, M, Q, and E; X80 is selected from L, P, R, A, G, C, F, M, and S; X81 is selected from V, L, M, F, and C; X84 is selected from L and M; and X85 is selected from R, Q, V, L, M, C, I, T, E, and K, and combinations thereof, provided that SEQ ID NO: 3 is excluded; and an isolated fusion polypeptide comprising at least one member selected SEQ ID NOS:235-244 and 246-449.
In a third aspect, an isolated polypeptide that enhances HDR activity through interactions with 53BP1 in a manner to influence repair mechanisms at DSB sites is provided. The isolated polypeptide includes a Ubv having at least 40% amino acid sequence identity to amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having at least 40% amino acid sequence identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded. The isolated polypeptide provides enhanced HDR activity through interactions with 53BP1 in a manner to influence repair mechanisms at DSB sites relative to SEQ ID NO:1 under identical conditions.
In a fourth aspect, an isolated polynucleotide is provided. The isolated polynucleotide encodes the isolated polypeptide of any of the first, second, or third aspects.
In a fifth aspect, an isolated polynucleotide encoding a ubiquitin polypeptide variant is provided. The isolated polynucleotide comprises at least one member selected from SEQ ID NOS:669-682, 885-890, and 892-1099, and the corresponding RNA counterparts thereof.
In a sixth aspect, a vector comprising an isolated polynucleotide encoding a ubiquitin polypeptide variant is provided. The isolated polynucleotide comprises at least one member selected from SEQ ID NOS:669-682, 885-890, and 892-1099, and the corresponding RNA counterparts thereof.
In a seventh aspect, a cell or cell line comprising the isolated polypeptide of the first, second, or third aspects, the isolated polynucleotide of the fourth or fifth aspects, or the vector of the sixth aspect.
In an eighth aspect, a method of suppressing 53BP1 recruitment to DNA double-strand break sites in a cell is provided. The method includes a step of administering to the cell the isolated polypeptide of the first, second, or third aspects, the isolated polynucleotide of the fourth or fifth aspects, or the vector of the sixth aspect.
In a ninth aspect, a method of increasing homology-directed repair in a cell is provided. The method includes a step of administering to the cell the isolated polypeptide of the first, second, or third aspects, the isolated polynucleotide of the fourth or fifth aspects, or the vector of the sixth aspect.
In a tenth aspect, a method of editing a gene in a cell using a CRISPR system is provided. The method includes a step of administering to the cell the isolated polypeptide of the first, second, or third aspects, the isolated polynucleotide of the fourth or fifth aspects, or the vector of the sixth aspect.
In an eleventh aspect, a method of gene targeting in a cell is provided. The method includes a step of administering to the cell the isolated polypeptide of the first, second, or third aspects, the isolated polynucleotide of the fourth or fifth aspects, or the vector of the sixth aspect.
In a twelfth aspect, a composition comprising the isolated polypeptide the isolated polypeptide of the first, second or third aspects is provided.
In an thirteenth aspect, a kit comprising the isolated polypeptide of the first, second, or third aspects, the isolated polynucleotide of the fourth or fifth aspects, or the vector of the sixth aspect.
In a fourteenth aspect, a method of performing a medically therapeutic procedure is provided. The includes the step of performing genome editing according to any of the tenth or eleventh aspects.
In a fifteenth aspect, a method of screening for amino acid changes in a first polypeptide that improve affinity of the first polypeptide for a second polypeptide is provided. The method includes a step of using the BACTH system with a reporter gene under control of cAMP regulated promoter to allow fluorescence activated cell sorting based on protein-protein interaction affinity between the first polypeptide and the second polypeptide to screen for improved affinity variants of the first polypeptide.
The current invention provides novel ubiquitin variants (Ubvs) with increased affinity for 53BP1 and improved efficacy for enhancing HDR rates. The identified Ubvs have increased affinity for 53BP1 and improved efficacy for enhancing HDR rates. Among the identified Ubvs include candidate amino acid changes in i53 that would improve its affinity for 53BP1 as well as Ubvs that do not include any of mutations present in the published i53 sequence. Methods to identify such variants from a population of mutagenized ubiquitin polypeptides are provided, as well as the identification of additional beneficial mutations at specific amino acid positions. Methods are provided that improve the rate of HDR and allow for increased rates of successful genome editing using the CRISPR/Cas9 system or other targeted nucleases in conjunction with supplying a repair template to direct precise genome editing events.
Screening methods to identify novel ubiquitin polypeptide variants
An initial filing identified ubiquitin variants (Ubvs) with increased affinity for 53BP1 and improved efficacy for enhancing HDR rates. In order to identify mutations that improve the affinity of i53 for 53BP1, a two-hybrid screen was conducted to identify variants with improved affinity. We engineered the screen such that interaction of two candidate proteins is tied to expression of a reporter gene that can be measured by fluorescence activated cell sorting (FACS). That disclosure described the results of a screen that interrogated the effect of all possible single amino acid substitutions individually at every position in i53 (a.a. 1-74) on the expression of a reporter gene in a two-hybrid assay in E. coli. From that screening method, about 230 amino acid changes were identified as candidates for improving the affinity of i53 for 53BP1. Of the 24 amino acid changes tested individually, 16 of them resulted in a statistically significant increase in percent of cells that were positive for reporter expression relative to i53. See Example 1 for details. See U.S. Provisional Patent Application Ser. No. 63/248,300, filed Sep. 24, 2021, and entitled “UBIQUITIN VARIANTS WITH IMPROVED AFFINITY FOR 53BP1” (Attorney Docket No. IDT01-021-PRO), the contents of which is incorporated by reference in its entirety.
A subsequent filing described the testing of a subset of those mutations individually and in combination for their effects on the affinity of the two proteins in vitro and on the ability to enhance HDR. From this testing, several individual mutations that change amino acids at the surface of i53 that interacts with 53BP1 were found to significantly improve the affinity of i53 for 53BP1. When mutations were combined together, the highest affinity Ubv (CM1) had a 50 to 100 fold improvement in the affinity for a fragment of 53BP1 relative to the published i53 sequence. Two of the Ubvs that contain multiple mutations relative to i53 were tested for their ability to improve HDR in HEK293 cells. These tests revealed that the improved affinity ubiquitin variants require about a 10 fold lower dose for maximum effectiveness and that HDR rates were improved beyond what could be achieved with the i53 peptide. See U.S. Provisional Patent Application Ser. No. 63/278,155, filed Nov. 11, 2021, and entitled “UBIQUITIN VARIANTS WITH IMPROVED AFFINITY FOR 53BP1” (Attorney Docket No. IDT01-021-PRO2), the contents of which is incorporated by reference in its entirety.
A subsequent filing evaluated additional individual mutations in the context of i53 and CM1 and identified novel combinations of mutations that further improve affinity beyond that of CM1. Additionally, novel beneficial mutations beyond those identified in the screen at specific amino acid positions were identified. Combining the novel beneficial mutations with screen identified mutations resulted in the generation of Ubvs that do not include any of the mutations present in the published i53 sequence and have dramatically improved affinity for 53BP1 compared to i53. See U.S. Provisional Patent Application Ser. No. 63/321,384, filed Mar. 18, 2022 and entitled “UBIQUITIN VARIANTS WITH IMPROVED AFFINITY FOR 53BP1” (Attorney Docket No. IDT01-021-PRO3), the contents of which is incorporated by reference in its entirety.
Using a combination of amino acid changes from the two-hybrid screen and identified through specific position screens (see Example 4), a ubiquitin variant (CM455) was identified that does not contain any of the mutations present in i53 yet maintains affinity comparable to CM1. Additional individual mutations in the context of CM455 at position 2 were evaluated and identified a novel mutation that that results in a variant (CM487) with improved affinity beyond that of CM455. (See Example 6).
Referring to
wherein X1 is selected from M, H, Y, W, Q, T, F, S, R, I, and N; X2 is selected from Q, L, I, and M; X6 is selected from K and R; X7 is selected from T, M, I, C, L, and V; X9 is selected from T, I, S, E and V; X12 is selected from T, M, and Y; X13 is selected from I, F, H and P; X14 is selected from T, E, D, H, and N; X16 is selected from E, M, T, N, Y, D, and H; X17 is selected from V and C; X18 is selected from E, M, Y, L, H, F, W, S, Q, T, C, N, R, and D; X19 is selected from P and K; X20 is selected from S, D, N, C, A, and W; X21 is selected from D and E; X25 is selected from N, V, I, E, G, M, Q, D, A, L, R, S, K, T, C, and F; X26 is selected from I, V, and L; X28 is selected from A, E, Q, W, I, M, and D; X29 is selected from K, M, L, R, Q, and H; X31 is selected from Q, C, F, W, H, Y, L, R, and M; X32 is selected from D, A, E, and R; X33 is selected from K, H, A, Q, S, V, L, E, M, T, I, F, C, Y, R, N, and W; X34 is selected from E and T; X38 is selected from P, L, C, F, I, V, Y, T, M, H, S, Q, A, W, N, and K; X39 is selected from D, W, E, G, S, L, and Q; X40 is selected from Q, E, and D; X41 is selected from Q, Y, I, C, and V; X42 is selected from R, W, F, H, Y, N, C, and S; X44 is selected from I, A and T; X46 is selected from A, Q, and G; X48 is selected from K, T, M, I, Q, V, R, L, and N; X49 is selected from Q, S, L, M, P, E V, A, D, I, C, G, and N; X51 is selected from E and D; X52 is selected from D and E; X54 is selected from R, Y, M, T, H, F, N, Q, K, and C; X55 is selected from T and R; X57 is selected from S, G, D, N, H, E, A, Q, M, R, and K; X58 is selected from D and S; X60 is selected from N, E, and Q; X61 is selected from I and L; X62 is selected from Q, L, T, V, C, A, M, I and S; X63 is selected from K, I, M, F, and V; X64 is selected from E, D, and S; X65 is selected from S, P, E, K, H, R, A, D, N, and Q; X66 is selected from T, K, R, and E; X67 is selected from L, H, K, R, S, M, C, Y, and T; X68 is selected from H, M, Q, and E; X69 is selected from L, P, R, A, G, C, F, M, and S; X70 is selected from V, L, M, F, and C; X73 is selected from L and M; and X74 is selected from R, Q, V, L, M, C, I, T, E, and K, and combinations thereof. These polypeptides of SEQ ID NO:450 are highly preferred, provided that polypeptides encoding SEQ ID NOS:1-3 are excluded.
Fusion Polypeptides with Ubvs Polypeptides Fused to Affinity Tag Motifs
Preferred Ubvs amino acid sequences include fusion polypeptides. Fusion polypeptides typically include extra amino acid information that is not native to the polypeptide to which the extra amino acid information is covalently attached. Such extra amino acid information may include tags that enable purification or identification of the fusion protein. Such extra amino acid information may also include peptides added to facilitate protein translation. Examples of such tags including adding an methionine or a methionine plus a short flexible linker (GGSG) (MGGSG; (SEQ ID NO:1113) to facilitate translation of protein variants where the X1 is not M, such as in CM142 (SEQ ID NO: 557). Such extra amino acid information may include peptides that enable the fusion proteins to be transported into cells and/or transported to specific locations within cells such as peptides that act as nuclear localization signals. Examples of tags for these purposes include the following: AviTag, which is a peptide allowing biotinylation by the enzyme BirA so the protein can be isolated by streptavidin (GLNDIFEAQKIEWHE; SEQ ID NO:1114); Calmodulin-tag, which is a peptide bound by the protein calmodulin (KRRWKKNFIAVSAANRFKKISSSGAL; SEQ ID NO:1115); polyglutamate tag, which is a peptide binding efficiently to anion-exchange resin such as Mono-Q (EEEEEE; SEQ ID NO:1116); E-tag, which is a peptide recognized by an antibody (GAPVPYPDPLEPR; SEQ ID NO:1117); FLAG-tag, which is a peptide recognized by an antibody (DYKDDDDK; SEQ ID NO:1118); HA-tag, which is a peptide from hemagglutinin recognized by an antibody (YPYDVPDYA; SEQ ID NO:1119); His-tag, which is typically 5-10 histidines and can direct binding to a nickel or cobalt chelate (HHHHH; SEQ ID NO:1120); Myc-tag, which is a peptide derived from c-myc recognized by an antibody (EQKLISEEDL; SEQ ID NO:1121); NE-tag, which is a novel 18-amino-acid synthetic peptide (TKENPRSNQEESYDDNES; SEQ ID NO:1122) recognized by a monoclonal IgG1 antibody, which is useful in a wide spectrum of applications including Western blotting, ELISA, flow cytometry, immunocytochemistry, immunoprecipitation, and affinity purification of recombinant proteins; S-tag, which is a peptide derived from Ribonuclease A (KETAAAKFERQHMDS; SEQ ID NO:1123); SBP-tag, which is a peptide which binds to streptavidin; (MDEKTTGWRGGHVVEGLAGELEQLRARLEHHPQGQREP; SEQ ID NO:1124); Softag 1, which is intended for mammalian expression (SLAELLNAGLGGS; SEQ ID NO:1125); Softag 3, which is intended for prokaryotic expression (TQDPSRVG; SEQ ID NO:1126); Strep-tag, which is a peptide which binds to streptavidin or the modified streptavidin called streptactin (Strep-tag II: WSHPQFEK; SEQ ID NO:1127); TC tag, which is a tetracysteine tag that is recognized by FlAsH and ReAsH biarsenical compounds (CCPGCC; SEQ ID NO:1128) V5 tag, which is a peptide recognized by an antibody (GKPIPNPLLGLDST; SEQ ID NO:1129); VSV-tag, a peptide recognized by an antibody (YTDIEMNRLGK; SEQ ID NO:1130); Xpress tag (DLYDDDDK; SEQ ID NO:1131); Isopeptag, which is a peptide which binds covalently topilin-C protein (TDKDMTITFTNKKDAE; SEQ ID NO:1132); SpyTag, which is a peptide which binds covalently to SpyCatcher protein (AHIVMVDAYKPTK; SEQ ID NO:1133); and SnoopTag, a peptide which binds covalently to SnoopCatcher protein (KLGDIEFIKVNK; SEQ ID NO:1134).
An affinity tag can include flanking amino acids when the affinity tag is located at the N-terminus of the fusion polypeptide. Such flanking amino acids include an initiator methionine and flexible linker sequences.
A highly preferred affinity tag includes a His-tag (SEQ ID NO:1135). A highly preferred affinity tag includes an N-terminal His-tag (MHHHHHHGGSG; SEQ ID NO:1136). Highly preferred fusion polypeptides include Ubvs, such as SEQ ID NO: 3 fused to an N-terminal His-tag (e.g., SEQ ID NO:1136), as well as other preferred Ubvs amino acid sequences that include an N-terminal His-tag. A highly preferred translation tag includes N-terminal M (M) or M plus a short flexible linker (i.e., MGGSG: SEQ ID NO:1113).
A highly preferred fusion polypeptide of Ubvs comprises SEQ ID NO:1100:
wherein X12 is selected from M, H, Y, W, Q, T, F, S, R, I, and N; X13 is selected from Q, L, I, and M; X17 is selected from K and R; X18 is selected from T, M, I, C, L, and V; X20 is selected from T, I, S, E and V; X23 is selected from T, M, and Y; X24 is selected from I, F, H and P; X25 is selected from T, E, D, H, and N; X27 is selected from E, M, T, N, Y, D, and H; X28 is selected from V and C; X29 is selected from E, M, Y, L, H, F, W, S, Q, T, C, N, R, and D; X30 is selected from P and K; X31 is selected from S, D, N, C, A, and W; X32 is selected from D and E; X36 is selected from N, V, I, E, G, M, Q, D, A, L, R, S, K, T, C, and F; X37 is selected from I, V, and L; X39 is selected from A, E, Q, W, I, M, and D; X40 is selected from K, M, L, R, Q, and H; X42 is selected from Q, C, F, W, H, Y, L, R, and M; X43 is selected from D, A, E, and R; X44 is selected from K, H, A, Q, S, V, L, E, M, T, I, F, C, Y, R, N, and W; X45 is selected from E and T; X49 is selected from P, L, C, F, I, V, Y, T, M, H, S, Q, A, W, N, and K; X50 is selected from D, W, E, G, S, L, and Q; X51 is selected from Q, E, and D; X52 is selected from Q, Y, I, C, and V; X53 is selected from R, W, F, H, Y, N, C, and S; X55 is selected from I, A and T; X57 is selected from A, Q, and G; X59 is selected from K, T, M, I, Q, V, R, L, and N; X60 is selected from Q, S, L, M, P, E V, A, D, I, C, G, and N; X62 is selected from E and D; X63 is selected from D and E; X65 is selected from R, Y, M, T, H, F, N, Q, K, and C; X66 is selected from T and R; X68 is selected from S, G, D, N, H, E, A, Q, M, R, and K; X69 is selected from D and S; X71 is selected from N, E, and Q; X72 is selected from I and L; X73 is selected from Q, L, T, V, C, A, M, I and S; X74 is selected from K, I, M, F, and V; X75 is selected from E, D, and S; X76 is selected from S, P, E, K, H, R, A, D, N, and Q; X77 is selected from T, K, R, and E; X78 is selected from L, H, K, R, S, M, C, Y, and T; X79 is selected from H, M, Q, and E; X80 is selected from L, P, R, A, G, C, F, M, and S; X81 is selected from V, L, M, F, and C; X84 is selected from L and M; and X85 is selected from R, Q, V, L, M, C, I, T, E, and K, and combinations thereof, provided that SEQ ID NO: 3 is excluded.
Additional preferred fusion polypeptides of Ubvs include SEQ ID NOS:235-244 and 246-449.
An isolated polypeptide that enhances rates of HDR through interactions with 53BP1 in a manner to influence repair mechanisms at DSB sites is provided. The isolated polypeptide comprises a Ubv having at least 40% amino acid sequence identity to amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having at least 40% amino acid sequence identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded. Such an isolated polypeptide provides enhanced HDR activity through interactions with 53BP1 in a manner to influence repair mechanisms at DSB sites relative to SEQ ID NO:1 under identical conditions.
Preferred isolated polypeptides include those having amino acid sequence identity in the range of at least 50% to 100% identity with amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having amino acid sequence identity in the range of at least 50% to 100% identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded. Even more preferably, preferred isolated polypeptides include those having amino acid sequence identity in the range of at least 60% to 100% identity with amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having amino acid sequence identity in the range of at least 60% to 100% identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded. Even more preferably, preferred isolated polypeptides include those having amino acid sequence identity in the range of at least 70% to 100% identity with amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having amino acid sequence identity in the range of at least 70% to 100% identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded. Even more preferably, preferred isolated polypeptides include those having amino acid sequence identity in the range of at least 80% to 100% identity with amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having amino acid sequence identity in the range of at least 80% to 100% identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded. Even more preferably, preferred isolated polypeptides include those having amino acid sequence identity in the range of at least 90% to 100% identity with amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having amino acid sequence identity in the range of at least 90% to 100% identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded. Even more preferably, preferred isolated polypeptides include those having amino acid sequence identity in the range of at least 95% to 100% identity with amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having amino acid sequence identity in the range of at least 95% to 100% identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded.
A preferred polypeptide sequence in the aforementioned ranges with amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded, further provide a functional benefit of enhanced HDR rates when compared to HDR rates achieved when introducing human ubiquitin SEQ ID NO:1 into cells under identical conditions.
A preferred isolated polynucleotide encoding such isolated polypeptides within the stated ranges of % amino acid sequence identity to the aforementioned reference polypeptide sequence(s) in the aforementioned ranges, further provide a functional benefit of enhanced HDR rates when compared to HDR achieved when introducing human ubiquitin SEQ ID NO:1 into cells under identical conditions. Such enhanced HDR rates can be readily assessed by one of skill in the art based upon the teachings disclosed herein, including tests for at least one of the following functional properties: (1) a higher Ka (lower Kd) for binding a fragment of 53BP1 (amino acids 1484-1603) (See, for example, SEQ ID NO: 245) than is measured for Human ubiquitin (SEQ ID NO:1) under identical conditions as measured in vitro using BLI, even more preferably a higher measured Ka (lower Kd) for binding a fragment of 53BP1 (amino acids 1484-1603) (See SEQ ID NO: 245) than is measured for i53 (SEQ ID NO:2) under identical conditions as measured in vitro using BLI; (2) Delivery of the polypeptide in the form of mRNA, plasmid, or protein, results in improved HDR rates for introduction an EcoR1 cut site insert at the HPRT1 or SERPINC1 cut sites as specified by the sgRNA and ssDNA donor sequences in Table 7 as compared to delivery of human ubiquitin (SEQ ID NO: 1) under the same conditions. See Examples 3, 4, 7, and 8 for details.
Isolated nucleic acids encoding preferred Ubvs amino acid sequences are provided. One preferred isolated nucleic acid encodes SEQ ID NO:450:
wherein X1 is selected from M, H, Y, W, Q, T, F, S, R, I, and N; X2 is selected from Q, L, I, and M; X6 is selected from K and R; X7 is selected from T, M, I, C, L, and V; X9 is selected from T, I, S, E and V; X12 is selected from T, M, and Y; X13 is selected from I, F, H and P; X14 is selected from T, E, D, H, and N; X16 is selected from E, M, T, N, Y, D, and H; X17 is selected from V and C; X18 is selected from E, M, Y, L, H, F, W, S, Q, T, C, N, R, and D; X19 is selected from P and K; X20 is selected from S, D, N, C, A, and W; X21 is selected from D and E; X25 is selected from N, V, I, E, G, M, Q, D, A, L, R, S, K, T, C, and F; X26 is selected from I, V, and L; X28 is selected from A, E, Q, W, I, M, and D; X29 is selected from K, M, L, R, Q, and H; X31 is selected from Q, C, F, W, H, Y, L, R, and M; X32 is selected from D, A, E, and R; X33 is selected from K, H, A, Q, S, V, L, E, M, T, I, F, C, Y, R, N, and W; X34 is selected from E and T; X38 is selected from P, L, C, F, I, V, Y, T, M, H, S, Q, A, W, N, and K; X39 is selected from D, W, E, G, S, L, and Q; X40 is selected from Q, E, and D; X41 is selected from Q, Y, I, C, and V; X42 is selected from R, W, F, H, Y, N, C, and S; X44 is selected from I, A and T; X46 is selected from A, Q, and G; X48 is selected from K, T, M, I, Q, V, R, L, and N; X49 is selected from Q, S, L, M, P, E V, A, D, I, C, G, and N; X51 is selected from E and D; X52 is selected from D and E; X54 is selected from R, Y, M, T, H, F, N, Q, K, and C; X55 is selected from T and R; X57 is selected from S, G, D, N, H, E, A, Q, M, R, and K; X58 is selected from D and S; X60 is selected from N, E, and Q; X61 is selected from I and L; X62 is selected from Q, L, T, V, C, A, M, I and S; X63 is selected from K, I, M, F, and V; X64 is selected from E, D, and S; X65 is selected from S, P, E, K, H, R, A, D, N, and Q; X66 is selected from T, K, R, and E; X67 is selected from L, H, K, R, S, M, C, Y, and T; X68 is selected from H, M, Q, and E; X69 is selected from L, P, R, A, G, C, F, M, and S; X70 is selected from V, L, M, F, and C; X73 is selected from L and M; and X74 is selected from R, Q, V, L, M, C, I, T, E, and K, and combinations thereof, provided that polypeptides encoding SEQ ID NOS:1-3 are excluded (i.e., SEQ ID NOS: 666, 667 and 883).
Another preferred isolated nucleic acid encodes SEQ ID NO:1100:
wherein X12 is selected from M, H, Y, W, Q, T, F, S, R, I, and N; X13 is selected from Q, L, I, and M; X17 is selected from K and R; X18 is selected from T, M, I, C, L, and V; X20 is selected from T, I, S, E and V; X23 is selected from T, M, and Y; X24 is selected from I, F, H and P; X25 is selected from T, E, D, H, and N; X27 is selected from E, M, T, N, Y, D, and H; X28 is selected from V and C; X29 is selected from E, M, Y, L, H, F, W, S, Q, T, C, N, R, and D; X30 is selected from P and K; X31 is selected from S, D, N, C, A, and W; X32 is selected from D and E; X36 is selected from N, V, I, E, G, M, Q, D, A, L, R, S, K, T, C, and F; X37 is selected from I, V, and L; X39 is selected from A, E, Q, W, I, M, and D; X40 is selected from K, M, L, R, Q, and H; X42 is selected from Q, C, F, W, H, Y, L, R, and M; X43 is selected from D, A, E, and R; X44 is selected from K, H, A, Q, S, V, L, E, M, T, I, F, C, Y, R, N, and W; X45 is selected from E and T; X49 is selected from P, L, C, F, I, V, Y, T, M, H, S, Q, A, W, N, and K; X50 is selected from D, W, E, G, S, L, and Q; X51 is selected from Q, E, and D; X52 is selected from Q, Y, I, C, and V; X53 is selected from R, W, F, H, Y, N, C, and S; X55 is selected from I, A and T; X57 is selected from A, Q, and G; X59 is selected from K, T, M, I, Q, V, R, L, and N; X60 is selected from Q, S, L, M, P, E V, A, D, I, C, G, and N; X62 is selected from E and D; X63 is selected from D and E; X65 is selected from R, Y, M, T, H, F, N, Q, K, and C; X66 is selected from T and R; X68 is selected from S, G, D, N, H, E, A, Q, M, R, and K; X69 is selected from D and S; X71 is selected from N, E, and Q; X72 is selected from I and L; X73 is selected from Q, L, T, V, C, A, M, I and S; X74 is selected from K, I, M, F, and V; X75 is selected from E, D, and S; X76 is selected from S, P, E, K, H, R, A, D, N, and Q; X77 is selected from T, K, R, and E; X78 is selected from L, H, K, R, S, M, C, Y, and T; X79 is selected from H, M, Q, and E; X80 is selected from L, P, R, A, G, C, F, M, and S; X81 is selected from V, L, M, F, and C; X84 is selected from L and M; and X85 is selected from R, Q, V, L, M, C, I, T, E, and K, and combinations thereof, provided that SEQ ID NO: 3 is excluded.
Preferred isolated polynucleotides (e.g., DNA and their corresponding RNA counterparts) include those that encode Ubvs having an amino acid sequence identity in the range of at least 70% to 100% identity of SEQ ID NOS: 450 and 1100, respectively. Even more preferably, isolated polynucleotides include those that encode Ubvs having an amino acid sequence identity in the range of at least 80% to 100% identity of SEQ ID NOS: 450 and 1100, respectively. Even more preferably, preferred isolated polynucleotides include those that encode Ubvs having an amino acid sequence identity in the range of at least 90% to 100% identity of SEQ ID NOS: 450 and 1100, respectively. Even more preferably, preferred isolated polynucleotides include those that encode Ubvs having an amino acid sequence identity in the range of at least 95% to 100% identity of SEQ ID NOS: 450 and 1100, respectively.
An isolated polynucleotide that encodes an isolated polypeptide with enhanced HDR activity through interactions with 53BP1 in a manner to influence repair mechanisms at DSB sites is provided. The encoded isolated polypeptide comprises a Ubv having at least 40% amino acid sequence identity to amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having at least 40% amino acid sequence identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded. Such an isolated polypeptide identity provides enhanced HDR activity through interactions with 53BP1 in a manner to influence repair mechanisms at DSB sites relative to SEQ ID NO:1 under identical conditions.
Preferred isolated polynucleotides encoding such isolated polypeptides include polypeptides those having amino acid sequence identity in the range of at least 50% to 100% identity with amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having amino acid sequence identity in the range of at least 50% to 100% identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded. Even more preferably, isolated polynucleotides encoding such isolated polypeptides include those having amino acid sequence identity in the range of at least 60% to 100% identity with amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having amino acid sequence identity in the range of at least 60% to 100% identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded. Even more preferably, isolated polynucleotides encoding such isolated polypeptides include those having amino acid sequence identity in the range of at least 70% to 100% identity with amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having amino acid sequence identity in the range of at least 70% to 100% identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded. Even more preferably, isolated polynucleotides encoding such isolated polypeptides include those having amino acid sequence identity in the range of at least 80% to 100% identity with amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having amino acid sequence identity in the range of at least 80% to 100% identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded. Even more preferably, preferred isolated polynucleotides encoding such isolated polypeptides include those having amino acid sequence identity in the range of at least 90% to 100% identity with amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having amino acid sequence identity in the range of at least 90% to 100% identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded. Even more preferably, preferred isolated polynucleotides encoding such isolated polypeptides include those having amino acid sequence identity in the range of at least 95% to 100% identity with amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having amino acid sequence identity in the range of at least 95% to 100% identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded.
A preferred isolated polynucleotide encoding such isolated polypeptides within the stated ranges of % amino acid sequence identity to the aforementioned reference polypeptide sequence(s) in the aforementioned ranges, further provide a functional benefit of enhanced HDR rates when compared to HDR rates of an isolated polynucleotide encoding SEQ ID NO:1 under identical conditions. Such enhanced HDR rates can be readily assessed by one of skill in the art based upon the teachings disclosed herein, including evaluations as described previously herein.
Applications
It will be generally understood that the disclosed amino acid substitutions within the ubiquitin polypeptide variants that result in improved affinity for 53BP1 can be generated in the context of the wild-type ubiquitin polypeptide (SEQ ID NO:1) or the i53 ubiquitin polypeptide (SEQ ID NO:2), including tag-free polypeptides and fusion polypeptides having an affinity tag included as part of the ubiquitin polypeptide variants. For example, one skilled in the art will appreciate that untagged versions or differently tagged versions fall within the scope of the disclosed ubiquitin polypeptide variants, including those ubiquitin polypeptide variants having a polyhistidine motif (e.g., a His6 tag). Accordingly, alternative versions of ubiquitin polypeptide variants may be constructed and function either with or without an affinity tag, such as a polyhistidine tag.
In a first aspect, an isolated polypeptide comprising a ubiquitin polypeptide variant is provided. The isolated polypeptide comprises at least one member selected from one of the following groups:
SEQ ID NO:450, wherein X1 is selected from M, H, Y, W, Q, T, F, S, R, I, and N; X2 is selected from Q, L, I, and M; X6 is selected from K and R; X7 is selected from T, M, I, C, L, and V; X9 is selected from T, I, S, E and V; X12 is selected from T, M, and Y; X13 is selected from I, F, H and P; X14 is selected from T, E, D, H, and N; X16 is selected from E, M, T, N, Y, D, and H; X17 is selected from V and C; X18 is selected from E, M, Y, L, H, F, W, S, Q, T, C, N, R, and D; X19 is selected from P and K; X20 is selected from S, D, N, C, A, and W; X21 is selected from D and E; X25 is selected from N, V, I, E, G, M, Q, D, A, L, R, S, K, T, C, and F; X26 is selected from I, V, and L; X28 is selected from A, E, Q, W, I, M, and D; X29 is selected from K, M, L, R, Q, and H; X31 is selected from Q, C, F, W, H, Y, L, R, and M; X32 is selected from D, A, E, and R; X33 is selected from K, H, A, Q, S, V, L, E, M, T, I, F, C, Y, R, N, and W; X34 is selected from E and T; X38 is selected from P, L, C, F, I, V, Y, T, M, H, S, Q, A, W, N, and K; X39 is selected from D, W, E, G, S, L, and Q; X40 is selected from Q, E, and D; X41 is selected from Q, Y, I, C, and V; X42 is selected from R, W, F, H, Y, N, C, and S; X44 is selected from I, A and T; X46 is selected from A, Q, and G; X48 is selected from K, T, M, I, Q, V, R, L, and N; X49 is selected from Q, S, L, M, P, E V, A, D, I, C, G, and N; X51 is selected from E and D; X52 is selected from D and E; X54 is selected from R, Y, M, T, H, F, N, Q, K, and C; X55 is selected from T and R; X57 is selected from S, G, D, N, H, E, A, Q, M, R, and K; X58 is selected from D and S; X60 is selected from N, E, and Q; X61 is selected from I and L; X62 is selected from Q, L, T, V, C, A, M, I and S; X63 is selected from K, I, M, F, and V; X64 is selected from E, D, and S; X65 is selected from S, P, E, K, H, R, A, D, N, and Q; X66 is selected from T, K, R, and E; X67 is selected from L, H, K, R, S, M, C, Y, and T; X68 is selected from H, M, Q, and E; X69 is selected from L, P, R, A, G, C, F, M, and S; X70 is selected from V, L, M, F, and C; X73 is selected from L and M; and X74 is selected from R, Q, V, L, M, C, I, T, E, and K, and combinations thereof, provided that SEQ ID NOS:1-3 are excluded; and
at least one member selected from the group of SEQ ID NOs:452-665.
In a first respect, the isolated polypeptide comprises a ubiquitin polypeptide variant selected from SEQ ID NO:450, wherein X1 is selected from M, H, Y, W, Q, T, F, S, R, I, and N; X2 is selected from Q, L, I, and M; X6 is selected from K and R; X7 is selected from T, M, I, C, L, and V; X9 is selected from T, I, S, E and V; X12 is selected from T, M, and Y; X13 is selected from I, F, H and P; X14 is selected from T, E, D, H, and N; X16 is selected from E, M, T, N, Y, D, and H; X17 is selected from V and C; X18 is selected from E, M, Y, L, H, F, W, S, Q, T, C, N, R, and D; X19 is selected from P and K; X20 is selected from S, D, N, C, A, and W; X21 is selected from D and E; X25 is selected from N, V, I, E, G, M, Q, D, A, L, R, S, K, T, C, and F; X26 is selected from I, V, and L; X28 is selected from A, E, Q, W, I, M, and D; X29 is selected from K, M, L, R, Q, and H; X31 is selected from Q, C, F, W, H, Y, L, R, and M; X32 is selected from D, A, E, and R; X33 is selected from K, H, A, Q, S, V, L, E, M, T, I, F, C, Y, R, N, and W; X34 is selected from E and T; X38 is selected from P, L, C, F, I, V, Y, T, M, H, S, Q, A, W, N, and K; X39 is selected from D, W, E, G, S, L, and Q; X40 is selected from Q, E, and D; X41 is selected from Q, Y, I, C, and V; X42 is selected from R, W, F, H, Y, N, C, and S; X44 is selected from I, A and T; X46 is selected from A, Q, and G; X48 is selected from K, T, M, I, Q, V, R, L, and N; X49 is selected from Q, S, L, M, P, E V, A, D, I, C, G, and N; X51 is selected from E and D; X52 is selected from D and E; X54 is selected from R, Y, M, T, H, F, N, Q, K, and C; X55 is selected from T and R; X57 is selected from S, G, D, N, H, E, A, Q, M, R, and K; X58 is selected from D and S; X60 is selected from N, E, and Q; X61 is selected from I and L; X62 is selected from Q, L, T, V, C, A, M, I and S; X63 is selected from K, I, M, F, and V; X64 is selected from E, D, and S; X65 is selected from S, P, E, K, H, R, A, D, N, and Q; X66 is selected from T, K, R, and E; X67 is selected from L, H, K, R, S, M, C, Y, and T; X68 is selected from H, M, Q, and E; X69 is selected from L, P, R, A, G, C, F, M, and S; X70 is selected from V, L, M, F, and C; X73 is selected from L and M; and X74 is selected from R, Q, V, L, M, C, I, T, E, and K, and combinations thereof, provided that SEQ ID NOS:1-3 are excluded. In a second respect, the isolated polypeptide shares amino acid sequence identity in the range of at least 40% to 100% identity of SEQ ID NO:1. In a third respect, the isolated polypeptide shares amino acid sequence identity in the range of at least 50% to 100% identity of SEQ ID NO:1. In a fourth respect, the isolated polypeptide shares amino acid sequence identity in the range of at least 60% to 100% identity of SEQ ID NO:1. In a fifth respect, the isolated polypeptide shares amino acid sequence identity in the range of at least 70% to 100% identity of SEQ ID NO:1. In a sixth respect, the isolated polypeptide shares amino acid sequence identity in the range of at least 80% to 100% identity of SEQ ID NO:1. In a seventh respect, the isolated polypeptide shares amino acid sequence identity in the range of at least 90% to 100% identity of SEQ ID NO:1. In an eighth respect, the isolated polypeptide shares amino acid sequence identity in the range of at least 95% to 100% identity of SEQ ID NO:1.
In a second aspect, an isolated polypeptide comprising an isolated fusion polypeptide having an Ubv amino acid sequence with an N-terminal His6-tag is provided. The isolated fusion polypeptide comprises at least one member selected from the following: an isolated fusion polypeptide comprising SEQ ID NO: 1100, wherein X12 is selected from M, H, Y, W, Q, T, F, S, R, I, and N; X13 is selected from Q, L, I, and M; X17 is selected from K and R; X18 is selected from T, M, I, C, L, and V; X20 is selected from T, I, S, E and V; X23 is selected from T, M, and Y; X24 is selected from I, F, H and P; X25 is selected from T, E, D, H, and N; X27 is selected from E, M, T, N, Y, D, and H; X28 is selected from V and C; X29 is selected from E, M, Y, L, H, F, W, S, Q, T, C, N, R, and D; X30 is selected from P and K; X31 is selected from S, D, N, C, A, and W; X32 is selected from D and E; X36 is selected from N, V, I, E, G, M, Q, D, A, L, R, S, K, T, C, and F; X37 is selected from I, V, and L; X39 is selected from A, E, Q, W, I, M, and D; X40 is selected from K, M, L, R, Q, and H; X42 is selected from Q, C, F, W, H, Y, L, R, and M; X43 is selected from D, A, E, and R; X44 is selected from K, H, A, Q, S, V, L, E, M, T, I, F, C, Y, R, N, and W; X45 is selected from E and T; X49 is selected from P, L, C, F, I, V, Y, T, M, H, S, Q, A, W, N, and K; X50 is selected from D, W, E, G, S, L, and Q; X51 is selected from Q, E, and D; X52 is selected from Q, Y, I, C, and V; X53 is selected from R, W, F, H, Y, N, C, and S; X55 is selected from I, A and T; X57 is selected from A, Q, and G; X59 is selected from K, T, M, I, Q, V, R, L, and N; X60 is selected from Q, S, L, M, P, E V, A, D, I, C, G, and N; X62 is selected from E and D; X63 is selected from D and E; X65 is selected from R, Y, M, T, H, F, N, Q, K, and C; X66 is selected from T and R; X68 is selected from S, G, D, N, H, E, A, Q, M, R, and K; X69 is selected from D and S; X71 is selected from N, E, and Q; X72 is selected from I and L; X73 is selected from Q, L, T, V, C, A, M, I and S; X74 is selected from K, I, M, F, and V; X75 is selected from E, D, and S; X76 is selected from S, P, E, K, H, R, A, D, N, and Q; X77 is selected from T, K, R, and E; X78 is selected from L, H, K, R, S, M, C, Y, and T; X79 is selected from H, M, Q, and E; X80 is selected from L, P, R, A, G, C, F, M, and S; X81 is selected from V, L, M, F, and C; X84 is selected from L and M; and X85 is selected from R, Q, V, L, M, C, I, T, E, and K, and combinations thereof, provided that SEQ ID NO: 3 is excluded; and an isolated fusion polypeptide comprising at least one member selected SEQ ID NOS:235-244 and 246-449.
In a first respect, an isolated polypeptide comprising an isolated fusion polypeptide having an Ubv amino acid sequence with an N-terminal His6-tag is provided. The isolated fusion polypeptide comprises at least one member selected from the following: an isolated fusion polypeptide comprising SEQ ID NO: 1100, wherein X12 is selected from M, H, Y, W, Q, T, F, S, R, I, and N; X13 is selected from Q, L, I, and M; X17 is selected from K and R; X18 is selected from T, M, I, C, L, and V; X20 is selected from T, I, S, E and V; X23 is selected from T, M, and Y; X24 is selected from I, F, H and P; X25 is selected from T, E, D, H, and N; X27 is selected from E, M, T, N, Y, D, and H; X28 is selected from V and C; X29 is selected from E, M, Y, L, H, F, W, S, Q, T, C, N, R, and D; X30 is selected from P and K; X31 is selected from S, D, N, C, A, and W; X32 is selected from D and E; X36 is selected from N, V, I, E, G, M, Q, D, A, L, R, S, K, T, C, and F; X37 is selected from I, V, and L; X39 is selected from A, E, Q, W, I, M, and D; X40 is selected from K, M, L, R, Q, and H; X42 is selected from Q, C, F, W, H, Y, L, R, and M; X43 is selected from D, A, E, and R; X44 is selected from K, H, A, Q, S, V, L, E, M, T, I, F, C, Y, R, N, and W; X45 is selected from E and T; X49 is selected from P, L, C, F, I, V, Y, T, M, H, S, Q, A, W, N, and K; X50 is selected from D, W, E, G, S, L, and Q; X51 is selected from Q, E, and D; X52 is selected from Q, Y, I, C, and V; X53 is selected from R, W, F, H, Y, N, C, and S; X55 is selected from I, A and T; X57 is selected from A, Q, and G; X59 is selected from K, T, M, I, Q, V, R, L, and N; X60 is selected from Q, S, L, M, P, E V, A, D, I, C, G, and N; X62 is selected from E and D; X63 is selected from D and E; X65 is selected from R, Y, M, T, H, F, N, Q, K, and C; X66 is selected from T and R; X68 is selected from S, G, D, N, H, E, A, Q, M, R, and K; X69 is selected from D and S; X71 is selected from N, E, and Q; X72 is selected from I and L; X73 is selected from Q, L, T, V, C, A, M, I and S; X74 is selected from K, I, M, F, and V; X75 is selected from E, D, and S; X76 is selected from S, P, E, K, H, R, A, D, N, and Q; X77 is selected from T, K, R, and E; X78 is selected from L, H, K, R, S, M, C, Y, and T; X79 is selected from H, M, Q, and E; X80 is selected from L, P, R, A, G, C, F, M, and S; X81 is selected from V, L, M, F, and C; X84 is selected from L and M; and X85 is selected from R, Q, V, L, M, C, I, T, E, and K, and combinations thereof, provided that SEQ ID NO: 3 is excluded. In a second respect, the isolated polypeptide of SEQ ID 1100 encompassing amino acid positions 12-85 shares amino acid sequence identity in the range of at least 40% to 100% identity of SEQ ID NO:1. In a third respect, the isolated polypeptide of SEQ ID 1100 encompassing amino acid positions 12-85 shares amino acid sequence identity in the range of at least 50% to 100% identity of SEQ ID NO:1. In a fourth respect, the isolated polypeptide of SEQ ID 1100 encompassing amino acid positions 12-85 shares amino acid sequence identity in the range of at least 60% to 100% identity of SEQ ID NO:1. In a fifth respect, the isolated polypeptide of SEQ ID 1100 encompassing amino acid positions 12-85 shares amino acid sequence identity in the range of at least 70% to 100% identity of SEQ ID NO:1. In a sixth respect, the isolated polypeptide of SEQ ID 1100 encompassing amino acid positions 12-85 shares amino acid sequence identity in the range of at least 80% to 100% identity of SEQ ID NO:1. In a seventh respect, the isolated polypeptide of SEQ ID 1100 encompassing amino acid positions 12-85 shares amino acid sequence identity in the range of at least 90% to 100% identity of SEQ ID NO:1. In an eighth respect, the isolated polypeptide of SEQ ID 1100 encompassing amino acid positions 12-85 shares amino acid sequence identity in the range of at least 95% to 100% identity of SEQ ID NO:1.
In a third aspect, an isolated polypeptide that enhances rates of HDR through interactions with 53BP1 in a manner to influence repair mechanisms at DSB sites is provided. The isolated polypeptide includes a Ubv having at least 40% amino acid sequence identity to amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having at least 40% amino acid sequence identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded. The isolated polypeptide provides enhanced HDR activity through interactions with 53BP1 in a manner to influence repair mechanisms at DSB sites relative to SEQ ID NO:1 under identical conditions.
In a first respect, the isolated polypeptide includes a Ubv having at least 50% amino acid sequence identity to amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having at least 50% amino acid sequence identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded. In a second respect, the isolated polypeptide includes a Ubv having at least 60% amino acid sequence identity to amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having at least 60% amino acid sequence identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded. In a third respect, the isolated polypeptide includes a Ubv having at least 70% amino acid sequence identity to amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having at least 70% amino acid sequence identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded. In a fourth respect, the isolated polypeptide includes a Ubv having at least 80% amino acid sequence identity to amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having at least 80% amino acid sequence identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded. In a fifth respect, the isolated polypeptide includes a Ubv having at least 90% amino acid sequence identity to amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having at least 90% amino acid sequence identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded. In a sixth respect, the isolated polypeptide includes a Ubv having at least 95% amino acid sequence identity to amino acid positions 1-74 of SEQ ID NOS:1, 2, 482, 633, or 450, provided that SEQ ID NOS:1 and 2 are excluded, and those having at least 95% amino acid sequence identity with amino acid positions 12-85 of SEQ ID NOS: 3, 241, 417, or 1100, provided that SEQ ID NO:3 is excluded.
In a fourth aspect, an isolated polynucleotide is provided. The isolated polynucleotide encodes the isolated polypeptide of any of the first, second, or third aspects.
In a fifth aspect, an isolated polynucleotide encoding a ubiquitin polypeptide variant is provided. The isolated polynucleotide comprises at least one member selected from SEQ ID NOS:669-682, 885-890, and 892-1099, and the corresponding RNA counterparts thereof.
In a sixth aspect, a vector comprising an isolated polynucleotide encoding a ubiquitin polypeptide variant is provided. The isolated polynucleotide comprises at least one member selected from SEQ ID NOS:669-682, 885-890, and 892-1099, and the corresponding RNA counterparts thereof.
In a seventh aspect, a cell or cell line comprising the isolated polypeptide of the first, second, or third aspects, the isolated polynucleotide of the fourth or fifth aspects, or the vector of the sixth aspect.
In an eighth aspect, a method of suppressing 53BP1 recruitment to DNA double-strand break sites in a cell is provided. The method includes a step of administering to the cell the isolated polypeptide of the first, second, or third aspects, the isolated polynucleotide of the fourth or fifth aspects, or the vector of the sixth aspect.
In a ninth aspect, a method of increasing homologous recombination in a cell is provided. The method includes a step of administering to the cell the isolated polypeptide of the first, second, or third aspects, the isolated polynucleotide of the fourth or fifth aspects, or the vector of the sixth aspect.
In a tenth aspect, a method of editing a gene in a cell using a CRISPR system is provided. The method includes a step of administering to the cell the isolated polypeptide of the first, second, or third aspects, the isolated polynucleotide of the fourth or fifth aspects, or the vector of the sixth aspect.
In an eleventh aspect, a method of gene targeting in a cell is provided. The method includes a step of administering to the cell the isolated polypeptide of the first, second, or third aspects, the isolated polynucleotide of the fourth or fifth aspects, or the vector of the sixth aspect.
In a twelfth aspect, a composition comprising the isolated polypeptide the isolated polypeptide of the first, second or third aspects is provided.
In an thirteenth aspect, a kit comprising the isolated polypeptide of the first, second, or third aspects, the isolated polynucleotide of the fourth or fifth aspects, or the vector of the sixth aspect. In a first respect, the kit additionally includes one or more components of a gene editing system. In this regard, the gene editing system is a CRISPR system.
In a fourteenth aspect, a method of performing a medically therapeutic procedure is provided. The includes the step of performing genome editing according to any of the tenth or eleventh aspects.
In a fifteenth aspect, a method of screening for amino acid changes in a first polypeptide that improve affinity of the first polypeptide for a second polypeptide is provided. The method includes a step of using the BACTH system with a reporter gene under control of cAMP regulated promoter to allow fluorescence activated cell sorting based on protein-protein interaction affinity between the first polypeptide and the second polypeptide to screen for improved affinity variants of the first polypeptide.
The polypeptides and polynucleotides disclosed herein may be used in a broad spectrum of applications. The polypeptides and polynucleotides disclosed herein may be used for the detection and quantitative determination as well as for the separation and isolation of 53BP1. The polypeptides and polynucleotides disclosed herein may be used in genomic engineering, epigenomic engineering, genome targeting, and genome editing. The polypeptides and polynucleotides disclosed herein may be used to modify repair pathways, activate or stimulate HDR or homology-based genome editing, inhibit 53BP1 recruitment to DSB sites or damaged chromatin in a cell or modulate DNA end resection. In an aspect, the polypeptides and polynucleotides disclosed herein are used in combination with a gene editing system. The disclosure also provides the use of the polypeptides and polynucleotides disclosed herein as medicaments.
In order to identify mutations that improve the affinity of i53 for 53BP1, the bacterial adenylate cyclase two-hybrid system (BACTH system) was used to screen for interaction between the two proteins. This method makes use of a B. pertussis calmodulin-dependent adenylate cyclase toxin. The catalytic domain of the toxin can be separated into two fragments (T18 and T25) that are able to associate in the presence of calmodulin but have minimal activity in its absence [21, 22]. If bait and prey proteins fused to T18 and T25 interact, then the catalytic activity is restored and cAMP is produced. In E. coli, cAMP binds to catabolite activator protein (CAP) that acts as a transcriptional activator for several genes. By expressing these fusion proteins in an E. coli strain that lacks endogenous adenylate cyclase and naturally lacks calmodulin, cAMP regulated protein expression can be used as a readout of bait-prey interaction [23]. We engineered the screen so that eGFP will be expressed under the control of a cAMP-regulated promoter. The coding sequence for a fragment of 53BP1 (a.a. 1221-1718) containing the i53 interacting regions and i53 were cloned into T18 and T25 adenylate cyclase expression plasmids such that fusion proteins of each would be expressed. If a Ubv interacts with 53BP1, the T18 and T25 fragments will be brought together, adenylate cyclase activity will be restored, cAMP will be produced, and some portion of the bacterial population will be GFP positive.
A plasmid library was made consisting of Ubv-adenylate cyclase fragment fusion protein plasmids that had on average a single codon within the i53 coding region exchanged for a random NNK codon. Plasmids were transformed into DHM1 cells that lack endogenous adenylate cyclase and contain the plasmid for expression of the 53BP1 fragment fused to one of the adenylate cyclase fragments. Expression of eGFP was used as a readout of bait-prey interaction using fluorescence activated cell sorting (FACS) to sort for GFP positive bacteria. Plasmid DNA was isolated from both the sorted GFP positive bacteria (Positive) and from the original pre-sort population (Input) and was sequenced using NGS. Counts were merged for mutations that result in the same amino acid change using Enrich2 [25]. Enrichment was calculated as enrichment=log 2((read count for an amino acid change in the positive population/read count for an amino acid change in the input)/(synonymous change read count in the positive population/synonymous change read count in the input)). A positive enrichment value indicates that mutations resulting in a particular amino acid substitution result in a higher percent of GFP positive bacteria than synonymous mutations and therefore indicates that the amino acid change may improve i53 affinity for 53BP1. For each experiment, DHM1 cells were transformed with the Ubv fusion protein plasmid library in two separate replicates using a gene pulser (Bio-Rad). The i53-adenylate cyclase fragment fusion protein (published i53 peptide, SEQ ID NO:2) plasmid was also introduced separately as a control to estimate selection pressure. Cells were then grown and sorted using FACS and GFP positive cells were collected. Two separate experiments were conducted on separate days using different levels of selection pressure resulting in a different percent GFP positive for the i53 population (i.e. for cells that express published i53 peptide (SEQ ID NO:2) fused to one of the adenylate cyclase fragments). Experiment one had an i53 percent positive of approximately 30 and experiment two had an i53 percent GFP positive of approximately 1700.
There was a high degree of correlation between the two experiments and between replicates (
152
A46G
155
K48M
171
S49D
216
L67K
227
R74L
aNS means not significant;
In order to assess the effect of mutations identified from the two-hybrid screen on the affinity of the Ubvs for 53BP1, Ubvs consisting of the i53 sequence with an N-terminal His tag and short flexible linker plus individual or combinations of screen-identified mutations were purified from E. coli (Table 3). Biolayer interferometry was used to measure the affinity of the purified proteins. Briefly, a purified Ubv was diluted in reaction buffer (1×PBS pH7.4, 0.1 mg/mL BSA, 0.001% Tween 20) to 2 ug/mL. Purified 53BP1 (amino acids 1484-1603) fused to MBP was diluted in reaction buffer to between 20 μM and 10 nM (Table 3, Table 4)). For each Ubv, 8 Ni-NTA sensor tips were hydrated and then loaded with the 2 ug/ml of a Ubv for 30 seconds. Sensor tips were then incubated in reaction buffer for 45 seconds to obtain a baseline. Tips were then moved into either empty buffer or seven different concentrations of purified 53BP1 and the association was measured. Tips were then moved back into reaction buffer and the dissociation was measured. Kon, Koff, and Kd were calculated using a 1:1 binding model using a global fit (Table 4).
The effect of individual mutations on the affinity of the Ubv for 53BP1 was found to correlate with the percent reporter positive cells measured from the high throughput screen (
aThe SEQ ID NOS shown in brackets correspond to the protein amino acid SEQ ID NO, followed by the DNA nucleic acid SEQ ID NO.
In order to test the effects of the improved affinity of the combination mutant Ubvs for 53BP1 on HIDR, i53, CM1, and CM7 Ubvs were purified and used for testing in human cells (Table 3). The Ubvs were delivered alongside Cas9 V3 (JDT) RNP targeting a site in SERPINC1 with single stranded Alt-R HIDR Donor Oligoes (JDR) to introduce an EcoR1 cut site sequence (GAATTC) at the Cas9 cut site upon successful HIDR (Table 5, see methods described below). A range of Ubvs doses was tested from 12.5 to 200 μM. The improved affinity ubiquitin variants required ˜10 fold lower dose for maximum effectiveness and the HDR rates were improved beyond what could be achieved with the i53 peptide (
Genome editing was mediated via IDT Alt-R Cas9 ribonucleoprotein (RNP) complexes delivered by Lonza nucleofection in concert with single-stranded oligodeoxynucleotide (ssODN) HDR repair templates. The specific repair event was the insertion of the 6-nt EcoR1 sequence (5′-GAATTC-3′) directly at the canonical Sp Cas9 cut site (between bases 3 and 4 in the 5′-direction from the PAM sequence). HDR complexes were formed with a nuclease-specific guide for the SERPINC1 gene (Table 5). HDR template consisted of a chemically modified ssODN synthesized as IDT Alt-R HDR Donor Oligos with the Alt-R modification. The sequence contains 40-nt homology arms (HA) on the 5′-end, the 6-nt EcoR1 sequence in the center of the oligo and 40-nt HA on the 3′-end (Table 5). The 86-nt repair template was homologous to the non-targeting strand of dsDNA, where targeting/non-targeting is defined with respect to the guide RNA sequence and the presence of the PAM sequence identifying the targeting strand. The RNPs were generated by complexing IDT Alt-R Cas9 to IDT Alt-R sgRNA at a 1:1.2 ratio of protein to guide to give a final concentration of 2 uM Cas9 with 2.4 uM guide RNA where final concentration refers to the concentration in the final cells, protein, RNA, and DNA mix. The Ubv protein was added to the Cas9 RNP at varying amounts (200 μM down to 12.5 μM final concentration) along with donor DNA at a final concentration of 2 uM. Cas9 RNP, donor, and Ubv protein was delivered into HEK293 cells using the Lonza 96-well Shuttle and nucleofection protocol 96-DS-150. The cells were allowed to grow for 48 hours, after which genomic DNA was isolated using QuickExtract (Epicentre). HDR was measured by NGS.
Testing of additional combinations of mutations identified variants with improved affinity over the previous best variant, CM1. In order to further validate the amino acids changes identified in the two-hybrid screen as candidates for improving the affinity of our Ubvs for 53BP1, a subset of the top hits from the screen were individually added to i53, the results of this screen are shown in
The results of that experiment are shown in
To narrow down which variant may have the best activity in cells CM138, CM142, CM143, CM147, CM149, CM158 were selected for additional testing. The 53BP1-binding deficiency mutant amino acid substitutions (P69L and L70V) were added to CM142, CM143, CM147, CM149, and CM158 and the effect on affinity was measured using BLI11. The results are shown in
Screening of possible alternative mutations at positions mutated in i53 resulted in the identification of high affinity ubiquitin variants that do not include any of the mutations present in i53. Given the tolerance of CM142 for the DM mutations (
To determine if CM455 is able to enhance rates of HDR, we tested its ability to improve rates of HDR measured by introduction of an EcoR1 cut site sequence at SERPINC1 as described in Example 3 with the exception that editing was measured using next generation sequencing. The results are shown in
aThe SEQ ID NOS shown in brackets correspond to the protein amino acid SEQ ID NO, followed by the DNA nucleic acid SEQ ID No.
To test if ubiquitin variants targeting 53BP1 provide a benefit when used in conjunction with small molecule inhibitors reported to boost HDR we tested if the rate of HDR using a DNA-dependent protein kinase (DNA-PK) inhibitor, IDT Enhancer (IDT-E or Alt-R HDR Enhancer), was further increased by using it in combination with CM1. DNA-PK is a critical protein complex in the NHEJ pathway, by inhibiting DNA-PK these small molecules bias the cell towards use of homologous recombination instead of NHEJ to repair double strand breaks induced by CRISPR/Cas9 and other nucleases thereby facilitating gene editing. Notably, 53BP1 recruitment is not dependent on the kinase activity of DNA-PK and is instead recruited through an ATM dependent pathway [29, 30]. Further, 53BP1 recruitment and formation of 53BP1 foci is often used to visualize the presence of double strand breaks, including in the presence of DNA-PK inhibitors which can cause 53BP1 foci to persist for a greater period due to inhibition of the normally rapid repair through the NHEJ pathway [27, 31]. We hypothesized that inhibition of 53BP1 may provide an additional benefit when used in conjunction with inhibitors of common NHEJ pathway targets such as DNA-PK and DNA-ligase IV due to the ability of inhibitors of 53BP1 to enhance HDR not just through a negative effect on NHEJ but also promoting HDR by facilitating end resection.
We tested if our ubiquitin variants provided a further benefit over inhibition of common NHEJ pathway targets alone by using the DNA-PK inhibitor IDT enhancer (IDT-E) in combination with CM1 in the context of both large and small inserts (Table 7). The results are shown in
aThe SEQ D NOS shown in brackets correspond to the protospacer SEQ ID NO, followed by the Donor Seq uence SEQ ID NO.
Testing of additional mutations identified a variant with improved affinity over that of the previously described CM455. In order to determine if the amino acid change made at position 2 (L2M) in CM455 relative to i53 was the optimal amino acid change at that position, we screened additional amino acid changes for their effect on the affinity for binding 53BP1. The results are shown in
A tag-free version of CM1 (CM1tf, SEQ ID NO:482) was compared with the His6-tagged version of CM1 (SEQ ID NO:241) for their ability to enhance HDR in HEK293 cells as has been described in previous examples. Briefly, 2 uM Cas9 RNP targeting a site in HPRT1 and 2 uM ssDNA donor containing 40 bp homology arms flanking a 6 bp EcoR1 cut site insert sequence were delivered into HEK293 cells with varying amounts of CM1tf (CM1tf, SEQ ID NO:482) or His-tagged CM1 (CM1; SEQ ID NO:241) using Lonza nucleofection. Genomic DNA was isolated after 48 hours, and editing was measured using an EcoR1 cleavage assay. The results are shown in
In order to test if CM1 is effective at increasing HDR rates when delivered in other forms, plasmid or mRNA encoding CM1 was introduced into cells and the effects on HDR rates were analyzed. To test the effectiveness of CM1 delivered as plasmid, 154 ng of plasmid encoding His-tagged i53, His-tagged CM1, or a crRNA for LbCas12a was co-delivered with 154 ng of plasmid encoding sgRNA targeting HPRT1 into Jurkat cells by Lonza nucleofection using SF buffer and program DS-150. After 72 hours, genomic DNA was extracted using QuickExtract (Lucigen) and editing was analyzed by PCR amplification of the HPRT1 target site followed by EcoR1 restriction enzyme digestion. Digested product was run on a Fragment Analyzer (AATI). The results are shown in
Use of plasmid encoding i53 or CM1 resulted in an increase in HDR rates, with CM1 causing a larger increase in HDR rate. In order to test if CM1 is effective when delivered as mRNA, mRNA encoding CM1tf or CM1tf protein (12.5 μM) was delivered with 2 μM Cas9 RNP targeting HPRT1 and 2 μM HPRT1 EcoR1 cut site ssDNA donor by Lonza nucleofection (SE solution, pulse code CL-120). The indicated mRNA concentration (6.56 nM) was calculated using the commonly used 40 ug/ml for an OD260 of 1 absorbance estimate for ssRNA. Using a sequence specific extinction coefficient, the concentration was calculated as 4.61 nM. After 48 hours genomic DNA was extracted and the rate of HDR was analyzed as described previously. The results are shown in
Introduction of CM1tf as either protein or mRNA provided a similar level of boost in HDR rates over the no enhancer control. No additional benefit was observed when CM1tf mRNA and protein were added together, however there may be some benefit to adding them in combination in other cell types or with other types of donor DNA. The CM1tf mRNA was generated from PCR product from a human codon optimized CM1tf expression vector (made by IDT) using the HiScribe T7 ARCA kit (NEB) and Monarch RNA cleanup columns (NEB). The poly-A tail was encoded in the PCR product by addition of a poly-T sequence to the reverse primer (Table 8).
GAAGTGGAACCCAGCGACACCATCGAGAACGTGAAGGCCAAAATCCAG
GACCACGAGGGCATCCCTCCTGACCAGCAGAGACTGGCCTTTCAGGGA
AAGTCCCTGGAAGATGGAAGAACCCTGAGCGACTACAACATCCTGAAG
GACCCTAAGAAGATGCCACTGCTGAGACTGAGATGATCAGCCTCGACT
A summary of amino acid and DNA sequences is presented in Table 9.
aThe SEQ ID NOS shown in brackets correspond to the protein amino acid SEQ ID NO, followed by the DNA nucleic acid SEQ ID NO.
To aid in understanding the invention, several terms are defined below.
The use of the terms “a” and “an” and “the” and similar referents in the context of describing the invention (especially in the context of the following claims) are to be construed to cover both the singular and the plural, unless otherwise indicated herein or clearly contradicted by context. The terms “comprising,” “having,” “including,” and “containing” are to be construed as open-ended terms (i.e., meaning “including, but not limited to,”) unless otherwise noted. Recitation of ranges of values herein are merely intended to serve as a shorthand method of referring individually to each separate value falling within the range, unless otherwise indicated herein, and each separate value is incorporated into the specification as if it were individually recited herein. All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”) provided herein, is intended merely to better illuminate the invention and does not pose a limitation on the scope of the invention unless otherwise claimed. No language in the specification should be construed as indicating any non-claimed element as essential to the practice of the invention.
The term “CRISPR” refers to Clustered Regularly Interspaced Short Palindromic Repeat bacterial adaptive immune system.
The terms “Cas” and “Cas endonuclease” generally refers to a CRISPR-associated endonuclease.
The term “Cas protein” generally refers to a wild-type protein, including a variant thereof, of a CRISPR-associated endonuclease (including the interchangeable terms Cas and Cas endonuclease).
The term “Cas nucleic acid” generally refers to a nucleic acid of a CRISPR-associated endonuclease, including a guide RNA, sgRNA, crRNA, or tracrRNA.
The terms “Cas9” and “CRISPR/Cas9” refer to the CRISPR-associated bacterial adaptive immune system of Steptococcus pyogenes. Examples of this system are disclosed in U.S. patent application Ser. Nos. 15/729,491 and 15/964,041, filed Oct. 10, 2017 and Apr. 26, 2018, respectively (Attorney Docket Nos. IDT01-009-US and IDT01-009-US-CIP, respectively), the contents of which are incorporated by reference herein.
The terms “AsCas12a” and “CRISPR/AsCas12a” refer to the CRISPR-associated bacterial adaptive immune system of Acidaminococcus sp. Examples of this system are disclosed in U.S. patent application Ser. No. 16/536,256, filed Aug. 8, 2019, (Attorney Docket No. IDT01-013-US), the contents of which are incorporated by reference herein.
The terms “LbCas12a” and “CRISPR/LbCas12a” refer to the CRISPR-associated bacterial adaptive immune system of Lachnospiraceae bacterium. Examples of this system are disclosed in U.S. Patent Application Ser. No. 63/018,592, filed May 1, 2020, (Attorney Docket No. IDT01-017-PRO), the contents of which are incorporated by reference herein.
The term “variant,” as that term modifies a protein (for example, ubiquitin), refers to a protein that includes at least one amino substitution of the reference, typically wild-type, protein amino acid sequence, additional amino acids (for example, such as an affinity tag or nuclear localization signal), or a combination thereof.
The term “polypeptide” refers to any linear or branched peptide comprising more than one amino acid. Polypeptide includes protein or fragment thereof or fusion thereof, provided such protein, fragment or fusion retains a useful biochemical or biological activity. In terms or manufacturing methods, “polypeptide” refers to synthetic polypeptides that may be produced from chemical means as well as polypeptides expressed from translation in vitro or in vivo.
The terms “fusion protein” and “fusion polypeptide” are interchangeable and typically includes extra amino acid information that is not native to the protein to which the extra amino acid information is covalently attached. Such extra amino acid information may include tags that enable purification or identification of the fusion protein. Such extra amino acid information may include peptides that enable the fusion proteins to be transported into cells and/or transported to specific locations within cells. Examples of tags for these purposes include affinity tags and nuclear localization signals (NLS), such as those obtained from SV40, allow for proteins to be transported to the nucleus immediately upon entering the cell. Given that the native Cas9 protein is bacterial in origin and therefore does not naturally comprise a NLS motif, addition of one or more NLS motifs to the recombinant Cas9 protein is expected to show improved genome editing activity when used in eukaryotic cells where the target genomic DNA substrate resides in the nucleus. One skilled in the art would appreciate these various fusion tag technologies, as well as how to make and use fusion proteins that include them
The terms “Ubiquitin” or “human Ubiquitin” refers to the wild-type Ubiquitin polypeptide amino acid sequence (SEQ ID NO:1).
The terms “i53,” i53 Ubiquitin,” or “Ubiquitin i53” refers to a ubiquitin variant polypeptide amino acid sequence (SEQ ID NO:2) that lacks the carboxy terminal di-glycine of the wild-type Ubiquitin polypeptide and includes several amino acid substitutions (Q2L, I44A, Q49S, Q62L, E64D, T66K, L69P, and V70L) relative to the wild-type Ubiquitin polypeptide.
The terms “polynucleotide” and “nucleic acid” are interchangeable and refer to synthetic DNA or synthetic RNA, including synthetic mRNA, as well as RNA, including mRNA that may be expressed from DNA or from a vector in vitro or in vivo. The SEQ ID NOS of polynucleotides have been presented in DNA forms without limiting that the corresponding RNA versions, including mRNA versions of those sequences may be readily deduces by one skilled in the art. Accordingly, while the SEQ ID NOS of polynucleotides formally define DNA sequences, such SEQ ID NOS implicitly encompass the RNA sequence counterparts of those DNA sequences as well.
One of ordinary skill in the art would appreciate that an isolated polypeptide or isolated polynucleotide comprising a particular SEQ ID NO will encompass the particular amino acid or nucleotide sequence defined by the SEQ ID NO as well as include any additional amino acid or nucleotide information not included within the given SEQ ID NO.
All references, including publications, patent applications, and patents, cited herein are hereby incorporated by reference to the same extent as if each reference were individually and specifically indicated to be incorporated by reference and were set forth in its entirety herein.
Preferred embodiments of this invention are described herein, including the best mode known to the inventors for carrying out the invention. Variations of those preferred embodiments may become apparent to those of ordinary skill in the art upon reading the foregoing description.
The inventors expect skilled artisans to employ such variations as appropriate, and the inventors intend for the invention to be practiced otherwise than as specifically described herein. Accordingly, this invention includes all modifications and equivalents of the subject matter recited in the claims appended hereto as permitted by applicable law. Moreover, any combination of the above-described elements in all possible variations thereof is encompassed by the invention unless otherwise indicated herein or otherwise clearly contradicted by context.
This application claims benefit of priority under 35 U.S.C. 119 to U.S. Provisional Patent Application Ser. No. 63/248,300, filed Sep. 24, 2021, U.S. Provisional Patent Application Ser. No. 63/278,155, filed Nov. 11, 2021, and U.S. Provisional Patent Application Ser. No. 63/321,384, filed Mar. 18, 2022, wherein each application is entitled “UBIQUITIN VARIANTS WITH IMPROVED AFFINITY FOR 53BP1,” the contents of each application are herein incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
63321384 | Mar 2022 | US | |
63278155 | Nov 2021 | US | |
63248300 | Sep 2021 | US |