The invention relates to a method for engineering I-CreI homing endonuclease variants able to cleave mutant I-CreI sites having variation in positions ±8 to ±10. The invention relates also to an I-CreI homing endonuclease variant obtainable by said method, to a vector encoding said variant, to a cell, an animal or a plant modified by said vector and to the use of said I-CreI endonuclease variant and derived products for genetic engineering, genome therapy and antiviral therapy.
Meganucleases are by definition sequence-specific endonucleases with large (>14 bp) cleavage sites that can deliver DNA double-strand breaks (DSBs) at specific loci in living cells (Thierry and Dujon, Nucleic Acids Res., 1992, 20, 5625-5631). Meganucleases have been used to stimulate homologous recombination in the vicinity of their target sequences in cultured cells and plants (Rouet et al., Mol. Cell. Biol., 1994, 14, 8096-106; Choulika et al., Mol. Cell. Biol., 1995, 15, 1968-73; Donoho et al., Mol. Cell. Biol, 1998, 18, 4070-8; Elliott et al., Mol. Cell. Biol., 1998, 18, 93-101; Sargent et al., Mol. Cell. Biol., 1997, 17, 267-77; Puchta et al., Proc. Natl. Acad. Sci. USA, 1996, 93, 5055-60; Chiurazzi et al., Plant Cell, 1996, 8, 2057-2066), making meganuclease-induced recombination an efficient and robust method for genome engineering. The use of meganuclease-induced recombination has long been limited by the repertoire of natural meganucleases, and the major limitation of the current technology is the requirement for the prior introduction of a meganuclease cleavage site in the locus of interest.
Thus, the making of artificial meganucleases with tailored substrate specificities is under intense investigation. Such proteins could be used to cleave genuine chromosomal sequences and open new perspectives for genome engineering in wide range of applications. For example, meganucleases could be used to induce the correction of mutations linked with monogenic inherited diseases, and bypass the risk due to the randomly inserted transgenes used in current gene therapy approaches (Hacein-Bey-Abina et al., Science, 2003, 302, 415-419).
Recently, Zinc-Finger DNA binding domains of Cys2-His2 type Zinc-Finger Proteins (ZFP) could be fused with the catalytic domain of the FokI endonuclease, to induce recombination in various cell types, including human lymphoid cells (Smith et al., Nucleic Acids Res, 1999, 27, 674-81; Pabo et al., Annu. Rev. Biochem, 2001, 70, 313-40; Porteus and Baltimore, Science, 2003, 300, 763; Urnov et al., Nature, 2005, 435, 646-651; Bibikova et al., Science, 2003, 300, 764). The binding specificity of ZFPs is relatively easy to manipulate, and a repertoire of novel artificial ZFPs, able to bind many (g/a)nn(g/a)nn(g/a)nn sequences is now available (Pabo et al., precited; Segal and Barbas, Curr. Opin. Biotechnol., 2001, 12, 632-7; Isalan et al., Nat. Biotechnol., 2001, 19, 656-60). However, preserving a very narrow substrate specificity is one of the major issues for genome engineering applications, and presently it is unclear whether ZFPs would fulfill the very strict requirements for therapeutic applications. Furthermore, these fusion proteins have demonstrated high toxicity in cells (Porteus and Baltimore, precited; Bibikova et al., Genetics, 2002, 161, 1169-1175)), probably due to a low level of specificity.
In nature, meganucleases are essentially represented by horning endonucleases (HEs), a family of endonucleases encoded by mobile genetic elements, whose function is to initiate DNA double-strand break (DSB)-induced recombination events in a process referred to as homing (Chevalier and Stoddard, Nucleic Acids Res., 2001, 29, 3757-74; Kostriken et al., Cell; 1983, 35, 167-74; Jacquier and Dujon, Cell, 1985, 41, 383-94). Several hundreds of HES have been identified in bacteria, eukaryotes, and archea (Chevalier and Stoddard, precited); however the probability of finding a HE cleavage site in a chosen gene is very low.
Given their biological function and their exceptional cleavage properties in terms of efficacy and specificity, HEs provide ideal scaffolds to derive novel endonucleases for genome engineering. Data have been accumulated over the last decade, characterizing the LAGLIDADG family, the largest of the four HE families (Chevalier and Stoddard, precited). LAGLIDADG refers to the only sequence actually conserved throughout the family and is found in one or (more often) two copies in the protein. Proteins with a single motif, such as I-CreI, faun homodimers and cleave palindromic or pseudo-palindromic DNA sequences, whereas the larger, double motif proteins, such as I-SceI are monomers and cleave non palindromic targets. Seven different LAGLIDADG proteins have been crystallized, and they exhibit a very striking conservation of the core structure, that contrasts with the lack of similarity at the primary sequence level (Jurica et al., Mol. Cell., 1998, 2, 469-76; Chevalier et al., Nat. Struct. Biol., 2001, 8, 312-6; Chevalier et al. J. Mol. Biol., 2003, 329, 253-69; Moure et al., J. Mol. Biol, 2003, 334, 685-95; Moure et al., Nat. Struct. Biol., 2002, 9, 764-70; Ichiyanagi et al., J. Mol. Biol., 2000, 300, 889-901; Duan et al., Cell, 1997, 89, 555-64; Bolduc et al., Genes Dev., 2003, 17, 2875-88; Silva et al., J. Mol. Biol., 1999, 286, 1123-36). In this core structure, two characteristic αββαββα folds, also called LAGLIDADG homing endonuclease core domains, contributed by two monomers, or by two domains in double LAGLIDAG proteins, are facing each other with a two-fold symmetry. DNA binding depends on the four β strands from each domain, folded into an antiparallel β-sheet, and fanning a saddle on the DNA helix major groove. Analysis of I-CreI structure bound to its natural target shows that in each monomer, eight residues (Y33, Q38, N30, K28, Q26, Q44, R68 and R70) establish direct interaction with seven bases at positions ±3, 4, 5, 6, 7, 9 and 10 (Jurica et al., 1998, precited). In addition, some residues establish water-mediated contact with several bases; for example S40, K28 and N30 with the base pair at position +8 and −8 (Chevalier et al., 2003, precited). The catalytic core is central, with a contribution of both symmetric monomers/domains. In addition to this core structure, other domains can be found: for example, PI-SceI, an intein, has a protein splicing domain, and an additional DNA-binding domain (Moure et al., 2002, precited; Grindl et al., Nucleic Acids Res., 1998, 26, 1857-62).
Two approaches for deriving novel meganucleases from homing endonucleases, are under investigation:
Hybrid or Chimeric Single-Chain Proteins
New meganucleases could be obtained by swapping LAGLIDADG homing endonuclease core domains of different monomers (Epinat et al., Nucleic Acids Res., 2003, 31, 2952-62; Chevalier et al., Mol. Cell., 2002, 10, 895-905; Steuer et al., Chembiochem., 2004, 5, 206-13; International PCT Applications WO 03/078619 and WO 2004/031346). These single-chain chimeric meganucleases wherein the two LAGLIDADG homing endonuclease core domains from different meganucleases are linked by a spacer, are able to cleave the hybrid target corresponding to the fusion of the two half parent DNA target sequences. These results mean that the two DNA binding domain of an I-CreI dimer behave independently; each DNA binding domain binds a different half of the DNA target site. The construction of chimeric and single chain artificial HEs has suggested that a combinatorial approach could be used to obtain novel meganucleases cleaving novel (non-palindromic) target sequences: different monomers or core domains could be fused in a single protein, to achieve novel specificities.
However, this approach does not enrich considerably the number of DNA sequences that can be targeted with homing endonucleases since the novel targets which are generated result from the combination of two different DNA target half-sites.
Protein Variants
Altering the substrate specificity of DNA binding proteins by mutagenesis and screening/selection has often proven to be difficult (Lanio et al., Protein Eng., 2000, 13, 275-281; Voziyanov et al., J. Mol. Biol., 2003, 326, 65-76; Santoro et al., P.N.A.S., 2002, 99, 4185-4190; Buchholz and Stewart, Nat. Biotechnol., 2001, 19, 1047-1052), and more particularly, engineering HEs DNA binding domain has long been considered a daunting task (Ashworth et al., Nature 2006, 441, 656-659; Gimble et al., J. Mol. Biol., 2003, 334, 993-1008; Arnould et al., J. Mol. Biol., 2006, 355, 443-458; Doyon et al., J. Am. Chem. Soc., 2006, 128, 2477-2484; Steuer et al., precited; Seligman et al., Nucleic Acids Res., 2002, 30, 3870-3879).
Analysis of the I-CreI/DNA crystal structure indicates that 9 amino acids make direct contacts with the homing site (Chevalier et al., 2003; Jurica et al., precited) which randomization would result in 209 combinations, a number beyond any screening capacity today.
Therefore, several laboratories have relied on a semi-rational approach (Chica et al., Curr. Opin. Biotechnol., 2005, 16, 378-384) to limit the diversity of the mutant libraries to be handled: a small set of relevant residues is chosen according to structural data. Nevertheless, this was still not sufficient to create redesigned endonucleases cleaving chosen sequences:
To reach a larger number of sequences, it would be extremely valuable to be able to generate other homing endonuclease variants with novel substrate specificity, ie able to cleave DNA targets which are not cleaved by the parent homing endonuclease or the few variants which have been isolated so far.
In particular, it would be extremely valuable to generate homing endonuclease variants able to cleave novel DNA targets wherein several nucleotides of the wild-type meganuclease DNA target have been mutated simultaneously.
However, this approach is not easy since the HEs DNA binding interface is very compact and the two different IV hairpins which are responsible for virtually all base-specific interactions are part of a single fold. Thus, the mutation of several amino acids placed in close vicinity which is required for binding a target mutated at several positions may disrupt the structure of the binding interface.
The Inventor has engineered hundreds of novel I-CreI variants which, altogether, target all of the 64 possible mutant I-CreI sites differing at positions ±10, ±9, and ±8. These variants having new substrate specificity towards nucleotides ±8, ±9, and/or ±10, increase the number of DNA sequences that can be targeted with meganucleases. Potential applications include genetic engineering, genome engineering, gene therapy and antiviral therapy.
Thus, the invention concerns a method for engineering a I-CreI homing endonuclease variant having a modified cleavage specificity, comprising at least the steps of:
(a) replacing at least one of the amino acids K28, N30, Y33, Q38 and/or S40 from the β1β2 hairpin of I-CreI, with an amino acid selected from the group consisting of A, C, D, E, G, H, K, N, P, Q, R, S, T, L, V, W, and Y
(b) selecting and/or screening the I-CreI variants from step (a) which are able to cleave a DNA target sequence consisting of a mutant I-CreI site wherein at least the aa nucleotide doublet in positions −9 to −8 and/or the tt nucleotide doublet in positions +8 to +9 has been replaced with a different nucleotide doublet.
According to the method of the invention, the amino acid mutation(s) in step a) are introduced in either wild-type I-CreI or a functional variant thereof. Step a) may comprise the introduction of additional mutations, particularly at other positions contacting the DNA target sequence or interacting directly or indirectly with said DNA target. Functional variants comprise mutations that do not affect the protein structure. For example, the amino acid mutations in step (a) may be introduced in an I-CreI variant comprising one or more mutations selected from the group consisting of:
Step a) may be performed by generating a library of variants as described in the International PCT Application WO 2004/067736.
The selection and/or screening in step (b) may be performed by using a cleavage assay in vitro or in vivo, as described in the International PCT Application WO 2004/067736.
According, to an advantageous embodiment of said method, the DNA target in step b) derives from a I-CreI site which is selected from C1234, C4334 and C1221 (SEQ ID NO: 1 to 3,
According to another advantageous embodiment of said method, the DNA target in step b) comprises a sequence having the formula:
c−11n−10n−9n−8m−7y−6n−5n−4n−3k−2y−1r+1m+2n+3n+4n+5n+6k+7n+8n+9n+10g+11 (I),
wherein n is a, t, c, or g, m is a or c, y is c or t, k is g or t, r is a or g (SEQ ID NO: 75), providing that when n−9n−8 is aa then n+8n+9 is different from tt and when n+8n+9 is tt, then n−9n−8 is different from aa.
According to a preferred embodiment of said method, n−5m−4n−3 is gtc and/or n+3n+4n+5 is gac.
The DNA target in step b) may be palindromic, non-palindromic or pseudo-palindromic. Preferably, the nucleotide sequence from positions −11 to −8 and +8 to +11 and/or the nucleotide sequence from positions −5 to −3 and/or +3 to +5 are palindromic.
According to another advantageous embodiment of said method, the DNA target in step b) comprises a nucleotide doublet in positions −9 to −8, which is selected from the group consisting of: ag, at, ac, ga, gg, gt, gc, ta, tg, tt, cg, ct, or cc, and/or a nucleotide doublet in positions +8 to +9, which is the reverse complementary sequence of said nucleotide doublet in positions −9 to −8, ie ct, ta, gt, tc, cc, ac, gc, at, ca, aa, cg, ag or gg.
According to another advantageous embodiment of said method, the DNA target in step b) further comprises the replacement of the a nucleotide in position −10 and/or the t nucleotide in position +10 of the I-CreI site, with a different nucleotide.
Preferably, said DNA target comprises a nucleotide triplet in positions −10 to −8, which is selected from the group consisting of: aac, aag, aat, acc, acg, act, aga, agc, agg, agt, ata, atg, cag, cga, egg, ctg, gac, gag, gat, gcc, gga, ggc, ggg, ggt, gta, gtg, gtt, tac, tag, tat, tcc, tga, tgc, tgg, tgt or ttg, and/or a nucleotide triplet in positions +8 to +10, which is the reverse complementary sequence of said nucleotide triplet in positions −10 to −8.
According to another advantageous embodiment of said method, step (b) is performed in vivo, under conditions where the double-strand break in the mutated DNA target sequence which is generated by said variant leads to the activation of a positive selection marker or a reporter gene, or the inactivation of a negative selection marker or a reporter gene, by recombination-mediated repair of said DNA double-strand break.
For example, the cleavage activity of the I-CreI variant of the invention may be measured by a direct repeat recombination assay, in yeast or mammalian cells, using a reporter vector, as described in the PCT Application WO 2004/067736. The reporter vector comprises two truncated, non-functional copies of a reporter gene (direct repeats) and a DNA target sequence within the intervening sequence, cloned in a yeast or a mammalian expression vector (
According to another advantageous embodiment of said method, it comprises a further step c1) of expressing one variant obtained in step b), so as to allow the formation of homodimers.
According to another advantageous embodiment of said method, it comprises a further step c2) of co-expressing one variant obtained in step b) and I-CreI or a functional variant thereof, so as to allow the formation of heterodimers. Preferably, two different variants obtained in step b) are co-expressed.
For example, host cells may be modified by one or two recombinant expression vector(s) encoding said variant(s). The cells are then cultured under conditions allowing the expression of the variant(s) and the homodimers/heterodimers which are folmed are then recovered from the cell culture.
According to the method of the invention, single-chain chimeric endonucleases may be constructed by the fusion of one variant obtained in step b) with a homing endonuclease domain/monomer. Said domain/monomer may be from a wild-type homing endonuclease or a functional variant thereof.
Methods for constructing single-chain chimeric molecules derived from homing endonucleases are well-known in the art (Epinat et al., Nucleic Acids Res., 2003, 31, 2952-62; Chevalier et al., Mol. Cell., 2002, 10, 895-905; Steuer et al., Chembiochem., 2004, 5, 206-13; International PCT Applications WO 03/078619 and WO 2004/031346). Any of such methods, may be applied for constructing single-chain chimeric molecule derived from the variants as defined in the present invention.
The subject matter of the present invention is also a I-CreI meganuclease variant obtainable by the method as defined above, said variant being able to cleave a DNA target sequence consisting of a mutant I-CreI site wherein at least one of the a nucleotides in position −9 and −8, or one of the t nucleotides in position +8 and +9 has been replaced with a different nucleotide.
According to an advantageous embodiment of said I-CreI variant, it has an arginine (R) or a lysine (K) in position 38; the variants having R or K in position 38 are able to cleave a DNA target comprising a guanine in position −9 or a cytosine in position +9.
Said variant may be selected from the variants having amino acid residues in positions 28, 30, 33, 38 and 40 respectively, which are selected from the group consisting of: Q28/N30/Y33/K38/R40, R28/N30/K33/R38/Q40, Q28/N30/R33/R38/R40, Q28/N30/Y33/K38/K40, K28/N30/T33/R38/Q40, K28/N30/S33/R38/E40, S28/N30/Y33/R38/K40, K28/N30/S33/R38/D40, K28/N30/S33/R38/S40, Q28/N30/Y33/R38/K40, Q28/N30/K33/R38/T40, N28/N30/S33/R38/K40, N28/N30/S33/R38/R40, E28/N30/R33/R38/K40, R28/N30/T33/R38/A40, Q28/N30/Y33/R38/A40, Q28/N30/Y33/R38/S40, K28/N30/R33/K38/A40, R281N30/A33/K38/S40, A28/N30/N33/R38/K40, Q28/N30/S33/R38/K40, K28/A30/H33/R38/S40, K28/H30/H33/R38/S40, K28/E30/S33/R38/S40, K28/N30/H33/R38/S40, K28/D30/H33/K38/S40, K28/K30/H33/R38/S40, K28/S30/H33/R38/S40, and K28/G30/V33/R38/S40.
According to another advantageous embodiment of said I-CreI variant, it has amino acid residues in positions 28, 30, 33, 38 and 40 respectively, which are selected from the group consisting of: Q28/N30/Y33/K38/R40, R28/N30/K33/R38/Q40, Q28/N30/R33/R38/R40, Q28/N30/Y33/K38/K40, Q28/N30/T33/Q38/K40, Q28/N30/R33/R38/K40, K28/N30/T33/Q38/R40, S28/N30/R33/S38/R40, N28/N30/Y33/Q38/R40, K28/N30/T33/R38/Q40, K28/N30/S33/R38/E40, Q28/N30/N33/Q38/K40, S28/N30/Y33/R38/K40, K28/N30/S33/R38/D40, K28/N30/R33/E38/R40, K28/N30/S33/R38/K40, R28/N30/R33/D38/R40, A28/N30/S33/Q38/R40, Q28/N30/Y33/R38/K40, Q28/N30/K33/R38/T40, R28/N30/A33/Y38/Q40, K28/N30/R33/Q38/E40, N28/N30/S33/R38/K40, N28/N30/S33/R38/R40, Q28/N30/Y33/Q38/K40, Q28/N30/Y33/Q38/R40, S28/N30/R33/Q38/R40, Q28/N30/R33/Q38/K40, E28/N30/R33/R38/K40, K28/N30/N33/Q38/A40, S28/N30/Y33/Q38/K40, T28/N30/R33/Q38/R40, Q28/N30/T33/Q38/R40, K28/N30/R33/T38/Q40, K28/N30/R33/T38/R40, Q28/N30/E33/D38/H40, R28/N30/Y33/N38/A40, Q28/N30/Y33/T38/R40, R28/N30/T33/R38/A40, H28/N30/Y33/D38/S40, Q28/N30/Y33/R38/A40, Q28/N30/Y33/A38/R40, S28/N30/Q33/A38/A40, Q28/N30/Y33/E38/K40, T281N30/N33/Q38/R40, Q28/N30/Y33/R38/S40, K28/N30/R33/Q38/R40, Q28/N30/R33/A38/R40, Q28/N30/N33/Q38/R40, R28/N30/R33/E38/R40, K28/N30/R33/A38/R40, K28/N30/T33/A38/A40, K28/N30/R33/K38/A40, R28/N30/A33/K38/S40, K28/N30/R33/N38/A40, T28/N30/E33/S38/D40, R28/N30/N33/Q38K40, R28/N30/R33/Y38/Q40, K28/N30/Y33/Q38/N40, K28/N30/R33/S38/S40, K28/N30/R33/Y38/A40, A28/N30/N33/R38/K40, K28/N30/R33/A38/T40, K28/N30/R33/N38/Q40, T28/N30/T33/Q38/R40, K28/N30/R33/Q38/Y40, Q28/N30/S33/R38/K40, R28/N30/Y33/Q38/S40, Q28/N30/R33/Q38/R40, K28/N30/R33/A38/Q40, A28/N30/R33/Q38/R40, K28/N30/R33/Q38/Q40, K28/N30/R33/Q38/A40, K28/N30/T33/A38/S40, K28/A30/H33/R38/S40, K28/H30/H33/R38/S40, K28/D30/N33/H38/S40, K28/E30/S33/R38/S40, K28/H30/T33/P38/S40, K28/G30/H33/Y38/S40, K28/A30/R33/Q38/S40, K28/S30/R33/G38/S40, K28/S30/H33/H38/S40, K28/N30/H33/R38/S40, K28/R30/R33/E38/S40, K28/Q30/T33/H38/S40, K28/R30/H33/G38/S40, K28/A30/N33/Q38/S40, Q28/Q30/H33/K38/S40, K28/K30/H33/R38/S40, K28/Q30/N33/Q38/S40, K28/Q30/T33/Q38/S40, K28/Q30/R33/Q38/S40 K28/R30/P33/G38/S40, K28/R30/G33/N38/S40, K28/N30/A33/Q38/S40, K28/N30/H33/N38/S40, K28/H30/H33/A38/S40, K28/R30/G33/S38/S40, K28/S30/R33/Q38/S40, K28/T30/D33/H38/S40, K28/H30/H33/Q38/S40, K28/A30/D33/H38/S40, K28/S30/H33/R38/S40, K28/N30/R33/A38/S40, K28/S30/H33/Q38/S40, K28/D30/A33/H38/S40, K28/N30/H33/E38/S40, K28/D30/R33/T38/S40, K28/D30/R33/S38/S40, K28/A30/H33/Q38/S40, K28/R30/G33/T38/S40, K28/N30/H33/S38/S40, K28/Q30/H33/Q38/S40, K28/N30/H33/G38/S40, K28/N30/N33/Q38/S40, K28/N30/D33/Q38/S40, K28/D30/R33/G38/S40, K28/N30/H33/A38/S40, K28/H30/M33/A38/S40, K28/S30/S33/H38/S40, K28/G30/V33/A38/S40, K28/S30/V33/Q38/S40, K28/D30/V33/H38/S40, R28/D30/V33/Q38/S40, K28/G30/V33/Q38/S40, K28/G30/V33/T38/S340, K28/G30/V33/H38/S40, K28/G30/V33/R38/S40, K28/G30/V33/G38/S40, R28/A30/V33/G38/S40, R28/D30/V33/R38/S40, R28/N30/V33/Q38/S40, and N28/T30/V33/D38/S40.
According to a more preferred embodiment, said I-CreI variant is a variant able to cleave at least a DNA target sequence which is not cleaved by the parent homing endonuclease (I-CreI D75N), said variant having amino acid residues in positions 28, 30, 33, 38 and 40 respectively, which are selected from the group consisting of: Q28/N30/Y33/K38/R40, R28/N30/K33/R38/Q40, Q28/N30/R33/R38/R40, Q28/N30/Y33/K38/K40, Q28/N30/T33/Q38/K40, Q28/N30/R33/R38/K40, K28/N30/T33/Q38/R40, S28/N30/R33/S38/R40, K28/N30/T33/R38/Q40, K28/N30/S33/R38/E40, S28/N30/Y33/R38/K40, K28/N30/S33/R38/D40, K28/N30/R33/E38/R40, K28/N30/S33/R38/S40, R28/N30/R33/D38/R40, A28/N30/S33/Q38/R40, Q28/N30/Y33/R38/K40, Q28/N30/K33/R38/T40, R28/N30/A33/Y38/Q40, K28/N30/R33/Q38/E40, N28/N30/S33/R38/K40, N28/N30/S33/R38/R40, S28/N30/R33/Q38/R40, Q28/N30/R33/Q38/K40, E28/N30/R33/R38/K40, K28/N30/N33/Q38/A40, S28/N30/Y33/Q38/K40, T28/N30/R33/Q38/R40, Q28/N30/T33/Q38/R40, K28/N30/R33/T38/Q40, K28/N30/A33/T38/R40, Q28/N30/E33/D38/H40, R28/N30/T33/R38/A40, H28/N30/Y33/D38/S40, S28/N30/Q33/A38/A40, K28/N30/R33/Q38/R40, Q28/N30/R33/A38/R40, R28/N30/R33/E38/R40, K28/N30/R33/A38/R40, K28/N30/T33/A38/A40, K28/N30/R33/K38/A40, K28/N30/R33/N38/A40, T28/N30/E33/S38/D40, R28/N30/R33/Y38/Q40, K28/N30/R33/S38/S40, K28/N30/R33/Y38/A40, A28/N30/N33/R38/K40, K28/N30/R33/A38/T40, K28/N30/R33/N38/Q40, T28/N30/T33/Q38/R40, K28/N30/R33/Q38/Y40, Q28/N30/S33/R38/K40, R28/N30/Y33/Q38/S40, Q28/N30/R33/Q38/R40, K28/N30/R33/A38/Q40, A28/N30/R33/Q38/R40, K28/N30/R33/Q38/Q40, K28/N30/R33/Q38/A40, K28/N30/T33/A38/S40, K28/A30/H33/R38/S40, K28/H30/H33/R38/S40, K28/D30/N33/H38/S40, K28/E30/S33/R38/S40, K28/H30/T33/P38/S40, K28/G30/H33/Y38/S40, K28/A30/R33/Q38/S40, K28/S30/R33/G38/S40, K28/S30/H33/H38/S40, K28/N30/H33/R38/S40, K28/R30/R33/E38/S40, K28/D30/G33/H38/S40, K28/R30/H33/G38/S40, K28/A30/N33/Q38/S40, K28/D30/H33/K38/S40, K28/H30/H33/R38/S40, K28/Q30/N33/Q38/S40, K28/Q30/T33/Q38/S40, K28/G30/R33/Q38/S40 K28/R30/P33/G38/S40, K28/R30/G33/N38/S40, K28/N30/A33/Q38/S40, K28/N30/H33/N38/S40, K28/H30/H33/A38/S40, K28/R30/G33/S38/S40, K28/S30/R33/Q38/S40, K28/T30/D33/H38/S40, K28/H30/H33/Q38/S40, K28/A30/D33/H38/S40, K28/S30/H33/R38/S40, K28/N30/R33/A38/S40, K28/S30/H33/Q38/S40, K28/D30/A33/H38/S40, K28/N30/H33/E38/S40, K28/D30/R33/T38/S40, K28/D30/H33/S38/S40, K28/A30/H33/Q38/S40, K28/R30/G33/T38/S40, K28/N30/H33/S38/S40, K28/Q30/H33/Q38/S40, K28/N30/H33/G38/S40, K28/N30/N33/Q38/S40, K28/N30/D33/Q38/S40, K28/D30/R33/G38/S40, K28/N30/H33/A38/S40, K28/H30/M33/A38/S40, K28/S30/S33/H38/S40, K28/G30/V33/A38/S40, K28/D30/V33/H38/S40, R28/D30/V33/Q38/S40, K28/G30/V33/T38/S340, K28/G30/V33/H38/S40, K28/G30/V33/R38/S40, and K28/G30/V33/G38/S40.
According to another more preferred embodiment, said I-CreI variant is a variant able to cleave a DNA target sequence consisting of a mutant I-CreI site wherein at least the a in position −8 and/or the t in position +8 has been replaced with a different nucleotide, said variant having amino acid residues in positions 28, 30, 33, 38 and 40 respectively, which are selected from the group consisting of: Q28/N30/Y33/K38/R40, R28/N30/K33/R38/Q40, Q28/N30/R33/R38/R40, Q28/N30/Y33/K38/K40, Q28/N30/T33/Q38/K40, Q28/N30/R33/R38/R40, K28/N30/T33/Q38/R40, S28/N30/R33/S38/R40, N28/N30/Y33/Q38/R40, K28/N30/S33/R38/E40, Q28/N30/N33/Q38/K40, S28/N30/Y33/R38/K40, K28/N30/S33/R38/D40, K28/N30/R33/E38/R40, K28/N30/S33/R38/S40, R28/N30/R33/D38/R40, A28/N30/S33/Q38/R40, Q28/N30/Y33/R38/K40, Q28/N30/K33/R38/T40, R28/N30/A33/Y38/Q40, K28/N30/R33/Q38/E40, N28/N30/S33/R38/K40, N28/N30/S33/R38/R40, Q28/N30/Y33/Q38/K40, Q28/N30/Y33/Q38/R40, S28/N30/R33/Q38/R40, Q28/N30/R33/Q38/K40, E28/N30/R33/R38/K40, K28/N30/N33/Q38/A40, S28/N30/Y33/Q38/K40, T28/N30/R33/Q38/R40, Q28/N30/T33/Q38/R40, K28/N30/R33/T38/R40, Q28/N30/E33/D38/H40, R28/N30/Y33/N38/A40, Q28/N30/Y33/T38/R40, R28/N30/T33/R38/A40, H28/N30/Y33/D38/S40, Q28/N30/Y33/R38/A40, Q28/N30/Y33/A38/R40, S28/N30/Q33/A38/A40, Q28/N30/Y33/E38/K40, T28/N30/N33/Q38/R40, Q28/N30/Y33/R38/S40, K28/N30/R33/Q38/R40, Q28/N30/R33/A38/R40, Q28/N30/N33/Q38/R40, R28/N30/R33/E38/R40, K28/N30/R33/A38/R40, K28/N30/T33/A38/A40, K28/N30/R33/K38/A40, R28/N30/A33/K38/S40, K28/N30/R33/N38/A40, T28/N30/E33/S38/D40, R28/N30/N33/Q38/D40, R28/N30/R33/Y38/Q40, K28/N30/Y33/Q38/N40, K28/N30/R33/S38/S40, K28/N30/R33/Y38/A40, A28/N30/N33/R38/K40, K28/N30/R33/A38/T40, T28/N30/T33/Q38/R40, K28/N30/R33/Q38/Y40, Q28/N30/S33/R38/K40, R28/N30/Y33/Q38/S40, Q28/N30/R33/Q38/R40, A28/N30/R33/Q38/R40, K28/N30/R33/Q38/Q40, K28/N30/R33/Q38/A40, K28/N30/T33/A38/S40, K28/A30/H33/R38/S40, K28/H30/H33/R38/S40, K28/D30/N33/H38/S40, K28/E30/S33/R38/S40, K28/H30/T33/P38/S40, K28/G30/H33/Y38/S40, K28/A30/R33/Q38/S40, K28/S30/R33/G38/S40, K28/S30/H33/H38/S40, K28/N30/H33/R38/S40, K28/R30/R33/E38/S40, K28/D30/G33/H38/S40, K28/R30/H33/G38/S40, K28/A30/N33/Q38/S40, K28/D30/H33/K38/S40, K28/K30/H33/R38/S40, K28/Q30/N33/Q38/S40, K28/Q30/T33/Q38/S40, K28/G30/R33/Q38/S40 K28/R30/P33/G38/S40, K28/R30/G33/N38/S40, K28/N30/A33/Q38/S40, K28/N30/H33/N38/S40, K28/H30/H33/A38/S40, K28/R30/G33/S38/S40, K28/S30/R33/Q38/S40, K28/T30/D33/H38/S40, K28/H30/H33/Q38/S40, K28/A30/D33/H38/S40, K28/S30/H33/R38/S40, K28/N30/R33/A38/S40, K28/S30/H33/Q38/S40, K28/D30/A33/H38/S40, K28/N30/H33/E38/S40, K28/D30/R33/T38/S40, K28/D30/R33/S38/S40, K28/A30/H33/Q38/S40, K28/R30/G33/T38/S40, K28/N30/H33/S38/S40, K28/Q30/H33/Q38/S40, K28/N30/H33/G38/S40, K28/N30/N33/Q38/S40, K28/N30/D33/Q38/S40, K28/D30/R33/G38/S40, K28/N30/H33/A38/S40, K28/H30/M33/A38/S40, K28/S30/S33/H38/S40, K28/G30/V33/A38/S40, K28/S30/V33/Q38/S40, K28/D30/V33/H38/S40, R28/D30/V33/Q38/S40, K28/G30/V33/Q38/S40, K28/Q30/V33/T38/S40, K28/G30/V33/H38/S40, K28/G30/V33/R38/S40, K28/G30/V33/G38/S40, R28/A30/V33/G38/S40, R28/D30/V33/R38/S40, R28/N30/V33/Q38/S40, and N28/T30/V33/D38/S40.
According to yet another more preferred embodiment, said I-CreI variant is a variant able to cleave a DNA target sequence consisting of a mutant I-CreI site wherein at least the a in position −9 and/or the t in positions +9 has been replaced with a different nucleotide, said variant having amino acid residues in positions 28, 30, 33, 38 and 40 respectively, which are selected from the group consisting of Q28/N30/Y33/K38/R40, R28/N30/K33/R38/Q40, Q28/N30/R33/R38/R40, Q28/N30/Y33/K38/K40, Q28/N30/T33/Q38/K40, Q28/N30/R33/R38/K40, K28/N30/T33/Q38/R40, S28/N30/R33/S38/R40, K28/N30/T33/R38/Q40, K28/N30/S33/R38/E40, S28/N30/Y33/R38/K40, K28/N30/S33/R38/D40, K28/N30/R33/E38/R40, K28/N30/S33/R38/S40, R28/N30/R33/D38/R40, Q28/N30/Y33/R38/K40, R28/N30/A33/Y38/Q40, N28/N30/S33/R38/K40, N28/N30/S33/R38/R40, E28/N30/R33/R38/K40, K28/N30/N33/Q38/A40, K28/N30/R33/T38/Q40, K28/N30/R33/T38/R40, Q28/N30/E33/D38/H40, R28/N30/T33/R38/A40, H28/N30/Y33/D38/S40, K28/N30/R33/Q38/R40, Q28/N30/R33/A38/R40, R28/N30/R33/E38/R40, K28/N30/R33/A38/R40, K28/N30/T33/A38/A40, K28/N30/R33/K38/A40, K28/N30/R33/N38/A40, R28/N30/R33/Y38/Q40, K28/N30/R33/S38/S40, A28/N30/N33/R38/K40, K28/N30/R33/A38/T40, K28/N30/R33/N38/Q40, T28/N30/T33/Q38/R40, K28/N30/R33/Q38/Y40, Q28/N30/S33/R38/K40, K28/N30/R33/A38/Q40, A28/N30/R33/Q38/R40, K28/N30/R33/Q38/A40, K28/N30/T33/A38/S40, K28/A30/H33/R38/S40, K28/H30/H33/R38/S40, K28/D30/N33/H38/S40, K28/E30/S33/R38/S40, K28/S30/R33/G38/S40, K28/S30/H33/H38/S40, K28/N30/H33/R38/S40, K28/R30/R33/E38/S40, K28/D30/G33/H38/S40, K28/D30/H33/K38/S40, K28/K30/H33/R38/S40, K28/Q30/N33/Q38/S40, K28/R30/G33/N38/S40, K28/N30/H33/N38/S40, K28/H30/H33/A38/S40, K28/R30/G33/S38/S40, K28/T30/D33/H38/S40, K28/A30/D33/H38/S40, K28/S30/H33/R38/S40, K28/N30/R33/A38/S40, K28/D30/A33/H38/S40, K28/D30/R33/T38/S40, K28/D30/R33/S38/S40, K28/R30/G33/T38/S40, K28/N30/H33/S38/S40, K28/N30/H33/G38/S40, K28/N30/N33/Q38/S40, K28/D30/R33/G38/S40, K28/N30/H33/A38/S40, K28/H30/M33/A38/S40, K28/S30/S33/H38/S40, K28/G30/V33/A38/S40, K28/D30/V33/H38/S40, K28/G30/V33/T38/S340, K28/G30/V33/H38/S40, K28/G30/V33/R38/S40, and K28/G30/V33/G38/S40.
According to another advantageous embodiment of said I-CreI variant, it comprises one or more additional mutation(s).
The residues which are mutated may advantageously be at positions contacting the DNA target sequence or interacting directly or indirectly with said DNA target. Preferably, said mutations are in positions selected from the group consisting of: I24, Q26, S32, Q44, R68, R70, D75, I77, and T140. Preferably, said I-CreI variant comprises one or more mutations selected from the group consisting of:
Furthermore, other residues may be mutated on the entire I-CreI sequence, and in particular in the C-terminal half of I-CreI (positions 80 to 163). The substitutions in the C-terminal half of I-CreI are preferably in positions: 80, 82, 85, 86, 87, 94, 96, 100, 103, 114, 115, 117, 125, 129, 131, 132, 147, 151, 153, 154, 155, 157, 159 and 160 of I-CreI.
In addition, said variant may include one or more residues inserted at the NH2 terminus and/or COOH terminus. For example, a methionine residue is introduced at the NH2 terminus, a tag (epitope or polyhistidine sequence) is introduced at the NH2 terminus and/or COOH terminus; said tag is useful for the detection and/or the purification of said variant.
The I-CreI variant of the invention may be an homodimer or an heterodimer.
According to another advantageous embodiment of said I-CreI variant, it is an heterodimer comprising monomers from two different variants.
The present invention encompasses also a single-chain chimeric endonuclease comprising a monomer from a I-CreI variant, as defined above.
The subject-matter of the present invention is also a polynucleotide fragment encoding a I-CreI variant or a single-chain chimeric endonuclease derived from said variant, as defined above.
The subject-matter of the present invention is also a recombinant vector comprising at least one polynucleotide fragment encoding a variant or a single-chain chimeric endonuclease derived from said variant, as defined above. Said vector may comprise a polynucleotide fragment encoding one monomer of a homodimeric variant, two monomers or one monomer and one domain of a single-chain molecule. Alternatively, said vector may comprise two different polynucleotide fragments, each encoding one of the monomers of a heterodimeric variant.
One type of preferred vector is an episome, i.e., a nucleic acid capable of extra-chromosomal replication. Preferred vectors are those capable of autonomous replication and/or expression of nucleic acids to which they are linked. Vectors capable of directing the expression of genes to which they are operatively linked are referred to herein as “expression vectors”.
A vector according to the present invention comprises, but is not limited to, a YAC (yeast artificial chromosome), a BAC (bacterial artificial), a baculovirus vector, a phage, a phagemid, a cosmid, a viral vector, a plasmid, a RNA vector or a linear or circular DNA or RNA molecule which may consist of chromosomal, non chromosomal, semi-synthetic or synthetic DNA. In general, expression vectors of utility in recombinant DNA techniques are often in the faun of “plasmids” which refer generally to circular double-stranded DNA loops which, in their vector form are not bound to the chromosome. Large numbers of suitable vectors are known to those of skill in the art.
Viral vectors include retrovirus, adenovirus, parvovirus (e.g. adeno-associated viruses), coronavirus, negative strand RNA viruses such as orthomyxovirus (e.g., influenza virus), rhabdovirus (e.g., rabies and vesicular stomatitis virus), paramyxovirus (e.g. measles and Sendai), positive strand RNA viruses such as picornavirus and alphavirus, and double-stranded DNA viruses including adenovirus, herpesvirus (e.g., Herpes Simplex virus types 1 and 2, Epstein-Barr virus, cytomegalovirus), and poxvirus (e.g., vaccinia, fowlpox and canarypox). Other viruses include Norwalk virus, togavirus, flavivirus, reoviruses, papovavirus, hepadnavirus, and hepatitis virus, for example.
Vectors can comprise selectable markers, for example: neomycin phosphotransferase, histidinol dehydrogenase, dihydrofolate reductase, hygromycin phosphotransferase, herpes simplex virus thymidine kinase, adenosine deaminase, glutamine synthetase, and hypoxanthine-guanine phosphoribosyl transferase for eukaryotic cell culture; TRP1 for S. cerevisiae; tetracycline, rifampicin or ampicillin resistance in E. coli.
Preferably said vectors are expression vectors, wherein the sequence(s) encoding the variant of the invention is placed under control of appropriate transcriptional and translational control elements to permit production or synthesis of said variant. Therefore, said polynucleotide is comprised in expression cassette. More particularly, the vector comprises a replication origin, a promoter operatively linked to said encoding polynucleotide, a ribosome-binding site, an RNA-splicing site (when genomic DNA is used), a polyadenylation site and a transcription termination site. It also can comprise an enhancer. Selection of the promoter will depend upon the cell in which the polypeptide is expressed. Preferably, when said variant is an heterodimer, the two polynucleotides encoding each of the monomers are included in one vector which is able to drive the expression of both polynucleotides, simultaneously.
According to another advantageous embodiment of said, vector, it includes a targeting construct comprising sequences sharing homologies with the region surrounding the DNA target sequence as defined above.
More preferably, said targeting DNA construct comprises:
a) sequences sharing homologies with the region surrounding the DNA target sequence as defined above, and
b) sequences to be introduced flanked by sequence as defined in a).
The invention also concerns a prokaryotic or eukaryotic host cell which is modified by a polynucleotide or a vector as defined above, preferably an expression vector.
The invention also concerns a non-human transgenic animal or a transgenic plant, characterized in that all or part of their cells are modified by a polynucleotide or a vector as defined above.
As used herein, a cell refers to a prokaryotic cell, such as a bacterial cell, or eukaryotic cell, such as an animal, plant or yeast cell.
The subject-matter of the present invention is also a composition comprising at least one I-CreI variant, one single-chain chimeric endonuclease derived from said variant, one or two polynucleotide(s), preferably included in expression vector(s), as defined above.
In a preferred embodiment of said composition, it contains a targeting DNA construct comprising the sequence which repairs the site of interest flanked by sequences sharing homologies with the targeted locus.
The polynucleotide sequence(s) encoding the variant or the single-chain chimeric endonuclease derived from said variant as defined in the present invention, may be prepared by any method known by the man skilled in the art. For example, they are amplified from a cDNA template, by polymerase chain reaction with specific primers. Preferably the codons of said cDNA are chosen to favour the expression of said protein in the desired expression system.
The recombinant vector comprising said polynucleotides may be obtained and introduced in a host cell by the well-known recombinant DNA and genetic engineering techniques.
The variant of the invention is produced by expressing the polypeptide(s) as defined above; preferably said polypeptide(s) are expressed or co-expressed in a host cell modified by one or two expression vector(s), under conditions suitable for the expression or co-expression of the polypeptides, and the variant is recovered from the host cell culture.
The subject-matter of the present invention is also a method of genetic engineering comprising a step of double-strand nucleic acid breaking in a site of interest located on a vector comprising a DNA target as defined hereabove, by contacting said vector with a I-CreI variant or a single-chain chimeric endonuclease comprising said variant as defined above, thereby inducing a homologous recombination with another vector presenting homology with the sequence surrounding the cleavage site of said variant.
The subject-matter of the present invention is also a method of genome engineering comprising the steps of: 1) double-strand breaking a genomic locus comprising at least one DNA target sequence as defined above, by contacting said target with a I-CreI variant or a single-chain chimeric endonuclease comprising said variant as defined above; 2) maintaining said broken genomic locus under conditions appropriate for homologous recombination with a targeting DNA construct comprising the sequence to be introduced in said locus, flanked by sequences sharing homologies with the targeted locus.
The subject-matter of the present invention is also a method of genome engineering, characterized in that it comprises the following steps: 1) double-strand breaking a genomic locus comprising at least one DNA target sequence as defined above, by contacting said cleavage site with a I-CreI variant or a single-chain chimeric endonuclease comprising said variant as defined above; 2) maintaining said broken genomic locus under conditions appropriate for homologous recombination with chromosomal DNA sharing homologies to regions surrounding the cleavage site.
The subject-matter of the present invention is also the use of a I-CreI endonuclease variant obtainable by the method as described above for molecular biology, in vivo or in vitro genetic engineering, and in vivo or in vitro genome engineering for non-therapeutic purposes, for cleaving a DNA target sequence as defined above.
Molecular biology includes with no limitations, DNA restriction and DNA mapping. Genetic and genome engineering for non therapeutic purposes include for example (i) gene targeting of specific loci in cell packaging lines for protein production, (ii) gene targeting of specific loci in crop plants, for strain improvements and metabolic engineering, (iii) targeted recombination for the removal of markers in genetically modified crop plants, (iv) targeted recombination for the removal of markers in genetically modified microorganism strains (for antibiotic production for example).
According to an advantageous embodiment of said use, it is for inducing a double-strand break in a site of interest comprising a DNA target sequence cleaved by a variant as defined above, thereby inducing a DNA recombination event, a DNA loss or cell death.
In a particular embodiment, an I-CreI variant having an arginine (R) or a lysine (K) in position 38 is used for cleaving a DNA target comprising a guanine in position −9 or a cytosine in position +9.
According to the invention, said double-strand break is for: repairing a specific sequence, modifying a specific sequence, restoring a functional gene in place of a mutated one, attenuating or activating an endogenous gene of interest, introducing a mutation into a site of interest, introducing an exogenous gene or a part thereof, inactivating or detecting an endogenous gene or a part thereof, translocating a chromosomal arm, or leaving the DNA unrepaired and degraded.
The subject-matter of the present invention is also the use of at least one I-CreI variant as defined above, for the preparation of a medicament for preventing, improving or curing a genetic disease in an individual in need thereof, said medicament being administrated by any means to said individual.
The subject-matter of the present invention is also a method for preventing, improving or curing a genetic disease in an individual in need thereof, said method comprising at least the step of administering to said individual a composition as defined above, by any means.
In a particular embodiment, an I-CreI variant having an arginine (R) or a lysine (K) in position 38 is used for cleaving a genomic DNA target comprising a guanine in position −9 or a cytosine in position +9.
The subject-matter of the present invention is also the use of at least one I-CreI variant as defined above, for the preparation of a medicament for preventing, improving or curing a disease caused by an infectious agent that presents a DNA intermediate, in an individual in need thereof, said medicament being administrated by any means to said individual.
The subject-matter of the present invention is also a method for preventing, improving or curing a disease caused by an infectious agent that presents a DNA intermediate, in an individual in need thereof, said method comprising at least the step of administering to said individual a composition as defined above, by any means.
The subject-matter of the present invention is also the use of at least one I-CreI variant, as defined above, in vitro, for inhibiting the propagation, inactivating or deleting an infectious agent that presents a DNA intermediate, in biological derived products or products intended for biological uses or for disinfecting an object.
The subject matter of the present invention is also a method for decontaminating a product or a material from an infectious agent that presents a DNA intermediate, said method comprising at least the step of contacting a biological derived product, a product intended for biological use or an object, with a composition as defined above, for a time sufficient to inhibit the propagation, inactivate or delete said infectious agent.
In a particular embodiment, an I-CreI variant having an arginine (R) or a lysine (K) in position 38 is used for cleaving a DNA target from said infectious agent that comprises a guanine in position −9 or a cytosine in position +9.
In another particular embodiment, said infectious agent is a virus. For example said virus is an adenovirus (Ad11, Ad21), herpesvirus (HSV, VZV, EBV, CMV, herpesvirus 6, 7 or 8), hepadnavirus (HBV), papovavirus (HPV), poxvirus or retrovirus (HTLV, HIV).
The subject-matter of the present invention is also the use of at least one I-CreI variant, as defined above, as a scaffold for making other meganucleases. For example a second round of mutagenesis and selection/screening can be performed on said I-CreI variant, for the purpose of making novel, second generation homing endonucleases.
According to another advantageous embodiment of said uses, said I-CreI variant is associated with a targeting DNA construct as defined above.
According to another advantageous embodiment of said uses, said I-CreI variant has amino acid residues in positions 28, 30, 33, 38 and 40 respectively, which are selected from the group consisting of: KNSQS, KNRQS, KNTQS, KNHQS.
The use of the I-CreI meganuclease variant and the methods of using said I-CreI meganuclease variant according to the present invention include also the use of the single-chain chimeric endonuclease derived from said variant, the polynucleotide(s), vector, cell, transgenic plant or non-human transgenic mammal encoding said variant or single-chain chimeric endonuclease, as defined above.
In addition to the preceding features, the invention further comprises other features which will emerge from the description which follows, which refers to examples illustrating the I-CreI homing endonuclease variants and their uses according to the invention, as well as to the appended drawings in which:
The method for producing meganuclease variants and the assays based on cleavage-induced recombination in yeast cells, which are used for screening variants with altered specificity, are described in the International PCT Application WO 2004/067736 and Epinat et al., N.A.R., 2003, 31, 2952-2962. These assays result in a functional LacZ reporter gene which can be monitored by standard methods (
I-CreI wt (I-CreI D75), I-CreI D75N (I-CreI N75) and I-CreI S70 N75 open reading frames were synthesized, as described previously (Epinat et al., N.A.R., 2003, 31, 2952-2962). Combinatorial libraries were derived from the I-CreI N75, I-CreI D75 or I-CreI S70 N75 scaffold, by replacing two or three different combinations of residues, potentially involved in the interactions with the bases in positions ±8 to 10 of one DNA target half-site. The diversity of the meganuclease libraries was generated by PCR using degenerated primers harboring a unique degenerated codon at each of the selected positions.
Mutation D75N was introduced by replacing codon 75 with aac. Then, three codons at positions N30, Y33 and Q38 (Ulib4 library) or K28, N30 and Q38 (Ulib5 library) were replaced by a degenerated codon VVK (18 codons) coding for 12 different amino acids: A,D,E,G,H,K,N,P,Q,R,S,T). In consequence, the maximal (theoretical) diversity of these protein libraries was 123 or 1728. However, in teams of nucleic acids, the diversity was 183 or 5832.
In Lib4, ordered from BIOMETHODES, an arginine in position 70 of the I-CreI N75 scaffold was first replaced with a serine (R70S). Then positions 28, 33, 38 and 40 were randomized. The regular amino acids (K28, Y33, Q38 and S40) were replaced with one out of 10 amino acids (A,D,E,K,N,Q,R,S,T,Y). The resulting library has a theoretical complexity of 10000 in terms of proteins.
In addition, small libraries of complexity 225 (152) resulting from the randomization of only two positions were constructed in an I-CreI N75 or I-CreI D75 scaffold, using NVK degenerate codon (24 codons, amino acids ACDEGHKNPQRSTWY).
Fragments carrying combinations of the desired mutations were obtained by PCR, using a pair of degenerated primers coding for 10, 12 or 15 different amino acids, and as DNA template, the I-CreI N75 (
The C1221 twenty-four by palindrome (tcaaaacgtcgtacgacgltttga, SEQ ID NO: 3) is a repeat of the half-site of the nearly palindromic natural I-CreI target (tcaaaacgtcgtgagacagtttgg, SEQ ID NO: 1). C1221 is cleaved as efficiently as the I-CreI natural target in vitro and ex vivo in both yeast and mammalian cells. The 64 palindromic targets were derived as follows: 64 pairs of oligonucleotides (ggcatacaagtttacnnnacgtcgtacgacgtnnngacaatcgtetgtca (SEQ ID NO: 72) and reverse complementary sequences) were ordered form Sigma, annealed and cloned into pGEM-T Easy (PROMEGA) in the same orientation. Next, a 400 bp PvuII fragment was excised and cloned into the yeast vector pFL39-ADH-LACURAZ, also called pCLS0042, described previously (Epinat et al., precited,
The three libraries of meganuclease expression variants were transformed into the leu2 mutant haploid yeast strain FYC2-6A: MATalpha, trp1 Δ63, leu2 Δ1, his3 Δ200. A classical chemical/heat choc protocol that routinely gives us 106 independent transformants per μg of DNA derived from (Gietz and Woods, Methods Enzymol., 2002, 350, 87-96), was used for transformation. Individual transformant (Leu+) clones were individually picked in 96 wells microplates. The 64 target plasmids were transformed using the same protocol, into the haploid yeast strain FYBL2-7B: MATα, ura3 Δ851, trp1 Δ63, leu2 Δ1, lys2 Δ202, resulting in 64 tester strains
Meganuclease expressing clones were mated with each of the 64 target strains, and diploids were tested for beta-galactosidase activity, by using the screening assay illustrated on
I-CreI variant clones as well as yeast reporter strains were stocked in glycerol (20%) and replicated in novel microplates. Mating was performed using a colony gridder (QpixII, GENETIX). Mutants were spotted on nylon filters covering YPD plates, using a high density (about 20 spots/cm2). A second spotting process was performed on the same filters to spot a second layer consisting of 64 different reporter-harboring yeast strains for each variant. Membranes were placed on solid agarose YEPD rich medium, and incubated at 30° C. for one night, to allow mating. Next, filters were transferred to synthetic medium, lacking leucine and tryptophan, with galactose (2%) as a carbon source (and with G418 for coexpression experiments), and incubated for five days at 37° C., to select for diploids, allow for meganuclease expression, reporter plasmid cleavage and recombination, and expression of beta-galactosidase. After 5 days, filters were placed on solid agarose medium with 0.02% X-Gal in 0.5 M sodium phosphate buffer, pH 7.0, 0.1% SDS, 6% dimethyl formamide (DMF), 7 mM β-mercaptoethanol, 1% agarose, and incubated at 37° C., to monitor β-galactosidase activity. After two days of incubation, positive clones were identified by scanning. The β-galactosidase activity of the clones was quantified using an appropriate software.
The clones showing an activity against at least one target were isolated (first screening). The spotting density was then reduced to 4 spots/cm2 and each positive clone was tested against the 64 reporter strains in quadruplicate, thereby creating complete profiles (secondary screening).
The open reading frame (ORF) of positive clones identified during the first and/or secondary screening in yeast was amplified by PCR on yeast colonies using primers: PCR-Gal10-F (gcaactttagtgctgacacatacagg, SEQ ID NO: 73) and PCR-Gal10-R (acaaccttgattgcagacttgacc, SEQ ID NO: 74) from PROLIGO. Briefly, yeast colony is picked and resuspended in 100 μl of LGlu liquid medium and cultures overnight. After centrifugation, yeast pellet is resuspended in 10 μl of sterile water and used to perform PCR reaction in a final volume of 50 μl containing 1.5 μl of each specific primers (100 pmol/μl). The PCR conditions were one cycle of denaturation for 10 minutes at 94° C., 35 cycles of denaturation for 30 s at 94° C., annealing for 1 min at 55° C., extension for 1.5 min at 72° C., and a final extension for 5 min. The resulting PCR products were then sequenced.
All analyses of protein structures were realized using Pymol. The structures from I-CreI correspond to pdb entry 1g9y. Residue numbering in the text always refer to these structures, except for residues in the second I-CreI protein domain of the homodimer where residue numbers were set as for the first domain.
I-CreI is a dimeric homing endonuclease that cleaves a 22 bp pseudo-palindromic target. Analysis of I-CreI structure bound to its natural target has shown that in each monomer, eight residues establish direct interactions with seven bases (Jurica et al., 1998, precited). According to these structural data, the bases of the nucleotides in positions ±8 to 10 establish direct contacts with I-CreI amino-acids N30, Y33, Q38 and indirect contacts with I-CreI amino-acids K28 and S40 (
An exhaustive protein library vs. target library approach was under-taken to engineer locally this part of the DNA binding interface. Randomization of 5 amino acids positions would lead to a theoretical diversity of 205=3.2×106. However, libraries with lower diversity were generated by randomizing 2, 3 or 4 residues at a time, resulting in a diversity of 225 (152), 1728 (123) or 10,000 (104). This strategy allowed an extensive screening of each of these libraries against the 64 palindromic 10NNN DNA targets using a yeast based assay described previously (Epinat et al., 2003, precited and International PCT Application WO 2004/067736) and whose principle is described in
First, the I-CreI scaffold was mutated from D75 to N. The D75N mutation did not affect the protein structure, but decreased the toxicity of I-CreI in overexpression experiments.
Next the Ulib4 library was constructed: residues 30, 33 and 38 (
Then, two other libraries were constructed: Ulib5 and Lib4. In Ulib5, residues 28, 30 and 38 (
In a primary screening experiment, 20000 clones from Ulib4, 10000 clones from Ulib5 and 20000 clones from Lib4 were mated with each one of the 64 tester strains, and diploids were tested for beta-galactosidase activity. All clones displaying cleavage activity with at least one out of the 64 targets were tested in a second round of screening against the 64 targets, in quadriplate, and each cleavage profile was established, as shown on
After secondary screening and sequencing of positives over the entire coding region, a total of 1484 unique mutants were isolated showing a cleavage activity against at least one target. Different patterns could be observed.
Altogether, this large collection of mutants allowed the targeting of all of the 64 possible DNA sequences differing at positions ±10, ±9, and +8 (
Thus, hundreds of novel variants were obtained, including mutants with novel substrate specificity; these variants can keep high levels of activity and the specificity of the novel proteins can be even narrower than that of the wild-type protein for its target.
Hierarchical clustering was used to establish potential correlations between specific protein residues and target bases, as previously described (Arnould et al., J. Mol. Biol., 2006, 355, 443-458). Clustering was done on the quantitative data from the secondary screening, using hclust from the R package. Variants were clustered using standard hierarchical clustering with Euclidean distance and Ward's method (Ward, J. H., American Statist. Assoc., 1963, 58, 236-244). Mutant dendrogram was cut at the height of 17 to define the clusters. For the analysis, cumulated intensities of cleavage of a target within a cluster was calculated as the sum of the cleavage intensities of all cluster's mutants with this target, normalized to the sum of the cleavage intensities of all cluster's mutants with all targets.
Ten different mutant clusters were identified (Table I).
1Target and base frequencies correspond to cumulated intensity of cleavage as described in Materials and Methods).
2In each position, residues present in more than 15% of the cluster are indicated
Analysis of the residues found in each cluster showed strong biases for all randomized positions. None of the residues is mutated in all libraries used in this study, and the residues found in the I-CreI scaffold were expected to be overrepresented. Indeed, K28, N30 and S40 were the most frequent residues in all 10 clusters, and no conclusion for DNA/protein interactions can really be infered. However, Y33 was the most represented residue only in clusters 7, 8 and 10, whereas strong occurrence of other residues, such as H, R, G, T, C, P or S, was observed in the seven other clusters. The wild type Q38 residue was overrepresented in all clusters but one, R and K being more frequent in cluster 4.
Meanwhile, strong correlations were observed between the nature of residues 33 and 38 and substrate discrimination at positions ±10 and ±9 of the target.
Prevalence of Y33 was associated with high frequencies of adenine (74.9% and 64.3% in clusters 7 and 10, respectively), and this correlation was also observed, although to a lesser extent in clusters 4, 5 and 8. H33 or R33 were correlated with a guanine (63.0%, 56.3% and 58.5%, in clusters 1, 4 and 5, respectively) and T33, C33 or S33 with a thymine (45.6% and 56.3% in clusters 3 and 9, respectively). G33 was relatively frequent in cluster 2, the cluster with the most even base representation in ±10. These results are consistent with the observations of Seligman and collaborators (Nucleic Acids Res., 2002, 30, 3870-3879), who showed previously that a Y33R or Y33H mutation shifted the specificity of I-CreI toward a guanine and Y33C, Y33T, Y33S (and also Y33L) towards a thymine in position ±10.
In addition, R38 and K38 were associated with an exceptional high frequency of guanine in cluster 4, while in all the other clusters, the wild type Q38 residue was overrepresented, as well as an adenine in ±9 of the target.
The structure of I-CreI bound to its target (Chevalier et al., 2003, precited; Jurica et al., 1998, precited) has shown that Y33 and Q38 contact two adenines in −10 and −9 (
Number | Date | Country | |
---|---|---|---|
Parent | 12091216 | Apr 2008 | US |
Child | 13744068 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/IB2005/003564 | Oct 2005 | US |
Child | 12091216 | US |