Heterodimeric Meganucleases and Use Thereof

Abstract
Heterodimeric meganuclease comprising two domains of different meganucleases which are in two separate polypeptides, said heterodimeric meganuclease being able to cleave a chimeric DNA target sequence comprising one different half of each parent meganuclease DNA target sequence.
Description

The invention relates to an heterodimeric meganuclease comprising two domains of different meganucleases which are in two separate polypeptides, said heterodimeric meganuclease being able to cleave a chimeric DNA target sequence comprising one different half of each parent meganuclease DNA target sequence.


The invention relates also to a vector encoding said heterodimeric meganuclease, to a cell, an animal or a plant modified by said vector and to the use of said herodimeric meganuclease and derived products for genetic engineering, genome therapy and antiviral therapy.


Meganucleases are by definition sequence-specific endonucleases with large (>12 bp) cleavage sites and they can be used to achieve very high levels of gene targeting efficiencies in mammalian cells and plants (Rouet et al., Mol. Cell. Biol., 1994, 14, 8096-106; Choulika et al., Mol. Cell. Biol., 1995, 15, 1968-73; Donoho et al., Mol. Cell. Biol, 1998, 18, 4070-8; Elliott et al., Mol. Cell. Biol., 1998, 18, 93-101; Sargent et al., Mol. Cell. Biol., 1997, 17, 267-77; Puchta et al., Proc. Natl. Acad. Sci. USA, 1996, 93, 5055-60), making meganuclease-induced recombination an efficient and robust method for genome engineering. The major limitation of the current technology is the requirement for the prior introduction of a meganuclease cleavage site in the locus of interest. Thus, the generation of novel meganucleases with tailored specificities is under intense investigation. Such proteins could be used to cleave genuine chromosomal sequences and open a wide range of applications, including the correction of mutations responsible for inherited monogenic diseases.


Recently, fusion of Cys2-His2 type Zinc-Finger Proteins (ZFP) with the catalytic domain of the FokI nuclease were used to make functional sequence-specific endonucleases (Smith et al., Nucleic Acids Res, 1999, 27, 674-81; Urnov et al., Nature, 2005, 435, 646-651). The binding specificity of ZFPs is relatively easy to manipulate, and a repertoire of novel artificial ZFPs, able to bind many (g/a)nn(g/a)nn(g/a)nn sequences is now available (Pabo et al., Annu. Rev. Biochem, 2001, 70, 313-40; Segal and Barbas, Curr. Opin. Biotechnol., 2001, 12, 632-7; Isalan et al., Nat. Biotechnol., 2001, 19, 656-60). Nevertheless, preserving a very narrow specificity is one of the major issues for genome engineering applications, and presently it is unclear whether ZFPs would fulfill the very strict requirements for therapeutic applications.


Homing Endonucleases (HEs) are a widespread family of natural meganucleases including hundreds of proteins (Chevalier and Stoddard, Nucleic Acids Res., 2001, 29, 3757-74). These proteins are encoded by mobile genetic elements which propagate by a process called “homing”: the endonuclease cleaves a cognate allele from which the mobile element is absent, thereby stimulating a homologous recombination event that duplicates the mobile DNA into the recipient locus (Kostriken et al., Cell; 1983, 35, 167-74; Jacquier and Dujon, Cell, 1985, 41, 383-94). Given their natural function and their exceptional cleavage properties in terms of efficacy and specificity, HEs provide ideal scaffolds to derive novel endonucleases for genome engineering. Data have been accumulated over the last decade, characterizating the LAGLIDADG family, the largest of the four HE families (Chevalier and Stoddard, precited). LAGLIDADG refers to the only sequence actually conserved throughout the family and is found in one or (more often) two copies in the protein. Proteins with a single motif, such as I-CreI, form homodimers and cleave palindromic or pseudo-palindromic DNA sequences (FIG. 1), whereas the larger, double motif proteins, such as I-SceI are monomers and cleave non palindromic targets. Seven different LAGLIDADG proteins have been crystallized, and they exhibit a very striking conservation of the core structure, that contrasts with the lack of similarity at the primary sequence level (Jurica et al., Mol. Cell., 1998, 2, 469-76; Chevalier et al., Nat. Struct. Biol., 2001, 8, 312-6; Chevalier et al. J. Mol. Biol., 2003, 329, 253-69; Moure et al., J. Mol. Biol, 2003, 334, 685-95; Moure et al., Nat. Struct. Biol., 2002, 9, 764-70; Ichiyanagi et al., J. Mol. Biol., 2000, 300, 889-901; Duan et al., Cell, 1997, 89, 555-64; Bolduc et al., Genes Dev., 2003, 17, 2875-88; Silva et al., J. Mol. Biol., 1999, 286, 1123-36). In this core structure, two characteristic αββαββα folds, also called LAGLIDADG Homing Endonuclease Core Domains, contributed by two monomers, or by two domains in double LAGLIDAG proteins, are facing each other with a two-fold symmetry. DNA binding depends on the four β strands from each domain, folded into an antiparallel β-sheet, and forming a saddle on the DNA helix major groove. Analysis of I-CreI structure bound to its natural target shows that in each monomer, eight residues establish direct interactions with seven bases (Jurica et al., 1998, precited). Residues Q44, R68 and R70 contact three consecutive base pairs at positions 3 to 5 and −3 to −5 (FIG. 1). The catalytic core is central, with a contribution of both symmetric monomers/domains. In addition to this core structure, other domains can be found: for example, PI-SceI, an intein, has a protein splicing domain, and an additional DNA-binding domain (Moure et al., 2002, precited; Grindl et al., Nucleic Acids Res., 1998, 26, 1857-62).


Two approaches have been used to derive novel endonucleases with new specificities, from Homing Endonucleases:


protein variants


Seligman and co-workers used a rational approach to substitute specific individual residues of the I-CreI αββαββα fold (Sussman et al., J. Mol. Biol., 2004, 342, 31-41; Seligman et al., Genetics, 1997, 147, 1653-64); substantial cleavage of novel targets was observed but for few I-CreI variant only.


In a similar way, Gimble et al. modified the additional DNA binding domain of PI-SceI (J. Mol. Biol., 2003, 334, 993-1008); they obtained variant protein with altered binding specificity but no altered specificity and most of the proteins maintained a lot of affinity for the wild-type target sequence.


hybrid or chimeric single-chain proteins


New meganucleases could be obtained by swapping LAGLIDADG Homing Endonuclease Core Domains of different monomers (Epinat et al., Nucleic Acids Res., 2003, 31, 2952-62; Chevalier et al., Mol. Cell., 2002, 10, 895-905; Steuer et al., Chembiochem., 2004, 5, 206-13; International PCT Applications WO 03/078619 et WO 2004/031346). These single-chain chimeric meganucleases wherein the two LAGLIDADG Homing Endonuclease Core Domains from different meganucleases are linked by a spacer, were able to cleave the hybrid target corresponding to the fusion of the two half parent DNA target sequences.


By coexpressing two domains from different meganucleases, the inventors have engineered functional heterodimeric meganucleases, which are able to cleave chimeric targets. This new approach, which can be applied to any meganuclease (monomer with two domains or homodimer), including the variants derived from wild-type meganucleases, considerably enriches the number of DNA sequences that can be targeted, resulting in the generation of dedicated meganucleases able to cleave sequences from many genes of interest. Potential applications include the cleavage of viral genomes specifically or the correction of genetic defects via double-strand break induced recombination, both of which lead to therapeutics.


Therefore, the invention concerns a heterodimeric meganuclease comprising two domains of different meganucleases (parent meganucleases), wherein said domains are in two separate polypeptides which are able to assemble and to cleave a chimeric DNA target sequence comprising one different half of each parent meganuclease DNA target sequence.


As opposed to the hybrid or chimeric meganucleases wherein the two meganuclease subunits which interact with a different half of a meganuclease target sequence, are in a single polypeptide, in the heterodimeric meganuclease according to the invention, each subunit is expressed from a separate polypeptide. The two poly-peptides which are different and originate from different meganucleases assemble to form a functional heterodimeric meganuclease.


Definitions


Amino acid residues in a polypeptide sequence are designated herein according to the one-letter code, in which, for example, Q means Gln or Glutamine residue, R means Arg or Arginine residue and D means Asp or Aspartic acid residue.


Nucleotides are designated as follows: one-letter code is used for designating the base of a nucleoside: a is adenine, t is thymine, c is cytosine, and g is guanine. For the degenerated nucleotides, r represents g or a (purine nucleotides), k represents g or t, s represents g or c, w represents a or t, m represents a or c, y represents t or c (pyrimidine nucleotides), d represents g, a or t, v represents g, a or c, b represents g, t or c, h represents a, t or c, and n represents g, a, t or c.


by “meganuclease”, is intended an endonuclease having a double-stranded DNA target sequence of 14 to 40 pb. Said meganuclease is either a dimeric enzyme, wherein each domain is on a monomer or a monomeric enzyme comprising the two domains on a single polypeptide.


by “meganuclease domain” is intended the region which interacts with one half of the DNA target of a meganuclease and is able to associate with the other domain of the same meganuclease which interacts with the other half of the DNA target to form a functional meganuclease able to cleave said DNA target.


by “meganuclease variant” is intented a meganuclease obtained by replacement of at least one residue in the amino acid sequence of the wild-type meganuclease (natural meganuclease) with a different amino acid.


by “functional variant” is intended a variant which is able to cleave a DNA target sequence, preferably said target is a new target which is not cleaved by the parent meganuclease.


by “LAGLIDADG Homing Endonuclease Core Domain”, is intended the characteristic αββαββα fold of the homing endonuclease of the LAGLIDADG family, corresponding to a sequence of about one hundred amino acid residues. For example, in the case of the dimeric homing endonuclease I-CreI (163 amino acids), the LAGLIDADG Homing Endonuclease Core Domain corresponds to the residues 6 to 94. In the case of monomeric homing endonuclease, two such domains are found in the sequence of the endonuclease; for example in I-DmoI (194 amino acids), the first domain (residues 7 to 99) and the second domain (residues 104 to 194) are separated by a short linker (residues 100 to 103).


by “DNA target sequence”, “DNA target”, “target sequence”, “target”, “recognition site”, “recognition sequence”, “homing recognition site”, “homing site”, “cleavage site” is intended a 14 to 40 bp double-stranded palindromic, partially palindromic (pseudo-palindromic) or non-palindromic polynucleotide sequence that is recognized and cleaved by a meganuclease. These terms refer to a distinct DNA location, preferably a genomic location, at which a double stranded break (cleavage) is to be induced by the meganuclease. The DNA target is defined by the 5′ to 3′ sequence of one strand of the double-stranded polynucleotide. For example, the palindromic DNA target sequence cleaved by wild-type I-CreI presented in FIG. 1 is defined by the sequence 5′-t−12c−11a−10a−9a−8a−7c−6g−5t−4c−3g−2t−1a+1c+2g+3a+4c+5g+6t+7t+8t+9t+10g+11a+12 (SEQ ID NO:1), wherein the bases interacting with R68, Q44 and R70 are from positions −5 to −3 and +5 to +3.


by “chimeric DNA target” or “hybrid DNA target” is intended the fusion of a different half of each parent meganuclease DNA target sequence.


by “vector” is intended a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked.


by “homologous” is intended a sequence with enough identity to another one to lead to a homologous recombination between sequences, more particularly having at least 95% identity, preferably 97% identity and more preferably 99%.


“Identity” refers to sequence identity between two nucleic acid molecules or polypeptides. Identity can be determined by comparing a position in each sequence which may be aligned for purposes of comparison. When a position in the compared sequence is occupied by the same base, then the molecules are identical at that position. A degree of similarity or identity between nucleic acid or amino acid sequences is a function of the number of identical or matching nucleotides at positions shared by the nucleic acid sequences. Various alignment algorithms and/or programs may be used to calculate the identity between two sequences, including FASTA, or BLAST which are available as a part of the GCG sequence analysis package (University of Wisconsin, Madison, Wis.), and can be used with, e.g., default settings.


The polypeptides forming the heterodimeric meganuclease of the invention may derive from a natural (wild-type) meganuclease or a functional variant thereof.


Preferred variants are variants having a modified specificity, ie variants able to cleave a DNA target sequence which is not cleaved by the wild-type meganuclease. For example, such variants may have amino acid variation at positions contacting the DNA target sequence or interacting directly or indirectly with said DNA target.


The polypeptides forming the heterodimeric meganuclease of the invention may comprise, consist essentially of or consist of, one domain as defined above. In the case of dimeric meganuclease, said polypeptide may consist of the entire open reading frame of the meganuclease (full-length amino acid sequence).


Said peptides may include one or more residues inserted at the NH2 terminus and/or COOH terminus of said domain. For example, a methionine residue is introduced at the NH2 terminus, a tag (epitope or polyhistidine sequence) is introduced at the NH2 terminus and/or COOH terminus; said tag is useful for the detection and/or the purification of said polypeptide.


The cleavage activity of the heterodimeric meganuclease of the invention may be measured by a direct repeat recombination assay, in yeast or mammalian cells, using a reporter vector, as described in the PCT Application WO 2004/067736. The reporter vector comprises two truncated, non-functional copies of a reporter gene (direct repeats) and a chimeric DNA target sequence within the intervening sequence, cloned in a yeast or a mammalian expression vector (FIG. 2). The chimeric DNA target sequence is made of one different half of each parent meganuclease (FIG. 5). Coexpression of the two polypeptides results in the assembly of a functional heterodimer which is able to cleave the chimeric DNA target sequence. This cleavage induces homologous recombination between the direct repeats, resulting in a functional reporter gene, whose expression can be monitored by appropriate assay.


According to an advantageous embodiment of said heterodimeric meganuclease, each polypeptide comprises the LAGLIDADG Homing Endonuclease Core Domain of a different LAGLIDADG homing endonuclease or a variant thereof; said LAGLIDADG homing endonuclease may be either a homodimeric enzyme such as I-CreI, or a monomeric enzyme such as I-DmoI.


The LAGLIDADG homing endonuclease may be selected from the group consisting of: I-SceI, I-ChuI, I-CreI, I-CsmI, PI-SceI, PI-TliI, PI-MtuI, I-CeuI, I-SceII, I-Sce III, HO, PI-CivI, PI-CtrI, PI-AaeI, PI-BsuI, PI-DhaI, PI-DraI, PI-MavI, PI-MchI, PI-MfuI, PI-MflI, PI-MgaI, PI-MgoI, PI-MinI, PI-MkaI, PI-MleI, PI-MmaI, PI-MshI, PI-MsmI, PI-MthI, PI-MtuI, PI-MxeI, PI-NpuI, PI-PfuI, PI-RmaI, PI-SpbI, PI-SspI, PI-FacI, PI-MjaI, PI-PhoI, PI-TagI, PI-ThyI, PI-TkI, PI-TspI, I-MsoI, and I-AniI; preferably, I-CreI, I-SceI, I-ChuI, I-DmoI, I-CsmI, PI-SceI, PI-PfuI, PI-TliI, PI-MtuI, and I-CeuI; more preferably, I-CreI, I-MsoI, I-SceI, I-AniI, I-DmoI, PI-SceI, and PI-PfuI; still more preferably I-CreI.


In a preferred embodiment, one of the polypeptide comprises the LAGLIDADG Homing Endonuclease Core Domain of an I-CreI variant having at least one substitution in positions 44, 68, and/or 70 of I-CreI, by reference to the amino acid numbering of the I-CreI sequence SWISSPROT P05725.


Said polypeptide may for example consist of the entire open reading frame of said I-CreI variant.


In a more preferred embodiment, said residues in positions 44, 68, and/or 70 of I-CreI are replaced with an amino acid selected in the group consisting of: A, D, E, G, H, K, N, P, Q, R, S, T, and Y.


In another more preferred embodiment, said I-CreI variant further comprises the mutation of the aspartic acid in position 75, in an uncharged amino acid, preferably an asparagine (D75N) or a valine (D75V).


In another more preferred embodiment, said heterodimeric LAGLIDADG homing endonucleases comprising two polypeptides derived from I-CreI and/or I-CreI variant(s) having at least one substitution in positions 44, 68, and/or 70 of I-CreI, cleaves a chimeric DNA target comprising the sequence: c−11a−10a−9a−8a−7c−6n−5n−4n−3n−2n−1n+1n+2n+3n+4n+5g+6t+7t+8t+9t+10g+11, wherein n is a, t, c, or g (SEQ ID NO: 2).


More preferably, for cleaving a chimeric DNA target, wherein n−4 is t or n+4 is a, one of the polypeptide has a glutamine (Q) in position 44.


More preferably, for cleaving a chimeric DNA target, wherein n−4 is a or n+4 is t, one of the polypeptide has an alanine (A) or an asparagine in position 44; the I-CreI variants A44, R68, S70 and A44, R68, S70, N75 are examples of such a polypeptide.


More preferably, for cleaving a chimeric DNA target, wherein n−4 is c or n+4 is g, one of the polypeptide has a lysine (K) in position 44; the I-CreI variants K44, R68, E70 and K44, R68, E70, N75 are examples of such a polypeptide.


The subject-matter of the present invention is also a recombinant vector comprising two polynucleotide fragments, each encoding a different polypeptide as defined above.


One type of preferred vector is an episome, i.e., a nucleic acid capable of extra-chromosomal replication. Preferred vectors are those capable of autonomous replication and/or expression of nucleic acids to which they are linked. Vectors capable of directing the expression of genes to which they are operatively linked are referred to herein as “expression vectors”.


A vector according to the present invention comprises, but is not limited to, a YAC (yeast artificial chromosome), a BAC (bacterial artificial), a baculovirus vector, a phage, a phagemid, a cosmid, a viral vector, a plasmid, a RNA vector or a linear or circular DNA or RNA molecule which may consist of chromosomal, non chromosomal, semi-synthetic or synthetic DNA. In general, expression vectors of utility in recombinant DNA techniques are often in the form of “plasmids” which refer generally to circular double-stranded DNA loops which, in their vector form are not bound to the chromosome. Large numbers of suitable vectors are known to those of skill in the art.


Viral vectors include retrovirus, adenovirus, parvovirus (e.g. adeno-associated viruses), coronavirus, negative strand RNA viruses such as orthomyxovirus (e.g., influenza virus), rhabdovirus (e.g., rabies and vesicular stomatitis virus), para-myxovirus (e.g. measles and Sendai), positive strand RNA viruses such as picor-navirus and alphavirus, and double-stranded DNA viruses including adenovirus, herpesvirus (e.g., Herpes Simplex virus types 1 and 2, Epstein-Barr virus, cytomega-lovirus), and poxvirus (e.g., vaccinia, fowlpox and canarypox). Other viruses include Norwalk virus, togavirus, flavivirus, reoviruses, papovavirus, hepadnavirus, and hepatitis virus, for example.


Vectors can comprise selectable markers, for example: neomycin phosphotransferase, histidinol dehydrogenase, dihydrofolate reductase, hygromycin phosphotransferase, herpes simplex virus thymidine kinase, adenosine deaminase, glutamine synthetase, and hypoxanthine-guanine phosphoribosyl transferase for eukaryotic cell culture; TRP1 for S. cerevisiae; tetracycline, rifampicin or ampicillin resistance in E. coli.


Preferably said vectors are expression vectors, wherein the sequences encoding the polypeptides of the invention are placed under control of appropriate transcriptional and translational control elements to permit production or synthesis of said polypeptides. Therefore, said polynucleotides are comprised in expression cassette(s). More particularly, the vector comprises a replication origin, a promoter operatively linked to said encoding polynucleotide, a ribosome binding site, an RNA-splicing site (when genomic DNA is used), a polyadenylation site and a transcription termination site. It also can comprise an enhancer. Selection of the promoter will depend upon the cell in which the polypeptide is expressed.


According to another advantageous embodiment of said vector, it includes a targeting construct comprising sequences sharing homologies with the region surrounding the chimeric DNA target sequence as defined above.


More preferably, said targeting DNA construct comprises:


a) sequences sharing homologies with the region surrounding the chimeric DNA target sequence as defined above, and


b) sequences to be introduced flanked by sequence as in a).


The invention also concerns a prokaryotic or eukaryotic host cell which is modified by two polynucleotides or a vector as defined above, preferably an expression vector.


The invention also concerns a non-human transgenic animal or a transgenic plant, characterized in that all or part of their cells are modified by two polynucleotides or a vector as defined above.


As used herein, a cell refers to a prokaryotic cell, such as a bacterial cell, or eukaryotic cell, such as an animal, plant or yeast cell.


The polynucleotide sequences encoding the polypeptides as defined in the present invention may be prepared by any method known by the man skilled in the art. For example, they are amplified from a cDNA template, by polymerase chain reaction with specific primers. Preferably the codons of said cDNA are chosen to favour the expression of said protein in the desired expression system.


The recombinant vector comprising said polynucleotides may be obtained and introduced in a host cell by the well-known recombinant DNA and genetic engineering techniques.


The heterodimeric meganuclease of the invention is produced by expressing the two polypeptides as defined above; preferably said polypeptides are co-expressed in a host cell modified by two expression vectors, each comprising a polynucleotide fragment encoding a different polypeptide as defined above or by a dual expression vector comprising both polynucleotide fragments as defined above, under conditions suitable for the co-expression of the polypeptides, and the heterodimeric meganuclease is recovered from the host cell culture.


The subject-matter of the present invention is further the use of a heterodimeric meganuclease, two polynucleotides, preferably both included in one expression vector (dual expression vector) or each included in a different expression vector, a dual expression vector, a cell, a transgenic plant, a non-human transgenic mammal, as defined above, for molecular biology, for in vivo or in vitro genetic engineering, and for in vivo or in vitro genome engineering, for non-therapeutic purposes.


Non therapeutic purposes include for example (i) gene targeting of specific loci in cell packaging lines for protein production, (ii) gene targeting of specific loci in crop plants, for strain improvements and metabolic engineering, (iii) targeted recombination for the removal of markers in genetically modified crop plants, (iv) targeted recombination for the removal of markers in genetically modified microorganism strains (for antibiotic production for example).


According to an advantageous embodiment of said use, it is for inducing a double-strand break in a site of interest comprising a chimeric DNA target sequence, thereby inducing a DNA recombination event, a DNA loss or cell death.


According to the invention, said double-strand break is for: repairing a specific sequence, modifying a specific sequence, restoring a functional gene in place of a mutated one, attenuating or activating an endogenous gene of interest, introducing a mutation into a site of interest, introducing an exogenous gene or a part thereof, inactivating or deleting an endogenous gene or a part thereof, translocating a chromosomal arm, or leaving the DNA unrepaired and degraded.


According to another advantageous embodiment of said use, said heterodimeric meganuclease, polynucleotides, vector, cell, transgenic plant or non-human transgenic mammal are associated with a targeting DNA construct as defined above.


The subject-matter of the present invention is also a method of genetic engineering, characterized in that it comprises a step of double-strand nucleic acid breaking in a site of interest located on a vector comprising a chimeric DNA target as defined hereabove, by contacting said vector with a heterodimeric meganuclease as defined above, thereby inducing a homologous recombination with another vector presenting homology with the sequence surrounding the cleavage site of said heterodimeric meganuclease.


The subject-matter of the present invention is also a method of genome engineering, characterized in that it comprises the following steps: 1) double-strand breaking a genomic locus comprising at least one chimeric DNA target of a heterodimeric meganuclease as defined above, by contacting said target with said heterodimeric meganuclease; 2) maintaining said broken genomic locus under conditions appropriate for homologous recombination with a targeting DNA construct comprising the sequence to be introduced in said locus, flanked by sequences sharing homologies with the targeted locus.


The subject-matter of the present invention is also a method of genome engineering, characterized in that it comprises the following steps: 1) double-strand breaking a genomic locus comprising at least one chimeric DNA target of a heterodimeric meganuclease as defined above, by contacting said cleavage site with said heterodimeric meganuclease; 2) maintaining said broken genomic locus under conditions appropriate for homologous recombination with chromosomal DNA sharing homologies to regions surrounding the cleavage site.


The subject-matter of the present invention is also a composition characterized in that it comprises at least one heterodimeric meganuclease or two polynucleotides, preferably both included in one expression vector or each included in a different expression vector, as defined above.


In a preferred embodiment of said composition, it comprises a targeting DNA construct comprising the sequence which repairs the site of interest flanked by sequences sharing homologies with the targeted locus.


The subject-matter of the present invention is also the use of at least one heterodimeric meganuclease or two polynucleotides, preferably included in expression vector(s), as defined above, for the preparation of a medicament for preventing, improving or curing a genetic disease in an individual in need thereof, said medicament being administrated by any means to said individual.


The subject-matter of the present invention is also the use of at least one heterodimeric meganuclease or two polynucleotides, preferably included in expression vector(s), as defined above for the preparation of a medicament for preventing, improving or curing a disease caused by an infectious agent that presents a DNA intermediate, in an individual in need thereof, said medicament being administrated by any means to said individual.


The subject-matter of the present invention is also the use of at least one heterodimeric meganuclease or two polynucleotides, preferably included in expression vector(s), as defined above, in vitro, for inhibiting the propagation, inactivating or deleting an infectious agent that presents a DNA intermediate, in biological derived products or products intended for biological uses or for disinfecting an object.


In a particular embodiment, said infectious agent is a virus.





In addition to the preceding features, the invention further comprises other features which will emerge from the description which follows, which refers to examples illustrating the I-CreI meganuclease variants and their uses according to the invention, as well as to the appended drawings in which:



FIG. 1 illustrates the rationale of the experiments. (a) Structure of I-CreI bound to its DNA target. (b) Zoom of the structure showing residues 44, 68, 70 chosen for randomization, D75 and interacting base pairs. (c) Design of the library and targets. The interactions of I-CreI residues Q44, R68 an R70 with DNA targets are indicated (top). The target described here (SEQ ID NO: 1) is a palindrome derived from the I-CreI natural target, and cleaved by I-CreI (Chevalier et al., 2003, precited). Cleavage positions are indicated by arrowheads. In the library, residues 44, 68 and 70 are replaced with ADEGHKNPQRST. Since I-CreI is an homodimer, the library was screened with palindromic targets. Sixty four palindromic targets resulting from substitutions in positions ±3, ±4 and ±5 were generated. A few examples of such targets are shown (bottom; SEQ ID NO: 10 to 16)



FIG. 2 illustrates the screening of the variants. (a) Yeast screening assay principle. A strain harboring the expression vector encoding the variants is mated with a strain harboring a reporter plasmid. In the reporter plasmid, a LacZ reporter gene is interrupted with an insert containing the site of interest, flanked by two direct repeats. Upon mating, the endonuclease (gray oval) performs a double strand break on the site of interest, allowing restoration of a functional LacZ (white oval) gene by single strand annealing (SSA) between the two flanking direct repeats. (b) Scheme of the experiment. A library of I-CreI variants is built using PCR, cloned into a replicative yeast expression vector and transformed in S. cerevisiae strain FYC2-6A (MATα; trp1Δ63, leu2Δ1, his3Δ200). The 64 palindromic targets are cloned in the LacZ-based yeast reporter vector, and the resulting clones transformed into strain FYBL2-7B (MATa, ura3Δ851, trp1Δ63, leu2Δ1, lys2Δ202). Robot-assisted gridding on filter membrane is used to perform mating between individual clones expressing meganuclease variants and individual clones harboring a reporter plasmid. After primary high throughput screening, the ORF of positive clones are amplified by PCR and sequenced. 410 different variants were identified among the 2100 positives, and tested at low density, to establish complete patterns, and 350 clones were validated. Also, 294 mutants were recloned in yeast vectors, and tested in a secondary screen, and results confirmed those obtained without recloning. Chosen clones are then assayed for cleavage activity in a similar CHO-based assay and eventually in vitro.



FIG. 3 illustrates the cleavage patterns of the variants. Mutants are identified by three letters, corresponding to the residues in positions 44, 68 and 70. Each mutant is tested versus the 64 targets derived from the I-CreI natural targets, and a series of control targets. Target map is indicated in the top right panel. (a) Cleavage patterns in yeast (left) and mammalian cells (right) for the I-CreI protein, and 8 derivatives. For yeast, the initial raw data (filter) is shown. For CHO cells, quantitative raw data (ONPG measurement) are shown, values superior to 0.25 are boxed, values superior to 0.5 are highlighted in medium grey, values superior to 1 in dark grey. LacZ: positive control. 0: no target. U1, U2 and U3: three different uncleaved controls. (b) Cleavage in vitro. I-CreI and four mutants are tested against a set of 2 or 4 targets, including the target resulting in the strongest signal in yeast and CHO. Digests are performed at 37° C. for 1 hour, with 2 nM linearized substrate, as described in Methods. Raw data are shown for I-CreI with two different targets. With both GGG and CCT, cleavage is not detected with I-CreI.



FIG. 4 represents the statistical analysis. (a) Cleaved targets: targets cleaved by I-CreI variants are colored in grey. The number of proteins cleaving each target is shown below, and the level of grey coloration is proportional to the average signal intensity obtained with these cutters in yeast. (b) Analysis of 3 out of the 7 clusters. For each mutant cluster (clusters 1, 3 and 7), the cumulated intensities for each target was computed and a bar plot (left column) shows in decreasing order the normalized intensities. For each cluster, the number of amino acid of each type at each position (44, 68 and 70) is shown as a coded histogram in the right column. The legend of amino-acid color code is at the bottom of the figure. (b) Hierarchical clustering of mutant and target data in yeast. Both mutants and targets were clustered using hierarchical clustering with Euclidean distance and Ward's method (Ward, J. H., American statist. Assoc., 1963, 58, 236-244). Clustering was done with hclust from the R package. Mutants and targets dendrograms were reordered to optimize positions of the clusters and the mutant dendrogram was cut at the height of 8 with deduced clusters. QRR mutant and GTC target are indicated by an arrow. Gray levels reflects the intensity of the signal.



FIG. 5 illustrates an example of hybrid or chimeric site: gtt (SEQ ID NO: 17) and cct (SEQ ID NO: 9) are two palindromic sites derived from the I-CreI site. The gtt/cct hybrid site (SEQ ID NO: 18) displays the gtt sequence on the top strand in −5, −4, −3 and the cct sequence on the bottom strand in 5, 4, 3.



FIG. 6 illustrates the cleavage activity of the heterodimeric variants. Yeast were co-transformed with the KTG and QAN variants. Target organization is shown on the top panel: target with a single gtt, cct or gcc half site are in bold; targets with two such half sites, which are expected to be cleaved by homo- and/or heterodimers, are in bold and highlighted in grey; 0: no target. Results are shown on the three panels below. Unexpected faint signals are observed only for gtc/cct and gtt/gtc, cleaved by KTG and QAN, respectively.



FIG. 7 represents the quantitative analysis of the cleavage activity of the heterodimeric variants. (a) Co-transformation of selected mutants in yeast. For clarity, only results on relevant hybrid targets are shown. The aac/acc target is always shown as an example of unrelated target. For the KTGxAGR couple, the palindromic tac and tct targets, although not shown, are cleaved by AGR and KTG, respectively. Cleavage of the cat target by the RRN mutant is very low, and could not be quantified in yeast. (b) Transient co-transfection in CHO cells. For (a) and (b), Black bars: signal for the first mutant alone; grey bars: signal for the second mutant alone; striped bars: signal obtained by co-expression or cotransfection.



FIG. 8 illustrates the activity of the assembled heterodimer ARS-KRE on the selected mouse chromosome 17 DNA target. CHO-K1 cell line were co-transfected with equimolar of target LagoZ plasmid, ARS and KRE expression plasmids, and the beta galactosidase activity was measured. Cells co-transfected with the LagoZ plasmid and the I-SceI, I-CreI, ARS or KRE recombinant plasmid or an empty plasmid were used as control.





EXAMPLE 1
Screening for New Functional Endonucleases

The method for producing meganuclease variants and the assays based on cleavage-induced recombination in mammal or yeast cells, which are used for screening variants with altered specificity, are described in the International PCT Application WO 2004/067736. These assays result in a functional LacZ reporter gene which can be monitored by standard methods (FIG. 2a).


A) Material and Methods
a) Construction of Mutant Libraries

I-CreI wt and I-CreI D75N open reading frames were synthesized, as described previously (Epinat et al., N.A.R., 2003, 31, 2952-2962). Mutation D75N was introduced by replacing codon 75 with AAC. The diversity of the meganuclease library was generated by PCR using degenerate primers from Sigma harboring codon VVK (18 codons, amino acids ADEGHKNPQRST) at position 44, 68 and 70 which interact directly with the bases at positions 3 to 5, and as DNA template, the I-CreI gene. The final PCR product was digested with specific restriction enzymes, and cloned back into the I-CreI ORF digested with the same restriction enzymes, in pCLS542. In this 2 micron-based replicative vector marked with the LEU2 gene, I-CreI variants are under the control of a galactose inducible promoter (Epinat et al., precited). After electroporation in E. coli, 7×104 clones were obtained 7×104 clones, representing 12 times the theoretical diversity at the DNA level (183=5832). DNA was extracted and transformed into S. cerevisiae strain FYC2-6A (MATα, trp1Δ63, leu2Δ1, his3Δ200). 13824 colonies were picked using a colony picker (QpixII, GENETIX), and grown in 144 microtiter plates.


b) Construction of Target Clones

The C1221 twenty-four bp palindrome (tcaaaacgtcgtacgacgttttga, SEQ ID NO: 1) is a repeat of the half-site of the nearly palindromic natural I-CreI target (tcaaaacgtcgtgagacagtttgg, SEQ ID NO: 3). C1221 is cleaved as efficiently as the I-CreI natural target in vitro and ex vivo in both yeast and mammalian cells. The 64 palindromic targets were derived as follows: 64 pair of oligonucleotides (ggcatacaagtttcaaaacnnngtacnnngttttgacaatcgtctgtca (SEQ ID NO: 4) and reverse complementary sequences) were ordered form Sigma, annealed and cloned into pGEM-T Easy (PROMEGA). Next, a 400 bp PvuII fragment was excised and cloned into the yeast vector pFL39-ADH-LACURAZ, described previously (Epinat et al., precited).


c) Mating of Meganuclease Expressing Clones and Screening in Yeast

Mating was performed using a colony gridder (QpixII, GENETIX). Mutants were gridded on nylon filters covering YPD plates, using a high gridding density (about 20 spots/cm2). A second gridding process was performed on the same filters to spot a second layer consisting of 64 or 75 different reporter-harboring yeast strains for each variant. Membranes were placed on solid agar YPD rich medium, and incubated at 30° C. for one night, to allow mating. Next, filters were transferred to synthetic medium, lacking leucine and tryptophan, with galactose (1%) as a carbon source (and with G418 for coexpression experiments), and incubated for five days at 37° C., to select for diploids carrying the expression and target vectors. After 5 days, filters were placed on solid agarose medium with 0.02% X-Gal in 0.5 M sodium phosphate buffer, pH 7.0, 0.1% SDS, 6% dimethyl formamide (DMF), 7 mM β-mercaptoethanol, 1% agarose, and incubated at 37° C., to monitor β-galactosidase activity. Results were analyzed by scanning and quantification was performed using a proprietary software.


d) Sequence and Re-Cloning of Primary Hits

The open reading frame (ORF) of positive clones identified during the primary screening in yeast was amplified by PCR and sequenced. Then, ORFs were recloned using the Gateway protocol (Invitrogen). ORFs were amplified by PCR on yeast colonies (Akada et al., Biotechniques, 28, 668-670, 672-674), using primers: ggggacaagtttgtacaaaaaagcaggcttcgaaggagatagaaccatggccaataccaaatataacaaagagttcc (SEQ ID NO: 5) and ggggaccactttgtacaagaaagctgggtttaagtcggccgccggggaggatttcttctttctcgc (SEQ ID NO: 6) from PROLIGO. PCR products were cloned in: (i) yeast gateway expression vector harboring a galactose inducible promoter, LEU2 or KanR as selectable marker and a 2 micron origin of replication, and (ii) a pET 24d(+) vector from NOVAGEN. Resulting clones were verified by sequencing (MILLEGEN).


B) Results

I-CreI is a dimeric homing endonuclease that cleaves a 22 bp pseudo-palindromic target. Analysis of I-CreI structure bound to its natural target has shown that in each monomer, eight residues establish direct interactions with seven bases (Jurica et al., 1998, precited). Residues Q44, R68, R70 contact three consecutive base pairs at position 3 to 5 (and −3 to −5, FIG. 1). An exhaustive protein library vs. target library approach was undertaken to engineer locally this part of the DNA binding interface. First, the I-CreI scaffold was mutated from D75 to N to decrease likely energetic strains caused by the replacement of the basic residues R68 and R70 in the library that satisfy the hydrogen-acceptor potential of the buried D75 in the I-CreI structure. The D75N mutation did not affect the protein structure, but decreased the toxicity of I-CreI in overexpression experiments. Next, positions 44, 68 and 70 were randomized and 64 palindromic targets resulting from substitutions in positions ±3, ±4 and ±5 of a palindromic target cleaved by I-CreI (Chevalier et al., 2003, precited) were generated, as described in FIG. 1.


A robot-assisted mating protocol was used to screen a large number of meganucleases from our library. The general screening strategy is described in FIG. 2b. 13,824 meganuclease expressing clones (about 2.3-fold the theoretical diversity) were spotted at high density (20 spots/cm2) on nylon filters and individually tested against each one of the 64 target strains (884,608 spots). 2100 clones showing an activity against at least one target were isolated (FIG. 2b) and the ORF encoding the meganuclease was amplified by PCR and sequenced. 410 different sequences were identified and a similar number of corresponding clones were chosen for further analysis. The spotting density was reduced to 4 spots/cm2 and each clone was tested against the 64 reporter strains in quadruplicate, thereby creating complete profiles (as in FIG. 3a). 350 positives could be confirmed. Next, to avoid the possibility of strains containing more than one clone, mutant ORFs were amplified by PCR, and recloned in the yeast vector. The resulting plasmids were individually transformed back into yeast. 294 such clones were obtained and tested at low density (4 spots/cm2). Differences with primary screening were observed mostly for weak signals, with 28 weak cleavers appearing now as negatives. Only one positive clone displayed a pattern different from what was observed in the primary profiling.


The 350 validated clones showed very diverse patterns. Some of these new profiles shared some similarity with the wild type scaffold whereas many others were totally different. Various examples are shown on FIG. 3a. Homing endonucleases can usually accommodate some degeneracy in their target sequences, and one of our first findings was that the original I-CreI protein itself cleaves seven different targets in yeast. Many of our mutants followed this rule as well, with the number of cleaved sequences ranging from 1 to 21 with an average of 5.0 sequences cleaved (standard deviation=3.6). Interestingly, in 50 mutants (14%), specificity was altered so that they cleaved exactly one target. 37 (11%) cleaved 2 targets, 61 (17%) cleaved 3 targets and 58 (17%) cleaved 4 targets. For 5 targets and above, percentages were lower than 10%. Altogether, 38 targets were cleaved by the mutants (FIG. 4a). It is noteworthy that cleavage was barely observed on targets with an A in position ±3, and never with targets with TGN and CGN at position +5, +4, ±3.


EXAMPLE 2
Novel Meganucleases can Cleave Novel Targets while Keeping High Activity and Narrow Specificity
A) Material and Methods
a) Construction of Target Clones

The 64 palindromic targets were cloned into pGEM-T Easy (PROMEGA), as described in example 1. Next, a 400 bp PvuII fragment was excised and cloned into the mammalian vector pcDNA3.1-LACURAZ-ΔURA, described previously (Epinat et al., precited). The 75 hybrid targets sequences were cloned as follows: oligonucleotides were designed that contained two different half sites of each mutant palindrome (PROLIGO).


b) Re-Cloning of Primary Hits

The open reading frame (ORF) of positive clones identified during the primary screening in yeast was recloned in: (i) a CHO gateway expression vector pCDNA6.2, following the instructions of the supplier (INVITROGEN), and ii) a pET 24d(+) vector from NOVAGEN Resulting clones were verified by sequencing (MILLEGEN).


c) Mammalian Cells Assay

CHO-K1 cell line from the American Type Culture Collection (ATCC) was cultured in Ham'sF12K medium supplemented with 10% Fetal Bovine Serum. For transient Single Strand Annealing (SSA) assays, cells were seeded in 12 well-plates at 13.103 cells per well one day prior transfection. Cotransfection was carried out the following day with 400 ng of DNA using the EFFECTENE transfection kit (QIAGEN). Equimolar amounts of target LagoZ plasmid and expression plasmid were used. The next day, medium was replaced and cells were incubated for another 72 hours. CHO-K1 cell monolayers were washed once with PBS. The cells were then lysed with 150 μl of lysis/revelation buffer added for β-galactosidase liquid assay (100 ml of lysis buffer (Tris-HCl 10 mM pH7.5, NaCl 150 mM, Triton X100 0.1%, BSA 0.1 mg/ml, protease inhibitors) and 900 ml of revelation buffer (10 ml of Mg 100× buffer (MgCl2 100 mM, β-mercaptoethanol 35%), 110 ml ONPG (8 mg/ml) and 780 ml of sodium phosphate 0.1 M pH7.5), 30 minutes on ice. Beta-galactosidase activity was assayed by measuring optical density at 415 nm. The entire process was performed on an automated Velocity 11 BioCel platform. The beta-galactosidase activity is calculated as relative units normalized for protein concentration, incubation time and transfection efficiency.


d) Protein Expression and Purification

His-tagged proteins were over-expressed in E. coli BL21 (DE3)pLysS cells using pET-24d (+) vectors (NOVAGEN). Induction with IPTG (0.3 mM), was performed at 25° C. Cells were sonicated in a solution of 50 mM Sodium Phosphate (pH 8), 300 mM sodium chloride containing protease inhibitors (Complete EDTA-free tablets, Roche) and 5% (v/v) glycerol. Cell lysates were centrifuged at 100000 g for 60 min. His-tagged proteins were then affinity-purified, using 5 ml Hi-Trap chelating HP columns (Amersham Biosciences) loaded with cobalt. Several fractions were collected during elution with a linear gradient of imidazole (up to 0.25M imidazole, followed by plateau at 0.5 M imidazole, 0.3 M NaCl and 50 mM Sodium Phosphate pH 8). Protein-rich fractions (determined by SDS-PAGE) were applied to the second column. The crude purified samples were taken to pH 6 and applied to a 5 ml HiTrap Heparin HP column (Amersham Biosciences) equilibrated with 20 mM Sodium Phosphate pH 6.0. Bound proteins are eluted with a sodium chloride continuous gradient with 20 mM sodium phosphate and 1M sodium chloride. The purified fractions were submitted to SDS-PAGE and concentrated (10 kDa cut-off centriprep Amicon Ultra system), frozen in liquid nitrogen and stored at −80° C. Purified proteins were desalted using PD10 columns (Sephadex G-25M, Amersham Biosciences) in PBS or 10 mM Tris-HCl (pH 8) buffer.


e) In Vitro Cleavage Assays

pGEM plasmids with single meganuclease DNA target cut sites were first linearized with XmnI. Cleavage assays were performed at 37° C. in 10 mM Tris-HCl (pH 8), 50 mM NaCl, 10 mM MgCl2, 1 mM DTT and 50 μg/ml BSA. 2 nM was used as target substrate concentration. A dilution range between 0 and 85 nM was used for each protein, in 25 μl final volume reaction. Reactions were stopped after 1 hour by addition of 5 μl of 45% glycerol, 95 mM EDTA (pH 8), 1.5% (w/v) SDS, 1.5 mg/ml proteinase K and 0.048% (w/v) bromophenol blue (6× Buffer Stop) and incubated at 37° C. for 30 minutes. Digests were run on agarosse electrophoresis gel, and fragment quantified after ethidium bromide staining, to calculate the percentage of cleavage.


B) Results

Eight representative mutants (belonging to 6 different clusters, see below) were chosen for further characterization (FIG. 3). First, data in yeast were confirmed in mammalian cells, by using an assay based on the transient cotransfection of a meganuclease expressing vector and a target vector, as described in a previous report. The 8 mutant ORFs and the 64 targets were cloned into appropriate vectors, and a robot-assisted microtiter-based protocol was used to co-transfect in CHO cells each selected variant with each one the 64 different reporter plasmids. Meganuclease-induced recombination was measured by a standard, quantitative ONPG assay that monitors the restoration of a functional β-galactosidase gene. Profiles were found to be qualitatively and quantitatively reproducible in five independent experiments. As shown on FIG. 3a, strong and medium signals were nearly always observed with both yeast and CHO cells (with the exception of ADK), thereby validating the relevance of the yeast HTS process. However, weak signals observed in yeast were often not detected in CHO cells, likely due to a difference in the detection level (see QRR and targets gtg, gct, and ttc). Four mutants were also produced in E. coli and purified by metal affinity chromatography. Their relative in vitro cleavage efficiencies against the wild-type site and their cognate sites was determined. The extent of cleavage under standardized conditions was assessed across a broad range of concentrations for the mutants (FIG. 3b). Similarly, the activity of I-CreI wt on these targets, was analysed. In many case, 100% cleavage of the substrate could not be achieved, likely reflecting the fact that these proteins may have little or no turnover (Perrin et al., EMBO J., 1993, 12, 2939-2947; Wang et al., Nucleic Acids Res., 1997, 25, 3767-3776). In general, in vitro assay confirmed the data obtained in yeast and CHO cells, but surprinsingly, the gtt target was efficiently cleaved by I-CreI


Specificity shifts were obvious from the profiles obtained in yeast and CHO: the I-CreI favorite gtc target was not cleaved or barely cleaved, while signals were observed with new targets. This switch of specificity was confirmed for QAN, DRK, RAT and KTG by in vitro analysis, as shown on FIG. 3b. In addition, these four mutants, which display various levels of activity in yeast and CHO (FIG. 3a) were shown to cleave 17-60% of their favorite target in vitro (FIG. 3b), with similar kinetics to I-CreI (half of maximal cleavage by 13-25 nM). Thus, activity was largely preserved by engineering. Third, the number of cleaved targets varied among the mutants: strong cleavers such as QRR, QAN, ARL and KTG have a spectrum of cleavage in the range of what is observed with I-CreI (5-8 detectable signals in yeast, 3-6 in CHO). Specificity is more difficult to compare with mutants that cleave weakly. For example, a single weak signal is observed with DRK but might represent the only detectable signal resulting from the attenuation of a more complex pattern. Nevertheless, the behavior of variants that cleave strongly shows that engineering preserves a very narrow specificity.


EXAMPLE 3
Hierarchical Clustering Defines Seven I-CreI Variant Families
A) Material and Methods

Clustering was done using hclust from the R package. We used quantitative data from the primary, low density screening. Both variants and targets were clustered using standard hierarchical clustering with Euclidean distance and Ward's method (Ward, J. H., American Stat. Assoc., 1963, 58, 236-244). Mutants and targets dendrograms were reordered to optimize positions of the clusters and the mutant dendrogram was cut at the height of 8 to define the cluster.


B) Results

Next, hierarchical clustering was used to determine whether families could be identified among the numerous and diverse cleavage patterns of the variants. Since primary and secondary screening gave congruent results, quantitative data from the first round of yeast low density screening was used for analysis, to permit a larger sample size. Both variants and targets were clustered using standard hierarchical clustering with Euclidean distance and Ward's method (Ward, J. H., precited) and seven clusters were defined (FIG. 4b). Detailed analysis is shown for 3 of them (FIG. 4c) and the results are summarized in Table I.









TABLE I







Cluster Analysis














Nucleotide





Three preferred
in



examples
targets1
position 4
preferred amino acid2














cluster
(FIG. 3a)
sequence
% cleavage
(%)1
44
68
70


















1
QAN
gtt
46.2
g
0.5
Q




77 proteins

gtc
18.3
a
2.0
80.5%




gtg
13.6
t
82.4
(62/77)





Σ = 78.1
c
15.1


2
QRR
gtt
13.4
g
0
Q
R


 8 proteins

gtc
11.8
a
4.9
100.0% 
100.0% 




tct
11.4
t
56.9
(8/8)
(8/8)





Σ = 36.6
c
38.2


3
ARL
gat
27.9
g
2.4
A
R


65 proteins

tat
23.2
a
88.9
63.0%
33.8%




gag
15.7
t
5.7
(41/65)
(22/65)





Σ = 66.8
c
3.0


4
AGR
gac
22.7
g
0.3
A&N
R
R


31 proteins

tac
14.5
a
91.9
51.6% &
48.4%
67.7%




gat
13.4
t
6.6
35.4%
15/31
21/31





Σ = 50.6
c
1.2
(16&11/31)


5
ADK
gat
 29.21
g
1.6


81 proteins
DRK
tat
15.4
a
73.8




gac
11.4
t
13.4





Σ = 56.05.9
c
11.2


6
KTG
cct
30.1
g
0
K


51 proteins
RAT
tct
19.6
a
4.0
62.7%




tcc
13.9
t
6.3
(32/51)





Σ = 63.6
c
89.7


7

cct
20.8
g
0
K


37 proteins

tct
19.6
a
0.2
91.9%




tcc
15.3
t
14.4
(34/37)





Σ = 55.7
c
85.4






1frequencies according to the cleavage index, as described in FIG. 4c




2in each position, residues present in more than ⅓ of the cluster are indicated







For each cluster, a set of preferred targets could be identified on the basis of the frequency and intensity of the signal (FIG. 4c). The three preferred targets for each cluster are indicated in Table 1, with their cleavage frequencies. The sum of these frequencies is a measurement of the specificity of the cluster. For example, in cluster 1, the three preferred targets (gtt/c/g), account for 78.1% of the observed cleavage, with 46.2% for gtt alone, revealing a very narrow specificity. Actually, this cluster includes several proteins which, as QAN, which cleaves mostly gtt (FIG. 3a). In contrast, the three preferred targets in cluster 2 represent only 36.6% of all observed signals. In accordance with the relatively broad and diverse patterns observed in this cluster, QRR cleaves 5 targets (FIG. 3a), while other cluster members' activity are not restricted to these 5 targets.


Analysis of the residues found in each cluster showed strong biases for position 44: Q is overwhelmingly represented in clusters 1 and 2, whereas A and N are more frequent in clusters 3 and 4, and K in clusters 6 and 7. Meanwhile, these biases were correlated with strong base preferences for DNA positions ±4, with a large majority of t:a base pairs in cluster 1 and 2, a:t in clusters 3, 4 and 5, and c:g in clusters 6 and 7 (see Table I). The structure of I-CreI bound to its target shows that residue Q44 interacts with the bottom strand in position −4 (and the top strand of position +4, see FIGS. 1b and 1c). These results suggests that this interaction is largely conserved in our mutants, and reveals a “code”, wherein Q44 would establish contact with adenine, A44 (or less frequently N44) with thymine, and K44 with guanine. Such correlation was not observed for positions 68 and 70.


EXAMPLE 4
Variants can be Assembled in Functional Heterodimers to Cleave New DNA Target Sequences
A) Materials and Methods

The 75 hybrid targets sequences were cloned as follows: oligo-nucleotides were designed that contained two different half sites of each mutant palindrome (PROLIGO). Double-stranded target DNA, generated by PCR amplification of the single stranded oligonucleotides, was cloned using the Gateway protocol (INVITROGEN) into yeast and mammalian reporter vectors. Yeast reporter vectors were transformed into S. cerevisiae strain FYBL2-7B (MATα, ura3×851, trp1Δ63, leu2Δ1, lys2Δ202).


B) Results

Variants are homodimers capable of cleaving palindromic sites. To test whether the list of cleavable targets could be extended by creating heterodimers that would cleave hybrid cleavage sites (as described in FIG. 5), a subset of I-CreI variants with distinct profiles was chosen and cloned in two different yeast vectors marked by LEU2 or K4N genes. Combinations of mutants having mutations at positions 44, 68 and/or 70 and N at position 75, were then co-expressed in yeast with a set of palindromic and non palindromic chimeric DNA targets. An example is shown on FIG. 2: co-expression of the K44, T68, G70, N75 (KTG) and Q44, A68, N70, N75 (QAN) mutants resulted in the cleavage of two chimeric targets, gtt/gcc and gtt/cct, that were not cleaved by either mutant alone. The palindromic gtt, cct and gcc targets (and other targets of KTG and QAN) were also cleaved, likely resulting from homodimeric species formation, but unrelated targets were not. In addition, a gtt, cct or gcc half-site was not sufficient to allow cleavage, since such targets were fully resistant (see ggg/gcc, gat/gcc, gcc/tac, and many others, on FIG. 6). Unexpected cleavage was observed only with gtc/cct and gtt/gtc, with KTG and QAN homodimers, respectively, but signal remained very weak. Thus, efficient cleavage requires the cooperative binding of two mutant monomers. These results demonstrate a good level of specificity for heterodimeric species.


Altogether, a total of 112 combinations of 14 different proteins were tested in yeast, and 37.5% of the combinations (42/112) revealed a positive signal on their predicted chimeric target. Quantitative data are shown for six examples on FIG. 7a, and for the same six combinations, results were confirmed in CHO cells in transient co-transfection experiments, with a subset of relevant targets (FIG. 7b). As a general rule, functional heterodimers were always obtained when one of the two expressed proteins gave a strong signal as homodimer. For example, DRN and RRN, two low activity mutants, give functional heterodimers with strong cutters such as KTG or QRR (FIGS. 7a and 7b) whereas no cleavage of chimeric targets could be detected by co-expression of the same weak mutants


EXAMPLE 5
Cleavage of a Natural DNA Target by Assembled Heterodimer
A) Materials and Methods
a) Genome Survey

A natural target potentially cleaved by a I-CreI variant, was identified by scanning the public databases, for genomic sequences matching the pattern caaaacnnnnnnnnnnnngttttg, wherein n is a, t, c, or g (SEQ ID NO: 2). The natural target DNA sequence caaaactatgtagagggttttg (SEQ ID NO: 7) was identified in mouse chromosome 17.


This DNA sequence is potentially cleaved by a combination of two I-CreI variants cleaving the sequences tcaaaactatgtgatagttttga (SEQ ID NO: 8) and tcaaaaccctgtgaagggttttga (SEQ ID NO: 9), respectively.


b) Isolation of Meganuclease Variants

Variants were selected by the cleavage-induced recombination assay in yeast, as described in example 1, using the sequence tcaaaactatgtgaatagttttga (SEQ ID NO: 8) or the sequence tcaaaaccctgtgaaggggttttga (SEQ ID NO: 9) as targets.


c) Construction of the Target Plasmid

Oligonucleotides were designed that contained two different half sites of each mutant palindrome (PROLIGO). Double-stranded target DNA, generated by PCR amplification of the single stranded oligonucleotides, was cloned using the Gateway protocol (INVITROGEN) into the mammalian reporter vector pcDNA3.1-LACURAZ-ΔURA, described previously (Epinat et al., precited), to generate the target LagoZ plasmid.


d) Construction of Meganuclease Expression Vector

The open reading frames (ORFs) of the clones identified during the screening in yeast were amplified by PCR on yeast colony and cloned individually in the CHO expression vector pCDNA6.2 (INVITROGEN), as described in example 1. I-CreI variants were expressed under the control of the CMV promoter.


e) Mammalian Cells Assay

CHO-K1 cell line were transiently co-transfected with equimolar amounts of target LagoZ plasmid and expression plasmids, and the beta galactosidase activity was measured as described in examples 2 and 4.


B) Results

A natural DNA target, potentially cleaved by I-CreI variants was identified by performing a genome survey of sequences matching the pattern caaaacnnnnnnnnnnnngttttg (SEQ ID NO: 2). A randomly chosen DNA sequence (SEQ ID NO: 2) identified in chromosome 17 of the mouse was cloned into a reporter plasmid. This DNA target was potentially cleaved by a combination of the I-CreI variants A44,R68,S70,N75 (ARS) and K44,R68,E70,N75 (KRE).


The co-expression of these two variants in CHO cell leads to the formation of functional heterodimer protein as shown in FIG. 8. Indeed when the I-CreI variants were expressed individually, virtually no cleavage activity could be detected on the mouse DNA target although the KRE protein showed a residual activity. In contrast, when these two variants were co-expressed together with the plasmid carrying the potential target, a strong beta-galactosidase activity could be measured. All together these data revealed that heterodimerization occurred in the CHO cells and that heterodimers were functional.


These data demonstrate that heterodimers proteins created by assembling homodimeric variants, extend the list of natural occurring DNA target sequences to all the potential hybrid cleavable targets resulting from all possible combination of the variants.


Moreover, these data demonstrated that it is possible to predict the DNA sequences that can be cleaved by a combination of variant knowing their individual DNA target of homodimer. Furthermore, the nucleotides at positions 1 et 2 (and −1 and −2) of the target can be different from gtac, indicating that they play little role in DNA/protein interaction.

Claims
  • 1-38. (canceled)
  • 39. A recombinant heterodimeric meganuclease comprising two separate polypeptides, wherein each of said polypeptides comprises a LAGLIDADG (SEQ ID NO: 20) Homing Endonuclease core domain and each of said polypeptides is a different LAGLIDADG (SEQ ID NO: 20) Homing Endonuclease I-CreI, at least one of said polypeptides having a lysine in position 44 of I-Crel amino acid sequence according to the amino acid numbering of the I-CreI sequence of SWISSPROT accession number PO5725 (SEQ ID NO: 21), and wherein said two separate polypeptides are able to assemble and to cleave a chimeric DNA target sequence selected from the group consisting of:
  • 40. A composition comprising at least one recombinant heterodimeric meganuclease according to claim 39.
  • 41. The composition according to claim 40, further comprising a targeting DNA construct comprising the sequence which repairs the site of interest flanked by sequence sharing homologies with the targeted locus.
  • 42. The recombinant heterodimeric meganuclease of claim 39, wherein one of the polypeptides comprising a LAGLIDADG (SEQ ID NO:20) homing endonuclease core domain, comprises the wild type I-CreI sequence of SEQ ID NO:21.
Priority Claims (2)
Number Date Country Kind
2005/000981 Mar 2005 IB international
2005/003083 Sep 2005 IB international
PCT Information
Filing Document Filing Date Country Kind 371c Date
PCT/IB06/01271 3/15/2006 WO 00 9/17/2007