This disclosure relates to materials and methods for editing a genome (e.g., an animal genome in vivo). The methods and materials can involve using a targeted endonuclease and a donor nucleic acid having a length within a particular range (e.g., from 20 to 95 nucleotides).
The zebrafish (Danio rerio) can be considered a premier teleostean model system. With strong biological and genomic similarities to other vertebrates, this organism is increasingly being used to study human biology and disease using a rich array of available in vivo genetic and molecular tools.
The ability to edit genomes (e.g., to edit genomes precisely) can be considered a bottleneck in life science, particularly for direct in vivo editing within model systems. This disclosure provides a transcription activator-like effector (TALE) nuclease toolbox that enables a new approach to in vivo genome editing using a non-mammalian vertebrate, the zebrafish (D. rerio). As described herein, TALE nucleases and donor nucleic acid can be used to perform homologous recombination successfully in species such as zebrafish. For example, this document demonstrates the ability to introduce genetic changes precisely at a TALE nuclease cut site in vivo using single-stranded DNA oligonucleotides as a donor sequence. Such methods can be used to introduce changes (e.g., small changes) at a TALE nuclease cut site in zebrafish. In some cases, the methods and materials provided herein can be used to introduce loxP-related sequences at two different TALE nuclease cut sites, thereby allowing for conditional allele generation in zebrafish and other model systems.
In addition, as described herein, particular scaffold backbones can be used in a manner that greatly increases the efficacy of artificial custom restriction endonucleases, TALE nucleases. For example, four of five (80%) +63 scaffold TALE nucleases exhibited DNA targeting rates that were higher than other tested TALE nucleases in zebrafish. Also, three of five (60%) +63 scaffold TALE nucleases exhibited bi-allelic conversion in somatic tissues, which can facilitate direct functional genomic analyses in injected animals (such as was previously accomplished using morpholino oligonucleotide knockdowns). This improved efficacy that yields bi-allelic conversion demonstrates that the TALE nucleases provided herein can be used in zebrafish and other systems (e.g., in vitro applications such as a single-step modifying approaches for iPSCs or gene therapy approaches).
In general, one aspect of this document features a method for modifying the genetic material of an organism. The method comprises, or consists essentially of, introducing into a cell of the organism: (i) a first nucleic acid encoding a first transcription activator-like effector (TALE) nuclease monomer, (ii) a second nucleic acid encoding a second TALE nuclease monomer, and (iii) a donor nucleic acid, wherein each of the first and second TALE nuclease monomers comprises a plurality of TAL effector repeat sequences and a FokI endonuclease domain, wherein the first TAL effector endonuclease monomer comprises the ability to bind to a first half-site sequence of a target DNA within the cell and comprises the ability to cleave the target DNA when the second TAL effector endonuclease monomer is bound to a second half-site sequence of the target DNA, wherein the target DNA comprises the first half-site sequence and the second half-site sequence separated by a spacer sequence, wherein the first and second half-sites have the same nucleotide sequence or different nucleotide sequences, and wherein the donor nucleic acid is from 20 to 85 (e.g., from 35 to 55) nucleotides in length and comprises a sequence that is heterologous to the target DNA and that is flanked by sequences that are similar or identical to the endogenous target nucleotide sequence. The organism can be a zebrafish embryo. The donor nucleic acid can be a single stranded DNA. The donor nucleic acid can be 40 to 50 nucleotides in length. The first TALE nuclease monomer, the second TALE nuclease monomer, or both the first and second TALE nuclease monomers can have a +63 scaffold backbone as set forth in SEQ ID NO:104.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention pertains. Although methods and materials similar or equivalent to those described herein can be used to practice the invention, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.
The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.
This document provides methods and materials for genome editing and functional genomic applications. As described herein, TALE nucleases and donor nucleic acid can be used to perform homologous recombination successfully in species such as zebrafish. For example, this document demonstrates the ability to introduce genetic changes precisely at a TALE nuclease cut site in vivo using single-stranded DNA oligonucleotides as a donor sequence. Such methods can be used to introduce changes (e.g., small changes) at a TALE nuclease cut site in zebrafish. In some cases, the methods and materials provided herein can be used to introduce loxP-related sequences at two different TALE nuclease cut sites, thereby allowing for conditional allele generation in zebrafish and other species (e.g., other model systems). In some cases, single-stranded DNA (ssDNA) oligonucleotides (oligos) can be used to add new sequences successfully and precisely at predefined locations in the genome of a species (e.g., a zebrafish). In some cases, such an introduced sequence can be a modified loxP (mloxP) sequence as described elsewhere (Thomson et al., Genesis, 36:162-167, 2003).
Zinc finger nucleases (ZFNs) and TALE nucleases can be effective at introducing locus-specific double-stranded breaks in the zebrafish (Doyon et al., Nature Biotechnol., 26:702-708, 2008; Meng et al., Nature Biotechnol., 26:695-701, 2008; Foley et al., PLoS One, 4:e4348, 2009; Huang et al., Nature Biotechnol., 29:699-700, 2011; and Sander et al., Nature Biotechnol., 29:697-698, 2011), generating an array of small genome insertions or deletions including loss of function alleles. However, the efficacy of previously described custom restriction enzymes can be relatively low and can yield many unperturbed loci.
As described herein, synthetic ssDNA oligonucleotides can be used with a TALE nuclease system for genome editing including the precise introduction of exogenous DNA sequence at a specific locus, such as the addition of loxP sequences for the generation of conditional alleles. Although deployed here in zebrafish, this approach has the potential to be effective for in vivo applications in a wide array of model organisms (e.g., insects, nematodes, frogs, mice, rats, and rabbits).
Transcription activator-like (TAL) effectors are polypeptides of plant pathogenic bacteria that are injected by the pathogen into the plant cell, where they travel to the nucleus and function as transcription factors to turn on specific plant genes. The primary amino acid sequence of a TAL effector dictates the nucleotide sequence to which it binds. Thus, target sites can be predicted for TAL effectors, and TAL effectors also can be engineered and generated for the purpose of binding to particular nucleotide sequences, as described herein.
For TALE nuclease polypeptides, a TAL effector can be fused to a nuclease or a portion of a nuclease, typically a nonspecific cleavage domain from a type II restriction endonuclease such as FokI (Kim et al. Proc. Natl. Acad. Sci. USA, 93:1156-1160, 1996). Other useful endonucleases may include, for example, HhaI, HindIII, NotI, BbvCI, EcoRI, BgtI, and AlwI. The fact that some endonucleases (e.g., FokI) function as dimers can be capitalized upon to enhance the target specificity of the TAL effector. For example, in some cases, each FokI monomer can be fused to a TAL effector sequence that recognizes a different DNA target sequence, and only when the two recognition sites are in close proximity do the inactive monomers come together to create a functional enzyme. By requiring DNA binding to activate the nuclease, a highly site-specific restriction enzyme can be created.
Sequence-specific TALE nucleases can be designed to recognize preselected target nucleotide sequences present in a cell. In some cases, a target nucleotide sequence can be scanned for nuclease recognition sites, and a particular nuclease can be selected based on the target sequence. In some cases, a TALE nuclease can be engineered to target a particular cellular sequence. A nucleotide sequence encoding the desired TALE nuclease can be inserted into any suitable expression vector, and can be operably linked to one or more promoters or other expression control sequences.
Examples and further descriptions of TALE nucleases can be found, for example, in U.S. Patent Application Publication No. 2011/0145940, which is incorporated herein by reference in its entirety. In some cases, as described herein, a TALE nuclease can have truncations at the N- and/or C-terminal regions of the TAL portion of the polypeptide, such that it has a shortened scaffold as compared to a wild type TAL polypeptide. An exemplary TALE nuclease with a modified scaffold is the +63 TALE nuclease described herein. It is to be noted that the TAL portion also can include one or more additional variations (e.g., substitutions, deletions, or additions) in combination with such N- and C-terminal scaffold truncations. For example, a TALE nuclease can have N- and C-terminal truncations of the TAL portion in combination with one or more amino acid substitutions (e.g., within the scaffold and/or within the repeat region).
Vectors comprising nucleic acid encoding a TALE nuclease can be introduced into cells by a variety of methods (e.g., injection, direct uptake, projectile bombardment, liposomes, or electroporation). As described in the Examples below, for example, DNA encoding a TALE nuclease can be microinjected into a zebrafish embryo. TALE nucleases can be stably or transiently expressed into cells using expression vectors. Techniques for expression in eukaryotic cells are well known to those in the art.
A donor nucleic acid also can be introduced into a cell, either simultaneously with or separately from the TALE nuclease nucleic acid. A donor nucleotide sequence (e.g., a single-stranded DNA (ssDNA) sequence) can, for example, include a variant sequence having one or more modifications (i.e., substitutions, deletions, insertions, or combinations thereof) with respect to a preselected target nucleotide sequence found endogenously within the genome of a cell to be transformed (also referred to herein as a “modified target nucleotide sequence”). In some cases, the donor nucleic acid can have an overall length that is from about 20 to about 90 nucleotides (e.g., 20 to 40, 20 to 45, 20 to 50, 20 to 55, 20 to 60, 25 to 40, 25 to 45, 25 to 50, 25 to 55, 25 to 60, 30 to 40, 30 to 45, 30 to 50, 30 to 55, 30 to 60, 30 to 65, 35 to 40, 35 to 45, 35 to 50, 35 to 55, 35 to 60, 35 to 65, 40 to 45, 40 to 50, 40 to 55, 40 to 60, 40 to 65, 40 to 70, 45 to 50, 45 to 55, 45 to 60, 45 to 65, 45 to 70, 45 to 74, 50 to 55, 50 to 60, 50 to 65, 50 to 70, or 50 to 75 nucleotides). In some cases, the variant sequence within a donor nucleic acid can be flanked on both sides with sequences that are similar or identical to the endogenous target nucleotide sequence within the cell. For example, the flanking sequences can have a length between about 10 and about 45 nucleotides (e.g., 10 to 30, 10 to 35, 10 to 40, 10 to 45, 15 to 30, 15 to 35, 15 to 40, 15 to 45, 20 to 35, 20 to 40, 20 to 45, 25 to 40, or 25 to 45 nucleotides), such that the overall length of the donor sequence is from about 20 to about 90 nucleotides (e.g., 20 to 40, 20 to 45, 20 to 50, 20 to 55, 20 to 60, 25 to 40, 25 to 45, 25 to 50, 25 to 55, 25 to 60, 30 to 40, 30 to 45, 30 to 50, 30 to 55, 30 to 60, 30 to 65, 35 to 40, 35 to 45, 35 to 50, 35 to 55, 35 to 60, 35 to 65, 40 to 45, 40 to 50, 40 to 55, 40 to 60, 40 to 65, 40 to 70, 45 to 50, 45 to 55, 45 to 60, 45 to 65, 45 to 70, 45 to 74, 50 to 55, 50 to 60, 50 to 65, 50 to 70, or 50 to 75 nucleotides). In some cases, homologous recombination can occur between the donor nucleic acid and the endogenous target on both sides of the variant sequence, such that the resulting cell's genome contains the variant sequence within the context of endogenous sequences from, for example, the same gene. A donor nucleotide sequence can be generated to target any suitable sequence within a genome.
Methods for altering the genetic material of an organism can include introducing a TALE nuclease into a cell of the organism, either by introducing a TALE nuclease polypeptide or by introducing a nucleic acid encoding such a TALE nuclease polypeptide. In some cases, a method provided herein can include introducing both a TALE nuclease and a heterologous donor nucleic acid into a cell. The donor nucleotide sequence can include one or more modifications (i.e., substitutions, deletions, insertions, or combinations thereof) with respect to a corresponding, preselected target nucleotide sequence found in the cell. The donor nucleotide sequence can undergo homologous recombination with the endogenous target nucleotide sequence, such that the endogenous sequence or a portion thereof is replaced with the donor sequence or a portion thereof. The target nucleotide sequence typically includes or is adjacent to a recognition site for a sequence-specific TALE nuclease. In some cases, a target nucleotide sequence can include recognition sites for two or more distinct TALE nucleases (e.g., two opposed target sequences that are distinct, such that TALE nucleases having distinct DNA sequence binding specificity can be used). In such cases, the specificity of DNA cleavage can be increased as compared to cases in which only one target sequence (or multiple copies of the same target sequence) is used. In some cases, the donor nucleotide sequence and the nucleotide sequence encoding the TALE nuclease can be contained in the same nucleic acid construct. In some cases, the donor nucleotide sequence and the TALE nuclease coding sequence can be contained in separate constructs, or the TALE nuclease polypeptide can be produced and introduced directly into a cell.
Any appropriate TALE nuclease can be used as described herein. In some cases, a TALE nuclease having a scaffold based on or including SEQ ID NO:104 or SEQ ID NO:105 can be used (see, e.g.,
The invention will be further described in the following examples, which do not limit the scope of the invention described in the claims.
TALE Nuclease Design:
Software available at https://boglab.plp.iastate.edu/node/add/talen was initially used to find candidate binding sites. Three criteria were used for TALE nuclease design. First, repeat arrays that ranged from 15-25 bases in length were selected. Second, the spacer length was restricted to 14 or 18 bp, with 15-16 bases being the optimum length. Finally, if possible, one restriction enzyme site was present within that spacer.
To streamline the design process, a computer program was devised and implemented to aid in the design of TALE nucleases. This program works through multiple steps to find good TALE nuclease binding sites. First, the user supplies an NCBI Gene identifier and the program downloaded the sequence. The program then uses NCBI EUtilities to extract the exons with some flanking sequence. The flanking sequence is needed if the exons are short or the TALE nuclease recognition sequence is near the beginning or end of the exon. TAL binding sites are located using the required criteria of a thymine on either end of the binding sequence (T [ACG] [TCG] . . . T) as described by Moscou and Bogdanove (Science, 326:1501, 2009) and Cermak et al. (Nucl. Acids Res., 39:e82, 2011). The final thymine is required because of the B plasmid's design in the Golden Gate TALEN and TAL Effector kit (Addgene, 1000000016, USA). Once the TALE nuclease binding sites are found, the program locates any commercially available restriction enzymes that cut within the spacer region. When a restriction site is located, the program scans for 300 base pairs surrounding the TALE nuclease binding sites and reports only those enzymes that cut once or twice in the amplicon. The results, including the TALE nuclease binding sequences and the restriction enzymes that cut within the spacer, are reported in an easily usable format. The program source is freely available at zfishbook.org/tal_tool and, for convenience, also can be accessed via a web-based interface.
TALE Nuclease Binding Sites and Spacer Regions:
The following TALE nuclease recognition sites and spacer sequences were used:
TALE Nuclease Constructs:
TALE nuclease assembly of the RVDs was performed using the Golden Gate approach as previously described by Cermak et al. (supra). Once assembled, rather than using the kit's destination vector, the RVDs were added to two different vectors—pT3 Ts-TAL+231 and pT3 Ts-TAL+63—which were used for in vitro transcription of TALE nuclease mRNA based on pT3TS vector previously described (Hyatt and Ekker, Meth. Cell Biol., 59:117-126, 1999). The TALE nuclease expression constructs were linearized with SacI, and mRNA was made (T3 mMessage Machine kit, Ambion) and purified (RNeasy MinElute Cleanup kit, Qiagen) for injection.
TALE Nuclease Germline Screening:
One-cell embryos were microinjected with 50-400 pg of TALE nuclease mRNA. Genomic DNA was collected at 4 dpf from 24-32 individual larvae as described in Meeker et al. (BioTechniques, 43:610, 612, and 614, 2007). Genomic DNA isolated from 10 larval zebrafish was extracted using DNAeasy Blood and Tissue kit (Qiagen). Genotyping was performed using PCR followed by restriction enzyme digest. The primers were as follows:
The undigested bands were cloned into the TOPO® TA Cloning Kit (Invitrogen) and sequenced to confirm mutation.
Genome Editing:
For the ponzr1 loci, single-stranded DNA (ssDNA) oligos were designed to target the spacer sequence between the TALE nuclease cut sites. The oligo extended to half the length of the TALE nuclease recognition site. An EcoRV site (5′-GATSTCC-3′) or a mutated LoxP (mLoxP) site (5′-TAACTTCGTATAGCATACATTA TAGCAATTTAT-3′; SEQ ID NO:154) was introduced near the center of the oligo, resulting in a 20-base homology arm on the left side and an 18-base homology arm on the right side.
One-cell embryos were microinjected with 50-75 pg of ponzr1 or chrh2 TALE nuclease mRNA and 50-75 pg of one ssDNA donor. Genomic DNA was isolated as described above. If the embryos were injected with the EcoRV oligo, PCR was performed using the same primers as listed above and the product was digested using EcoRV. The positive larval DNA was cloned and colony PCR was used to find EcoRV-positive plasmids. Those plasmids were sent for sequencing to confirm EcoRV integration. If the embryos were injected with the mLoxP oligo, the genomic DNA was amplified using the same forward primer as listed above and a mLoxP reverse primer, 5′-ATAAATTGCTAT AATGTATGCTATACGAAGT-3′ (SEQ ID NO:155), or the same reverse primer as listed above and a mLoxP forward primer, 5′-ACTTCGTATAGCATA CATTATAGCAATTTAT-3′ (SEQ ID NO:156). The positive larval DNA was then amplified using the original pair of primers listed above and that product was cloned (TOPO® TA Cloning® Kit, Invitrogen) and colony PCR was used to find mLoxP-positive plasmids. The positive plasmids were sequenced for confirmation of mLoxP integration.
Zebrafish Work:
The zebrafish work was conducted under full animal care and use guidelines with prior approval by the local institutional animal care committee's approval. Danio rerio transgenic lines were described previously: Tg(fli1:EGFP) (Traver et al., Nat. Immunol., 4:1238-1246, 2003) and Tg(gata1:dsRed) (Lawson and Weinstein, Devel. Biol., 248:307-318, 2002).
Data Analysis and Statistics:
Quantification of TALE nuclease mutated DNA was performed using image J. For each gel, the background was subtracted and each lane was isolated to generate individual intensity plot profiles. A straight line was drawn across the bottom of each plot to eliminate inconsistencies caused by a skewed baseline. Each peak was then quantified. The intensity measurement for each band was added together to get total intensity. To calculate percent NHEJ, the intensity of the top band was divided by the total intensity. A student's T-test was used to test significance.
Cell-Free TALE Nuclease Restriction Endonuclease Assay:
5 ug of the ponzr1 PCR product was digested in each assay. Plasmids pTal 278, pTAL 279, pDelTal 278 and pDelTal 279 were linearized with SacI and used to transcribe messenger RNA using the mMessage RNA kit (Ambion). In vitro translation of 2 ug of each messenger RNA was accomplished using an In Vitro Transcription and Translation kit (Promega). ponzr1 PCR product was included in the assay mix during in vitro translation of different TALE nuclease combinations, allowing the translation and in vitro nuclease digestion to occur simultaneously. Translation was conducted for 2 hours at 30° C. To further facilitate TALE nuclease in vitro nuclease activity, the assay mix was diluted five-fold in in vitro digestion buffer (20 mM Tris-HCl pH 7.5, 5 mM MgCl2, 50 mM KCl, 5% glycerol, and 0.5 mg/ml BSA). The assay mix was additionally incubated at 30° C. for 4 hours. DNA from the mix was purified using the Qiagen PCR Purification kit, concentrated via ethanol precipitation, and run on a 2% agarose gel. The negative control did not include the translated TALE nucleases.
The efficacy of previously described custom restriction enzymes was relatively low and yielded many unperturbed loci (Doyon et al., supra; Meng et al., supra; Foley et al., supra; Huang et al., supra; and Sander et al., supra). For example, standard TALE nucleases using the +231 scaffold (
Different TALE nuclease scaffolds, with differential N- and C-terminal truncations, diverse linkers to the FokI nuclease, and distinct nuclear localization sequences, have been tested (Miller et al., Nature Biotechnol., 29:143-148 2011; Cermak et al., supra; and Mussolino et al., Nucl. Acids Res., 39(21):9283-9293, 2011). In the experiments described herein, a +63 scaffold was tested in an RNA expression vector backbone (
To further test the efficacy of the +63 scaffold, TALE nucleases were generated against three additional loci (moesina, ppp1cabb and cdh5;
Studies were then conducted to determine whether TALE nucleases could be used for targeted gene inhibition for phenotype studies using injected animals, such as targeted gene knockdown using morpholinos (MOs) (Nasevicius and Ekker, Nat. Genet., 25:216-220, 2000). Indeed, cdh5 TALE nuclease-injected larvae displayed specific vascular changes that phenocopied those generated by MOs (Wang et al., supra). Embryos injected with either cdh5 +63 TALE nucleases or MOs displayed similar vascular phenotypes: pronounced cardiac edema with blood pooling (
Germline transmission of +63 TALE nuclease-induced lesions also was very high (
In vitro work has demonstrated that single-stranded (ss) DNA can be an effective donor for HR-mediated editing at a ZFN-induced double-stranded break (Chen et al., Nat. Meth., 8:753-755, 2011; and Porteus and Carroll, Nat. Biotechnol., 23:967-973, 2005). With the high-efficient genome modification success of +63 TALE nucleases, it was hypothesized that synthetic oligonucleotides designed to span the predicted TALE nuclease cut site could serve as an HR template in vivo (
Studies were then conducted to investigate whether TALE nuclease/oligo co-injection could introduce larger sequences such as a loxP site, an essential step in making Cre-dependent conditional genetic alleles. TALE nucleases against the ponzr1 locus were used with long synthetic oligonucleotides to add a modified loxPJTZ17 (mloxP; Thomson et al., supra) site at this locus (
These results represent the first description of successful HR in zebrafish, and the first demonstration of HR using ssDNA as a donor template in vivo. This approach complements the established error-prone NHEJ toolkit for model organisms (
For the experiments described above, the RVDs used in the final GoldyTALEN and Miller +63 constructs are shown in
Biallelic gene targeting in F0 GoldyTALEN-injected embryos is depicted in
Biallelic gene targeting in F0 GoldyTALEN-injected embryos with low doses of TALE nuclease mRNA is shown in
Targeted genome editing using GoldyTALENs at the ponzr1 locus is depicted in
In further studies, single-stranded DNA oligonucleotides were designed to test for homology directed repair (HDR) in vivo. Different lengths of homology arms to the ponzr1 locus were tested. The first four sequences incorporated a novel seven-base sequence (GATATCC, underlined in
It is to be understood that while the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.
This application claims benefit of priority from U.S. Provisional Application No. 61/701,540, filed on Sep. 14, 2012, and U.S. Provisional Application No. 61/663,451, filed on Jun. 22, 2012.
This invention was made with government support under GM063904, DK084567, and DK083219, awarded by the National Institutes of Health. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2013/031754 | 3/14/2013 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
61663451 | Jun 2012 | US | |
61701540 | Sep 2012 | US |