This document relates to materials and methods for high efficiency gene targeting at multiple genomic sites in a single cell.
The ability to edit plant genomes through gene targeting (GT) has traditionally been hindered by low frequencies of recombination. Editing plant genomes through GT requires efficient methods to deliver both sequence-specific nucleases (SSNs) and repair templates to plant cells. This can be achieved using Agrobacterium T-DNA or biolistics, or by stably integrating nuclease-encoding cassettes and repair templates into the plant genome. In dicotyledonous plants such as tobacco and tomato, for example, greater than 10-fold enhancements in GT frequencies have been achieved using DNA virus-based replicons. These replicons transiently amplify to high copy numbers in plant cells, delivering abundant SSNs and repair templates to achieve targeted gene modification. While the use of SSNs has helped to increase recombination frequencies, successes in GT have largely been limited to single-gene, single-loci targets.
This document is based on the discovery of materials and methods that enable multiplex gene targeting in plants. For example, this document is based, at least in part, on the development of a replicon-based system for genome engineering of plants (e.g., cereal crops) using a deconstructed version of the wheat dwarf virus (WDV). As described herein, the replicons achieved more than a 100-fold (e.g., about 110-fold) increase in expression of a reporter gene in wheat cells, relative to non-replicating controls. Replicons carrying CRISPR/Cas9 nucleases and repair templates achieved GT at an endogenous ubiquitin locus at frequencies at least 10 fold (e.g., about 12-fold) greater than non-viral delivery methods. Moreover, in some cases, the methods provided herein with the deconstructed WDV replicons can be used for gene targeted integration by HR in all three homeoalleles (A, B, and D) of the hexaploid wheat genome, thus achieving multiplexed GT within the same wheat cell. Thus, these materials and methods can provide high frequencies of GT that make it possible to edit complex genomes without the need to integrate GT reagents into the genome.
In one aspect, this document features a method for modifying genomic material of a plant cell at two or more loci. The method can include (a) providing a plant cell that contains two or more endogenous nucleic acid sequences to be modified, (b) introducing into the plant cell (i) a first repair template targeted to a first genomic sequence within the plant cell, and (ii) a second repair template targeted to a second genomic sequence within the plant cell, and (c) maintaining the plant cell under conditions in which the first and second repair templates recombine by homologous recombination with their corresponding genomic loci, thereby producing a plant cell containing targeted genomic modifications at the first and second genomic sequences.
The first repair template can be within a first geminivirus replicon that includes, in order from 5′ to 3′, a first geminivirus long intergenic region (LIR), the first repair template, a geminivirus short intergenic region (SIR), a virus Rep/RepA coding sequence, and a second geminivirus LIR. The second repair template can be within a second geminivirus replicon that contains, in order from 5′ to 3′, a third geminivirus LIR, the second repair template, a second geminivirus SIR, a virus Rep/RepA coding sequence, and a fourth geminivirus LIR. The first, second, third, and fourth geminivirus LIRs can contain the same nucleotide sequence, or at least two of the first, second, third, and fourth geminivirus LIRs can contain different nucleotide sequences. The first and second geminivirus SIRs can contain the same nucleotide sequence, or can contain different nucleotide sequences. The first and second repair templates can be within a geminivirus replicon that contains, in order from 5′ to 3′, a first geminivirus LIR, the first repair template, the second repair template, a geminivirus SIR, a virus Rep/RepA coding sequence, and a second geminivirus LIR. The first and second geminivirus LIRs can contain the same nucleotide sequence, or can contain different nucleotide sequences.
In some embodiments, the method can further include introducing into the plant cell a first sequence specific endonuclease targeted to the first genomic sequence and a second sequence specific endonuclease targeted to the second genomic sequence, and maintaining the plant cell under conditions in which the first and second sequence specific endonucleases are expressed to introduce double stranded DNA breaks (DSBs) at the first and second genomic sequences. The first and second sequence specific endonucleases can be first and second Cas9 endonucleases, and the method can further include introducing a first guide RNA that targets the first Cas9 endonuclease to the first genomic sequence, and a second guide RNA that targets the second Cas9 endonuclease to the second genomic sequence.
The geminivirus can be, for example, wheat dwarf virus or bean yellow dwarf virus. The plant cell is from a polyploid plant (e.g., wheat, oat, triticale, tritordeum, peanut, sugar cane, white potato, tobacco, apple, banana, watermelon, canola, leek, strawberry, or cotton). The plant cell can be a protoplast. All alleles and homeoalleles of the two or more endogenous nucleic acid sequences may be modified in the plant cell containing the targeted genomic modifications. The method can include introducing the first and second repair templates into the plant cell simultaneously, or introducing the first and second repair templates into the plant cell sequentially.
The method can further include introducing into the plant cell a third repair template targeted to a third genomic sequence within the plant cell, and maintaining the plant cell under conditions in which the third template recombines by homologous recombination with its corresponding genomic sequence. The third repair template can be within a third geminivirus replicon that contains, in order from 5′ to 3′, a fifth geminivirus LIR, the third repair template, a third geminivirus SIR, a virus Rep/RepA coding sequence, and a sixth geminivirus LIR. The first, second, and third repair templates can be within a geminivirus replicon that contains, in order from 5′ to 3′, a first geminivirus LIR, the first repair template, the second repair template, the third repair template, a first geminivirus SIR, a virus Rep/RepA coding sequence, and a second geminivirus LIR. The method can further include introducing into the plant cell a first sequence specific endonuclease targeted to the first genomic sequence, a second sequence specific endonuclease targeted to the second genomic sequence, and a third sequence specific endonuclease targeted to the third genomic sequence, and maintaining the plant cell under conditions under which the first, second, and third sequence specific endonucleases are expressed to introduce DSBs at the first, second, and third genomic sequences. The first, second, and third sequence specific endonucleases can be first, second, and third Cas9 endonucleases, and the method can further include introducing a first guide RNA that targets the first Cas9 endonuclease to the first genomic sequence, a second guide RNA that targets the second Cas9 endonuclease to the second genomic sequence, and a third guide RNA that targets the third Cas9 endonuclease to the third genomic sequence. The method can include introducing the first, second, and third repair templates into the plant cell simultaneously, or introducing the first, second, and third repair templates into the plant cell sequentially.
In another aspect, this document features a nucleic acid containing a first sequence that includes, in order from 5′ to 3′, a first geminivirus LIR, a first repair template targeted to a first genomic sequence within a first of two or more endogenous plant nucleic acid sequences, a geminivirus SIR, a virus Rep/RepA coding sequence, and a second geminivirus LIR. The geminivirus can be wheat dwarf virus or bean yellow dwarf virus. The nucleic acid can further include a sequence encoding a first sequence specific endonuclease that targets the first endogenous plant nucleic acid sequence. The sequence specific endonuclease can be a Cas9 endonuclease, and the nucleic acid can further include a sequence encoding a first guide RNA that targets the first Cas9 endonuclease to the first genomic sequence. The nucleic acid can further contain a second sequence that contains, in order from 5′ to 3′, a third geminivirus LIR, a second repair template targeted to a second endogenous nucleic acid sequence within the plant, a second geminivirus SIR, a virus Rep/RepA coding sequence, and a fourth geminivirus LIR. The first, second, third, and fourth geminivirus LIRs can contain the same nucleotide sequence, or at least two of the first, second, third, and fourth geminivirus LIRs can contain different nucleotide sequences. The first and second geminivirus SIRs can contain the same nucleotide sequence, or can contain different nucleotide sequences.
In some embodiments, the nucleic acid can contain, in order from 5′ to 3′, the first geminivirus LIR, the first repair template, a second repair template targeted to a second endogenous nucleic acid sequence within the plant, the geminivirus SIR, the virus Rep/RepA coding sequence, and the second geminivirus LIR. The nucleic acid can further contain a sequence encoding a first sequence specific endonuclease that targets the first endogenous plant nucleic acid sequence, and a sequence encoding a second sequence specific endonuclease that targets the second endogenous plant nucleic acid sequence. The first and second sequence-specific endonucleases can be first and second Cas9 endonucleases, and the nucleic acid can further contain a sequence encoding a first guide RNA that targets the first Cas9 endonuclease to the first genomic sequence, and a sequence encoding a second guide RNA that targets the second Cas9 endonuclease to the second genomic sequence. The first and second geminivirus LIRs can contain the same nucleotide sequence, or can contain different nucleotide sequences. The plant can be a polyploid plant (e.g., wheat, oat, triticale, tritordeum, peanut, sugar cane, white potato, tobacco, apple, banana, watermelon, canola, leek, strawberry, or cotton).
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention pertains. Although methods and materials similar or equivalent to those described herein can be used to practice the invention, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.
The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.
Methods to precisely edit cellular genomes can utilize repair templates, optionally in combination with highly efficient and programmable SSNs, including meganucleases (Puchta et al., Nucleic Acids Res 1993, 21:5034-5040; Salomon and Puchta, EMBO J 1998, 17:6086-6095; and Jacoby et al., Nucl. Acids Res., 10.1093/nar/gkr1303, 2012), zinc-finger nucleases (ZFNs) (Kim et al., Proc Natl Acad Sci USA 1996, 93:1156-1160; Townsend et al., Nature 2009, 459:442-445; and Sander et al., Nature Methods, 8:67-69, 2011), transcription activator-like effector (TALE) nucleases (Christian et al., Genetics 2010, 186:757-761; Bogdanove and Voytas, Science 2011, 333:1843-1846; and U.S. Publication No. 2011/0145940), and the clustered regularly interspaced short palindromic repeat (CRISPR)-Cas9 system (Hwang et al., Nat Biotechnol 2013, 31:227-229; Shan et al., Nat Biotechnol 2013, 31:686-688; Cong et al., Science 339:819-823, 2013; and Mali et al., Science 339:823-826, 2013). SSNs can be used to introduce a double-strand break (DSB) in the target locus to be modified, and the DSB can be repaired by one of the two primary pathways: non-homologous end joining (NHEJ) or homologous recombination (HR). In NHEJ, the ends of the broken chromosome are rejoined, sometimes imprecisely, which can introduce small insertions or deletions (indels) at the break site (Gorbunova and Levy, Nucleic Acids Res 1997, 25:4650-4657). When indels occur in coding sequences, they may create frame shift mutations that disrupt gene function. In HR, or gene targeting (GT), the DSB is repaired using a template with homology to the break site. The repair template can be the sister chromatid, a homologous or homeologous chromosome (in the case of polyploid species), or an exogenous template containing one or more specific sequence modifications to be incorporated into the break site. Efficient delivery of genome engineering reagents to plant cells is necessary to achieve targeted genome modification, particularly GT, since GT involves delivery of both a SSN expression cassette and a repair template.
This document provides materials and methods for efficient delivery of genome engineering reagents that can be used to achieve GT in plants, including crop species such as cereals that are difficult to transform. In some embodiments, the methods provided herein can utilize DNA viruses, such as geminiviruses, that have been engineered as vectors for the expression of heterologous proteins in plants. Although the cargo capacity of these viruses is not unlimited, they can be converted into non-infectious replicons by replacing genes important for infection and cell-to-cell movement with heterologous sequences (Lazarowitz et al., supra; Shen and Hohn The Plant Journal 1994, 5:227-236; and Shen and Hohn J Gen Virol 1995, 76 (Pt 4):965-969), including SSN expression cassettes and repair templates (Lazarowitz et al., EMBO J 1989, 8:1023-1032; and Ugaki et al., Nucleic Acids Res 1991, 19:371-377).
Geminiviruses replicate through a rolling circle replication (RCR) cycle (
The replicon-based systems described herein that are useful as GT vectors have characteristics that may include one or more of the following: (a) the viral DNA genome can be used as repair template, (b) the replicon (including the repair template) can replicate to high copy number, and (c) the expression of Rep and RepA viral proteins can enhance HR, perhaps due to the interaction with proteins in the plant cell that promote progression into S phase.
As described herein, replicons based on WDV were developed for plant genome engineering. The WDV-derived replicons were successfully used to amplify and express heterologous proteins in plants such as wheat, corn, and rice. The Example section herein describes the replication and protein expression of the WDV system in wheat cells, and provides different replicon architectures that were used to optimize WDV as a vector for delivering CRISPR/Cas reagents and donor templates. Use of the WDV replicons increased GT efficiency greater than 10-fold in wheat cells. In addition, the replicons were able to promote multiplexed GT, achieving targeted integration, within the same cell, of different reporter genes in different loci of the polyploid wheat genome.
This document provides highly efficient, virus-based systems and methods for targeted modification of plant genomes. The in planta systems and methods for GT include the use of customizable endonucleases in combination with plant DNA virus-based replicons. Plant DNA viruses, including geminiviruses, have many attributes that may be advantageous for in planta GT, including their ability to replicate to high copy numbers in plant cell nuclei. These viruses can be modified to encode a desired nucleotide sequence, such as a repair template sequence targeted to a particular sequence in a plant genome. First generation geminiviruses, or “full viruses” (viruses that retain only the useful “blocks” of sequence), can carry up to about 800 nucleotides (nt), while deconstructed geminiviruses (viruses that encode only the proteins needed for viral replication) have a much larger cargo capacity. This document describes how customizable nucleases and plant DNA viruses can be used for in planta GT, and provides materials and methods for achieving such GT. The methods can be used with both monocotyledonous plants [e.g., banana, grasses (such as Brachypodium distachyon), wheat, oats, barley, maize, Haynaldia villosa, palms, orchids, onions, pineapple, rice, and sorghum] and dicotyledonous plants (e.g., Arabidopsis, beans, Brassica, carnations, chrysanthemums, citrus plants, coffee, cotton, eucalyptus, impatiens, melons, peas, peppers, Petunia, poplars, potatoes, roses, soybeans, squash, strawberry, sugar beets, tobacco, tomatoes, and woody tree species), and can be particularly useful with plants having complex, polyploid genomes [e.g., durum wheat (Triticum turgidum durum), which is tetraploid, white bread wheat (Triticum aestivum), which is hexaploid, as well as other triploid species (e.g., apple, banana, and watermelon), tetraploid species (e.g., cotton, potato, canola, leek, tobacco, and peanut), hexaploid species (e.g., oat, triticale, and tritordeum), and octoploid species (e.g., strawberry and sugar cane)].
In general, the systems and methods described herein include two components: a plant DNA virus-based replicon containing a repair template targeted to an endogenous plant sequence, and an endonuclease that also is targeted to a site near or within the target sequence. The endonuclease can generate a targeted DNA DSB at the desired locus, and the plant cell can repair the DSB using the repair template present in the replicon, thereby incorporating the modification stably into the plant genome. In some embodiments, the systems and methods provided herein include two or more plant DNA virus-based replicons that each contain a different repair template with or without a sequence encoding a SSN, or include a plant DNA virus-based replicon that contains two or more repair templates targeted to endogenous plant sequences, with or without sequences encoding two or more SSNs that also are targeted to sites near or within the target sequences. The endonucleases can generate targeted DNA DSBs at the desired loci, and the plant cell can repair the DSBs using the repair templates present in the replicon(s), thereby incorporating the modifications stably into the plant genome.
Geminivirus-based replicons can be particularly useful. Geminiviruses are a large family of plant viruses that contain circular, single-stranded DNA genomes. Examples of geminiviruses include the cabbage leaf curl virus, tomato golden mosaic virus, bean yellow dwarf virus (BeYDV; also referred to as chickpea chlorotic dwarf virus), African cassava mosaic virus, wheat dwarf virus (WDV), miscanthus streak mastrevirus, tobacco yellow dwarf virus, tomato yellow leaf curl virus, bean golden mosaic virus, beet curly top virus, maize streak virus, and tomato pseudo-curly top virus.
In some embodiments, a first component of the systems and methods described herein is a geminivirus-based replicon engineered to contain a repair template that includes a desired modification (a “donor sequence”) that is heterologous to the plant to be modified, flanked by sequences of homology (“homology arms”) to a target locus within the plant genome. The engineered replicon can be generated by, for example, replacing non-essential geminivirus nucleotide sequence (e.g., CP sequence) with a desired repair template. Other methods for adding sequence to viral vectors include, without limitation, those discussed in Peretz et al. (Plant Physiol., 145:1251-1263, 2007).
A repair template as used herein can include a donor nucleic acid sequence having the ability to replace an endogenous target sequence within the plant, flanked by homology arms containing sequences homologous to endogenous sequences on either side of the target. A repair template can have a length ranging from about 25-50 nucleotides up to about 10,000 nt (e.g., about 25 nt to about 50 nt, about 50 nt to about 100 nt, about 100 nt to about 300 nt, about 300 nt to about 500 nt, about 500 nt to about 700 nt, about 700 nt to about 1000 nt, about 1000 nt to about 1500 nt, about 1500 nt to about 2000 nt, about 2500 nt to about 3000 nt, about 3000 nt to about 5000 nt, about 5000 nt to about 7500 nt, or about 7500 nt to about 10,000 nt). Within a repair template, each homology arm can have a length of about 50 nt to about 5000 nt (e.g., about 50 nt to about 100 nt, about 100 nt to about 300 nt, about 300 nt to about 500 nt, about 500 nt to about 700 nt, about 700 nt to about 1000 nt, about 1000 nt to about 3000 nt, or about 3000 nt to about 5000 nt). The donor sequence between the homology arms can have a length from about 1 nt to about 5000 nt (e.g., about 1 nt to about 50 nt, about 50 nt to about 100 nt, about 100 nt to about 200 nt, about 200 nt to about 300 nt, about 300 nt to about 400 nt, about 400 nt to about 500 nt, about 500 nt to about 1000 nt, about 1000 nt to about 2000 nt, about 2000 nt to about 3000 nt, or about 3000 nt to about 5000 nt). Repair templates and DNA virus plasmids can be prepared using molecular biology techniques such as those that are described in the Example section.
The homology arms within a repair template can have at least about 90% sequence identity (e.g., at least about 90%, 92%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.5% sequence identity) to the endogenous plant sequences to which they are targeted.
The percent sequence identity between a particular nucleic acid or amino acid sequence and a sequence referenced by a particular sequence identification number is determined as follows. First, a nucleic acid or amino acid sequence is compared to the sequence set forth in a particular sequence identification number using the BLAST 2 Sequences (Bl2seq) program from the stand-alone version of BLASTZ containing BLASTN version 2.0.14 and BLASTP version 2.0.14. This stand-alone version of BLASTZ can be obtained online at fr.com/blast or at ncbi.nlm.nih.gov. Instructions explaining how to use the Bl2seq program can be found in the readme file accompanying BLASTZ. Bl2seq performs a comparison between two sequences using either the BLASTN or BLASTP algorithm. BLASTN is used to compare nucleic acid sequences, while BLASTP is used to compare amino acid sequences. To compare two nucleic acid sequences, the options are set as follows: -i is set to a file containing the first nucleic acid sequence to be compared (e.g., C:\seq1.txt); -j is set to a file containing the second nucleic acid sequence to be compared (e.g., C:\seq2.txt); -p is set to blastn; -o is set to any desired file name (e.g., C:\output.txt); -q is set to -1; -r is set to 2; and all other options are left at their default setting. For example, the following command can be used to generate an output file containing a comparison between two sequences: C:\Bl2seq -i c:\seq1.txt -j c:\seq2.txt -p blastn -o c:\output.txt -q -l -r 2. To compare two amino acid sequences, the options of Bl2seq are set as follows: -i is set to a file containing the first amino acid sequence to be compared (e.g., C:\seq1.txt); -j is set to a file containing the second amino acid sequence to be compared (e.g., C:\seq2.txt); -p is set to blastp; -o is set to any desired file name (e.g., C:\output.txt); and all other options are left at their default setting. For example, the following command can be used to generate an output file containing a comparison between two amino acid sequences: C:\Bl2seq c:\seq1.txt -j c:\seq2.txt -p blastp -o c:\output.txt. If the two compared sequences share homology, then the designated output file will present those regions of homology as aligned sequences. If the two compared sequences do not share homology, then the designated output file will not present aligned sequences.
Once aligned, the number of matches is determined by counting the number of positions where an identical nucleotide or amino acid residue is presented in both sequences. The percent sequence identity is determined by dividing the number of matches either by the length of the sequence set forth in the identified sequence (e.g., SEQ ID NO:2), or by an articulated length (e.g., 100 consecutive nucleotides or amino acid residues from a sequence set forth in an identified sequence), followed by multiplying the resulting value by 100. For example, a nucleic acid sequence that has 450 matches when aligned with the sequence set forth in SEQ ID NO:2 is 96.4 percent identical to the sequence set forth in SEQ ID NO:2 (i.e., 450÷467×100=96.4). It is noted that the percent sequence identity value is rounded to the nearest tenth. For example, 75.11, 75.12, 75.13, and 75.14 is rounded down to 75.1, while 75.15, 75.16, 75.17, 75.18, and 75.19 is rounded up to 75.2. It also is noted that the length value will always be an integer.
A second component of the systems and methods described herein can be an endonuclease that can be customized to target a particular nucleotide sequence and generate a DSB at or near that sequence. As noted above, examples of customizable endonucleases include ZFNs, meganucleases, and TALE nucleases, as well as CRISPR/Cas systems. In particular, CRISPR/Cas molecules are components of a prokaryotic adaptive immune system that is functionally analogous to eukaryotic RNA interference, using RNA base pairing to direct DNA or RNA cleavage. Directing DNA DSBs requires two components: the Cas9 protein, which functions as an endonuclease, and CRISPR RNA (crRNA) and tracer RNA (tracrRNA) sequences that aid in directing the Cas9/RNA complex to target DNA sequence (Makarova et al., Nat Rev Microbiol, 9(6):467-477, 2011). The modification of a single targeting RNA can be sufficient to alter the nucleotide target of a Cas protein. In some cases, crRNA and tracrRNA can be engineered as a single cr/tracrRNA hybrid to direct Cas9 cleavage activity (Jinek et al., Science, 337(6096):816-821, 2012). Like TALE nucleases, for example, the components of a CRISPR/Cas system (the Cas9 endonuclease and the crRNA and tracrRNA, or the cr/tracrRNA hybrid) can be delivered to a cell in a geminivirus-based replicon.
The coding sequence for an endonuclease can be operably linked to a promoter that is inducible, constitutive, cell specific, or activated by alternative splicing of a suicide exon. Strong, constitutive promoters may be particularly useful. Such promoters include, without limitation, the maize ubiquitin promoter (ZmUbi) used in the experiments described below, the ubiquitin promoter from Panicum virgatum (PvUbi), the ubiquitin 10 (Ubi10) promoter from Arabidopsis thaliana, the Actin 1 (Act-1) promoter from rice, the nopaline synthase (nos) promoter, octopine synthase (ocs) promoter, and mannopine synthase (mas) promoters from Agrobacterium tumefaciens, and the 35S promoter from cauliflower mosaic virus. The plant can be infected with a viral replicon containing a repair template, and the endonuclease can be expressed to cleave the DNA at the target sequence, facilitating HR on either side of the repair template to be integrated.
One or more endonuclease coding sequences can be contained in the same geminivirus construct as one or more repair templates, or can be present in one or more vectors that are separately delivered to the plant, either sequentially or simultaneously with the geminivirus construct containing the repair template(s). In some embodiments, for example, plants can be transfected or infected with a second viral vector, such as an RNA virus vector (e.g., a tobacco rattle virus (TRV) vector, a tobacco mosaic virus (TMV) vector, a potato virus X (PVX) vector, a pea early-browning virus (PEBV) vector, a wheat streak mosaic virus (WSMV) vector, or a barley stripe mosaic virus (BSMV) vector) that encodes the endonuclease. As an example, TRV is a bipartite RNA plant virus that can be used to transiently deliver protein coding sequences to plant cells. For example, the TRV genome can be modified to encode a ZFN or TALE nuclease by replacing TRV nucleotide sequence with a subgenomic promoter and the ORF for the endonuclease. The inclusion of a TRV vector can be useful because TRV infects dividing cells and therefore can modify germ line cells specifically. In such cases, expression of the endonuclease encoded by the TRV can occur in germ line cells, such that HR at the target site is heritable.
In some embodiments in which a geminivirus vector contains both a repair template and an endonuclease encoding sequence, the geminivirus can be deconstructed such that it encodes only the proteins needed for viral replication. Since a deconstructed geminivirus vector has a much larger capacity for carrying sequences that are heterologous to the virus, the repair template may be longer than 800 nt. An exemplary system using a deconstructed vector is described in the Example section below.
The construct(s) containing one or more repair templates and one or more endonuclease encoding sequences can be delivered to a plant cell using, for example, biolistic bombardment. In some cases, the one or more repair templates and endonuclease coding sequences can be delivered using Agrobacterium-mediated transformation, insect vectors, grafting, or DNA abrasion, according to standard methods.
After a plant is infected or transfected with a repair template (and, in some cases, an endonuclease encoding sequence), any suitable method can be used to determine whether GT has occurred at the target site. In some embodiments, a phenotypic change can indicate that a repair template sequence was integrated into the target site. Such is the case for the plants that were modified with geminivirus replicons containing sequences encoding fluorescent reporters, as described below, or sequences encoding herbicide or antibiotic resistance for selection and regeneration of the modified cells. In some cases, the first GT event (e.g., the insertion of a fluorescent marker or a nucleic acid conferring herbicide resistance or antibiotic resistance downstream from an endogenous promoter) can be used to select for a second or second and third gene editing (NHEJ or GT) event. PCR-based methods and sequencing methods also can be used to determine whether a genomic target site contains a repair template sequence, and/or whether precise recombination has occurred at the 5′ and 3′ ends of the repair template.
The invention will be further described in the following examples which does not limit the scope of the invention described in the claims.
Vector Construction.
The replication elements of WDV (LIR, SIR, and Rep/RepA) were PCR amplified from pWI11 (Ugaki et al., supra) and cloned by Gibson assembly (New England Biolabs; Ipswich, Mass.) into the multi-cloning site of the pCLEAN-G185 binary vector in a LIR-SIR-Rep-LIR configuration (
GFP was cloned into the Gateway site of the different WDV replicons, resulting in pWDV1-GFP, pWDV2-GFP, and pWDV3-GFP. For the expression experiments, the vector Ubi-GFP, having the ZmUbi promoter driving expression of GFP (
Donor templates for GT experiments were generated to knock-in the different fluorescence reporters into the ubiquitin, MLO, and EPSPS loci. GT into the ubiquitin gene was performed using a promoter-less ‘P2A-gfp-nos terminator’ cassette (referred as T2A:gfp) (
Plant Material.
Wheat (T. aestivum cv Bobwhite) plants were grown at 20° C. (day) and 14° C. (night) temperatures with a relative air humidity of 60% under a 16 hour photoperiod. Plants of the maize (Zea mays) Hi II hybrid genotype were grown in the greenhouse at 28° C. with light supplementation for a 12 hour photoperiod. Rice (Oryza sativa cv Nipponbare) grains were dehulled and sterilized with 75% ethanol and 2.5% sodium hypochlorite and then plated on ½ MS solid medium in a round glass cup, and covered with sterilized plastic film. Plants were grown at 28° C. with a photoperiod of 16 hours light and 8 hours dark for about 14-20 days in a growth chamber. Wheat and rice protoplast isolation was carried out from wheat and rice leaves as described elsewhere (Shan et al., supra).
Biolistic Transformation.
Immature wheat scutella (0.5-1.5 mm) were isolated from primary tillers harvested 16 days after anthesis and used for biolistic transformation about 1 hour after isolation, or cultured for 2-3 weeks to induce callus. Scutella isolation and culture conditions were as described elsewhere (Gil-Humanes et al., “Genetic transformation of wheat: Advances in the transformation method and applications for obtaining lines with improved bread-making quality and low toxicity in relation to celiac disease.” In Genetic Transformation. Ed. Alvarez, InTech; 2011:135-150) with an osmotic treatment applied between 1 hour before and 2 hours after bombardment. F2 immature zygotic embryos (1.5 to 2.0 mm) of corn were aseptically dissected from ears harvested 10 to 13 days post pollination. Corn immature embryos were isolated as described elsewhere (Ishida et al., Nat Protoc 2007, 2:1614-1621) and placed with the embryo axis facing down in culture medium to induce cell division and callus formation for 2-3 weeks. Biolistic bombardment of immature embryos or calli of the different species was carried out using a PDS-1000 gene gun. Equimolar amounts (1 pmol DNA mg-1 of gold) of each plasmid were used for each experiment with 60 μg of gold particles (0.6 μm diameter) per shot. GFP images of transformed tissue were taken using a camera mounted on a Nikon microscope.
Genomic DNA and Total RNA Isolation from Wheat Callus and Protoplasts.
Total genomic DNA was isolated from ˜200 mg of wheat callus with a 20 mM Tris-HCl (pH 7.5), 250 mM NaCl, 25 mM EDTA, 0.5% SDS extraction solution that included RNase. A 2% cetyl trimethylammonium bromide (CTAB) solution was used for total genomic DNA isolation from leaves (˜50 mg tissue) and protoplasts (200,000 cells). RNA from wheat callus (˜200 mg) was isolated using TRIzol reagent (Invitrogen) according to the manufacturer's instructions, and treated with TURBO DNase (Ambion; Waltham, Mass.) to eliminate DNA contamination. RNA (500 ng) was converted to cDNA using the High-Capacity cDNA Reverse Transcription Kit (Applied Biosystems; Foster City, Calif.).
Detecting WDV Replicon Circularization.
Circularization of the replicon was detected by PCR using the Expand Long Template system (Roche; Basel, Switzerland). Specific primers were designed to detect circularization of the different pWDV constructs expressing GFP or TaCas9 (TABLE 1). PCR conditions in all the cases were: 50-75 ng of DNA template, 0.15 μM of each primer, lx Expand Long Template Buffer 1, and 1.87 U of the enzyme mix in a 25 μl reaction. Cycling conditions consisted of an initial denaturation step of 94° C. for 5 minutes followed by 30 cycles of 94° C. for 30 seconds, 55° C. for 30 seconds, and 68° C. for 1 minute, followed by a final extension step of 68° C. for 5 minutes.
Quantitative Real Time PCR (qRT-PCR).
qRT-PCR was used to assess both DNA copy number and gene expression from the replicons. qRT-PCR was performed using the FastStart Universal SYBR Green Mix kit (Roche) on the LIGHTCYCLER® 480 Instrument (Roche). To carry out copy number and relative gene expression experiments, primers were designed for the GFP coding sequence within the WDV constructs (GFP1_F and GFP1_R), for the Ubi-GFP control plasmid (GFP2_F and GFP2_R), and for the Rep and RepA coding sequence (Rep_F and Rep_R) (TABLE 1). The actin gene (actin_F and actin_R) and the RLI (similar to A. thaliana RNase L inhibitor protein) gene (RLI_F and RLI_R) were used as references to normalize replicon copy number and gene expression. qRT-PCR conditions were: 0.3 μM of each primer and 1× FastStart Universal SYBR Green Mix in a final volume of 10 μl, with either cDNA obtained from 20 ng of total RNA or 50 ng of total genomic DNA for quantification of gene expression and copy number, respectively. Primer efficiencies and Cq values were determined using the LingRegPCR v2013.0 software (Ruijter et al., Nucleic Acids Res 2009, 37:e45). Normalized copy number and gene expression were calculated using an equation described elsewhere (Hellemans et al., Genome Biol 2007, 8) for multiple reference genes, and the results were standardized using an adapted version of the Microsoft Excel Qgene template (Muller et al., BioTechniques 2002, 32:1372-1379). Three technical replicates were performed for each sample. Error bars in the figures represent standard errors of three different biological replicates (DNA or RNA from calli or protoplasts transformed independently).
Molecular Characterization of Targeted Mutagenesis.
A PCR/restriction enzyme assay was performed to detect mutations induced by NHEJ in the ubiquitin gene in transfected protoplasts. Insertions or deletions at the DSB induced by sgUbi1 would result in loss of a HaeIII restriction site located just upstream of the PAM sequence. A 533-bp fragment containing the sgUbi1 target site was PCR amplified simultaneously from the 1AL, 1BL, and 1DL alleles using the Ubi_F and Ubi_R primer pair (TABLE 1). PCR conditions were 50 ng of DNA template, 0.5 μM of each primer, 1×Q5 Reaction Buffer (New England Biolabs; Ipswich, Mass.), and 0.5 U of the Q5 polymerase (New England Biolabs) in a 25 μl reaction. Cycling conditions consisted of an initial denaturation step of 98° C. for 30 seconds, followed by 40 cycles of 98° C. for 10 seconds, 64° C. for 20 seconds, and 72° C. for 20 seconds, with a final extension step of 72° C. for 5 minutes. The PCR product of each reaction was digested with HaeIII for 3 hours and resolved on a 2% agarose gel. The frequency of mutations was estimated by quantifying the intensity of the undigested and digested bands with the software ImageJ (Schneider et al., Nat Meth 2012, 9:671-675) as described elsewhere (Shan et al., Nat Protocols 2014, 9:2395-2410). Cleavage-resistant amplicons were gel purified, cloned into pJET1.2 (ThermoScientific; Waltham, Mass.) and sequenced.
Molecular Characterization of GT.
Gene targeting of each of the GFP, BFP, and dsRed reporters was detected by PCR. One primer in the genomic flanking region and one primer in the donor template were combined in each case to detect the 5′ and 3′ junctions of the insertion (TABLE 1). PCR conditions were: 150 ng of DNA template, 0.5 μM of each primer, 1×Q5 Reaction Buffer (New England Biolabs), and 0.5 U of the Q5 polymerase (New England Biolabs) in a 25 μl reaction. Cycling conditions consisted of an initial denaturation step of 98° C. for 30 seconds, followed by 40 cycles of 98° C. for 10 seconds, a variable annealing temperature for 20 seconds, and 72° C. for 45 seconds, and a final extension step of 72° C. for 5 minutes. Amplicons were separated on a 1% agarose gel, purified, and cloned into pJET1.2 (ThermoScientific) for sequencing. About 10 colonies of each transformation event were sequenced by Sanger sequencing and analyzed.
Quantifying Multiplexed GT in Wheat Protoplasts.
Multiplexed GT was calculated by dividing the number of protoplasts expressing GFP and BFP by the total number of cells, and normalizing to the transformation efficiency of each experiment. A Nikon A1 Spectral Confocal Microscope was used to collect random photos with the different filters. Image J software was used to count the number of cells (GFP- and/or BFP-expressing cells and total number of cells) in 10 random images for each treatment and experiment. Transformation efficiency was estimated with a replicon-based control plasmid expressing GFP (pWDV2-GFP).
Two different geminiviruses—WDV and ToLCV—were deconstructed to create autonomous replicons that function in plant cells. Specifically, the movement protein (MP) and coat protein (CP) coding sequences were removed from WDV and ToLCV, thereby eliminating the possibility of cell-to-cell movement as well as plant-to-plant insect-mediated transmission. The lack of the CP also increases the copy number of dsDNA replicon intermediates (Padidam et al., J Virol 1999, 73:1609-1616), likely because CP is not available to sequester and package ssDNA into virions, and loss of CP/Rep interactions represses viral replication (Malik et al., Virology 2005, 337:273-283). A GFP coding sequence was inserted into both vectors such that expression would be driven from the endogenous viral promoters (giving rise to ToLCV-GFP and pWDV2-GFP,
pWDV2-GFP and ToLCV-GFP were used to transform wheat calli by biolistics. As a control, calli also were transformed with a BeYDV replicon carrying a GFP cassette (pBeYDV-GFP) (Baltes et al., supra). Only cells transformed with pWDV2-GFP showed evidence of GFP expression (
Time Course of Replicon Amplification and Gene Expression.
Wheat calli were transformed with either pWDV2-GFP or the Ubi-GFP control to study protein expression and replication over a 2-week time course (
Different Architectures of WDV Show Differences in Replication and the Expression of the Heterologous Proteins.
To evaluate promoter activity for heterologous protein expression, GFP expression and copy number of the replicon were monitored by qRT-PCR in wheat calli transformed with the different replicon architectures (
WDV-Induced Targeted Mutagenesis.
To test whether the WDV-derived replicons enable targeted mutagenesis, a 20 nt chimeric single-guide RNA (sgRNA) that recognizes the third exon of the ubiquitin gene (sgUbi1) was designed. Wheat protoplasts were transformed with different WDV constructs expressing CRISPR/Cas reagents, namely sgUbi1 and a wheat codon-optimized Cas9 (TaCas9) (vectors pWDV1-CR, pWDV2-CR, and pWDV3-CR) (
The efficiency of GT in wheat scutella with the different replicon architectures was quantified and compared with a non-viral control (pCR.GFP), in which TaCas9 is driven by the ZmUbi promoter. Wheat scutella were transformed by particle bombardment, an approach frequently used to generate transgenic wheat plants. GT events were calculated by counting the number of cells expressing GFP 7 dpb (
In addition, a vector containing a sgRNA complementary to the MildewLocusO (MLO) gene (sgMLO1) was synthesized and designated pWDV1.CR.BFP. A donor template designed to knock-in a promoter-less bfp coding sequence by HR also was synthesized and designated P2A-bfp (
Multiplexed Gene Targeting in Wheat Cells.
Next, the ability of the WDV replicons to achieve multiplexed GT within the same cell was examined by simultaneously targeting integration of T2A-gfp and P2A-bfp into the ubiquitin and MLO loci, respectively. Wheat protoplasts were transfected with both pWDV1.CR.GFP and pWDV1.CR.BFP, and GT frequencies of 5.85% and 3.25% were observed for the GFP and BFP reporters, respectively. The GT frequency was 1.1% for simultaneous integration of both reporters (
Multiplexed GT also was accomplished with a single vector (pWDV1.CR.GFP+dsRed) designed to simultaneously modify the ubiquitin and the EPSPS (5-enolypyruvylshikimate-3-phosphate synthase) loci. pWDV1.CR.GFP+dsRed has the sgUbi1 sgRNA and T2A:gfp donor template described above for GT into the ubiquitin gene, but it also carries the sgEPS1 sgRNA and a donor template designed for in-frame integration of dsRed into the EPSPS coding sequence (P2A:dsRed) (
Taken together, these data demonstrate that WDV-derived replicons can increase GT frequencies 12-fold over standard methods of DNA delivery, and indicate that the promoter driving Cas9 expression may be critical for achieving high efficiency gene targeting. In addition, multiplexed, targeted integration by HR was achieved in all three homeoalleles (A, B, and D) of hexaploid wheat cells using CRISPR/Cas9. The reagents described herein therefore offer considerable potential for genome editing of staple cereal crop genomes, including complex polyploid genomes such as wheat.
It is to be understood that while the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.
This application claims benefit of priority from U.S. Provisional Application Ser. No. 62/411,952, filed on Oct. 24, 2016, which is incorporated herein by reference in its entirety.
This invention was made with government support under IOS-1339209 awarded by the National Science Foundation. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2017/058031 | 10/24/2017 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62411952 | Oct 2016 | US |