The present invention relates to the field of biotechnology, in particular, to a RNA-guided targeted genome modification method for plants.
Over the past decade, discovery and improvement of sequence-specific nuclease have exerted strong influence on the establishment of targeted mutagenesis. Zinc finger nuclease (ZFN) and Transcription activator-like effector nuclease (TALEN) are the main representatives (Carroll et al, 2006; Christian et al, 2010.). They are fusion proteins consisting of an engineered binding domain array for recognizing specific nucleic acid sequences and a non-specific nuclease Fok1 for DNA cleavage. The resulted double-strand breaks can be repaired via either the non-homologous end joining or the homologous recombination pathway in eukaryotic cells, thereby introducing site-specific nucleotides alteration or modification. The above mentioned techniques have been successfully applied in a number of species, including nematodes, human cells, mice, zebra fish, corn, rice, short grass, etc. (Beumer et al, 2006; Meng et al, 2008; Shukla et al., 2009; Meyer et al, 2010; Cui et al, 2011; Mahfouz et al, 2011; Li et al, 2012; Meyer et al, 2012; Shan et al, 2013; Weinthal et al, 2013). However, the main drawbacks of these techniques include low DNA recognition efficiency by protein elements, difficulty in engineering and vector construction and limitation of DNA recognition specificity.
In 2012, a breakthrough new technology was discovered and improved, CRISPR/Cas. CRISPR (clustered regulatory interspaced short palindromic repeats) is composed of short direct repeats separated by unique sequences of similar length. Functional CRISPR RNAs (crRNAs) are processed from transcripts of CRISPR array through base-pairing with another trans-activating crRNA (tracrRNA) at the direct repeats to form an RNA duplex that can be incorporated into Cas protein. And then, the binary complex will survey the genome for complementary DNA sequences and trigger double-strand breaks at the target sites.
Moreover, crRNA can be fused with tracrRNA to form a single-stranded chimeric RNA (chiRNA) molecule, which can also mediate the cleavage of targeted DNA sequences by Cas9 (Jinek et al., 2012). This editable type CRISPR/Cas system quickly achieved success applications in a number of species, including human cell lines, zebra fish, E. coli, mice and the like (Jinek et al, 2012; Hwang et al, 2013; Jiang et al, 2013; Jinek et al, 2013; Mali et al, 2013; Shen et al, 2013; Wang et al, 2013.). The main advantages of this technique include simplicity in vector construction, simultaneous gene-modifications at multiple target-sites. For animals, in vitro transcripts from chiRNA and Cas9 can be directly introduced (e.g. by injection) in embryonic cells, thereby causing heritable gene mutations. In mice, it was reported that genetic mutations have been successfully conducted to up to five target sites simultaneously. However, due to the presence of cell wall, such technique is not easy to apply in plants.
Summing up, to meet requirements on plant genetic engineering, there is an urgent need to develop a simple and efficient targeted gene modification method for plants.
The object of the present invention is to provide a simple and efficient targeted gene modification method for plants.
Another object of the present invention is to provide a CRISPR/Cas toolkits suitable for plants to achieve successful and stable modification of targeted DNA sequences in progeny.
In the first aspect of the present invention, a targeted gene modification method for plant genome is provided, comprising the steps of:
(a) introducing a nucleic acid construct expressing chimeric RNA and Cas protein into a plant cell to obtain a transformed plant cell, wherein the chimeric RNA is a chimera consisting of CRISPR RNA (crRNA) specifically recognizing targeted sites to be modified (or to be cut) and trans-activating crRNA (tracrRNA); and
(b) under suitable conditions, forming chimeric RNA (chiRNA) through transcription of said nucleic acid construct in the transformed plant cell and expressing said Cas protein in said transformed plant cell, so that, in said transformed plant cell, site specific cleavage on genomic DNA is conducted by Cas protein under the guidance of said chimeric RNA, thereby performing targeted modification in genome.
In another preferred embodiment, said targeted modification includes random targeted modification and non-random targeted modification (precise targeted modification).
In another preferred embodiment, before genome DNA is cleaved by the chimeric RNA and Cas protein, a donor DNA is introduced into the plant cell, thereby performing precise targeted modification on genome. Said donor DNA is a single-stranded or double-stranded DNA and comprises DNA sequence to be inserted or replaced, and the DNA sequence may be a single nucleotide, or plurality of nucleotides (including DNA fragments or encoding genes).
In another preferred embodiment, said nucleic acid construct comprises a first nucleic acid sub-construct and a second nucleic acid sub-construct, wherein the first nucleic acid sub-construct and a second nucleic acid sub-constructs are independent from each other, or integrated;
wherein the first nucleic acid sub-construct comprises from 5′ to 3′ the following elements:
A-B (I)
the second nucleic acid sub-construct comprises from 5′ to 3′ the following elements:
In another preferred embodiment, there is one or more of the first nucleic acid sub-construct (for multiple sites to be cut), and is independent to the second nucleic acid sub-construct, or the first nucleic acid sub-construct and the second nucleic acid sub-construct are integrated.
In another preferred embodiment, relative position between each of the first nucleic acid sub-construct and the second nucleic acid sub-construct is arbitrary.
In another preferred embodiment, the followings are operably linked from 5′ to 3′ between the second plant promoter and the encoding sequence of Cas protein:
the third nucleic acid sub-construct, and preferably, said third nucleic acid sub-construct is encoding sequence of p19 protein derived from Tomato bushy stunt virus (TBSV); and
self-splicing sequence, and preferably, said self-splicing sequence is encoding sequence of 2A polypeptide (SEQ ID NO.: 98).
In another preferred embodiment, the encoding sequence of p19 protein comprises the full-length sequence or cDNA sequence of p19 gene.
In another preferred embodiment, the sequence of 2A polypeptide is shown in SEQ ID NO.: 99.
In another preferred embodiment, the encoding sequence of p19 protein is shown in SEQ ID NO.: 100.
In another preferred embodiment, the amino acid sequence of p19 protein is shown in SEQ ID NO.: 101.
In another preferred embodiment, the targeted modifications include:
(i) in the absence of donor DNA, performing random insertions and deletions in specific sites of the plant genome; and
(ii) in the presence of donor DNA, performing precise insertion, deletion or replacement of DNA sequence in specific sites of the plant genome using the donor DNA as a template;
preferably, the targeted modification include gene knock-out, gene knock-in (transgene) of the plant genome and regulation (up-regulation or down-regulation) of the expression level of endogenous genes.
In another preferred embodiment, said RNA transcription terminator is U6 transcription terminator, which is at least 7 consecutive Ts (TTTTTTT).
In another preferred embodiment, the first plant promoter is an endogenous promoter from a plant to be modified.
In another preferred embodiment, the first plant promoter is RNA polymerase III-dependent promoter from a plant to be modified.
In another preferred embodiment, the RNA polymerase III-dependent promoter includes AtU6-26, OsU6-2, AtU6-1, AtU3-B, At7SL or combinations thereof.
In another preferred embodiment, the plant transcriptional terminator is Nos.
In another preferred embodiment, the second plant promoter is RNA polymerase II-dependent promoter, and preferably, comprises a constitutively expressed promoter or sporocyteless (SPL) promoter specifically expressed in Arabidopsis germline cell.
In another preferred embodiment, in the second nucleic acid sub-construct, expression cassette of SPL gene is, from 5′ to 3′, operably linked behind the encoding sequence of Cas protein.
In another preferred embodiment, the expression cassette of SPL gene comprises intron exon, untranslated region and terminator of SPL gene.
In another preferred embodiment, from 5′ to 3′, one or more sequences selected from the following group are operably linked to the expression cassette of SPL gene: sequence of SEQ ID NO.: 103 (intron 1), 104 (exon 2), 105 (intron 2), 106 (exon 3), 107 (3′ untranslated region), 108 (terminator).
In another preferred embodiment, the sequence of the plant transcription terminator in the second nucleic acid sub-constructs is shown in SEQ ID NO.: 108.
In another preferred embodiment, the nucleic acid construct is a plasmid simultaneously expressing the chimeric RNA and Cas protein.
In another preferred embodiment, the plant includes monocots, dicots and gymnosperms;
Preferably, said plant includes forestry plants, agricultural plants, crops, ornamental plants.
In another preferred embodiment, the plants include plants of the following families: Brassicaceae, Gramineae.
In another preferred embodiment, the plant includes but not limited to Arabidopsis, rice, wheat, barley, corn, sorghum, oats, rye, sugarcane, rapeseed, cabbage, cotton, soybean, alfalfa, tobacco, tomato, peppers, squash, watermelon, cucumber, apple, peach, plum, crabapple, sugar beet, sunflower, lettuce, lettuce, Artemisia annua, artichoke, stevia, poplar, willow, eucalyptus, clove, rubber trees, cassava, castor, peanut, peas, astragalus, tobacco, tomato and pepper.
In another preferred embodiment, said cas protein includes cas9 protein.
In another preferred embodiment, the second plant promoter is RNA polymerase II-dependent promoter.
In another preferred embodiment, RNA polymerase II-dependent promoter includes constitutive promoter and sporocyteless (SPL) promoter specifically expressed in Arabidopsis germline cell.
In another preferred embodiment, the first plant promoter includes AtU6-26, OsU6-2, AtU6-1, AtU3-B, At7SL or combinations thereof.
In another preferred embodiment, the second plant promoter includes 35s, UBQ, SPL promoter, or combinations thereof.
In another preferred embodiment, the method further comprises: before or after step (b), said transformed plant cell is regenerated into a plant.
In another preferred embodiment, the method further comprises: said transformed plant cell is detected for mutation or modification in genome.
In another preferred embodiment, the plant cell includes a plant cell derived from cultures, callus or plants.
In the second aspect, a nucleic acid construct used in targeted modification on plant genome is provided in the present invention, the nucleic acid construct comprising a first nucleic acid sub-construct and a second nucleic acid sub-construct, wherein the first nucleic acid sub-construct and the second nucleic acid sub-constructs are independent from each other, or integrated;
wherein the first nucleic acid sub-construct comprises from 5′ to 3′ the following elements:
A-B (I)
the second nucleic acid sub-construct comprises from 5′ to 3′ the following elements:
In another preferred embodiment, the followings are operably linked from 5′ to 3′ between the second plant promoter and the encoding sequence of Cas protein:
the third nucleic acid sub-construct, and preferably, said third nucleic acid sub-construct is encoding sequence of p19 protein derived from Tomato bushy stunt virus (TBSV); and
2A sequence.
In another preferred embodiment, the encoding sequence of p19 protein comprises the full-length sequence or cDNA sequence of p19 gene.
In another preferred embodiment, the encoding sequence of p19 protein is shown in SEQ ID NO.: 98.
In another preferred embodiment, said RNA transcription terminator is U6 transcription terminator, which is at least 7 consecutive Ts (TTTTTTT).
In another preferred embodiment, the plant transcriptional terminator is Nos.
In another preferred embodiment, the nucleic acid construct is DNA construct.
In another preferred embodiment, the first nucleic acid sub-construct and the second nucleic acid sub-construct are integrated.
In another preferred embodiment, there is one or more of the first nucleic acid sub-construct (for multiple sites to be cut).
In another preferred embodiment, the first nucleic acid sub-construct and the second nucleic acid sub-construct is in the same plasmid.
In another preferred embodiment, the first nucleic acid sub-construct is located upstream or downstream to the second nucleic acid sub-construct.
In another preferred embodiment, the first plant promoter and/or second plant promoter is a constitutive or inducible promoter.
In another preferred embodiment, the encoding sequence of Cas protein further comprises NLS sequence located at both sides of ORF.
In another preferred embodiment, the second nucleic acid sub-construct further comprises Nos terminator located downstream to the encoding sequence of Cas protein.
In another preferred embodiment, the Cas protein further comprises a tag sequence.
In another preferred embodiment, the second nucleic acid sub-construct further comprises: a tag sequence (e.g. 3×Flag sequence) located between the second plant promoter and the encoding sequence of Cas protein.
In another preferred embodiment, the NLS sequence at N-end is located downstream to the tag sequence.
In the third aspect, a vector is provided in the present invention, said vector containing the nucleic acid construct according to the second aspect of the present invention.
The present invention also provides a vector combination, wherein the vector combination comprises a first vector and a second vector, wherein the first vector contains the first nucleic acid sub-construct of the nucleic acid construct according to the second aspect of the present invention, and the second vector contains the second nucleic acid sub-construct of the nucleic acid construct according to the second aspect of the present invention.
In another preferred embodiment, there is one or more of the first nucleic acid sub-construct.
In another preferred embodiment, there can be one or more of the first vector containing one or more of the first nucleic acid sub-construct of the nucleic acid construct according to the second aspect of the present invention.
In the fourth aspect, a genetically engineered cell is provided in the present invention, the cell containing the vector or vector combination according to the third aspect of the present invention.
In the fifth aspect, a plant cell is provided in the present invention, wherein the nucleic acid construct according to the second aspect of the present invention is integrated into the genome of said plant cell.
In the sixth aspect, a method for producing a plant is provided in the present invention, comprising the step of regenerating the plant cell according to the fifth aspect of the present invention into a plant.
In the seventh aspect, a plant is provided in the present invention, wherein the nucleic acid construct according to the second aspect of the present invention is integrated into the genome of plant cells in said plant.
In the eighth aspect, a plant is provided in the present invention, wherein the plant is prepared according to the method of the sixth aspect.
It should be understood that in the present invention, the technical features specifically mentioned above and below (such as in the Examples) can be combined with each other, thereby constituting a new or preferred technical solution which needs not be individually described.
Through comprehensive and intensive research, RNA-guided targeted genome modification in plants has been successfully achieved by the inventors by using nucleic acid constructs of specific structure. Using the method of the present invention, targeted cleavage and modification can be performed and a variety of different types of mutations can be efficiently introduced into specific sites, thereby facilitating the screening of modified new plants. And the proportion of genetically modified plants can be increased in transgenic offspring of the germline specific gene targeting system. Moreover, the inventors have also discovered that when a specific sequence is introduced into the nucleic acid construct of the present invention, the targeting efficiency in plants can be effectively improved and the developmental phenotype of a plant can be influenced. Based on the above findings, the present invention is completed.
Based on the experimental results, the present invention is particularly applicable to plants, and targeted cleavage on DNA sequence and gene modification in genome can be achieved in a stably inherited plant.
As used herein, the term “crRNA” refers to CRISPR RNA which is responsible for recognizing target sites.
As used herein, the term “tracrRNA” refers trans-activating crRNA pairing with crRNA.
As used herein, the term “plant promoter” refers to a nucleic acid sequence initiating transcription of nucleic acid in a plant cell. The plant promoter may be derived from plants, microorganisms (such as bacteria, viruses) or animals, or an artificially synthesized or engineered promoter.
As used herein, the term “plant transcription terminator” refers to a terminator which can terminate transcription in plant cells. The plant transcription terminator may be derived from plants, microorganisms (such as bacteria, viruses) or animals, or an artificially synthesized or engineered terminator. Representative examples include (but are not limited to): Nos terminator.
As used herein, the term “Cas protein” refers to a nuclease. A preferred Cas proteins are Cas9 protein. Typical Cas9 protein includes (but not limited to): Cas9 derived from Streptococcus pyogenes SF370.
As used herein, the term “encoding sequence of Cas protein” means a nucleotide sequence encoding Cas protein with cleavage activity. In the case where the inserted polynucleotide sequence is transcribed and translated to produce functional Cas protein, a skilled person will appreciate that a large number of polynucleotide sequences can encode the same polypeptide due to codon degeneracy. In addition, a skilled person will also appreciate that different species will have certain preference for codon, and codons for Cas protein will be optimized according to requirements on expression in different species. These variants should be included into term “encoding sequence of Cas protein”. Furthermore, the term specifically includes full-length sequence of Cas gene sequence, a sequence which is substantially identical with Cas gene sequence, and a sequence encoding a protein which maintain the function of Cas protein.
As used herein, the term “plant” includes complete plants, plant organs (e.g., leaves, stems, roots, etc.), seeds and plant cells as well as progeny thereof. It is not necessary to particularly limit the type of plant which can be used in the method of the present invention, generally including any type of higher plants suitable for transformation, including monocots, dicots and gymnosperms.
As used herein, the term “heterologous sequence” is a sequence from different species, or, if from the same species, a sequence highly modified from its original form. For example, a heterologous structural gene operably linked to a promo er may be derived from a different species from which the structural gene is originally obtained, or, if from the same species, one or both of them are highly modified from their original forms.
As used herein, “operably linked to” or “operably linked” refers to a situation in which some parts of a linear DNA sequence can affect the activity of other parts in the same linear DNA sequence. For example, if a signal peptide DNA is expressed as a precursor and involves in the secretion of polypeptide, then the signal peptide (secretory leader sequence) DNA is operably linked to polypeptide DNA; if a promoter controls transcription of a sequence, then it is operably linked to encoding sequence; and if a ribosome binding site is positioned in a position where it can be translated, then it is operably linked to encoding sequence. Generally, “operably linked to” means “neighbor”, and, for secretion leader sequence, it means “neighbor” in reading frame.
As used herein, the term “encoding sequence of 2A polypeptide”, “self-splicing sequence”, or “2A sequence” refers to a protease-independent self-splicing amino acid sequence found in virus, similar to IRES. Using 2A, simultaneous expression of two genes from a single promoter can be achieved. It is also widely found in various types of eukaryotic cells. Unlike IRES, the expression level of downstream proteins will not be reduced. However, after splicing, residues of 2A polypeptide will linked to the upstream protein as a single entity, and Furin proteolytic cleavage site (4 basic amino acid residues, such as Arg-Lys-Arg-Arg) can be added between the upstream protein and 2A polypeptide to completely remove the residues of 2A polypeptide from the end of upstream protein.
As used herein, the term “chimeric RNA (chiRNA)”, “single-stranded guide RNA (sgRNA)” are used interchangeably to refer to a RNA sequence, which contains encoding sequence of the structure of formula I and is capable of forming a complete RNA molecule through transcription.
Nucleic Acid Construct
A nucleic acid construct is provided in the present invention, said nucleic acid construct comprising a first nucleic acid sub-construct and a second nucleic acid sub-construct, wherein the first nucleic acid sub-construct and the second nucleic acid sub-construct are independent from each other, or integrated;
wherein the first nucleic acid sub-construct comprises from 5′ to 3′ the following elements:
a first plant promoter;
encoding sequence of the chimeric RNA operably linked to the first plant promoter, and the encoding sequence of the chimeric RNA is shown in formula I:
A-B (I)
wherein,
A is DNA sequence encoding CRISPR RNA (crRNAs);
B is DNA sequence encoding trans-activating crRNA (tracrRNA);
“-” represents a linkage bond or a linker sequence between A and B; wherein a complete RNA molecule is formed through transcription of the encoding sequence of the chimeric RNA, i.e., the chimeric RNA (chiRNA); and
a RNA transcription terminator (including but not limited to: U6 transcription terminator, at least 7 consecutive Ts);
the second nucleic acid sub-construct comprises from 5′ to 3′ the following elements:
a second plant promoter;
encoding sequence of Cas protein operably linked to the second plant promoter, and the Cas protein is a fusion protein with nuclear localization sequence (NLS sequence) at N-end, C-end or both ends;
a plant transcription terminator (including but not limited to Nos terminator, etc.).
In the present invention, the strength of the first plant promoter and the second plant promoter can initiate production of an effective amount of chiRNA and Cas protein, for achieving site-directed modification for plant genome.
In the present invention, it should be understood that the first nucleic acid sub-construct and the second nucleic acid sub-construct may be located on the same polynucleotide or different polynucleotides, or can also be located on the same vector or different vectors.
The above mentioned nucleic acid construct constructed in the present invention can be introduced into plant cells by conventional recombinant techniques for plant (e.g. Agrobacterium transfection technique), thereby obtaining plant cells containing the nucleic acid construct (or a vector containing the nucleic acid construct), or obtaining plant cells with said nucleic acid construct integrated into the genome.
In the plant cell, chiRNA formed through transcription of the nucleic acid construct of the present invention pairs with the expressed Cas protein, to site-specifically cleave genome, thereby introducing a variety of different mutations.
Furthermore, in order to obtain more seeds containing mutated genes, further improve the activity of CRISPR/Cas9 system in germline cells and reduce possible adverse effects on plant development from gene targeting technique, expression cassette of Arabidopsis SPOROCYTELESS (SPL) gene is used in the present invention to drive expression of Cas9 genes.
SPL gene is specifically expressed in germline cells of Arabidopsis, including megasporocyte and microsporocyte. According to in situ hybridization experiment, it is demonstrated that transcription of Cas9 can be effectively initiated in germline cells by using expression cassette of SPL gene. And the results of mutant detection also demonstrate that Cas9 expression system driven by SPL promoter won't affect the gene function, growth and development of T1 transgenic plants. However, a great deal of heterozygotes, in which targeted genes are mutated, can be obtained in the transgenic population of T2 generation, indicating that the mutation of target gene occurs in germline cells.
For further improving the stability of sgRNA in plants and efficiency of gene-targeting of CRISPR/Cas9 system, a gene-targeting vector psgR-Cas9-p19, co-expressing TBSV-p19 protein and Cas9 protein is constructed. The protein activity of the correctly recombined YFFP gene is detected in Arabidopsis transient expression system, and based on the results, it was showed that p19 protein can significantly improve the gene-targeting efficiency of CRISPR/Cas9 system.
Furthermore, p19 co-expression vector targeting Arabidopsis endogenous genes is constructed, and clear leaf developmental phenotypes can be found in about one-third of the obtained plants of T1 generation suggesting that p19 will inhibit the miRNA-regulated development process in plants. Results from Northern detection and quantitative analysis on gene expression show that the expression level of p19 protein is positively correlated with the cumulative amount of miR168 and sgRNA. Meanwhile, analysis on phenotype and genotype of target sites also shows that the higher the expression of p19 in transgenic plants, the higher the probability of mutation in a target gene, which provides important basis and means for further improving plant gene-targeting system based on CRISPR/Cas9.
Method for Targeted Gene Cleavage
A method for targeted gene cleavage or modification on the genome of plants is also provided in the present invention.
(a) a nucleic acid construct expressing chimeric RNA and expressing Cas protein is introduced into a plant cell to obtain a transformed plant cell; and
(b) under suitable conditions, the nucleic acid construct in the transformed plant cell is transcribed to form chimeric RNA (chiRNA), and the transformed plant cell expresses said Cas protein, so that targeted cleavage on genome is performed by said Cas protein in said transformed plant cell, under the guidance of the chimeric RNA, thereby performing targeted modification on genome.
In the method of the present invention, in step (a), the nucleic acid constructs expressing chimeric RNA and expressing Cas protein can be in the same nucleic acid construct, or may be in different nucleic acid constructs.
In addition, if Cas protein expression cassette has been contained in the plant or plant cell to be treated, merely a nucleic acid construct expressing chimeric RNA can be introduced.
Further, if it is necessary to perform targeted cleavage or targeted modification at multiple specific sites, a nucleic acid construct expressing a plurality of different chiRNAs (may be in the same or in different nucleic acid constructs) may be introduced into a plant cell.
Upon targeted cleavage, plant cells will be repaired through a variety of mechanisms, and a variety of mutations may often be introduced during the repair process. Based on this, plants or plant cells with desired mutation and desired performance can be screened for use in subsequent research or production.
Method for Precise Targeted Genome Modification
If it is necessary to preform precise targeted insertion, deletion or replacement of DNA sequence in plant genome, a donor DNA can be introduced before the initiation of targeted gene cleavage on genome by chimeric RNA and Cas protein. The donor DNA can be a single-stranded or double-stranded DNA, and contain DNA sequence to be inserted and replaced. The DNA sequence may be a single nucleotide, or a plurality of nucleotides (including DNA fragment or encoding gene). Upon targeted cleavage, precise targeted insertion, deletion or replacement for plant genome can be performed in a plant cell through homologous recombination-mediated DNA repair system and using donor DNA as a template. The donor DNA can be inserted into a specific location in plant genome or used to replace specific DNA sequences; or can also be used to replace promoter, and insert enhancer or other DNA cis-regulatory elements to regulate the expression level of endogenous genes in a plant; and also be used to insert a polynucleotide sequence encoding a complete protein. The methods for introducing donor DNA include, but not limited to: microinjection, Agrobacterium-mediated transfection, gene-gun, electroporation, ultrasonic method, liposome-mediated method, polyethylene glycol (PEG) mediated method, laser microbeam puncture, direct-introduction of donor DNA after chemical modification (adding lipophilic groups) and the like.
Use
The present invention can be used in plant genetic engineering for modifying various plants, especially crops and forestry plants with economic value.
The main advantages of the present invention include:
(a) targeted cleavage and modification can be specifically performed at specific positions in a plant genome;
(b) various forms of modifications can be efficiently introduced into specific positions;
(c) new genes can be efficiently introduced into specific positions.
(d) specific genes in the plant genome can be efficiently knock out.
(e) expression level of endogenous genes in a plant can effectively regulated.
The invention will be further illustrated with reference to the following specific examples. It is to be understood that these examples are only intended to illustrate the invention, but not to limit the scope of the invention. For the experimental methods in the following examples without particular conditions, they are performed under routine conditions, such as conditions described in Sambrook et al., Molecular Cloning: A Laboratory Manual, New York: Cold Spring Harbor Laboratory Press, 1989, or as instructed by the manufacturer. All the percentages or fractions refer to weight percentage and weight fraction, unless stated otherwise.
General Materials and Methods
Growth of Arabidopsis and Rice
Wild-type Arabidopsis Col-0 (available from the American ABRC center) is used in experiments. Seeds are inoculated on MS medium and vernalized at 4° C. for 3 days, and then placed into long photoperiod growth chamber (16 h light/8 h darkness) at 22° C., and after 5-10 days, the seedlings are transplanted to nutrient soil.
Rice used in the experiment is Kasalath cultivar (purchased from China Rice Research Institute). After transplanted to soil, the plants are grown in a greenhouse (16 h light, 30° C./8 h darkness, 22° C.).
Design of Target Sites
Suitable target sites for chiRNA is in the form of N1-20NGG, wherein N1-20 is recognition sequence provided by chiRNA vector construct, and NGG is a recognition sequence necessary for CRISPR/Cas9 complex binding to DNA target sites, called PAM sequence.
G is used as starting signal for transcription of U6 type small RNA, therefore, sequence in the form of GN19NGG is selected as target sites. In addition, according to previous study, it was shown that CRISPR/Cas system can tolerate mismatch of target site from the side of PAM sequence up to five bases, therefore, if the first nucleotide in N1-20 is G, the synthesized oligo primer for target site is linker+N1-20; and if the first nucleotide in N1-20 is not G, it will be deemed as G in the present Examples, and the synthesized oligo primer for target site will be linker+GN2-20.
Construction of Vector
Encoding sequence of SpCas9 was PCR-amplified from vector PX260 by using primers Cas9-F and Cas9-R, and subcloned between the XhoI and BamHI sites of pA7-GFP vector to replace its original GFP gene, thereby obtaining 2×35S promoter and Nos terminator at N-terminal and C-terminal respectively. Detailed construction of pX260 and A7-GFP vector can be found in literature (Voelker et al., 2006; Cong et al., 2013). Afterwards, complete expression cassette from 2×35S promoter to Nos terminator is subcloned into pBluescript SK+vector (commercially available from Stratagene Inc., San Diego, Calif.) by HindIII/EcoRI restriction sites, named as 35S-Cas9-SK.
AtU6-26 promoter is obtained through PCR amplification using AtU6-26F and AtU6-26R as primers and wild-type Arabidopsis thaliana Col-0 genome DNA as a template, and subcloned into pEasy-Blunt vector (available from TransGen Biotech, Beijing), and a clone with KpnI preceding the promoter is selected. Afterwards, it is subcloned into pBluescript SK+vector (purchased from Stratagene Inc., San Diego, Calif.) using KpnI/XhoI restriction site. 85 bp of chiRNA inducing sequence is obtained through PCR amplification from pX330 vector using Atu6-26-85F and AtU6-26-85R primers and fused with AtU6-26 promoter to obtain a complete chiRNA expression vector (see
chiRNA expression cassette is subcloned into 35S-Cas9-SK through KpnI/EcoRI digestion for transient expression analysis; or digested using KpnI/SalI, and then subcloned into KpnI/EcoRI region of pCambia1300 vector (Cambia, Canberra, Australia) along with SalI/EcoRI fragment containing complete Cas9 expression cassette for transgene of Arabidopsis.
OsU6-2 promoter is obtained through PCR amplification using OsU6-2F and OsU6-2R as primers and Wild type rice Nipponbare genome DNA as template, and then subcloned into pEasy-Blunt vector (TransGen Biotech, Beijing).
OsU6-2 is transferred into At6-26SK vector to replace AtU6-26 promoter through Transfer PCR by using TPCR-OSu6F and TPCR-OsU6R primers method, thereby obtaining OsU6-2SK vector (see
pAtU6-26 fragment of AtU6-26 promoter is obtained through PCR amplification using pAtU6-F-HindIII and pAtU6-R as primers and wild-type Arabidopsis thaliana Col-0 genome as a template. chiRNA (i.e., SgRNA) fragment is obtained through PCR amplification by using sgR-F-U6 and sgR-R-SmaI primers and pX330 vector as a template. pAtU6-chiRNA fragment (SEQ ID NO.: 40) is obtained through overlapping PCR by using pAtU6-F-HindIII and sgR-R-SmaI primers and mixture of PCR products of chiRNA and pAtU6 as a template, digested by HindIII and XmaI and inserted into corresponding sites of pMD18T vector to give PSGR-At vector.
pAtUBQ1 promoter and terminator of AtUBQ1 are obtained through PCR amplification using pAtUBQ1-F-SmaI and pAtUBQ1-R-Cas as well as tUBQ1-F-BamHI and tUBQ-R-KpnI primers and wild-type Arabidopsis thaliana Col-0 genome as a template. Cas9 gene fragment is obtained through PCR amplification by using Cas9-F-pUBQ and Cas9-R-BamHI as primers and pX330 vector as a template. The above pAtUBQ1, Cas9 gene and terminator fragment of AtUBQ1 are digested with XmaI and NcoI, NcoI and BamHI, as well as BamHI and KpnI, and ligated into psgR-At vector digested with XmaI and KpnI, thereby finally obtaining psgR-Cas9-At backbone vector with pAtUBQ-Cas9-tUBQ (SEQ ID NO.: 41) as insert fragment.
Sequence complying with 5′-NNNNNNNNNNNNNNNNNNNNGG-3′ is selected as a target. For psgR-Cas9-At vector, sense strand 5′-GATTGNNNNNNNNNNNNNNNNNNN-3′ and antisense strand 5′-AAACNNNNNNNNNNNNNNNNNNNC-3′ were synthesized respectively. Then double-stranded DNA small fragment with linkers formed by denaturing and annealing both of the synthesized artificial sequences is inserted between two BbsI sites of psgR-Cas9-At, thereby obtaining psgR-Cas9-At vector for specific target sites. Complete pAtU6-chiRNA element is amplified from psgR-At vector with inserted target gene fragment by using pAtU6-F-KpnI and sgR-EcoRI as primers, digested with KpnI and EcoRI, and inserted into psgR-Cas9-At vector with pAtU6-chiRNA element for another target gene, thereby obtaining p2×sgR-Cas9-At vector. Afterwards, the vector is digested with HindIII and EcoRI, and complete 2×sgr-Cas9-At is subcloned into pCambia1300 vector (Cambia, Canberra, Australia) to obtain binary vector p2×1300-sgr-Cas9 for transgene of Arabidopsis.
Construction of pUBQ-Cas9-sgR Series Vectors
Primers sgR-Bsa I-F/R are synthesized, and the primers are added with phosphorus by PNK kinase, slowly anneal, and are linked into Bbs I site of psgR-Cas9-At. The resulting psgR-Cas9-Bsa vector is digested with EcoR I and HindIII and linked into pBin19 vector, thereby obtaining pUBQ-Cas9-sgR vector. Synthesized primers sgR-AP1-S27/A27 and sgR-AP1-S194/A194 are also linked into BsaI site of pUBQ-Cas9-sgR vector according to the above method, thereby obtaining pUBQ-Cas9-sgR-AP1-27 and pUBQ-Cas9-sgR-AP1-194.
Construction of pSPL-Cas9-sgR Series Vector
Primers SPL5′-F-XmaI and SPL5′-R-BsaI are synthesized, and promoter sequence at 5′end of SPL gene is amplified from Arabidopsis genome. This fragment is digested with Xma I and Bsa I, and linked into Xma I and Nco I sites of psgR-Cas9-Bsa, thereby obtaining pSPL-Cas9-5′. Primers SPL3′-F-BamHI and SPL3′-R-KpnI are synthesized, and promoter sequence at 3′end of SPL gene is amplified from Arabidopsis genome, which comprises exons (SEQ ID NO.: 104, 106), two introns (SEQ ID NO.: 103, 105) and terminator (SEQ ID NO.: 108) after SPL gene, digested with BamH I and Kpn I and linked into pSPL-Cas9-5′, to give PSPL-Cas9-53′. The resulting plasmid is digested with Xma I and Kpn I, and linked into pUBQ-Cas9-sgR, thereby obtaining pSPL-Cas9-sgR vector. The synthesized primers sgR-AP1-S27/A27 and sgR-AP1-S194/A194 are also linked into Bsa I site of pSPL-Cas9-sgR vector according to the above method, thereby obtaining pSPL-Cas9-sgR-AP1-27 and pSPL-Cas9-sgR-AP1-194.
Construction of psgR-Cas9-p19 Vector
TBSV-p19-2A gene containing Nco I site is synthesized by GENEWIZ, Inc. The gene fragment is digested with NcoI, and then inserted into NcoI site of psgR-Cas9 vector. The insertion direction of the fragment is identified by using p19-F and Cas9-378R primers, thereby obtaining psgR-Cas9-p19 vector.
Construction of psgR-Cas9-MRS1/2 Vectors
Primers sgR-MRS1-S/A and sgR-MRS2-S/A are synthesized respectively, and linked into Bbs I site of psgR-Cas9-At, thereby obtaining psgR-Cas9-MRS1 and psgR-Cas9-MRS2 vectors.
Construction psgR-Cas9-MRS1/2-p19 Vectors
Primers sgR-MRS1-S/A and sgR-MRS2-S/A are synthesized respectively, and linked into Bbs I site of psgR-Cas9-p19, thereby obtaining psgR-Cas9-MRS1-p19 and psgR-Cas9-MRS2-p19 vectors.
Construction of 1300-psgR-Cas9-p19-AP1/TT4 Vector
Primers sgR-AP1-S27/A27, sgR-AP1-S194/A194, sgR-TT4-S65/A65 and sgR-TT4-S296/A296 are synthesized respectively, and the primers are added with phosphorus by using PNK kinase, anneal, and are linked into Bbs I site of psgR-Cas9-p19, thereby obtaining psgR-Cas9-p19-AP1-27, psgR-Cas9-p19-AP1-194, psgR-Cas9-p19-TT4-65 and psgR-Cas9-p19-TT4-296. psgR-Cas9-AP1-194-p19 and psgR-Cas9-p19-TT4-296 are amplified by using AtU6-F-KpnI and sgR-R-EcoRI primers, and the resulting fragments are digested by using Kpn I and EcoR I and linked into psgR-Cas9-p19-AP1-27 and psgR-Cas9-p19-TT4-65, thereby obtaining psgR-Cas9-p19-AP1 and psgR-Cas9-p19-TT4. Both of plasmids are digested with HindIII and EcoR I, recycled, and linked into pCAMBIA1300 vector, thereby obtaining 1300-psgR-Cas9-p19-AP1 and 1300-psgR-Cas9-p19-TT4 vectors.
Analysis of Homologous Recombination-Based Transient YF-FP Report System
Homologous recombination-based transient YF-FP report system is constructed based on pA7-YFP. pA7-YFP vector can be found in
Creation of Stable Transgenic Arabidopsis and Rice Plants
Agrobacterium GV3101 is transformed with pCambia1300 vector containing complete expression cassette of SpCas9 and complete expression cassette of chiRNA. Robust wild-type Col-0 plants during full-bloom stage are selected and subject to transgene operation through floral dip method (Clough and Bent, 1998). Transgenic plants are normally managed until seeds are harvested. Obtained seeds of T1 generation are sterilized with 5% sodium hypochlorite for 10 minutes, rinsed with sterile water for four times, and seeded on MS0 medium containing 20 μg/L of hygromycin or 50 μM kanamycin for screening. The seeds are placed at 4° C. for 2 days, transferred to a 12-hour light incubator for 10 days, and then transplanted to a 16-hour light greenhouse, and cultured. Transgenic plants are obtained by Agrobacterium-mediated transformation of calli of rice (Hiei et al., 1994).
Digestion and Sequencing Analysis of Genome Modification
Genomic DNAs of positive transformants obtained through Hygromycin-screen are extracted, PCR-amplified by using primers corresponding to target site and recovered. About 400 ng of PCR recovered product for each sample is digested by corresponding restriction enzyme overnight. Digestion reaction was analyzed by agarose gel electrophoresis (1.2-2%). Residual uncleaved stripes after digestion are recovered, linker into pZeroBack/blunt vector (TianGen Biotech, Beijing). Plasmid for monoclone is prepared by shaking, and subject to Sanger sequencing analysis by using M13F primers.
Identification of Mutant for Germline Cell Targeting
For 4 different transgenic populations of T1 generation, 32 strains are randomly selected, one leaf and one inflorescence for each population are selected after growing for two weeks and after flowering respectively, and genomic DNAs are extracted using CTAB method. Target gene fragments are PCR-amplified by using primers AP1-F133/271R, and sequenced, and for mutant, multiple signal peaks will occur from the cleavage site. For transgenic populations of T2 generation, 8 mutated strains are randomly selected, and 12 single plants are detected respectively. PCR products, sequencing results of which show multiple signal peaks, are subject to TA cloning, and 10 monoclone are picked and sequenced to determine the type of gene mutation.
Identification of Mutants Containing p19 Protein
60 strains are randomly selected for 1300-psgR-Cas9-p19-AP1/TT4 transgenic plant population of T1 generation respectively, grow for 2 weeks, and then one leaf is taken, genomic DNA of which is extracted using CTAB method. Gene fragments are PCR-amplified by using AP1-F133/271R and TT4-F159/407R primers. PCR bands are detected by electrophoresis, and produced fragments are counted to determine plant line and relevant developmental phenotypes.
In Situ Hybridization
1. Material embedding: inflorescences of transgenic plants after bolting are selected as materials, fixed with 4% paraformaldehyde for 12 hours, dehydrated with graded alcohol, transparentized with xylene and embedded in paraffin.
2. Preparation of probe: Cas9 gene is amplified with primers dCas9-F3-F/R, and the resulting fragments are digested with PstI and BamHI and ligated into pTA2 vector. The resulting vector was linearized with Sal I as DNA template, and antisense and sense Biotin labeled RNA probes (Roche, 11175025910) are in vitro transcribed by using T7 and SP6RNA polymerase, respectively. Products are digested with DNase I, subject to alkaline-lysis and purified, and dissolved in formamide for storage.
3. In situ hybridization is performe following the method reported in the literature. (Brewer P B, Heisler M G, Hejatko J, Friml J, Benkova E (2006) In situ hybridization for mRNA detection in Arabidopsis tissue sections Nat Protoc 1: 1462-1467)
Northern Hybridization
Inflorescences of a plant during flowering stage is taken, and total RNA is extracted using Trizol method (Invitrogen). 50 μg of each sample is loaded, target RNA bands are separated by using 15% PAGE gel and transferred to a nitrocellulose membrane by wet transfer method (Hybond, Amersham). UV cross-linking is performed for two minutes, and then pre-hybridization is performed in hybridization solution (DIG EASY Hyb, Roche) for 1 hour, 20 μM digoxin labeled artificial sequence probe (Invitrogen) is added, and hybridization is conducted at 42° C. overnight. The membrane is washed in 2×SSC, 0.1% SDS for two times (10 mins for each time), and in 0.1×SSC, 0.1% SDS for two times (10 mins for each time). Target bands are detected with digoxigenin detection kit (Thermo Fisher), tableted for 15 minutes, and developed under X-ray.
Realtime PCR
Extracted total RNAs of a plant are treated with DNase I (Takara) for 30 minutes. Upon phenol-chloroform purification, 5 μg is taken and subject to reverse transcription (Takara). The product is diluted at 1-fold, 1 μl is taken as template, and Realtime-PCR reaction system (Biorad) is formulated. Each sample was done in triplicate, ACTIN gene is used as internal control, wild type Col is used as control, and the relative change of gene expression is calculated with 2-ΔΔCt method.
Sequence Information
CRISPR/Cas9 of Streptococcus pyogenes SF370 was used to cause targeted double-strand breaks of DNA in Arabidopsis protoplasts.
Results are shown in
Single binary vector for Agrobacterium-mediated transformation of Arabidopsis and rice was constructed to express chiRNA and hSpCas9, and two Arabidopsis genes BRI1 and GAI as well as one rice gene ROC5 were selected to design target site.
Results are shown in
Stable transgenic plants of Arabidopsis and rice were generated with targeted gene sites modified.
Results are shown in
The results show that a large percentage of T1 transgenic Arabidopsis plants exhibit similar phenotype to homozygous mutants of the target gene locus during early growth stage. RFLP digestion analysis showed that, for target sites in certain transgenic plants, there are significantly fragments which can not be digested remained in PCR products, indicating that natural cleavage sites at target sites of some cells in these plants have been lost. Further sequencing results show that transgenic plants of T1 generation for selected target genes of Arabidopsis and rice have multiple types of DNA mutations in the target gene locus, including short deletion, insertion or replacement. It means that targeted gene cleavage can be efficiently performed by CRISPR/Cas systems in transgenic plants of Arabidopsis and rice on multiple sites of genome, thereby obtaining modifications of specific genes.
Targeted gene insertions and deletions were induced in BRI1 1 gene locus 1 of several Arabidopsis plants by using engineered chiRNA: Cas9 (
Results are shown in
Targeted gene insertions and deletions were induced in BRI1 2 gene locus 1 of several Arabidopsis plants by using engineered chiRNA: Cas9.
Results are shown in
Targeted gene insertions and deletions were induced in BRI1 2 gene locus 3 of several Arabidopsis plants by using engineered chiRNA: Cas9.
Results are shown in
Targeted gene insertions and deletions were induced in GAI gene locus 1 of Arabidopsis by using engineered chiRNA: Cas9.
Results are shown in
Targeted gene insertions and deletions were induced in ROC5 gene locus 1 of rice by using engineered chiRNA: Cas9.
Results are shown in
Summary of part of experiments of the above Examples is shown in Table 2:
Example 4 was repeated, except that, AtU6-26 was replace by promoter AtU6-1. Targeted gene insertions and deletions were induced in BRI1 1 gene locus 1 of several Arabidopsis plants by using engineered chiRNA: Cas9
10 independent transgenic rice of T1 generation were sequenced. Results showed that mutations can also be introduced into genome by using AtU6-1, while at relatively lower frequency, and is less than 10% of AtU6-26. It suggests that AtU6-26 is a particularly preferred first plant promoter.
Two different genes in Arabidopsis were simultaneously mutated at target sites.
P2×1300-sgr-Cas9 vector was used in several Arabidopsis plants to induce targeted gene insertions and deletions at CHLI1 and CHLI2 loci. Results are shown in
chiRNA oligos used in the construction of vectors are sgCHLI1-S101 and sgCHLI1-A101, as well as sgCHLI2-S280 and sgCHLI2-A280 in Table 3. PCR primers used in SURVEYOR analysis for detecting transgenic plants are CHLI1-3-F and CHLI1-262-R, as well as CHLI2-3-F and CHLI2-463-R in Table 3.
Simultaneous mutation and deletion of large fragment at two sites within the same gene of Arabidopsis were achieved through target sites.
P2×1300-sgr-Cas9 vector was used in several Arabidopsis plants to induce targeted gene insertions and deletions at two sites of TT4 gene and cause deletion of large fragment between the two sites. Results are shown in
chiRNA oligos used in the construction of vectors are sgTT4-S65 and sgTT4-A65, as well as sgTT4-S296 and sgTT4-A296 in Table 3. PCR primers used in SURVEYOR analysis for detecting transgenic plants are TT4-1-F and TT4-362-R, as well as TT4-F-159 and TT4-407-R in Table 3.
Arabidopsis plants of T1 generation
Arabidopsis plants of T1 generation
For achieving specific expression of Cas9 gene in germline cells of Arabidopsis, 3.7 Kb sequence upstream to SPL gene was cloned as promoter and 1.5 Kb downstream fragments was cloned as terminator. Humanized Cas9 gene of Streptomyces was used to replace the first exon of SPL gene, and all of the introns as well as the second and third exons of SPL gene were retained (
In situ hybridization results showed that, promoter of SPL gene can drive Cas9 gene to be specifically expressed in tapetum cell (
For comparing the efficiency of gene targeting between pSPL-Cas9-sgR vector and pUBQ-Cas9-sgR vector, gene targeting vectors for nucleotide site No. 27 and nucleotide site No. 194 of encoding gene of Arabidopsis APETALA (AP1) were constructed respectively, and used to transform Arabidopsis thaliana. Through PCR-amplification of the sequence of target gene and alignment of sequenced results, it was discovered that gene mutations can be detected in the plants of T1 an T2 generation for pUBQ-Cas9-sgR series of vectors, while mutations can only be detected in the transgenic population of T2 generation for pSPL-Cas9-sgR series of vectors (
According to statistics of gene targeting activities of targeting vectors during different developmental stages and in different generations, it was discovered that, firstly, there is different cleavage efficiency for different targeting sites. In terms of pUBQ-Cas9-sgR vector, the efficiency of AP1-27 was higher than that of AP1-194, whether in leaves or in inflorescences. Secondly, for some strains, mutation can be detected in leaves, however, no mutant can be produced in inflorescence. Furthermore, in the transformants of T2 generation, mutation efficiency at AP1-194 site for pSPL-Cas9-sgR was higher than AP1-27, and nearly doubled compared with pUBQ-Cas9-sgR transformant during the same period (
For comparing types of gene mutation produced by different gene targeting systems, 8 transgenic strains of T2 generation containing targeted gene mutation were randomly selected from 4 transgenic populations, and for each strain, 12 single plants were detected. Experimental results showed that certain percentage of homozygotes (2-4%) and heterozygous (11-12%) can be produced by constitutively expressed gene targeting system, however, chimera, genotype of which is unclear or wild type accounts for the vast majority (73%-84%). And for germline cell-specific targeting vector, about 30% of heterozygotes can be stably produced, and no homozygous plant was obtained (
Based on the existing gene targeting vector of Arabidopsis (
In transient expression system of Arabidopsis, protoplasts are co-transformed by CRISPR/Cas9 vector with or without p19 and YFFP reporter gene. YFFP reporter gene is the encoding gene of yellow fluorescent protein (YFP) with part of repeats, and under normal circumstances, can not be correctly expressed and translated. However, under recognition and cleavage of CRISPR/Cas9 system, double-stranded DNA breaks (DSB) will occur and endogenous DNA repair mechanism in plants will be activated to remove the repeated gene fragment, thereby producing normal and functional protein YFP (
To verify the function of p19 protein in stably transformed system to improve the efficiency of plant gene targeting, two endogenous genes AP1 and TT4 in Arabidopsis was selected as target sites in this Example, and two groups of CRISPR/Cas9 gene knockout vectors with and without p19 protein were constructed, and used to transform Arabidopsis thaliana. In the obtained four transgenic populations of T1 generation, developmental phenotypes of leaves to different degree can be found. Depending on the severity of phenotype, they can be divided into three types: flat type (1/−), curl type (2/+) and serration type (3/++), and thus it is presumed that p19 protein may also interfere with miRNA-regulated leaf development process in plants (
For verification, expression levels of sgRNA and miR168 in plants with different phenotypes were detected respectively, and it was found that the cumulative levels of sgRNA and miRNA were the highest in the plants with severe leaf phenotype (
To understand whether p19 protein can improve targeting activity of CRISPR/Cas9 system while stabilizing sgRNA, developmental phenotype of leaves and gene mutations were recorded in two different 1300-psgR-Cas9-p19 transgenic populations, respectively.
Results showed that in both populations, about one-third of the plants exhibited severe developmental phenotype, about one-fifth of the plants exhibited slight leaf developmental phenotype. And in each population, the probability of targeted gene mutations is significantly higher in plants with leaf developmental phenotype, as compared with the plants without leaf developmental phenotype (
All literatures mentioned in the present application are incorporated by reference herein, as though individually incorporated by reference. Additionally, it should be understood that after reading the above teaching, many variations and modifications may be made by the skilled in the art, and these equivalents also fall within the scope as defined by the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
201310299527.4 | Jul 2013 | CN | national |
201310398734.5 | Sep 2013 | CN | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2014/082144 | 7/14/2014 | WO | 00 |