This application contains a Sequence Listing electronically submitted via EFS-Web to the United States Patent and Trademark Office as an ASCII text file entitled “15500_0005-00155_SL.txt” having a size of 178,681 bytes and created on Nov. 17, 2021. The information contained in the Sequence Listing is incorporated by reference herein.
The present invention pertains to the technical field of genetic engineering, and specifically relates to a method for generating a site-specific mutation in an organism in the absence of an artificial DNA template and a use thereof.
The genetic engineering technology for modifying genome of organisms has been widely used in industrial and agricultural production, such as genetically modified microorganisms commonly used in the pharmaceutical and chemical fields, and genetically modified crops with insect-resistance and herbicide-resistance in the agricultural field. With the advent of site-specific nucleases, by introducing a targeted fragmentation into the genome of recipient organism and causing spontaneous repair, it has been possible to achieve site-specific editing of genome and more precise modification of genome.
Gene editing tools mainly include three types of sequence-specific nuclease (SSN): zinc finger nuclease (ZFN), transcription activator-like effector nuclease (TALEN) and clustered regularly interspaced short palindromic repeats (CRISPR) associated Cas system (CRISPR/Cas system). Sequence-specific nucleases are programmable nucleases that can generate DNA double-strand breaks (DSBs) at specific sites in genome. DNA double-strand breaks activate the endogenous DNA repair pathway to repair DNA damage in a cell, but the repair process easily leads to changes in the DNA sequence at target sites, thereby achieving the introduction of mutations at interesting sites. This technology enables biologists to accurately target a target gene and edit it. Among them, both ZFN and TALEN need to design specific recognition protein modules for the target sequence, thereby having low throughput and complex operations. However, Cas protein is universal in the CRISPR/Cas system, in which a guide RNA (gRNA) can be formed by a specific CRISPR-RNA (crRNA) designed for a target site alone or in conjunction with transactivating RNA (tracrRNA), or a single guide RNA (sgRNA) alone is enough, the crRNA and tracrRNA together or sgRNA alone can be assembled with the Cas protein to form a ribonucleoprotein complex (RNP), the target sequence is identified on the basis of protospacer adjacent motif (PAM) in genome, thereby realizing site-specific editing. And thus, it has become a main gene editing tool because of its simple operation, wide application range and high throughput.
Sequence-specific nuclease can produce DNA double-strand breaks at specific sites in the genome. These DNA double-strand breaks can be repaired into a variety of different repair types, which are mainly base insertions or deletions. For example, the two most common types of CRISPR/Cas9 editing events are inserting a base at the break or deleting a base at the break (Shen et al. 2018. Predictable and precise template-free CRISPR editing of pathogenic variants. Nature. DOI: 10.1038/s41586-018-0686-x). The insertion or deletion of bases in coding region will cause frameshift mutations, leading to loss of gene function. Therefore, the main purpose of the above gene editing tools is still to perform gene knockout.
It has been considered all the time that the use of sequence-specific nuclease alone cannot achieve mutations in type of base substitution. To this end, the prior art proposes three solutions: 1) adding an exogenous DNA fragment as a repair template to initiate a homologous recombination repair pathway; 2) fusing deaminase with Cas9 to sequentially develop single-base editing tools for C to T and A to G; 3) fusing reverse transcriptase with Cas9, using pegRNA to guide the synthesis and substitution of small DNA strand. However, the editing efficiency of these three solutions is significantly lower than the efficiency of gene knockout, and the simultaneously introduced exogenous DNA fragment and reverse transcriptase may readily cause concerns about biological safety. The off-target effect of single-base editing also restricts its potential application in cell therapy. Especially for long-term plant breeding projects, how to improve base substitution efficiency at target site while reducing the regulatory authorities' concerns about biosafety is a problem that needs to be resolved in the application of gene editing technology.
In summary, there are urgent technical needs in the fields of cell therapy and biological breeding for site-specific base substitution by using only targeted knockout of sequence-specific nuclease without introducing a foreign DNA fragment, especially through the non-transgenic transient editing system to efficiently complete site-specific base substitution editing.
The invention provides a method for generating a site-specific mutation in an organism only by generating double-strand breaks on a genome and without providing an artificial DNA template, and use of the method.
The technical solutions adopted by the present invention are as follows:
A method for generating a new mutation in an organism, which comprises the following steps: sequentially generating two or more DNA breaks at a specific site in a genome of the organism and spontaneously repairing them respectively, wherein a later DNA break is generated based on a new sequence generated from a previous DNA break repair.
In a specific embodiment, the “DNA break” is achieved by delivering a nuclease with targeting property into a cell of an organism to contact with a specific site of genomic DNA.
In a specific embodiment, the “nuclease with targeting property” is a ZFN, TALEN or CRISPR/Cas system.
In a specific embodiment, the “sequentially generating two or more DNA breaks at a specific site” refers to that based on a new sequence generated from a previous DNA break repair event caused by ZFN or TALEN editing, a new ZFN or TALEN protein is designed to cut the site again.
In another specific embodiment, “sequentially generating two or more DNA breaks at a specific site” refers to that based on a new sequence generated from a previous DNA break repair event caused by a CRISPR/Cas system, a new target RNA is designed to cut the site again. For example, a second cutting is performed at the site again by designing a new target RNA on the basis of a new sequence generated from a first break repair event of Cas9 editing. In a similar way, a third cutting is performed at the site by designing a new target RNA on the basis of a new sequence generated from a second break repair event and so on, as shown in
In a specific embodiment, the “two or more DNA breaks” are generated by sequentially delivering different targeted nucleases into recipient cells of different generations, wherein a mutant cell that has completed the previous editing is used as a recipient to receive the delivery of the targeted nuclease for the later editing, thereby performing second editing to generate site-specific mutation. This method is preferably used for ZFN and TALEN editing systems.
In another specific embodiment, the “two or more DNA breaks” are generated by delivering different targeted nucleases for different targets into a same recipient cell. This method is preferably used for CRISPR/Cas editing system.
In a specific embodiment, the “two or more DNA breaks” are generated when RNP complexes formed by a same CRISPR/Cas nuclease respectively with different gRNAs or sgRNAs sequentially cut corresponding target sequences.
In another specific embodiment, the “two or more DNA breaks” are generated when RNP complexes formed by each of two or more CRISPR/Cas nucleases that recognize different PAM sequences with respective gRNA or sgRNA, sequentially cut corresponding target sequences. For example, the PAM sequence recognized by Cas9 from Streptococcus pyogenes is “NGG” or “NAG” (Jinek et al., “A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity”, Science 2012, 337:816-821), the PAM sequence recognized by Cas9 of Staphylococcus aureus is “NNGRRT” or “NNGRR(N)”, the PAM sequence recognized by Neisseria meningitidis Cas9 is NNNNGATT, and the PAM sequence recognized by Streptococcus thermophilus Cas9 is NNAGAAW. In this way, the editable window of a DNA molecule is larger.
In a specific embodiment, the targeted nuclease is any CRISPR/Cas nuclease capable of achieving genome editing.
In a specific embodiment, the targeted nuclease is in a form of DNA.
In another specific embodiment, the targeted nuclease is in a form of mRNA or protein instead of DNA. The protein form is preferred.
In a specific embodiment, the method for delivering targeted nucleases into a cell is selected from, but not limited to: 1) a PEG-mediated cell transfection method; 2) a liposome-mediated cell transfection method; 3) an electroporation transformation method; 4) a microinjection; 5) a gene gun bombardment; 6) an Agrobacterium-mediated transformation method.
In the method, a new target is designed based on a new sequence generated from the previous DNA break repair, and thus, mutations can be sequentially formed for many times at a specific site in the genome, thereby exponentially enriching types of repair events after DNA breaks, and generating new types of base substitution, deletion and insertion mutations that cannot be obtained by a single gene editing, so that the method is suitable to be used as a tool to create new mutations. This method can be briefly described as a method of programmed sequential cutting/editing or successive cutting/editing.
In a specific embodiment, a new target is designed based on a new specific sequence predicted to be generated from previous break repair at a specific site of organism genome, sequential editing is then performed, and thus a final possible mutation at the site can be designed in advance to achieve an expected editing.
In another specific embodiment, a new target is designed based on a new sequence predicted to be generated from previous break repair at a specific site of organism genome, sequential editing is then performed, and in addition to an expected editing event, a variety of different mutations can be generated eventually at the site, so that the method can be used as a tool to create various different mutations.
In another aspect, the present invention further provides a method for generating a new mutation in an organism, which comprises the following step: sequentially generating two or more DNA breaks at a specific site in a gene at the level of genome or chromosome of the organism, thereby achieving a precise base substitution, deletion or insertion.
In a specific embodiment, the “sequentially generating two or more DNA breaks at a specific site” refers to that a new target RNA designed based on a new sequence generated from a previous break repair event, and a cutting is performed again at the same site.
In a specific embodiment, the “DNA break” is achieved by a nuclease with targeting property.
The present invention further provides a new mutation obtained by the aforementioned method.
The present invention further provides a protein or biologically active fragment thereof that has the aforementioned new mutation.
The present invention further provides a nucleic acid, which comprises a nucleic acid sequence or complementary sequence thereof that encodes the protein or biologically active fragment thereof.
The present invention further provides a nucleic acid, which comprises:
In a specific embodiment, the nucleic acid further comprises (b) a nucleotide sequence encoding a Cas polypeptide.
In a specific embodiment, the target RNA is a sgRNA or gRNA.
In a specific embodiment, the Cas polypeptide and the target RNA are present in an in vitro cell or an ex vivo cell.
The present invention further provides a recombinant expression vector, which comprises the aforementioned nucleic acid and a promoter operably linked thereto.
The present invention further provides an expression cassette, which comprises the aforementioned nucleic acid.
The present invention further provides a host cell, which comprises the aforementioned expression cassette.
The present invention further provides an organism that is regenerated by using the aforementioned host cell.
The present invention further provides a method for lysing a target DNA, which comprises contacting the target DNA with a complex, wherein the complex comprises:
In a specific embodiment, the target RNA is a sgRNA or gRNA.
In a specific embodiment, the target DNA is present in a bacterial cell, eukaryotic cell, plant cell or animal cell.
In a specific embodiment, the target DNA is a chromosomal DNA.
In a specific embodiment, the Cas polypeptide and the target RNA are present in an in vitro cell or an ex vivo cell.
In a specific embodiment, the contacting comprises introducing the following into a cell: (a) the Cas polypeptide or a polynucleotide encoding the Cas polypeptide, and (b) the target RNA or a DNA polynucleotide encoding the target RNA.
The present invention further provides a composition, which comprises:
In a specific embodiment, the target RNA is a sgRNA or gRNA.
In a specific embodiment, the Cas polypeptide and the target RNA are present in an in vitro cell or an ex vivo cell.
The invention further provides a use of the composition in manufacture of a medicament for treatment of a disease.
The disease that can be treated with the composition of the present invention includes, but is not limited to, a disease caused by single gene mutation, such as genetic tyrosinemia type 1, phenylketonuria, progeria, sickle cell disease, etc. A spontaneous cell repair is induced by delivering into a cell the Cas protein and the crRNA or sgRNA composition that is expected to repair a pathogenic mutation site to produce a normal functional protein, and thus a therapeutic effect is obtained.
The present invention further provides a kit, which comprises:
In a specific embodiment, the target RNA is a sgRNA or gRNA.
In a specific embodiment, the target RNAs in (b) are in a same or separate containers.
The present invention further provides a method for screening editing events independent of exogenous transgenic markers, comprising the following steps:
In a specific embodiment, the “first target gene” is a gene locus encoding at least one phenotypic selectable trait, wherein the at least one phenotypic selectable trait is a resistance/tolerance trait or a growth advantage trait.
In a specific embodiment, the “specific site of a first target gene” refers to a site at which a certain type of mutation is generated after sequential cuttings and repairs, which is capable of conferring the recipient cell with a resistance to a certain selection pressure to produce at least one phenotypic selectable resistance/tolerance trait or growth advantage trait.
In a specific embodiment, the “certain type of mutation” comprises substitution of single base, substitution of a plurality of bases, or insertion or deletion of an unspecified number of bases.
In a specific embodiment, the “certain selection pressure” may be an environmental pressure or a pressure resulted from an added compound; for example, the environmental pressure is high temperature, low temperature or hypoxia and the like; the pressure resulted from an added compound may be a pressure resulted from a salt ion concentration, antibiotic, cytotoxin, herbicide, etc.
In a specific embodiment, the “DNA break” is achieved by delivering a nuclease with a targeting property into a cell of an organism to contact with a specific site of genomic DNA.
In a specific embodiment, the “nuclease with a targeting property” is any CRISPR/Cas nuclease capable of performing genome editing.
In a specific embodiment, the feature, “two or more DNA breaks are sequentially generated in sequence at a specific site”, refers to that based on a new sequence formed by a previous DNA break repair event generated by a CRISPR/Cas system, a new target RNA is designed to cut the site again.
In a specific embodiment, the “two or more DNA breaks” are generated when RNP complexes formed by a same CRISPR/Cas nuclease respectively with different gRNAs or sgRNAs sequentially cut corresponding target sequences.
In another specific embodiment, the “two or more DNA breaks” are generated when RNP complexes respectively formed by each of two or more CRISPR/Cas nucleases that recognize different PAM sequences with respective gRNA or sgRNA, sequentially cut corresponding target sequences. In this way, the editable window of a DNA molecule is larger.
In a specific embodiment, the “second target gene” refers to another gene that is different in coding from the first target gene.
In a specific embodiment, the “targeted nuclease for at least one second target gene” and the CRISPR/Cas nuclease used for generating DNA break at a specific site of the first target gene are the same.
In another specific embodiment, the “targeted nuclease for at least one second target gene” and the CRISPR/Cas nuclease used for generating DNA break at a specific site of the first target gene are different. In this way, there are more selectable editing sites on the second target gene.
In a specific embodiment, the targeted nuclease is in a form of DNA.
In another specific embodiment, the targeted nuclease is in a form of mRNA or protein instead of DNA. The protein form is preferred.
In a specific embodiment, the method for delivering targeted nuclease into cell is selected from, but not limited to: 1) a PEG-mediated cell transfection method; 2) a liposome-mediated cell transfection method; 3) an electroporation transformation method; 4) a microinjection; 5) a gene gun bombardment; or 6) an Agrobacterium-mediated transformation method.
The present invention further provides a method for non-transgenic transient editing of an organism genome, comprising the following steps:
In a specific embodiment, the “first target gene” is a gene locus encoding at least one phenotypic selectable trait, wherein the at least one phenotypic selectable trait is a resistance/tolerance trait or a growth advantage trait.
In a specific embodiment, the “specific site of the first target gene” refers to a site at which a certain type of mutation is generated after sequential cuttings and repairs, which is capable of conferring the recipient cell with a resistance to a certain selection pressure to produce at least one phenotypic selectable resistance/tolerance trait or growth advantage trait.
In a specific embodiment, the “certain type of mutation” comprises substitution of single base, substitution of a plurality of bases, or insertion or deletion of an unspecified number of bases.
In a specific embodiment, the “certain selection pressure” may be an environmental pressure or a pressure resulted from an added compound; for example, the environmental pressure is high temperature, low temperature or hypoxia and the like; the pressure resulted from an added compound may be a pressure resulted from a salt ion concentration, antibiotic, cytotoxin or herbicide, and the like.
In a specific embodiment, the CRISPR/Cas protein is any CRISPR/Cas nuclease capable of performing genome editing.
In a specific embodiment, the feature “to sequentially generate two or more DNA breaks at the specific site” refers to that based on a new sequence formed by a previous DNA break repair event generated by a CRISPR/Cas system, a new target RNA is designed to cut the site again.
In a specific embodiment, the “two or more DNA breaks” are generated when RNP complexes formed by a same CRISPR/Cas nuclease respectively with different gRNAs or sgRNAs sequentially cut corresponding target sequences.
In another specific embodiment, the “two or more DNA breaks” are generated when RNP complexes respectively formed by each of two or more CRISPR/Cas nucleases that recognize different PAM sequences with respective gRNA or sgRNA, sequentially cut corresponding target sequences. In this way, the editable window of a DNA molecule is larger.
In a specific embodiment, the “second, third or more target genes” refer to other genes that are different in coding from the first target gene.
In a specific embodiment, the “at least one of artificially synthesized crRNA and tracrRNA fragments or artificially synthesized sgRNA fragments targeting a second, third or more target genes” shares the same Cas protein with the crRNA or sgRNA targeting the first target gene.
In another specific embodiment, the “at least one of artificially synthesized crRNA and tracrRNA fragments or artificially synthesized sgRNA fragments targeting a second, third or more target genes” and the crRNA or sgRNA targeting the first target gene use Cas proteins that recognize different PAM sequences. In this way, there are more selectable editing sites on the second target gene.
In a specific embodiment, the method for delivering the RNP complex into cells is selected from, but not limited to: 1) a PEG-mediated cell transfection method; 2) a liposome-mediated cell transfection method; 3) an electroporation transformation method; 4) a microinjection; 5) a gene gun bombardment; and so on.
The present invention further provides a method for non-transgenic transient editing of a plant genome, comprising the following steps:
In a specific embodiment, the “first target gene” is a gene locus encoding at least one phenotypic selectable trait, wherein the at least one phenotypic selectable trait is a resistance/tolerance trait or a growth advantage trait.
In a specific embodiment, the “specific site of the first target gene” refers to a site at which a certain type of mutation is generated after sequential cuttings and repairs at the site, which can confer the recipient cell with a resistance to a certain selection pressure to produce at least one phenotypic selectable resistance/tolerance trait or growth advantage trait.
In a specific embodiment, the “certain type of mutation” comprises substitution of single base, substitution of a plurality of bases, or insertion or deletion of an unspecified number of bases.
In a specific embodiment, the “certain selection pressure” may be an environmental pressure or a pressure resulted from an added compound; for example, the environmental pressure is preferably high temperature, low temperature or hypoxia and the like; the pressure resulted from an added compound may be a pressure resulted from a salt ion concentration, antibiotic, cytotoxin, herbicide, etc.
In a specific embodiment, the “recipient plant cell or tissue” is any cell or tissue that can serve as a recipient for transient expression and can be regenerated into a complete plant through tissue culture. Specifically, the cell is a protoplast cell or a suspension cell; the tissue is preferably a callus, immature embryo, mature embryo, leaf, shoot tip, young spike, hypocotyl, etc.
In a specific embodiment, the CRISPR/Cas protein is any CRISPR/Cas nuclease capable of performing genome editing.
In a specific embodiment, the feature “to sequentially generate two or more DNA breaks at the specific site” refers to that based on a new sequence formed by a previous DNA break repair event generated by a CRISPR/Cas system, a new target RNA is designed to cut the site again.
In a specific embodiment, the “two or more DNA breaks” are generated when RNP complexes formed by a same CRISPR/Cas nuclease respectively with different gRNAs or sgRNAs sequentially cut corresponding target sequences.
In another specific embodiment, the “two or more DNA breaks” are generated when RNP complexes respectively formed by each of two or more CRISPR/Cas nucleases that recognize different PAM sequences with respective gRNA or sgRNA, sequentially cut corresponding target sequences. In this way, the editable window of a DNA molecule is larger.
In a specific embodiment, the “second, third or more target genes” refer to other genes that are different in coding from the first target gene.
In a specific embodiment, the “at least one of artificially synthesized crRNA and tracrRNA fragments or artificially synthesized sgRNA fragments targeting a second, third or more target genes” shares the same Cas protein with the crRNA or sgRNA targeting the first target gene.
In another specific embodiment, the “at least one of artificially synthesized crRNA and tracrRNA fragments or artificially synthesized sgRNA fragments targeting a second, third or more target genes” and the crRNA or sgRNA targeting the first target gene use Cas proteins that recognize different PAM sequences. In this way, there are more selectable editing sites on the second target gene.
In a specific embodiment, the method for delivering the RNP complex into plant cells is selected from, but not limited to: 1) a PEG-mediated protoplast transformation method; 2) a microinjection; 3) a gene gun bombardment; 4) a silicon carbide fiber-mediated method; 5) a vacuum infiltration method, or any other transient introduction method. The gene gun bombardment is preferred.
In a specific embodiment, the “first target gene” is at least one endogenous gene that encodes at least one phenotypic selectable trait selected from herbicide resistance/tolerance, wherein the herbicide resistance/tolerance is selected from the group consisting of resistance/tolerance to EPSPS inhibitor (including glyphosate); resistance/tolerance to glutamine synthesis inhibitor (including glufosinate); resistance/tolerance to ALS or AHAS inhibitor (including imidazoline or sulfonylurea); resistance/tolerance to ACCase inhibitor (including aryloxyphenoxypropionic acid (FOP)); resistance/tolerance to carotenoid biosynthesis inhibitor, including carotenoid biosynthesis inhibitors of phytoene desaturase (PDS) step, 4-hydroxyphenylpyruvate dioxygenase (HPPD) inhibitors or other carotenoid biosynthesis target inhibitors; resistance/tolerance to cellulose inhibitor; resistance/tolerance to lipid synthesis inhibitor; resistance/tolerance to long-chain fatty acid inhibitor; resistance/tolerance to microtubule assembly inhibitor; resistance/tolerance to photosystem I electron shunting agent; resistance/tolerance to photosystem II inhibitor (including carbamates, triazines and triazones); resistance/tolerance to PPO inhibitor; and resistance/tolerance to synthetic growth hormone (including dicamba, 2,4-D (i.e., 2,4-dichlorophenoxyacetic acid)). Wherein, the first target gene is selected from PsbA, ALS, EPSPS, ACCase, PPO, HPPD, PDS, GS, DOXPS, TIR1, AFB5, and some types of mutations generated after sequential cuttings and repairs at specific sites of these herbicide target genes may confer the recipient plant cells with resistance/tolerance to the corresponding herbicides.
In a specific embodiment, the “first target gene” is ALS, and the “specific site of gene” refers to site A122, P197, R198, D204, A205, D376, R377, W574, 5653 or G654 in an Arabidopsis AtALS protein amino acid sequence (e.g., as shown in SEQ ID NO:1), and amino acid sites in an ALS protein of another plant which correspond to the above-mentioned amino acid sites by using the AtALS amino acid sequence as reference standard. The crRNA or sgRNA targets a target sequence comprising a sequence encoding an AtALS protein amino acid sequence site selected from the group consisting of A122, P197, R198, D204, A205, D376, R377, W574, S653, G654 or any combination thereof, and a target sequence comprising a sequence encoding an amino acid site in an ALS protein of another plant which corresponds to the above-mentioned amino acid sites, and any combination thereof, by using the AtALS amino acid sequence as reference standard. The ALS W574 site is preferred. The selection pressure is preferably a treatment with pyroxsulam or nicosulfuron.
In a specific embodiment, the “first target gene” is ACCase, and the “specific site of gene” refers to site I1781, E1874, N1878, W1999, W2027, I2041, D2078, C2088 or G2096 in an Alopecurus myosuroides AmACCase protein amino acid sequence (e.g., as shown in SEQ ID NO: 3, and the gene sequence is as shown in SEQ ID NO: 4), and amino acid sites in an ACCase protein of another monocotyledonous plant which correspond to the above-mentioned amino acid sites by using the AmACCase amino acid sequence as reference standard. The crRNA or sgRNA targets a target sequence comprising a sequence encoding an AmACCase amino acid sequence site selected from the group consisting of I1781, E1874, N1878, W1999, W2027, I2041, D2078, C2088, G2096 or any combination thereof, and a target sequence comprising a sequence encoding an amino acid site in an ACCase protein of another monocotyledonous plant which corresponds to the above-mentioned amino acid site, and any combination thereof, by using the AmACCase amino acid sequence as reference standard. ACCase W2027 site is preferred. The selection pressure is preferably a treatment with quizalofop-p-ethyl.
In a specific embodiment, the “first target gene” is HPPD, and the “specific site of gene” refers to site H141, L276, P277, N338, G342, R346, D370, P386, K418 or G419 in an Oryza sativa OsHPPD protein amino acid sequence (as shown in SEQ ID NO: 5, and the genome sequence is as shown in SEQ ID NO: 6), and amino acid sites in an HPPD protein of another plant which correspond to the above-mentioned amino acid sites by using the OsHPPD amino acid sequence as reference standard. The crRNA or sgRNA targets a target sequence comprising a sequence encoding an OsHPPD amino acid sequence site selected from the group consisting of H141, L276, P277, N338, G342, R346, D370, P386, K418, G419 or any combination thereof, and a target sequence comprising a sequence encoding an amino acid site in an HPPD protein of another plant which corresponds to the above-mentioned amino acid site, and any combination thereof, by using the OsHPPD amino acid sequence as reference standard. The selection pressure is preferably a treatment with biscarfentrazone.
In a specific embodiment, the “first target gene” is PPO, and the “specific site of gene” refers to site S128, V217, S223, V364, K373, L423, Y425 or W470 in an Oryza sativa OsPPO1 protein amino acid sequence (as shown in SEQ ID NO: 7, and the genome sequence is as shown in SEQ ID NO: 8), and amino acid sites in a PPO protein of another plant which correspond to the above-mentioned amino acid sites by using the amino acid sequence of OsPPO1 as reference standard. The crRNA or sgRNA targets a target sequence comprising a sequence encoding an OsPPO1 amino acid sequence site selected from the group consisting of S128, V217, S223, V364, K373, L423, Y425, W470 or any combination thereof, and a target sequence comprising a sequence of the above-mentioned amino acid sites corresponding to a PPO protein of another plant and any combination thereof using the OsPPO1 amino acid sequence as reference standard. The selection pressure is preferably a treatment with saflufenacil.
In a specific embodiment, the “first target gene” is TIR1, and the “specific site of gene” refers to site F93, F357, C413 or S448 in an Oryza sativa OsTIR1 protein amino acid sequence (as shown in SEQ ID NO: 9, and the genome sequence is as shown in SEQ ID NO: 10), and amino acid sites in a TIR1 protein of another plant which correspond to the above-mentioned amino acid sites by using the OsTIR1 amino acid sequence as reference standard. The crRNA or sgRNA targets a target sequence comprising a sequence encoding an OsTIR1 amino acid sequence site selected from the group consisting of F93, F357, C413, S448 or any combination thereof, and a target sequence comprising a sequence encoding an amino acid site in a TIR1 protein of another plant which corresponds to the above-mentioned amino acid site, and any combination thereof, by using the OsTIR1 amino acid sequence as reference standard. The selection pressure is preferably 2,4-D treatment.
The present invention further provides a non-transgenic transient editing system using the aforementioned method.
The present invention additionally provides a use of the aforementioned non-transgenic transient editing system as a selection marker.
The present invention additionally provides a use of the aforementioned non-transgenic transient editing system in treatment of a disease.
The present invention additionally provides a use of the aforementioned non-transgenic transient editing system in biological breeding.
The present invention additionally provides a genetically modified plant obtained by the aforementioned method, the genome of which contains an editing event of a first target gene, and the genetically modified plant is obtained in a non-transgenic manner.
The present invention additionally provides a genetically modified plant obtained by the aforementioned method, the genome of which contains an editing event of a first target gene, and further contains at least one second target gene editing event, and the genetically modified plant is obtained in a non-transgenic manner.
The present invention additionally provides a genetically modified plant obtained by the aforementioned method, the genome of which contains at least one second target gene editing event, and the genetically modified plant is obtained in a non-transgenic manner, wherein the first target gene editing event has been removed by genetic separation.
The present invention further provides a genome of the genetically modified plant obtained by the aforementioned method, the genome comprising: 1) an editing event of a first target gene; 2) an editing event of the first target gene and an editing event of at least one second target gene; or 3) at least one second target gene editing event, wherein the editing event of the first target gene has been removed by genetic separation; wherein the genetically modified plant is obtained in a non-transgenic manner.
Another aspect of the present invention provides a new plant gene mutation obtained by the aforementioned method.
The present invention also provides a new mutation generated in a plant, which comprises one or a combination of two or more of the following types:
In a specific embodiment, wherein the aspartic acid at a site corresponding to Arabidopsis ALS376 is substituted by glutamic acid (D376E), the tryptophan at a site corresponding to Arabidopsis ALS574 is substituted by leucine or methionine (W574L or W574M), the serine at a site corresponding to Arabidopsis ALS653 is substituted by asparagine or arginine (S653N or S653R), or the glycine at a site corresponding to Arabidopsis ALS654 is substituted by aspartic acid (G654D), wherein the sites of amino acids are mentioned by using the sites of corresponding amino acids in Arabidopsis thalianan as reference; or, the tryptophan at a site corresponding to Alopecurus myosuroides ACCase2027 is substituted by leucine or cysteine (W2027L or W2027C), wherein the site of amino acid is mentioned by using the site of corresponding amino acid in Alopecurus myosuroides as reference.
In another specific embodiment, the mutation type is S653R/G654D, wherein the sites of amino acids are mentioned by using the sites of corresponding amino acids in Arabidopsis thalianan as reference.
In a specific embodiment, the aspartic acid at site 350 of Oryza sativa ALS is substituted by any other amino acid, the tryptophan at site 548 of Oryza sativa ALS is substituted by any other amino acid, or the tryptophan at site 561 of Solanum tuberosum L. ALS2 is substituted by any other amino acid; or, the tryptophan at site 2038 of Oryza sativa ACCase2 is substituted by any other amino acid.
In another specific embodiment, the aspartic acid at site 350 of Oryza sativa ALS is substituted by glutamic acid (D350E), the tryptophan at site 548 of Oryza sativa ALS is substituted by leucine or methionine (W548L or W548M), or the tryptophan at site 561 of Solanum tuberosum L. ALS2 is substituted by leucine or methionine (W561L or W561M); or, the tryptophan at site 2038 of Oryza sativa ACCase2 is substituted by leucine or cysteine (W2038L or W2038C), wherein the amino acid sequence of the Oryza sativa ALS protein is shown in SEQ ID NO: 11, the amino acid sequence of the Solanum tuberosum L. StALS2 protein is shown in SEQ ID NO: 19, and the amino acid sequence of the Oryza sativa ACCase2 protein is shown in SEQ ID NO: 13.
The present invention additionally provides a protein or biologically active fragment thereof that has the aforementioned new mutation.
The present invention also provides a nucleic acid, which comprises a nucleic acid sequence or complementary sequence thereof that encodes the protein or biologically active fragment thereof.
The present invention additionally provides a recombinant expression vector, which comprises the nucleic acid and a promoter operably linked thereto.
The present invention further provides an expression cassette, which comprises the nucleic acid.
The present invention further provides a plant cell, which comprises the expression cassette.
The present invention further provides a plant regenerated by using the plant cell.
Another aspect of the present invention provides a method for producing a plant with improved resistance or tolerance to herbicides, which comprises regenerating the plant cell into a plant.
Another aspect of the present invention provides a method for controlling weeds in a plant cultivation site, wherein the plant includes the aforementioned plant or a plant produced by the aforementioned method, wherein the method comprises applying to the cultivation site one or more herbicides in an effective amount to control the weeds.
Another aspect of the present invention also provides a use of the new mutation, the protein or biologically active fragment thereof, the nucleic acid, the recombinant expression vector or the expression cassette in improving resistance or tolerance of a plant cell, a plant tissue, a plant part or a plant to herbicides.
The present invention has the following excellent technical effects:
Based on sequences generated from new repair event generated by sequential editing, new targets can be designed, which can sequentially form mutations for many times at a specific site in the genome, thereby exponentially enriching the types of repair events after DNA breaks, and realizing new types of base substitution, deletion and insertion mutations that cannot be obtained by a single gene editing. That is, the programmed sequential cutting/editing scheme adopted by the present invention, which uses the sequence generated from a previous gene editing repair as the later gene editing target, can endow CRISPR/Cas with new functions of single-base editing and site-precise deletion and insertion through simple knockout.
The invention can realize the screening of gene editing events in the absence of exogenous markers, further realize the non-transgenic gene editing and can effectively screen editing events, and can greatly reduce the biological safety concerns of the method in cell therapy and biological breeding.
In particular, the plant non-transgenic transient editing method provided by the present invention only involves Cas protein and artificially synthesized small fragments of gRNA or sgRNA, without the participation of exogenous DNA in the whole process, and produces endogenous resistance selection markers by editing the first target gene via continuous targeting, so that the editing event can be effectively screened, actually without genetic modification operations involved, and thus the method is equivalent to chemical mutagenesis or radiation-induced breeding, also does not require continuous multiple generations of separation and detection of exogenous transgenic components, thereby shortening the breeding cycle, ensuring the biological safety, saving supervision and approval cost, and providing great application prospects in precise breeding of plants.
In the present invention, unless otherwise specified, the scientific and technical terms used herein have the meanings commonly understood by those skilled in the art. In addition, protein and nucleic acid chemistry, molecular biology, cell and tissue culture, microbiology, immunology related terms and laboratory procedures used herein are all terms and routine procedures widely used in the corresponding fields. At the same time, in order to better understand the present invention, definitions and explanations of related terms are provided below.
The term “genome” as used herein refers to all complements of genetic material (genes and non-coding sequences) present in each cell or virus or organelle of an organism, and/or complete genome inherited from a parent as a unit (haploid).
The term “gene editing” refers to strategies and techniques for targeted specific modification of any genetic information or genome of living organisms. Therefore, the term includes editing of gene coding regions, but also includes editing of regions other than gene coding regions of the genome. It also includes editing or modifying other genetic information of nuclei (if present) and cells.
The term “CRISPR/Cas nuclease” may be a CRISPR-based nuclease or a nucleic acid sequence encoding the same, including but not limited to: 1) Cas9, including SpCas9, ScCas9, SaCas9, xCas9, VRER-Cas9, EQR-Cas9, SpG-Cas9, SpRY-Cas9, SpCas9-NG, NG-Cas9, NGA-Cas9 (VQR), etc.; 2) Cas12, including LbCpf1, FnCpf1, AsCpf1, MAD7, etc., or any variant or derivative of the aforementioned CRISPR-based nuclease; preferably, wherein the at least one CRISPR-based nuclease comprises a mutation compared to the corresponding wild-type sequence, so that the obtained CRISPR-based nuclease recognizes a different PAM sequence. As used herein, “CRISPR-based nuclease” is any nuclease that has been identified in a naturally occurring CRISPR system, which is subsequently isolated from its natural background, and has preferably been modified or combined into a recombinant construct of interest, suitable as a tool for targeted genome engineering. As long as the original wild-type CRISPR-based nuclease provides DNA recognition, i.e., binding properties, any CRISPR-based nuclease can be used and optionally reprogrammed or otherwise mutated so as to be suitable for various embodiments of the invention.
The term “CRISPR” refers to a sequence-specific genetic manipulation technique that relies on clustered regularly interspaced short palindromic repeats, which is different from RNA interference that regulates gene expression at the transcriptional level.
“Cas9 nuclease” and “Cas9” are used interchangeably herein, and refer to RNA-guided nuclease comprising Cas9 protein or fragment thereof (for example, a protein containing the active DNA cleavage domain of Cas9 and/or the gRNA binding domain of Cas9). Cas9 is a component of the CRISPR/Cas (clustered regularly interspaced short palindrome repeats and associated systems) genome editing system. It can target and cut DNA target sequences under the guidance of guide RNA to form DNA double-strand breaks (DSB).
“Cas protein” or “Cas polypeptide” refers to a polypeptide encoded by Cas (CRISPR-associated) gene. Cas protein includes Cas endonuclease. Cas protein can be a bacterial or archaeal protein. For example, the types I to III CRISPR Cas proteins herein generally originate from prokaryotes; the type I and type III Cas proteins can be derived from bacteria or archaea species, and the type II Cas protein (i.e., Cas9) can be derived from bacterial species. “Cas proteins” include Cas9 protein, Cpf1 protein, C2c1 protein, C2c2 protein, C2c3 protein, Cas3, Cas3-HD, Cas5, Cas7, Cas8, Cas10, Cas12a, Cas12b, or a combination or complex thereof.
“Cas9 variant” or “Cas9 endonuclease variant” refers to a variant of the parent Cas9 endonuclease, wherein when associated with crRNA and tracRNA or with sgRNA, the Cas9 endonuclease variant retains the abilities of recognizing, binding to all or part of a DNA target sequence and optionally unwinding all or part of a DNA target sequence, nicking all or part of a DNA target sequence, or cutting all or part of a DNA target sequence. The Cas9 endonuclease variants include the Cas9 endonuclease variants described herein, wherein the Cas9 endonuclease variants are different from the parent Cas9 endonuclease in the following manner the Cas9 endonuclease variants (when complexed with gRNA to form a polynucleotide-directed endonuclease complex capable of modifying a target site) have at least one improved property, such as, but not limited to, increased transformation efficiency, increased DNA editing efficiency, decreased off-target cutting, or any combination thereof, as compared to the parent Cas9 endonuclease (complexed with the same gRNA to form a polynucleotide-guided endonuclease complex capable of modifying the same target site).
The Cas9 endonuclease variants described herein include variants that can bind to and nick double-stranded DNA target sites when associated with crRNA and tracrRNA or with sgRNA, while the parent Cas endonuclease can bind to the target site and result in double strand break (cleavage) when associated with crRNA and tracrRNA or with sgRNA.
“Guide RNA” and “gRNA” are used interchangeably herein, and refer to a guide RNA sequence used to target a specific gene for correction using CRISPR technology, which usually consists of crRNA and tracrRNA molecules that are partially complementary to form a complex, wherein crRNA contains a sequence that has sufficient complementarity with the target sequence so as to hybridize with the target sequence and direct the CRISPR complex (Cas9+crRNA+tracrRNA) to specifically bind to the target sequence. However, it is known in the art that a single guide RNA (sgRNA) can be designed, which contains both the properties of crRNA and tracrRNA.
The terms “single guide RNA” and “sgRNA” are used interchangeably herein, and refer to the synthetic fusion of two RNA molecules, which comprises a fusion of a crRNA (CRISPR RNA) of a variable targeting domain (linked to a tracr pairing sequence hybridized to tracrRNA) and a tracrRNA (trans-activating CRISPR RNA). The sgRNA may comprise crRNA or crRNA fragments and tracrRNA or tracrRNA fragments of the type II CRISPR/Cas system that can form a complex with the type II Cas endonuclease, wherein the guide RNA/Cas endonuclease complex can guide the Cas endonuclease to a DNA target site so that the Cas endonuclease can recognize, optionally bind to the DNA target site, and optionally nick the DNA target site or cut (introduce a single-strand or double-strand break) the DNA target site.
In certain embodiments, the guide RNA(s) and Cas9 can be delivered to a cell as a ribonucleoprotein (RNP) complex. RNP is composed of purified Cas9 protein complexed with gRNA, and it is well known in the art that RNP can be effectively delivered to many types of cells, including but not limited to stem cells and immune cells (Addgene, Cambridge, Mass., Mirus Bio LLC, Madison, Wis.).
The protospacer adjacent motif (PAM) herein refers to a short nucleotide sequence adjacent to a (targeted) target sequence (prespacer) recognized by the gRNA/Cas endonuclease system. If the target DNA sequence is not adjacent to an appropriate PAM sequence, the Cas endonuclease may not be able to successfully recognize the target DNA sequence. The sequence and length of PAM herein can be different depending on the Cas protein or Cas protein complex in use. The PAM sequence can be of any length, but is typically in length of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 nucleotides.
As used herein, the term “organism” or “living body” includes animals, plants, fungi, bacteria, and the like.
As used herein, the term “host cell” includes plant cells, animal cells, fungal cells, bacterial cells, and the like.
In the present invention, “animal” includes but is not limited to vertebrates, such as humans, non-human mammals, birds, fish, reptiles, amphibians, etc., as well as invertebrates, such as insects.
In the present invention, the “plant” should be understood to mean any differentiated multicellular organism capable of performing photosynthesis, in particular monocotyledonous or dicotyledonous plants, for example, (1) food crops: Oryza spp., like Oryza sativa, Oryza latifolia, Oryza sativa, Oryza glaberrima; Triticum spp., like Triticum aestivum, T. Turgidumssp. durum; Hordeum spp., like Hordeum vulgare, Hordeum arizonicum; Secale cereale; Avena spp., like Avena sativa, Avena fatua, Avena byzantine, Avena fatua var. sativa, Avena hybrida; Echinochloa spp., like Pennisetum glaucum, Sorghum, Sorghum bicolor, Sorghum vulgare, Triticale, Zea mays or Maize, Millet, Rice, Foxtail millet, Proso millet, Sorghum bicolor, Panicum, Fagopyrum spp., Panicum miliaceum, Setaria italica, Zizania palustris, Eragrostis tef, Panicum miliaceum, Eleusine coracana; (2) legume crops: Glycine spp. like Glycine max, Soja hispida, Soja max, Vicia spp., Vigna spp., Pisum spp., field bean, Lupinus spp., Vicia, Tamarindus indica, Lens culinaris, Lathyrus spp., Lablab, broad bean, mung bean, red bean, chickpea; (3) oil crops: Arachis hypogaea, Arachis spp, Sesamum spp., Helianthus spp. like Helianthus annuus, Elaeis like Eiaeis guineensis, Elaeis oleifera, soybean, Brassicanapus, Brassica oleracea, Sesamum orientale, Brassica juncea, Oilseed rape, Camellia oleifera, oil palm, olive, castor-oil plant, Brassica napus L., canola; (4) fiber crops: Agave sisalana, Gossypium spp. like Gossypium, Gossypium barbadense, Gossypium hirsutum, Hibiscus cannabinus, Agave sisalana, Musa textilis Nee, Linum usitatissimum, Corchorus capsularis L, Boehmeria nivea (L.), Cannabis sativa, Cannabis sativa; (5) fruit crops: Ziziphus spp., Cucumis spp., Passiflora edulis, Vitis spp., Vaccinium spp., Pyrus communis, Prunus spp., Psidium spp., Punica granatum, Malus spp., Citrullus lanatus, Citrus spp., Ficus carica, Fortunella spp., Fragaria spp., Crataegus spp., Diospyros spp., Eugenia unifora, Eriobotrya japonica, Dimocarpus longan, Carica papaya, Cocos spp., Averrhoa carambola, Actinidia spp., Prunus amygdalus, Musa spp. (Musa acuminate), Persea spp. (Persea Americana), Psidium guajava, Mammea americana, Mangifera indica, Canarium album (Oleaeuropaea), Caricapapaya, Cocos nucifera, Malpighia emarginata, Manilkara zapota, Ananas comosus, Annona spp., Citrus reticulate (Citrus spp.), Artocarpus spp., Litchi chinensis, Ribes spp., Rubus spp., pear, peach, apricot, plum, red bayberry, lemon, kumquat, durian, orange, strawberry, blueberry, hami melon, muskmelon, date palm, walnut tree, cherry tree; (6) rhizome crops: Manihot spp., Ipomoea batatas, Colocasia esculenta, tuber mustard, Allium cepa (onion), eleocharis tuberose (water chestnut), Cyperus rotundus, Rhizoma dioscoreae; (7) vegetable crops: Spinacia spp., Phaseolus spp., Lactuca sativa, Momordica spp, Petroselinum crispum, Capsicum spp., Solanum spp. (such as Solanum tuberosum, Solanum integrifolium, Solanum lycopersicum), Lycopersicon spp. (such as Lycopersicon esculentum, Lycopersicon lycopersicum, Lycopersicon pyriforme), Macrotyloma spp., Kale, Luffa acutangula, lentil, okra, onion, potato, artichoke, asparagus, broccoli, Brussels sprouts, cabbage, carrot, cauliflower, celery, collard greens, squash, Benincasa hispida, Asparagus officinalis, Apium graveolens, Amaranthus spp., Allium spp., Abelmoschus spp., Cichorium endivia, Cucurbita spp., Coriandrum sativum, B. carinata, Rapbanus sativus, Brassica spp. (such as Brassica napus, Brassica rapa ssp., canola, oilseed rape, turnip rape, turnip rape, leaf mustard, cabbage, black mustard, canola (rapeseed), Brussels sprout, Solanaceae (eggplant), Capsicum annuum (sweet pepper), cucumber, luffa, Chinese cabbage, rape, cabbage, calabash, Chinese chives, lotus, lotus root, lettuce; (8) flower crops: Tropaeolum minus, Tropaeolum majus, Canna indica, Opuntia spp., Tagetes spp., Cymbidium (orchid), Crinum asiaticum L., Clivia, Hippeastrum rutilum, Rosa rugosa, Rosa Chinensis, Jasminum sambac, Tulipa gesneriana L., Cerasus sp., Pharbitis nil (L.) Choisy, Calendula officinalis L., Nelumbo sp., Bellis perennis L., Dianthus caryophyllus, Petunia hybrida, Tulipa gesneriana L., Lilium brownie, Prunus mume, Narcissus tazetta L., Jasminum nudiflorum Lindl., Primula malacoides, Daphne odora, Camellia japonica, Michelia alba, Magnolia liliiflora, Viburnum macrocephalum, Clivia miniata, Malus spectabilis, Paeonia suffruticosa, Paeonia lactiflora, Syzygium aromaticum, Rhododendron simsii, Rhododendron hybridum, Michelia figo (Lour.) Spreng., Cercis chinensis, Kerria japonica, Weigela florida, Fructus forsythiae, Jasminum mesnyi, Parochetus communis, Cyclamen persicum Mill., Phalaenophsis hybrid, Dendrobium nobile, Hyacinthus orientalis, Iris tectorum Maxim, Zantedeschia aethiopica, Calendula officinalis, Hippeastrum rutilum, Begonia semperflorenshybr, Fuchsia hybrida, Begonia maculataRaddi, Geranium, Epipremnum aureum; (9) medicinal crops: Carthamus tinctorius, Mentha spp., Rheum rhabarbarum, Crocus sativus, Lycium chinense, Polygonatum odoratum, Polygonatum Kingianum, Anemarrhena asphodeloides Bunge, Radix ophiopogonis, Fritillaria cirrhosa, Curcuma aromatica, Amomum villosum Lour., Polygonum multiflorum, Rheum officinale, Glycyrrhiza uralensis Fisch, Astragalus membranaceus, Panax ginseng, Panax notoginseng, Acanthopanax gracilistylus, Angelica sinensis, Ligusticum wallichii, Bupleurum sinenses DC., Datura stramonium Linn., Datura metel L., Mentha haplocalyx, Leonurus sibiricus L., Agastache rugosus, Scutellaria baicalensis, Prunella vulgaris L., Pyrethrum carneum, Ginkgo biloba L., Cinchona ledgeriana, Hevea brasiliensis (wild), Medicago sativa Linn, Piper Nigrum L., Radix Isatidis, Atractylodes macrocephala Koidz; (10) raw material crops: Hevea brasiliensis, Ricinus communis, Vernicia fordii, Morus alba L., Hops Humulus lupulus, Betula, Alnus cremastogyne Burk., Rhus verniciflua stokes; (11) pasture crops: Agropyron spp., Trifolium spp., Miscanthus sinensis, Pennisetum sp., Phalaris arundinacea, Panicum virgatum, prairiegrasses, Indiangrass, Big bluestem grass, Phleum pratense, turf, cyperaceae (Kobresia pygmaea, Carex pediformis, Carex humilis), Medicago sativa Linn, Phleum pratense L., Medicago sativa, Melilotus suavcolen, Astragalus sinicus, Crotalaria juncea, Sesbania cannabina, Azolla imbircata, Eichhornia crassipes, Amorpha fruticosa, Lupinus micranthus, Trifolium, Astragalus adsurgens pall, Pistia stratiotes linn, Alternanthera philoxeroides, Lolium; (12) sugar crops: Saccharum spp., Beta vulgaris; (13) beverage crops: Camellia sinensis, Camellia Sinensis, tea, Coffee (Coffea spp.), Theobroma cacao, Humulus lupulus Linn.; (14) lawn plants: Ammophila arenaria, Poa spp. (Poa pratensis (bluegrass)), Agrostis spp. (Agrostis matsumurae, Agrostis palustris), Lolium spp. (Lolium), Festuca spp. (Festuca ovina L.), Zoysia spp. (Zoysiajaponica), Cynodon spp. (Cynodon dactylon/bermudagrass), Stenotaphrum secunda turn (Stenotaphrum secundatum), Paspalum spp., Eremochloa ophiuroides (centipedegrass), Axonopus spp. (carpetweed), Bouteloua dactyloides (buffalograss), Bouteloua var. spp. (Bouteloua gracilis), Digitaria sanguinalis, Cyperusrotundus, Kyllingabrevifolia, Cyperusamuricus, Erigeron canadensis, Hydrocotylesibthorpioides, Kummerowiastriata, Euphorbia humifusa, Viola arvensis, Carex rigescens, Carex heterostachya, turf; (15) tree crops: Pinus spp., Salix spp., Acer spp., Hibiscus spp., Eucalyptus spp., Ginkgo biloba, Bambusa sp., Populus spp., Prosopis spp., Quercus spp., Phoenix spp., Fagus spp., Ceiba pentandra, Cinnamomum spp., Corchorus spp., Phragmites australis, Physalis spp., Desmodium spp., Populus, Hedera helix, Populus tomentosa Carr, Viburnum odoratissinum, Ginkgo biloba L., Quercus, Ailanthus altissima, Schima superba, Ilex pur-purea, Platanus acerifolia, Ligustrum lucidum, Buxus megistophylla Levl., Dahurian larch, Acacia mearnsii, Pinus massoniana, Pinus khasys, Pinus yunnanensis, Pinus finlaysoniana, Pinus tabuliformis, Pinus koraiensis, Juglans nigra, Citrus limon, Platanus acerifolia, Syzygium jambos, Davidia involucrate, Bombax malabarica L., Ceiba pentandra (L.), Bauhinia blakeana, Albizia saman, Albizzia julibrissin, Erythrina corallodendron, Erythrina indica, Magnolia gradiflora, Cycas revolute, Lagerstroemia indica, coniferous, macrophanerophytes, Frutex; (16) nut crops: Bertholletia excelsea, Castanea spp., Corylus spp., Carya spp., Juglans spp., Pistacia vera, Anacardium occidentale, Macadamia (Macadamia integrifolia), Carya illinoensis Koch, Macadamia, Pistachio, Badam, other plants that produce nuts; (17) others: Arabidopsis thaliana, Bra chiaria eruciformis, Cenchrus echinatus, Setaria faberi, Eleusine indica, Cadaba farinose, algae, Carex elata, ornamental plants, Carissa macrocarpa, Cynara spp., Daucus carota, Dioscorea spp., Erianthus sp., Festuca arundinacea, Hemerocallis fulva, Lotus spp., Luzula sylvatica, Medicago sativa, Melilotus spp., Morus nigra, Nicotiana spp., Olea spp., Ornithopus spp., Pastinaca sativa, Sambucus spp., Sinapis sp., Syzygium spp., Tripsacum dactyloides, Triticosecale rimpaui, Viola odorata, and the like.
In a specific embodiment, the plant is selected from rice, corn, wheat, soybean, sunflower, sorghum, rape, alfalfa, cotton, barley, millet, sugarcane, tomato, tobacco, cassava, potato, sweet potato, Chinese cabbage, cabbage, cucumber, Chinese rose, Scindapsus aureus, watermelon, melon, strawberry, blueberry, grape, apple, citrus, peach, pear, banana, etc.
As used herein, the term “plant” includes a whole plant and any progeny, cell, tissue or part of plant. The term “plant part” includes any part of a plant, including, for example, but not limited to: seed (including mature seed, immature embryo without seed coat, and immature seed); plant cutting; plant cell; plant cell culture; plant organ (e.g., pollen, embryo, flower, fruit, bud, leaf, root, stem, and related explant). Plant tissue or plant organ can be a seed, callus tissue, or any other plant cell population organized into a structural or functional unit. The plant cell or tissue culture can regenerate a plant that has the physiological and morphological characteristics of the plant from which the cell or tissue is derived, and can regenerate a plant that has substantially the same genotype as the plant. In contrast, some plant cells cannot regenerate plants. The regenerable cells in plant cells or tissue cultures can be embryos, protoplasts, meristematic cells, callus, pollen, leaves, anthers, roots, root tips, silk, flowers, kernels, spikes, cobs, husks, or stems.
The plant parts comprise harvestable parts and parts that can be used to propagate offspring plants. The plant parts that can be used for propagation include, for example, but not limited to: seeds; fruits; cuttings; seedlings; tubers; and rootstocks. The harvestable parts of plants can be any of useful parts of plants, including, for example, but not limited to: flowers; pollen; seedlings; tubers; leaves; stems; fruits; seeds; and roots.
The plant cells are the structural and physiological units of plants. As used herein, the plant cells include protoplasts and protoplasts with partial cell walls. The plant cells may be in a form of isolated single cells or cell aggregates (e.g., loose callus and cultured cells), and may be part of higher order tissue units (e.g., plant tissues, plant organs, and plants). Therefore, the plant cells can be protoplasts, gamete-producing cells, or cells or collection of cells capable of regenerating a whole plant. Therefore, in the embodiments herein, a seed containing a plurality of plant cells and capable of regenerating into a whole plant is considered as a “plant part”.
As used herein, the term “protoplast” refers to a plant cell whose cell wall is completely or partially removed and whose lipid bilayer membrane is exposed. Typically, the protoplast is an isolated plant cell without cell wall, which has the potential to regenerate a cell culture or a whole plant.
The plant “offspring” includes any subsequent generations of the plant.
The term “bacteria” means all prokaryotes, including all organisms in the Kingdom Procaryotae. The term “bacteria” includes all microorganisms considered to be bacteria, including Mycoplasma, Chlamydia, Actinomyces, Streptomyce, and Rickettsia. All forms of bacteria are included in this definition, including cocci, bacilli, spirilla, spheroplasts, protoplasts, etc. The term also includes prokaryotes that are Gram-negative or Gram-positive. “Gram-negative” and “Gram-positive”mean a staining pattern using Gram staining methods well known in the art (see, for example, Finegold and Martin, Diagnostic Microbiology, 6th Ed., CV Mosby St. Louis, pp. 13-15[1982]). “Gram-positive bacteria” are bacteria that can retain the original dye used for Gram staining, causing the stained cells to appear dark blue to purple under a microscope. “Gram-negative bacteria” do not retain the original dye used for Gram staining, but can be stained with a counter stain. Therefore, Gram-negative bacteria appear red after the Gram staining reaction.
As used herein, the term “fungi” refers to eukaryotic organisms such as molds and yeasts, including dimorphic fungi.
The terms “herbicide tolerance” and “herbicide resistance” can be used interchangeably, and both refer to herbicide tolerance and herbicide resistance. “Improvement in herbicide tolerance” and “improvement in herbicide resistance” mean that the tolerance or resistance to the herbicide is improved compared to a plant containing wild-type gene.
The term “wild-type” refers to a nucleic acid molecule or protein that can be found in nature.
In the present invention, the term “cultivation site” comprises a site where the plant of the present invention is cultivated, such as soil, and also comprises, for example, plant seeds, plant seedlings and grown plants. The term “weed-controlling effective amount” refers to an amount of herbicide that is sufficient to affect the growth or development of the target weed, for example, to prevent or inhibit the growth or development of the target weed, or to kill the weed. Advantageously, the weed-controlling effective amount does not significantly affect the growth and/or development of the plant seeds, plant seedlings or plants of the present invention. Those skilled in the art can determine such weed-controlling effective amount through routine experiments.
The term “Target DNA” as used herein refers to a DNA polynucleotide comprising a “target site” or “target sequence”.
The term “lysis” means the cleavage of the covalent backbone of a DNA molecule. The lysis can be initiated by a variety of methods, including but not limited to enzymatic or chemical hydrolysis of phosphodiester bonds. Both single-strand and double-strand lysis is possible, and double-strand lysis may occur due to two distinct single-strand lysis events. The DNA lysis may result in blunt or staggered ends. In certain embodiments, a complex comprising DNA-targeting RNA and a site-specific modification polypeptide is used for a targeted double-strand DNA lysis.
The term “gene” comprises a nucleic acid fragment expressing a functional molecule (such as, but not limited to, specific protein), including regulatory sequences before (5′ non-coding sequences) and after (3′ non-coding sequences) a coding sequence.
The DNA sequence that “encodes” a specific RNA is a DNA nucleic acid sequence that can be transcribed into RNA. The DNA polynucleotides can encode a RNA (mRNA) that can be translated into a protein, or the DNA polynucleotides can encode a RNA that cannot be translated into a protein (for example, tRNA, rRNA, or DNA-targeting RNA; which are also known as “non-coding” RNA or “ncRNA”).
The terms “polypeptide”, “peptide” and “protein” are used interchangeably in the present invention, and refer to a polymer of amino acid residues. The terms are applied to amino acid polymers in which one or more amino acid residues are artificially chemical analogs of corresponding and naturally occurring amino acids, as well as to naturally occurring amino acid polymers. The terms “polypeptide”, “peptide”, “amino acid sequence” and “protein” may also include their modification forms, including but not limited to glycosylation, lipid linkage, sulfation, γ-carboxylation of glutamic acid residue, hydroxylation and ADP-ribosylation.
The term “biologically active fragment” refers to a fragment that has one or more amino acid residues deleted from the N and/or C-terminus of a protein while still retaining its functional activity.
For the terms related to amino acid substitution used in the description, the first letter represents a naturally occurring amino acid at a certain position in a specific sequence, the following number represents the position in the corresponding sequence, and the second letter represents a different amino acid for substituting the naturally occurring amino acid. For example, W574L means that tryptophan at position 574 is substituted by leucine. For double or multiple mutations, each mutation is separated by “/”.
The terms “polynucleotide” and “nucleic acid” are used interchangeably and comprise DNA, RNA or hybrids thereof, which may be double-stranded or single-stranded.
The terms “nucleotide sequence” and “nucleic acid sequence” both refer to the sequence of bases in DNA or RNA.
As used in the present invention, “expression cassette”, “expression vector” and “expression construct” refer to a vector such as a recombinant vector suitable for expression of a nucleotide sequence of interest in a plant. The term “expression” refers to the production of a functional product. For example, the expression of a nucleotide sequence may refer to the transcription of the nucleotide sequence (such as transcription to generate mRNA or functional RNA) and/or the translation of RNA into a precursor or mature protein.
The “expression construct” of the present invention can be a linear nucleic acid fragment, a circular plasmid, a viral vector, or, in some embodiments, can be an RNA (such as mRNA) that can be translated.
The “expression construct” of the present invention may comprise regulatory sequences and nucleotide sequences of interest from different sources, or regulatory sequences and nucleotide sequences of interest from the same source but arranged in a way different from those normally occurring in nature.
The terms “recombinant expression vector” or “DNA construct” are used interchangeably herein and refer to a DNA molecule comprising a vector and at least one insert. Recombinant expression vectors are usually produced for the purpose of expression and/or propagation of the insert or for the construction of other recombinant nucleotide sequences. The insert may be operably or may be inoperably linked to a promoter sequence and may be operably or may be inoperably linked to a DNA regulatory sequence.
The terms “regulatory sequence” and “regulatory element” can be used interchangeably and refer to a nucleotide sequence that is located at the upstream (5′ non-coding sequence), middle or downstream (3′ non-coding sequence) of a coding sequence, and affects the transcription, RNA processing, stability or translation of a related coding sequence. Plant expression regulatory elements refer to nucleotide sequences that can control the transcription, RNA processing or stability or translation of a nucleotide sequence of interest in plants.
The regulatory sequences may include, but are not limited to, promoters, translation leader sequences, introns, and polyA recognition sequences.
The term “promoter” refers to a nucleic acid fragment capable of controlling the transcription of another nucleic acid fragment. In some embodiments of the present invention, the promoter is a promoter capable of controlling gene transcription in plant cells, regardless of whether it is derived from plant cells. The promoter can be a constitutive promoter or a tissue-specific promoter or a developmentally regulated promoter or an inducible promoter.
The term “constitutive promoter” refers to a promoter that will generally cause gene expression in most cell types in most cases. “Tissue-specific promoter” and “tissue-preferred promoter” are used interchangeably, and refer to a promoter that is mainly but not necessarily exclusively expressed in a tissue or organ, and also expressed in a specific cell or cell type. “Developmentally regulated promoter” refers to a promoter whose activity is determined by a developmental event. “Inducible promoter” responds to an endogenous or exogenous stimulus (environment, hormone, chemical signal, etc.) to selectively express an operably linked DNA sequence.
As used herein, the term “operably linked” refers to a connection of a regulatory element (for example, but not limited to, promoter sequence, transcription termination sequence, etc.) to a nucleic acid sequence (for example, a coding sequence or open reading frame) such that the transcription of the nucleotide sequence is controlled and regulated by the transcription regulatory element. The techniques for operably linking regulatory element region to nucleic acid molecule are known in the art.
The “introducing” a nucleic acid molecule (such as a plasmid, linear nucleic acid fragment, RNA, etc.) or protein into a plant refers to transforming a cell of the plant with the nucleic acid or protein so that the nucleic acid or protein can function in the plant cell. The term “transformation” used in the present invention comprises stable transformation and transient transformation.
The term “stable transformation” refers to that the introduction of an exogenous nucleotide sequence into a plant genome results in a stable inheritance of the exogenous gene. Once stably transformed, the exogenous nucleic acid sequence is stably integrated into the genome of the plant and any successive generations thereof.
The term “transient transformation” refers to that the introduction of a nucleic acid molecule or protein into a plant cell to perform function does not result in a stable inheritance of the foreign gene. In transient transformation, the exogenous nucleic acid sequence is not integrated into the genome of the plant.
Unless otherwise defined, all technical and scientific terms used herein have the same meanings as commonly understood by those of ordinary skill in the art to which the present invention pertains. Although any methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present invention, the preferred methods and materials are now described.
All publications and patents cited in this description are incorporated herein by reference as if each individual publication or patent is exactly and individually indicated to be incorporated by reference, and is incorporated herein by reference to disclose and describe methods and/or materials related to the publications cited. The citation of any publication which it was published before the filing date should not be interpreted as an admission that the present invention is not eligible to precede the publications of the existing invention. In addition, the publication date provided may be different from the actual publication date, which may require independent verification.
Unless specifically stated or implied, as used herein, the terms “a”, “a/an” and “the” mean “at least one.” All patents, patent applications, and publications mentioned or cited herein are incorporated herein by reference in their entirety, with the same degree of citation as if they were individually cited.
The main sequences involved in the present invention are summarized as follows, and related sequences are provided in the sequence listing.
myosuroides
The present invention will be further explained in conjunction with the following examples. The following examples are illustrated by way of examples, but the protection scope of the present invention should not be limited to these examples. The experimental methods in the following examples, unless otherwise specified, were the methods described in commonly used molecular biology, tissue culture technology and agronomy manuals. For example, specific steps could be found in: “Molecular Cloning: A Laboratory Manual (3rd edition)” (Sambrook, J., Russell, David W., 2001, Cold Spring Harbor), “Plant Propagation by Tissue Culture” (Edwin F. George, Michael A. Hall, Geert-Jan De Klerk, 2008, Springer). The materials, reagents, instruments, etc. used in the following examples could be obtained from commercial sources unless otherwise specified.
A. Experimental Materials
1. Arabidopsis thaliana Material
The wild-type Arabidopsis thaliana Col-0 was a model variety of dicotyledonous plant, its original seeds were provided by the Department of Weeds, College of Plant Protection, China Agricultural University, and the propagation and preservation thereof were performed by our laboratory according to standard methods in this field.
2. Vectors
The vector plasmids pCBC-dT1T2 (Xing H L, Dong L, Wang Z P, Zhang H Y, Han C Y, Liu B, Wang X C, Chen Q J 2014. A CRISPR/Cas9 toolkit for multiplex genome editing in plants. BMC Plant Biol. November 29; 14(1):327, see https://www.addgene.org/50590/for details), pHEE401E (see Wang Z P, Xing H L, Dong L, Zhang H Y, Han C Y, Wang X C, Chen Q J Genome Biol. 2015 Jul. 21; 16:144. doi: 10.1186/s13059-015-0715-0. https://www.addgene.org/71287/for specific information), and pHEE401E-NG (the mutation as reported in Nishimasu et al. 2018 Engineered CRISPR-Cas9 nuclease with expanded Targeting space. Science 361(6408):1259-1262. doi: 10.1126/science.aas9129 was introduced into pHEE401E to construct vector pHEE401E-NG capable of recognizing NG PAM) were purchased from Addgene website or constructed by our laboratory in accordance with conventional molecular biology methods, and kept by our laboratory.
3. Main Equipment
Pipette gun, water bath, PCR instrument (Bio-rad T100), electrophoresis instrument (WIX-EP600), gel imager, electric blast dryer, centrifuge (Eppendorf 5424R), high-throughput tissue lyser, shaker, electronic balance, pH meter, etc.
4. Main Reagents
High-fidelity DNA polymerase (purchased from Tsingke Bio), agarose gel recovery kit and plasmid extraction kit (purchased from Sparkjade), BsaI and T4 DNA ligases (purchased from NEB), Trans5α competent cells and EHA105 competent cells (purchased from TransGen Biotech, Beijing, China), GV3101 Agrobacterium competent cells (purchased from Shanghai AngyuBio), Tris, EDTA, kanamycin, cephalosporin, hygromycin, agarose, yeast powder, tryptone, NaCl (purchased from Sangon Biotech), MS powder, sucrose, Silwet-77, hygromycin (purchased from Solarbio), nucleic acid dye (Dured), absolute ethanol (purchased from Sinopharm), etc.
5. Preparation of Main Solutions
B. Experimental Methods
1. Design and Construction of CRISPR/Cas9 Dual Target Vector
1.1 Target Design
The Arabidopsis thaliana ALS gene sequence was shown in SEQ ID NO: 2. The target sequence gRNA1 (5′-GCATGGTTATGCAATGGGA-3′) of 19 bases was designed using the AGA near Arabidopsis thaliana ALS574 site as PAM, it was predicted that one G base would be deleted between the first 3-4 sites of PAM after editing, then a second target sequence gRNA2 (5′-GGCATGGTTATGCAATGGA-3′) was designed based on the sequence generated from the deletion, and it was predicted that one T base would be inserted via a second editing, thereby realizing conversion of TGG-TTG, as shown in
Similarly, the target sequence gRNA3 (5′-TGCCGATGATCCCGAGTGG-3′) of 19 bases was designed using the TGG near Arabidopsis thaliana ALS653 site as PAM, it was predicted that one G base would be deleted between the first 3-4 positions of PAM after editing, then a second target sequence gRNA4 (5′-TTGCCGATGATCCCGATGG-3′) was designed based on the sequence generated from the deletion, and it was predicted that one A base would be inserted after the second editing, thereby realizing conversion of AGT-AAT, as shown in
1.2 Vector Construction
The method described by Xing H L, Dong L, Wang Z P, Zhang H Y, Han C Y, Liu B, Wang X C, Chen Q J 2014. A CRISPR/Cas9 toolkit for multiplex genome editing in plants. BMC Plant Biol. November 29; 14(1): 327 was followed. Specifically, dT1T2 plasmid was used as a template to amplify the ALS574 and 653 sites dual-target fragment respectively to construct an sgRNA expression cassette. The vector backbones of pHEE401E and pHEE401E-NG were digested with BsaI, and the bands were cut from the gel and recovered, and the target fragment was directly used for the ligation reaction after digestion. T4 DNA ligase was used to ligate the vector backbones and the target fragment, the ligation products were transformed into Trans5a competent cells, different monoclones were picked out for sequencing. After confirmation via sequencing, the Sparkjade High Purity Plasmid Mini Extraction Kit was used to extract the plasmids to obtain the recombinant plasmids, which were respectively named as pQY743 and pQY745.
2. Design of Primers for Target Detection
Primers for target detection took ALS574 and ALS653 target sites as centers, wherein the primer for upstream detection was about 100 bp from the ALS574 target site, and the primer for downstream detection was about 280 bp from the ALS653 target site. The primer sequences were as follows: 574/653checking-F: 5′ATTGACGGAGATGGAAGCTT3′, and 574/653checking-R: 5′CCAAACTGAGCCAGTCACAA3′.
3. Establishment of Arabidopsis thaliana Genetic Transformation System
3.1 Agrobacterium Transformation
The constructed recombinant plasmids were transformed into Agrobacterium GV3101 competent cells to obtain recombinant Agrobacterium cells.
3.2 Preparation of Agrobacterium Infection Solution
3.3 Transformation of Arabidopsis thaliana
3.4 Seed Harvest
After the seeds were mature, they were harvested. After harvesting, the seeds were dried in an oven at 37° C. for about one week.
4. Selection of Transgenic Plant
The seeds were treated with disinfectant for 5 minutes, washed with deionized water for 5 times, and evenly spread on the MS selection medium (containing 30 μg/mL hygromycin, 100 μg/mL cephalosporin), the medium was placed in a light incubator (temperature 22° C., 16 hours light, 8 hours dark, light intensity 100-150 μmol/m2/s, humidity 75%), and one week later, the positive seedlings were selected and transplanted to soil.
5. Detection of T1 Mutant Plants
5.1 Extraction of Genomic DNA
5.2 PCR Amplification
The extracted T1 plant genome was used as a template, the detection primers were used to amplify target fragment, 5 μL of amplified product was pipetted and detected by 1% agarose gel electrophoresis, and imaged with the gel imager. The remaining product was delivered to the sequencing company to directly perform sequencing.
6. Detection of T2 Mutant Plant
After the seeds of the T1 strain were harvested from single plant, the seeds of different strains of two vectors were selected and spread on the imazapic selection medium (MS medium+0.24 μg/mL imazapic) to perform selection, and the positive seedlings were transplanted to the soil one week later and subjected to molecular detection, in which the method was the same as step 5.
The used primers and sequences thereof:
C. Experimental Results
1. Genotype Detection of T1 Plant
T1 seeds were selected by MS hygromycin resistance medium, a total of 32 positive seedlings were obtained for the pQY743 vector, and a total of 18 positive seedlings were obtained for the pQY745 vector. For each vector, 10 seedlings were selected and leaf genomic DNA thereof was extracted to detect the target site. It was found that there was no editing occurred in the T1 generation at ALS574 site, and there were editing events that met the design expectations at ALS653 site. The detection results are shown in Table 1:
2. Selection Results of T2 Generation Seeds
After the T2 generation seeds of single plant were harvested, they are spread on the imazapic resistant medium to perform selection, it could be seen that the wild-type Col-0 could not grow on the resistant medium, while the positive plants of the mutant strains could grow normally on the resistant medium, as shown in
For each vector, 10 seedlings were selected and genomic DNA thereof was extracted from leaves for molecular detection. It was found that there were 6 strains for the pQY743 vector that had homozygous mutation in line with the expectations, i.e., mutation from TGG to TTG, as shown in
The above results showed that by using the technical solution of programmed sequential cutting/editing of the present invention, the design and realization of expected mutations at the target site could be achieved, and the base substitution mutations could be realized by designing sequential sgRNA combinations only with Cas9 protein.
The operation steps for vector design, construction, and Arabidopsis thaliana transformation and selection were performed by referring to Example 1. For AtALS W574 site, the vector design was the same as that of Example 1. The schematic diagram of the vector was shown in
For the AtALS S653 site, the vector design was the same as that of Example 1. The vector diagram was shown in
The above results showed that by using the technical solution of programmed sequential cutting/editing of the present invention, in addition to the expected mutations designed for the target site, multiple functional mutation types could also be generated by designing sequential sgRNA combinations with only corresponding Cas9 protein, so that it is a suitable tool for creating new functional mutations.
The operation steps for vector design, construction, and Arabidopsis thaliana transformation and selection were performed by referring to Example 1, except that, for the sequence 5′CTTGGCATGGTTATGCAATGgg3′ near the AtALS W574 site, the GG closer to the W574 site was used first as PAM to design sgRNA1: 5′CTTGGCATGGTTATGCAATG3′, in which the W574 site was underlined and the NG PAM was shown in italics. It was predicted that a new sequence 5′CTTGGCATGGTTATGCAAATGGGAag3′ was to be formed by +A after cutting and repair. The AG was used as a new PAM site to design sgRNA2: 5′CTTGGCATGGTTATGCAAATGGGA3′, and −G genotype was formed after spontaneous repair of cells, resulting in W574M. That was, different PAM sites were used for the two cuttings in this scheme. The vector diagram was shown in
The results showed that by using the technical solution of the present invention and using different PAMs to perform programmed sequential cutting/editing, the editing could be carried out in a wider sequence range to obtain amino acid substitution.
The mutation as reported in Nishimasu et al. 2018 Engineered CRISPR-Cas9 nuclease with expanded targeting space. Science 361(6408):1259-1262. doi: 10.1126/science.aas9129 was introduced into pHUE411 (A CRISPR/Cas9 toolkit for multiplex genome editing in plants. Xing H L, Dong L, Wang Z P, Zhang H Y, Han C Y, Liu B, Wang X C, Chen Q J. BMC Plant Biol. 2014 Nov. 29; 14(1):327. 10.1186/s12870-014-0327-y, see details in https://www.addgene.org/71287/) to construct the vector pHUE411-NG capable of recognizing NG PAM.
The Oryza sativa ALS gene sequence was shown in SEQ ID NO: 12. By using 5′GGCGTGCGGTTTGATGATCG3′ (the underlined part was the OsALS-D350 site corresponding to Arabidopsis thaliana ALS-D376) and a new sequence 5′GGCGTGCGGTTTGATGACG3′ that was predicted to be generated from editing as targets, a dual-target vector was constructed according to the method described in Xing H L, Dong L, Wang Z P, Zhang H Y, Han C Y, Liu B, Wang X C, Chen Q J. BMC Plant Biol. 2014, and it was expected to obtain the conversion of GAT-GAA, resulting in OsALS D350E mutation.
By using 5′GGTATGGTTGTGCAATGGGA3′ (the underlined part was the OsALS-W548 site corresponding to Arabidopsis thaliana ALS-W574) and a new sequence 5′GGTATGGTTGTGCAATGGA 3′ that was predicted to be generated from editing as targets, a dual-target vector was constructed, and it was expected to obtain the conversion of TGG-TTG, resulting in OsALS W548L mutation.
Then the two vectors were transferred into Oryza sativa to obtain transgenic plants, and the identification indicated that the plants with expected substitutions D350E and W548L were obtained. The results of herbicide resistance biotest in field showed that the D350E and W548L mutants acquired the resistance to ALS inhibitor herbicides.
The Oryza sativa ACCase2 gene sequence was shown in SEQ ID NO: 14, in which OsACCase2 W2038 site corresponded to the ACCase W2027 site of Alopecurus myosuroides. The AGG close to this site was used as PAM to design sgRNA1: 5′TTCATCCTCGCTAAC-TGAG3′, and it was predicted that a new sequence was formed by −G after cutting and repair. The AGG was continuously used as PAM to design sgRNA2: 5′CTTC-ATCCTCGCTAACTGAG3′, and +T genotype was formed after cutting and repair again, resulting in the W2038L mutation. The sgRNA1 and sgRNA2 were constructed on the pHUE411 vector to form an editing vector, and the vector diagram was shown in
The editing vector was used to transform the callus of Huaidao No 5 (a rice variety), and after 3 weeks of co-selection with 50 μg/L hygromycin and 50 μg/L quizalop-p, a large number of resistant calli were obtained, as shown on the left panel of
The above results of programmed sequential cutting/editing of Oryza sativa gene showed that the technical solution of the present invention was applicable to both monocotyledonous and dicotyledonous plants.
1. Experimental Instrument and Reagents
2. Experimental Method
2.1 Construction of pET15b-Cas9 Expression Vector
The DNA sequences of SpCas9 and NGA-Cas9 proteins obtained after plant codon optimization were shown in SEQ ID NO: 15 and SEQ ID NO: 16, respectively, and the sequences were synthesized by GenScript as template DNAs.
After the NG-Cas9 and NGA-Cas9 sequences were amplified separately, the two fragments were ligated to a pET15b expression vector by infusion method, transformed into DH5a, and sequenced after verification.
2.2 Expression and Purification of Proteins
The constructed expression vectors were transformed into Escherichia coli Rosetta (DE3), the expression thereof was induced by IPTG, and the bacteria were harvested, lyzed and purified by Ni-NTA column. The specific method was as follows:
2.3 Expression and Purification Results of Cas9 Fusion Proteins
The purification results of SpCas9 and NGA-Cas9 proteins were shown in
The specific detection primers, OsACC1750AA-F: 5′gcgaagaagactatgctcgtattgg3′ and OsACC2196AA-R: 5′cttaatcacacctttcgcagcc3′, were used to amplify the fragment containing the OsACCase W2038 target site, and the PCR product was 1500 bp in length.
The PCR system was shown in the following table:
The results were shown in
2. Preparation of RNP Complex:
According to the OsALS548 target site sequence GGGTATGGTTGTGCAATGGGAgga (the OsALS W548 site was underlined, corresponding to Arabidopsis thaliana ALS W574, and the PAM site recognized by Cas9 protein was shown in italic lowercase), the purified NGA-Cas9 protein was selected to prepare the RNP complex. The GGA was used as PAM to design >CrRNA1-548-G: 5′-GGGUAUGGUUGUGCAAUGGGAguuuuagagcuaugcu-3′, it was predicted that one G base would be deleted between the first 3-4 positions of PAM after editing, and then the sequence resulted from the above deletion was used to design a second >CrRNA2-548+T: 5′-UGGGUAUGGUUGUGCAAUGGAguuuuagagcuaugcu-3′, it was predicted that one T base would be inserted after the second editing to obtain the conversion of TGG-TTG.
>CrRNA1-548-G and >CrRNA1-548+T were synthesized by GenScript Biotechnology Company, and sgRNA was also synthesized:
The synthesized crRNA and GenCRISPR tracrRNA (GenScript SC1933) were mixed equimolarly, added with crRNA&tracrRNA annealing buffer (GenScript SC1957-B) and annealed to prepare gRNA according to the instructions. The tracrRNA sequence was 5′-agcauagcaaguuaaaauaaggcuaguccguuaucaacuugaaaaaguggcaccgagucggugcuuu-3.
The RNP reaction system was prepared according to the following table, and after the reaction system was prepared, it was incubated at 25° C. for 10 minutes.
3. Protoplast Transformation
4. Detection of Genome Target Editing Event
The PCR reaction system was shown in the following table:
5. Experimental Results:
By either using the gRNA prepared with the synthetic crRNA and tracrRNA by annealing or directly using the synthetic sgRNA, an active RNP complex could be formed with the purified NGA-Cas9 protein. By sequencing the OsALS548 targeted site, the mutation from TGG to TTG could be detected, which demonstrated that the site-specific mutation of the target site in the cell could be achieved by the programmed sequential cutting/editing generated from the RNP complex in combination with crRNA or sgRNA in sequential order. As shown in
1. Preparation of RNP Complexes:
According to the OsACCase2 W2038 target site sequence GTTCATCCTCGCTAACTGGAGagg, in which the OsACCase2 W2038 site, corresponding to the Alopecurus myosuroides ACCase W2027, was underlined, and the PAM site recognized by the Cas9 protein was shown in italic lowercase, the purified SpCas9 protein was selected to prepare the RNP complexes. The GGA was used as PAM to design >CrRNA1-2038-G: 5′-GUUCAUCCUCGCUAACUGGAGguuuuagagcuaugcu-3′, it was predicted that one G base would be deleted between the first 3-4 positions of PAM after editing, then the sequence generated from the deletion was used to design a second >CrRNA2-2038+T: 5′-UGUUCAUCCUCGCUAACUGAGguuuuagagcuaugcu-3′, it was predicted that one T base would be inserted after the second editing, thus the conversion of TGG-TTG was obtained.
>CrRNA1-2038-G and >CrRNA2-2038+T were synthesized by GenScript Biotechnology Company, and sgRNA was also synthesized:
The synthesized crRNA and GenCRISPR tracrRNA (GenScript SC1933) were mixed equimolarly, added with crRNA&tracrRNA annealing buffer (GenScript SC1957-B) and annealed to prepare gRNA according to the instructions.
The RNP complexes were prepared by incubating in the same reaction system as in Example 8. Taking the amount of 10 gene gun bombardments for transformation as example: 20 μg of Cas9 protein, 20 μg of gRNA or sgRNA, 10 μl of 10×Cas9 reaction buffer, made up to 100 μl in total with RNase-free ultrapure water, incubated at 25° C. for 10 minutes, and mixed gently.
2. Induction of Oryza sativa Callus
The mature and plump Oryza sativa seeds were selected, hulled, and disinfected according to the following steps:
3. Gene Gun RNP Bombardment:
The callus with good embryogenicity was transferred to hypertonic medium (formulation: MS powder (4.42 g/L)+2,4-D (2 mg/L)+sucrose (30 g/L)+D-mannitol (0.4M))+phytagel (4 g/L)), sterile operation was carried out on an ultra-clean bench, and cultivation was carried out in the dark at 25° C. for 4-6 hours.
Bombardment parameters: the vacuum degree was 26-28, the distance was 6 cm, and the air pressure was 1100 psi or 1350 psi.
4. Selection, Differentiation and Rooting:
5. Detection of Target Editing Events in Resistant Callus and TO Tissue Culture Seedlings:
A total of 11 resistant calli were obtained in the selection, the DNA thereof was extracted by the CTAB method. The detection primers for the target site were designed as follows: OsACC2038test-F: 5′CTGTAGGCATTTGAAACTGCAGTG3′, OsACC2038test-R: 5′GCAATCCTGGAGTTCCT-CTGACC3′, and the PCR fragment containing OsACCase2 W2038 site was amplified, recovered and sequenced. The sequencing detection indicated that 10 out of them had the mutation from TGG to TTG at the OsACCase2 W2038 site, in which 3 samples were homozygous mutants.
For the T0 generation tissue culture seedlings obtained by resistant callus differentiation, the DNA thereof was extracted for detecting the editing target site sequence, and there was also a homozygous mutation from TGG to TTG at the OsACCase2 W2038 site, as shown in
6. Resistance Test of T1 Generation Seedlings to ACCase Inhibitor Herbicides
After the propagation of the TO strain containing the W2038L mutation, the T1 generation mutant seedlings were tested for herbicide resistance with quizalofop-p and haloxyfop-p in field concentrations. It could be seen that the OsACCase2 W2038L mutant strain was significantly resistant to these two ACCase inhibitor herbicides, as shown in
In summary, by either using the gRNA prepared with the synthetic crRNA and tracrRNA by annealing, or directly using the synthetic sgRNA, an active RNP complex could be formed with the purified SpCas9 protein; the gene gun bombarded calli could be selected by using quizalofop-p in the tissue culture stage; the homozygous mutation from TGG to TTG could be detected by sequencing the OsACCase2 W2038 target site of the TO generation tissue culture seedlings; the mutation could be inherited to the T1 generation and showed resistance to ACCase inhibitor herbicides, which further demonstrated that the programmed sequential cutting/editing generated from the RNP complex in combination with the sequential targeting crRNA or sgRNA could achieve site-specific mutation of a target site in the cell, and could guide the production of cell-endogenous selection markers for tissue culture selection, thereby creating herbicide-resistant crops.
The RNP complex was prepared by the method according to Example 8, the gene gun bombardment and tissue culture procedures were the same as those of Example 9. Besides the crRNA or sgRNA targeting the OsACCase2 W2038 site, the crRNA or sgRNA targeting the OsBADH2 gene was added at the same time, and they were incubated with SpCas9 protein to form a targeting RNP complex for a second target gene OsBADH2. The gene gun bombardment was performed, the TO generation tissue culture seedlings were obtained after recovery culture, screening, differentiation and rooting, and the T1 generation was obtained by propagation.
The Oryza sativa OsBADH2 genome sequence was shown in SEQ ID NO: 17. According to the CRISPOR online tool (http://crispor.tefor.net/), the target site sequence CCAAGTACCTCCGCGCAATCGcgg was selected, in which the PAM site recognized by the Cas9 protein was shown in italic lowercase, and the purified SpCas9 protein was selected to prepare an RNP complex. The CGG was used as PAM to design >CrRNA1-OsBADH2: 5′-CCAAGUACCUCCGCGCAAUCGguuuuagagcuaugcu-3′, it was predicted that the resistant mutation of OsACCase2 W2038L and the knockout mutation event of OsBADH2 could be simultaneously detected in the resistant callus obtained by the selection of quizalofop-p.
>CrRNA1-OsBADH2 was synthesized by GenScript Biotechnology Company, and sgRNA was also synthesized:
The resistant callus was selected according to the transformation steps in Example 9, as shown in
As a result, a total of 13 resistant calli were obtained by selection, the OsACCase2 W2038L mutation event was detected in 11 out of them, and the detection of OsBADH2 target sequence for these 11 callus samples showed that 8 out of them simultaneously contained the editing event at OsBADH2 target site. In the TO generation tissue culture seedlings obtained by differentiation, the existence of OsACCase2 W2038L mutation and OsBADH2+A homozygous mutation was detected, as shown in
In sum, after the rice callus was subjected to gene gun bombardment with the sequential targeting crRNA- or sgRNA-targeting RNP complex to perform site-specific editing of OsACCase2 W2038 and simultaneously adding the targeting RNP complex targeting the second target gene OsBADH2, the selection of the callus could be performed in the tissue culture stage with quizalofop-p; the OsACCase2 W2038 mutation and the targeted knockout of OsBADH2 simultaneously occurred in 61% of the resistant callus, and the strains containing homozygous mutation of OsBADH2 were detected in the TO generation tissue culture seedlings. These indicated that the sequential targeting in combination with the RNP transformation for site-specific editing of resistance genes could generate endogenous selection markers, and the addition of the corresponding selection pressure could simultaneously screen the editing events of the second target gene, thereby achieving the site-specific editing of genome by non-transgenic means.
The targeting RNP complexes for the OsALS548 site were prepared by the method according to Example 8, in which the crRNA and sgRNA sequences were the same as in Example 8, respectively:
Referring to Example 8, the gRNA or sgRNA was incubated with NGA-Cas9 to prepare RNP complexes targeting OsALS548 site.
In addition to the targeted RNP complex for the OsALS548 site, the crRNA or sgRNA for the OsSWEET14 gene was added at the same time, and the targeting RNP complex for the second target gene OsSWEET14 was formed by incubation with the SpCas9 protein. The TO generation tissue culture seedlings were obtained by performing gene gun bombardment, recovery culture, selection, differentiation, rooting, and the T1 generation was obtained by propagation. The selection pressure was 5 mg/L pyroxsulam.
The Oryza sativa OsSWEET14 genome sequence was shown in SEQ ID NO: 18. According to the CRISPOR online tool (http://crispor.tefor.net/), the target site sequence GAGCTTAGCACCTGGTTGGAGggg was selected, in which the PAM sites recognized by the SpCas9 protein was shown in italic lowercase, and the purified SpCas9 protein was selected to prepare the RNP complex. The GGG was used as PAM to design:
>CrRNA1-OsSWEET14:
5′-GAGCUUAGCACCUGGUUGGAGguuuuagagcuaugcu-3′, and it was predicted that the resistant mutation of OsALS W548L and the knockout mutation of OsSWEET14 could be simultaneously detected in the resistant callus obtained by the selection of pyroxsulam.
>CrRNA1-Os SWEET14 was synthesized by GenScript Biotechnology Company, and the sgRNA was also synthesized:
The resistant callus was selected according to the gene gun bombardment and tissue culture procedures as described in Example 9, as shown in
The sequences at the OsALS548 site and OsSWEET14 target site of the callus and TO generation tissue culture seedlings were sequenced. The OsSWEET14 target site detection primers were:
As a result, a total of 9 resistant calli were obtained by the selection, the OsALS W548L mutation event was detected in 8 out of them, and the detection of OsSWEET14 target sequence in these 8 callus samples showed that 5 out of them also contained the editing event at OsSWEET14 target site. In the TO generation tissue culture seedlings obtained by differentiation, both the OsALS W548L mutation and the OsSWEET14-C homozygous mutation could be detected, as shown in
After the propagation of TO strains in which the occurrence of OsALS W548L mutation was detected, the T1 generation mutant seedling strains were tested for herbicide resistance with pyroxsulam, imazapic, nicosulfuron and flucarbazone-Na at field concentrations. It could be seen that the OsALS W548L mutant strain showed significant resistance to all of these 4 ALS inhibitor herbicides, as shown in
In summary, by either using the gRNA prepared with the synthetic crRNA and tracrRNA by annealing or directly using the synthetic sgRNA, an active RNP complex could be formed with the purified NGA Cas9 protein, the site-specific mutation of a target site in the cells could be achieved by the programmed sequential cutting/editing generated from the RNP complex in combination with the sequential targeting crRNA or sgRNA, the selection of the callus bombarded by gene gun could be performed with pyroxsulam at the tissue culture stage, the mutation from TGG to TTG was detected by sequencing the OsALS548 target site of the TO generation tissue culture seedlings, and this mutation could be inherited to the T1 generation and showed resistance to ALS inhibitor herbicides.
After the rice callus was subjected to gene gun bombardment by simultaneously adding the targeting RNP complex targeting the second target gene OsSWEET14 and using the SpCas9 protein that recognized NGG PAM, the selection of the callus was performed with pyroxsulam at the tissue culture stage. The OsALS W548L mutation and the targeted knockout of OsSWEET14 occurred simultaneously in 55% of the resistant callus, the occurrence of homozygous mutation of OsSWEET14 could be detected in the TO generation tissue culture seedlings, which further indicated that the endogenous selection markers could be generated by the programmed sequential cutting/editing in combination with the site-specific editing of resistant genes generated from PNP transformation, and the editing events of the second target gene could be selected at the same time by simultaneously using the Cas9 proteins that recognized different PAM sites and adding a corresponding selection pressure, thereby achieving the site-specific editing of genome by non-transgenic means.
The amino acid sequence of Solanum tuberosum L. StALS2 protein was shown in SEQ ID NO: 19, and the sequence of Solanum tuberosum L. StALS2 gene was shown in SEQ ID NO: 20. The methods for the preparation of RNP complexes and the gene gun bombardment referred to Examples 8-9, and the sgRNAs for the original sequence and the edited sequence designed for StALS2W561 site of Solanum tuberosum L. StALS2 corresponding to the Arabidopsis thaliana ALS574 site were as follows:
And the RNP complexes were prepared according to the method described in Example 8.
The recipient potato variety was Atlantic or Favorita, and the leaves, stems and axillary buds thereof were used as explants, respectively. The methods for gene gun bombardment and selection and differentiation were as follows:
The detection primers for StALS2 W561 site were:
After detection, the editing event of W561L occurred at the StALS2 W561 site of the resistance-screened TO generation potato tissue culture seedlings, which demonstrated that the non-transgenic transient gene editing method provided by the present invention was suitable for crops such as potatoes from which an exogenous transgenic element can hardly be separated and removed by selfing or hybridization.
The HBB (hemoglobin subunit beta) gene in human embryonic kidney cell 293T (the DNA sequence thereof was shown in SEQ ID NO: 21, the CDS sequence thereof was shown in SEQ ID NO: 22, and the amino acid sequence thereof was shown in SEQ ID NO: 23) was selected, the target site for sequential targeting of sgRNA was designed in the region of the first exon, in which the first target was catggtgcaCctgactcctgAGG. The sgRNA that recognized this target was named sgHBB, and it was predicted that the deletion of one C base could be generated at the sgRNA cut of this site. The second target was ccatggtgcatctgactctgAGG, which recognized the sequence with the deletion of one C base generated from the cutting/editing of the first target, and the sgRNA of this target was named sgHBB-c. The sgRNA with no target site in 293T cells was designed and named sgNOTAR, which was used as a complementing plasmid for transfection in the experiment.
According to the above design, complementary single-stranded DNA fragments were synthesized respectively. After annealing, they were ligated into px458 (addgene: 48138) plasmids digested with BbsI enzyme, and transformed to E. coli DH5a competent. After the resultant E. coli single colonies were verified by sequencing, the plasmids were extracted and purified with an endotoxin-free plasmid extraction kit (Tiangen Bio).
The vigorously growing 293T cells were digested and isolated with 0.05% trypsin (Gibico), diluted with DMEM medium (10% fetal bovine serum; penicillin+streptomycin double resistant) and inoculated into 24-well culture plates, and placed in a carbon dioxide incubator overnight. On the next day, they were mixed separately with, according to sequential cutting/editing: sgHBB and sgHBB-c plasmids each 0.5 ug; single target cutting/editing: sgHBB and sgNOTAR plasmids each 0.5 ug; no target control: pEGFP-c1 plasmid 1 μg. The transformation was performed with lipofectamine3000 (Invitrogen). There were 3 duplications for each group.
48 hours After transformation, pictures were taken with a fluorescence microscope to record the transformation efficiency, and the total DNA of each well was extracted with a nucleic acid extraction kit (Omega).
The designed Hi-tom sequencing primers were as follows:
These primers were used to perform PCR for each DNA sample. The high-throughput sequencing of PCR products was performed by using Hi-tom method (Sci China Life Sci. 2019 January; 62(1):1-7. doi: 10.1007/s11427-018-9402-9).
The statistical data of the sequencing results were shown in Table 6 and Table 7. These data indicated that the sequential cutting/editing method at HBB site produced an editing outcome from C to T (resulting in P6S mutation) at a ratio of about 1.67%, while the single target cutting/editing method could not produce such base substitution.
In addition, sickle cell anemia and β-thalassemia are both inherited anemias caused by mutations in the HBB gene encoding hemoglobin β subunit of adult. Patients with these diseases may need blood transfusion or other therapies throughout their lives. The results of the above experiment demonstrated that through the technical solution of programmed sequential cutting/editing as provided by the present invention, a combination of crRNA or sgRNA could be designed for the mutation site of the HBB gene to induce the expected repair at the mutation site, so that the cell produced active hemoglobin and restored function, thereby achieving therapeutic effect. That is, the composition provided by the present invention has a use in treatment of diseases.
After a variety of tests at the same time, because this novel method is completely based on the existing functions of Cas9, the method of the present invention will be fully applicable to achieve the new functions of base substitution, deletion and insertion of specific fragments in other organisms (plants, animals, fungi or bacteria, etc.) where Cas9 can work well.
All publications and patent applications mentioned in the description are incorporated herein by reference, as if each publication or patent application is individually and specifically incorporated herein by reference.
Although the aforementioned invention has been described in more details by way of examples and embodiments for clear understanding, it is obvious that certain changes and modifications can be implemented within the scope of the appended claims, and such changes and modifications are all within the scope of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
201911081617.X | Nov 2019 | CN | national |
202010821877.2 | Aug 2020 | CN | national |
202010974151.2 | Sep 2020 | CN | national |
This application is a national stage filing under 35 U.S.C. § 371 of International Application No. PCT/CN2020/120633, filed Oct. 13, 2020, and claims the priority to and benefits of Chinese Patent Application No. 201911081617.X, filed Nov. 7, 2019, Chinese Patent Application No. 202010821877.2, filed Aug. 15, 2020, and Chinese Patent Application No. 202010974151.2, filed Sep. 16, 2020, which are incorporated herein by reference in their entireties.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2020/120633 | 10/13/2020 | WO |