The present invention relates to a method for editing or modifying plant genomes, specifically, a nuclear genome, a mitochondrial genome, and a plastid genome.
Upon selective breeding of higher plants, editing or modification of a nuclear genome is considered to be an effective method. In addition, genomes existing in plastids, including chloroplasts, and mitochondria, contain genes that play important roles, and editing of genomes contained in these intracellular organs, etc. is also considered to be effective for selective breeding of plants.
The plastid genome of higher plants has a size of about 150 kb and contains about 120 genes. These genes are associated with photosynthesis, antibiotic tolerance, herbicide tolerance, and the like. Among the plastid genes, for example, psbA, a key gene for photosystem, and rbcL, a key enzyme for dark-reaction CO2 fixation, are important genes that carryout plant functions. It is expected that the improvement of these genes will contribute to optimization of light energy utilization in plants, the enhancement of food production, bioethanol production and increased biomass production, the improvement of CO2 absorption and utilization as a resource, and the like.
Gene transfer into the plastid genome has been performed for about 30 years. The advantages of gene transfer into the plastid genome are different from those of gene transfer into the nuclear genome. For example, since the plastid genome is maternally inherited, it can prevent the spread of recombinant genes through pollens. In addition, the expression of a desired gene product is relatively easy because gene silencing, which occurs during the genetic recombination of the nucleus, does not occur.
However, the transfer of foreign genes into the plastid genome is not so easy. Special equipment (e.g., particle gun) and culture techniques are required for the gene transfer into the plastid genome. Moreover, the number of plant species, into which gene transfer can be carried out, is limited, and even in the case of model plants such as Arabidopsis thaliana and rice, it is difficult to transfer foreign genes into the chloroplast genome thereof (Non Patent Literature 1 and Non Patent Literature 2). Although there are some successful examples (for example, Patent Literature 1, etc.), gene transfer into the plastid genome is still a difficult technique.
Furthermore, to date, there are no practical techniques for genome editing that modifies only a specific single nucleotide in the plastid genome. The use of transgenic plants produced by the aforementioned gene transfer is internationally regulated by the Cartagena Act. In contrast, in some cases, the Cartagena Act may not apply to the modification of only a single nucleotide in the plastid genome that is originally present in plants, although the treatment is different from country to country. Therefore, it has been desired to develop a technique of modifying only a specific single nucleotide in the plastid genome, instead of gene transfer into the plastid genome.
The plant mitochondrial genome encodes not only genes involved in electron transport system, ATP synthesis, mitochondrial gene translation, etc., but also encodes many open reading frames (ORFs) whose functions are unknown. Insufficient utilization and characterization of the plant mitochondrial genome is partially caused by the limited tools for modification of the plant mitochondrial genome and the difficulty in identifying a single nucleotide polymorphism (SNP) in the genome that affects agronomic traits as a result of the modification. To date, stable gene transfer into the mitochondrial genome by a particle gun method has been performed on two unicellular organisms, namely, green alga Chlamydomonas (Non Patent Literature 3) and yeasts (Non Patent Literatures 4 and 5). However, stable gene transfer into the mitochondrial genome of higher plants has not been successfully achieved so far.
Recently, Mok et al. have bisected the cytidine deaminase (CD) gene of a Burkholderia cenocepacia DddA protein, and have fused an uracil glycosylase inhibitor (UGI) and the DNA-binding domain of TALE (transcription activator-like effector) with each of the obtained gene portions to create a protein, and thereafter, they have allowed the protein to transiently express in mammalian cells (Non Patent Literature 6). As a result, they have succeeded in substituting the target C:G pair in the mitochondrial genome with a T:A pair. The conversion of the C:G pair to the T:A pair has occurred in, at maximum, 50% of the mitochondrial genome in the cells.
Moreover, in order to replace the target base pair (conversion of C:G to T:A) in the mitochondrial genome of lettuce and rapeseed calluses, Kang et al. have applied the technique of Mok et al., and have allowed a fusion protein consisting of UGI and TALE to transiently express in the lettuce and rapeseed calli. As a result, Kang et al. have reported that the frequency of editing the mitochondrial genome is, at maximum, about 25% (Non Patent Literature 7).
As mentioned above, although the single nucleotide editing technique for plant genomes has been progressing year by year, its editing efficiency is still low at the present stage, and thus, further improvement of the technique is needed.
Under the aforementioned circumstances, it is an object of the present invention to provide a method for editing or modifying plant genomes, namely, a nuclear genome, a plastid (e.g., chloroplast) genome, and a mitochondrial genome in plants, and in particular, a method for editing or modifying a target single nucleotide with good accuracy and high efficiency.
The present inventors have conducted intensive studies regarding whether the technique reported by Mok et al. (Non Patent Literature 6) could not be utilized for the editing of the nuclear genome, plastid genome, and mitochondrial genome of plants.
First, the present inventors have designed DNA-binding sequence TALE repeats used in the genome-editing enzyme TALEN (transcription activator-like effector nuclease), which recognizes 7 bp to 21 bp each before and after 10-20 bp containing a single nucleotide as a target of editing, and have then designed protein sequences (TALECD) by fusing the DNA-binding sequence TALE repeats with a half-split DddA Cytidine deaminase in each of the left and right pairs.
Subsequently, a nuclear transition (localization) signal (NLS) was added to these two proteins (nTALECD), a chloroplast transition (localization) signal was added to these two proteins (ptpTALECD), or a mitochondrial localization signal was added to these two proteins (mtpTALECD). Expression vectors for each protein (vectors that stably introduce DNA encoding each of the three types of peptide-added proteins into the nuclear genome) were constructed. These vectors were transformed into the nuclei of plant stem cells (DNA encoding each TALECD was incorporated into the plant nuclear genomic DNA, so that each of the above TALECDs can be expressed stably (not transiently). It could be confirmed that the nTALECD, ptpTALECD, or mtpTALECD expressed from these three types of expression vectors migrates into the nucleus, chloroplast, or mitochondria, respectively, and edits the target single nucleotide (conversion of C:G pair to T:A pair).
The present inventors have found that, by using the above-described method for editing a plant genome according to the present invention, the target C:G pairs contained in the plant genome (nuclear genome, plastid genome, and mitochondrial genome) can be homoplasmically modified, namely, if taking the plastid genome as an example, almost all of the target C:G pairs in about 1000 copies or more of plastid genomes contained in a cell in the plant can be converted to T:A pairs.
By the way, both plastids and mitochondria are cell organelles that are generated as a result of intracellular symbiosis of free-living bacteria, and retain their own genomic DNA. However, when compared with mitochondria, which have been intracellularly symbiotic for a longer period of time, the plastid genome has a sequence and a structure that are more similar to those of bacteria. In addition, unlike the mitochondrial genome, the plastid genome has transcription, translation, and DNA replication/repair systems that clearly exhibit bacterial types. Moreover, plant mitochondria duplicate and partially divert some of the enzymes of the DNA replication and repair system used in the plastid, and have their own hybrid-type system that is different from the plastid genome and the mammalian mitochondrial genome, which means that the three types of organellar genomes have three different styles. In fact, among the molecules identified as repair factors for plastid genomic DNA and mammalian mitochondrial genomic DNA, there are many completely different repair molecules. Therefore, genomic DNA repairs and changes that appear after modification of individual mitochondrial and plastid genomic DNAs are also different (see Non Patent Literature 8, Non Patent Literature 9, etc.).
As described above, since the mitochondria in mammals and the plastids and mitochondria in plants are completely different intracellular organelles, editing techniques applicable to the mitochondrial genome in mammals are not necessarily applicable to the editing of the mitochondrial genome and the plastid genome in plants.
Accordingly, the aforementioned results “the target C:G pairs can be homoplasmically modified” can be said to be significant effects that can never be predicted from the results disclosed in Non Patent Literature 6 that are “at most only about 42% of the target C:G pairs in mammalian cells was modified.” In addition, also regarding the technique of editing a mitochondrial genome and a plastid genome in plants disclosed in Non Patent Literature 7, the single nucleotide modification percentages were about 25% and about 38%, respectively. Taking into consideration these results, it can be said that the method for editing a plant genome according to the present invention is extremely efficient, compared with the method disclosed in Non Patent Literature 7.
Specifically, the present invention includes the following (1) to (6).
It is to be noted that the preposition “to” sandwiched between numerical values is used in the present description to mean a numerical value range including the numerical values located left and right of the preposition.
According to the method of the present invention, it is possible to modify a single nucleotide in a plant genome, specifically, in a nuclear genome, a plastid genome or a mitochondrial genome in a plant. Moreover, according to the method of the present invention, target nucleotides of almost all of copies of a nuclear genome, a plastid genome or a mitochondrial genome in a plant body can be modified.
Hereafter, the embodiments for carrying out the present invention will be described.
A first embodiment relates to a method for editing a plant genomic DNA, comprising converting a target nucleotide on the genomic DNA to another nucleotide.
In the present embodiment, the “plant genome” means a genome contained in the nucleus of a plant (nuclear genome), a genome contained in the plastid of a plant (plastid genome), or a genome contained in the mitochondria of a plant (mitochondrial genome). In addition, in the present embodiment, the “plastid” means an organelle present in the cells of plants, algae and the like, and the plastid performs anabolism such as photosynthesis, the storage of sugars, fats, etc., and the synthesis of various compounds. Examples of the “plastid” may include chloroplasts, leucoplasts, and chromoplasts.
Modification of a target nucleotide is not particularly limited, but it may be carried out using a nucleotide-modifying enzyme such as deaminase that is introduced into the nucleus, plastid, or mitochondria. Such an enzyme may be, for example, cytidine deaminase that converts the cytosine (C) in DNA to uridine (U). The enzyme is particularly preferably an enzyme that converts the C in double-stranded DNA to U, and it is, for example, a cytidine deaminase domain of DddA of Burkholderia cenocepacia (hereinafter referred to as “DddAtox”: SEQ ID NO: 35), or a protein substantially identical to DddAtox. In this context, the protein substantially identical to DddAtox is not particularly limited, and it is, for example, a protein comprising an amino acid sequence having an amino acid identity of 70% or more, preferably 80% or more, more preferably 90% or more, 91% or more, 92% or more, 93% or more, 94% or more, 95% or more, 96% or more, 97% or more, 98% or more, and most preferably 99% or more, to the amino acid sequence as set forth in SEQ ID NO: 35, and having cytidine deaminase activity (the activity of converting the C in double-stranded DNA to U).
In order to specifically modify the target nucleotide of a nuclear genomic DNA, plastid genomic DNA, or mitochondrial genomic DNA in plants, it is necessary to allow a modifying enzyme such as deaminase (for example, cytidine deaminase) to recognize the target nucleotide. As a means therefore, there may be applied a method comprising: ligating a modifying enzyme to TALE (transcription activator-like effector) that binds to a genomic DNA around the target nucleotide (for example, within a range of 0 to 1000 nucleotides, preferably 5 to 100 nucleotides, and more preferably 5 to 50 nucleotides, from the target nucleotide): and then introducing the modifying enzyme-TALE fusion protein into the nucleus, plastid or mitochondria in plants. More specifically, for example, a DNA encoding such a modifying enzyme-TALE fusion protein may be introduced into a nuclear genomic DNA (may be incorporated into the nuclear genomic DNA), and thereafter, the modifying enzyme-TALE fusion protein expressed in the cytoplasm may be transported (introduced) into the nucleus, plastid, or mitochondria. In this case, it is desirable to introduce a DNA encoding a fusion protein formed by adding (binding) a different type of signal peptide (a nuclear localization signal peptide, a plastid localization signal peptide, or a mitochondrial localization signal peptide) as described below to the modifying enzyme-TALE fusion protein, into the nuclear genomic DNA.
As a method of transporting the modifying enzyme-TALE fusion protein into the nucleus, there can be applied a method which comprises fusing the modifying enzyme-TALE fusion protein with a nuclear localization signal/sequence (NLS) peptide, and then expressing the fused body. Examples of the nuclear localization signal peptide usable in the embodiment of the present invention may include, but are not limited to, an SV40 large T antigen NLS peptide (PKKKRKV, SEQ ID NO: 111), a nucleoplasmin NLS peptide (AVKRPAATKKAGQAKKKKLD, SEQ ID NO: 112), an EGL-13 NLS peptide (MSRRRKANPTKLSENAKKLAKEVEN, SEQ ID NO: 113), a c-Myc NLS peptide (PAAKRVKLD, SEQ ID NO: 114), and a TUS protein NLS peptide (KLKIKRPVK, SEQ ID NO: 115). Other than these NLS peptides, usable nuclear localization signal peptides are present, and see, for example, NLSdb (https://rostlab.org/services/nlsdb/browse/signals) that is the database of nuclear localization signals.
As a method of transporting the modifying enzyme-TALE fusion protein into the plastid, there can be applied a method which comprises fusing the modifying enzyme-TALE fusion protein with a plastid localization signal peptide (a peptide that has neither a clear higher-order structure nor sequence homology, but is rich in basic amino acids and multiple hydrophobic amino acids, contains a few acidic amino acids, and exhibits the function of specifically sorting and transporting to chloroplasts or plastids by adding it to the N-terminus of the amino acid sequence of the protein), and then expressing the fused body. The plastid localization signal peptide usable in the embodiment of the present invention is preferably a signal peptide possessed by a protein localized in a plant plastid. Examples of a preferred signal peptide may include, but are not limited to, protein-derived signal peptides such as RECA1, RBCS, CAB, NEP, SIG1 to 5, and GUN2 to 5, nuclear-encoded chloroplast ribosomal protein-derived signal peptides such as RPL12 and RPS9, nuclear-encoded chloroplast tRNA aminoacyl transferase-derived signal peptides, nuclear-encoded chloroplast heat shock protein-derived signal peptides, protein-derived signal peptides such as FtsZ, FtsH, MinC, MinD, and MinE, nuclear-encoded chloroplast photosynthesis-related enzyme complex group-derived signal peptides, nuclear-encoded plastid lipid metabolism enzyme group-derived signal peptides, and nuclear-encoded thylakoid protein group-derived signal peptides. For the plastid localization signal peptides, see, for example, von HEIJNE et al., Eur. J. Biochem. 180, 535-545, 1989.
As a method of transporting the modifying enzyme-TALE fusion protein into the mitochondria, there can be applied a method which comprises fusing the modifying enzyme-TALE fusion protein with a mitochondrial localization signal peptide (a peptide that does not have a clear higher-order structure or sequence homology, but is characterized in that, for example, basic amino acids and multiple hydrophobic amino acids appear alternately), and then expressing the fused body. The plastid localization signal peptide usable in the embodiment of the present invention may preferably be, for example, a signal peptide possessed by a protein localized in plant mitochondria. Examples of the preferred signal peptide may include, but are not limited to, an Arabidopsis thaliana ATPase δ′ subunit-derived signal peptide (MFKQASRLLS RSVAAASSKS VTTRAFSTEL PSTLDS, SEQ ID NO: 116), a rice ALDH2a gene product-derived signal peptide (MAARRAASSL LSRGLIARPS AASSTGDSAI LGAGSARGFL PGSLHRFSAA PAAAATAAAT EEPIQPPVDV KYTKLLINGN FVDAASGKTF ATVDP, SEQ ID NO: 117), a pea cytochrome c oxidase Vb-3-derived signal peptide (MWRRLFTSPH LKTLSSSSLS RPRSAVAGIR CVDLSRHVAT QSAASVKKRV EDVV, SEQ ID NO: 118), an Arabidopsis thaliana ATPase β subunit-derived signal peptide, a chaperonin CPN-60-derived signal peptide (Logan et al., Journal of Experimental Botany 50, 865-871, 2000), a rice ALDH signal peptide (Nakazono et al., Plant Physiology 124, 587-598, 2000), and a rice FIFO-ATPase inhibitor protein signal peptide (Nakazono et al., Plant 210, 188-194, 2000).
Otherwise, it is also possible to use a method which comprises directly introducing a plasmid DNA or mRNA encoding the modifying enzyme-TALE fusion protein, and the modifying enzyme-TALE fusion protein, and the like into a cell (wherein examples of the introduction method may include a virus method, a particle gun method, a PEG method, and a cell membrane-penetrating peptide method).
In order to modify a target nucleotide in a plant genomic DNA with high probability, two modifying enzyme-TALE fusion proteins (for example, the TALE left and TALE right shown in
Moreover, when a full-length protein such as DddAtox is used as an enzyme for modifying the target sequence, if the direct use thereof affects the cells due to its toxicity, partial proteins prepared by dividing such a full-length protein at an appropriate position may be each fused with the aforementioned TALE left and TALE right, and each fusion protein may be then transferred into the plastid. The two partial proteins, which are obtained by dividing the full-length protein at the appropriate position, can be reassociated with each other at a stage in which they bind to the vicinity of the target nucleotide, and can exhibit desired activity (see the Examples). When DddAtox is used as a modifying enzyme, for example, the amino acid sequence of DddAtox as set forth in SEQ ID NO: 35 may be divided between any amino acids at positions 40 to 100, for example, between the amino acids at positions 44 and 45, or between the amino acids at positions 94 and 95.
Furthermore, the modifying enzyme-TALE fusion protein may be fused with other proteins that have functions to enhance the action of the fusion protein. An example of such other proteins may be an uracil glycosylase inhibitor (UGI). UGI inhibits the activity of uracil glycosylase, which removes U. Accordingly, when cytidine deaminase is used as a modifying enzyme, UGI plays a role of preventing the removal of U that is converted from C, and maintaining the modification by the cytidine deaminase-TALE fusion protein.
In the first embodiment, for example, if the aforementioned cytidine deaminase (CD), DddAtox, is used as a modifying enzyme, the target nucleotide C in a nuclear genomic DNA, a plastid genomic DNA and a mitochondrial genomic DNA can be converted to T, homoplasmically (a state in which the same mutations are kept in all of cells and tissues, or in plants). Therefore, the present invention provides an extremely useful means for improving plants.
A second embodiment relates to: a nuclear genome in which a target nucleotide in the nuclear genomic DNA of a plant is modified, a plastid genome in which a target nucleotide in the plastid genomic DNA of a plant is modified, or a mitochondrial genome in which a target nucleotide in the mitochondrial genome DNA of a plant is modified, wherein the modification is carried out by the method for editing a plant genomic DNA according to the first embodiment; a nucleus having the nuclear genome, a plastid having the plastid genome, or mitochondria having the mitochondrial genome: a plant cell having the nuclear genome, the plastid genome or the mitochondrial genome: a cytoplasm of the plant cell: or a seed or a plant (an adult plant), comprising the plant cell.
The plant (adult plant) in the present embodiment includes not only generations (T0, or also, T1 depending on the plant type) that are differentiated from transformed cells, in which a target nucleotide in a nuclear genomic DNA, a target nucleotide in a plastid genomic DNA, or a target nucleotide in a mitochondrial genomic DNA is modified, but also includes generations of progenies obtained from T0/T1. In addition, the seeds in the second embodiment include not only seeds obtained from the above-described T0/T1 generations, but also include seeds obtained from the generations of progenies.
A third embodiment relates to a method for producing a plant having an edited plant genome, wherein the method comprises editing a plant genome by the method for editing a plant genomic DNA according to the first embodiment.
That is to say, the third embodiment relates to:
a method for producing a plant having an edited nuclear genome, wherein the method comprises editing a nuclear genome by the method for editing a plant genomic DNA according to the first embodiment:
The plants according to the first, second, and third embodiments are not particularly limited, and any plants may be applied as long as they are seed plants. If daring to give some examples, examples of the plants that can be used herein may include: gramineous plants, such as rice, wheat, corn, barley, rye, and sorghum: and cruciferous plants, for example, plants belonging to genus Alyssum, genus Arabidopsis (Arabidopsis thaliana, etc.), genus Armoracia (horseradish, etc.), genus Aurinia, genus Brassica (Chinese flat cabbage, mustard green, Brassica juncea, rapeseed, Brassica rapa ssp., hagoromokanran (kale), flowering kale, cauliflower, cabbage, brussels sprouts (komochikaran), broccoli, bok choy, turnip greens mustard leaves, oilseed rape, Chinese cabbage, Japanese mustard spinach, turnip, etc.), genus Camelina, genus Capsella, genus Cardamine, genus Coronopus, genus Diplotaxis, genus Draba, genus Eruca (Rucola, etc.), genus Hesperis, genus Hirschfeldia, genus Iberis, genus Ionopsidium, genus Lepidium, genus Lobularia, genus Lunaria, genus Malcolmia, genus Matthiola, genus Nasturtium, genus Orychophragmus, genus Raphanus (Japanese radish, Raphanus sativus var. sativus, etc.), genus Rapistrum, genus Rorippa, genus Sisymbrium, genus Thlaspi, and genus Eutrema (Japanese wasabi mustard, etc.). Furthermore, other examples of the plants that can be used herein may include: solanaceous plants, such as tomato, potato, pepper, shishito pepper, and petunias: Asteraceae plants, such as sunflower and dandelion: Convolvulaceae plants, such as bindweed and sweet potato: araceous plants, such as konjak, taro, Colocasia esculenta, and Colocasia esculenta: leguminous plants, such as soybeans, adzuki beans, and green beans: cucurbitaceous plants, such as pumpkin, cucumber, and melon: and amaryllidaceous plants, such as onion, green onion, and garlic.
The disclosures of all publications cited in the present description are incorporated herein by reference in their entirety. In addition, throughout the present description, when the description includes singular terms with the articles “a,” “an,” and “the,” these terms include not only single items but also multiple items, unless otherwise clearly specified from the context.
Hereinafter, the present invention will be further described in the following examples. However, these examples are only illustrative examples of the embodiments of the present invention, and thus, are not intended to limit the scope of the present invention.
A wild-type strain, Arabidopsis thaliana Colombia-0 strain (Col-0), and a genetically recombinant strain were cultivated at 22° C. under long-day conditions (light period: 16 hours; dark period: 8 hours). Col-0 seeds were seeded on a ½ MS medium (pH=5.7) containing Murashige-Skoog medium salt mixture (Wako, Japan) (2.3 g/L), MES (500 mg/L) and sucrose (10 g/L), and on a ½ MS medium containing Plant Preservative Mixture (Plant Cell Technology, USA) (1 mL/L), Gamborg's Vitamin Solution (Sigma-Aldrich, USA) (1 mL/L) and agar (8 g/L). One to two weeks after the seeding, the seedlings were transplanted in Jiffy-7 (Jiffy Products International B. V., Netherlands), and were then used in Agrobacterium transfection. Besides, several slow-growing T1 plants were subjected to a stratification treatment, and were then transplanted into plant boxes each containing a ½ MS medium at 23 days after stratification (DAS) (at 23 DAS).
TALE target sequences were designed using Old TALEN Targeter (https://tale-nt.cac.cornell.edu/node/add/talen-old), such that the sequences bind to both sides of a cytidine deaminase target region. A first nucleotide to be recognized needs to be on the 3′ side adjacent to T, as far as possible. The minimum length of the TALE target sequence was set to be 15 bp in order for TALE to bind in a sequence-specific manner. The TALE-binding sequences are shown below.
One pair of left and right ptpTALECDs (
The DNA binding domains of ptpTALECDs were assembled using Platinum Gate TALEN system (Sakuma et al., Scientific reports 3, 1-8, 2013.) (
Hereafter, CD half-UGI sequences and a RecA1 PTP sequence are shown.
“G1333C” is a protein consisting of the amino acids at positions 45 to 138 on the C-terminal side of the amino acid sequence of DddAtox as set forth in SEQ ID NO: 35. In addition, UGI (Uracil Glycosylase Inhibitor) consists of the amino acid sequence as set forth in SEQ ID NO: 36, and is ligated to the “G1333C” via a linker peptide (SEQ ID NO: 37) (hereinafter, the amino acid sequence of UGI and the linker peptide are the same as those described above).
“G1333N” is a protein consisting of the amino acids at positions 1 to 44 on the N-terminal side of the amino acid sequence of DddAtox as set forth in SEQ ID NO: 35.
“G1397C” is a protein consisting of the amino acids at positions 95 to 138 on the C-terminal side of the amino acid sequence of DddAtox as set forth in SEQ ID NO: 35.
“G1397N” is a protein consisting of the amino acids at positions 1 to 94 on the N-terminal side of the amino acid sequence of DddAtox as set forth in SEQ ID NO: 35.
“PTP” is a plastid transit peptide of Arabidopsis thaliana RECA1 (the amino acid sequence of PTP is as set forth in SEQ ID NO: 38).
Primer sequences used in vector construction are shown in the following Table 1.
I-1-4. Transformation of Plants and Screening of Transformants
Col-0 was transformed by a floral dip method (Clough et al., The Plant Journal 16, 735-743, 1998.) with the Agrobacterium tumefaciens strain C58C1 retaining one of the aforementioned transformation vectors. First, transgenic T1 seeds were selected using fluorescence from GFP as an indicator. GFP-positive seeds were seeded on a ½ MS medium containing 125 mg/L Claforan. On the other hand, GFP-negative seeds were seeded on a ½ MS medium containing 50 mg/L kanamycin and 125 mg/L Claforan.
Total DNA was extracted from the second true leaf of the selected seedlings, using the Maxwell (registered trademark) RSC Plant DNA Kit (Promega, USA). For genotyping of transgenic strains, the plastid DNA sequence regions around the cytidine deaminase target sequences were amplified using the following primer sets corresponding to the target genes. In order to detect substitution of the target nucleotide, the nucleotide sequences of the purified PCR products were determined by the Sanger method.
Using all DNA sequence data, single nucleotide polymorphisms (SNPs) in the plastid and mitochondrial genomes were determined. First, preparation of a PE library using Nextera XT DNA library Prep Kit (Illumina) was entrusted to Macrogen Japan, and sequencing was then carried out using Illumina NovaSeq 6000 platform. Sequence reads at the 150 bp paired end were analyzed using Geneious prime (Biomatters Ltd). Sequence reads were attached to an Arabidopsis thaliana chloroplast genome sequence, and sequences detected as SNPs with a reference chloroplast genome sequence in 50% or more of the reads are shown in the following Table 2.
A- A
indicates data missing or illegible when filed
T2 seeds obtained from T1 plants corresponding to individual target genes were seeded on a ½ MS medium. Genotyping of 16S rRNA in the cotyledons of 7 DAS or 13 DAS seedlings was performed as in the case of the T1 plants. PCR for GFP was performed using the following primers.
At 11 DAS and 23 DAS, T2 seeds derived from the T1 plants, in which C5 of 16S rRNA was homoplasmically substituted, were seeded on a ½ MS medium containing 0, 10 or 50 mg/L spectinomycin. The phenotypes of germinated cotyledons were observed at 8 DAS.
Plant images were taken with iPhone (registered trademark) Xs (Apple Inc., US) and LEICA MC 170 HD (Leica, Germany). Gel images were taken with a ChemiDoc™ MP Imaging System (BIORAD, USA). Then, the images were processed with Adobe Photoshop 2021 (Adobe, USA).
The amino acid sequence of DddAtox as set forth in SEQ ID NO: 35 was divided between the 44th and 45th amino acids, or between the 94th and 95th amino acids, and the N-terminal or C-terminal side was linked to the C-terminus of a platinum TALE DNA-binding domain (Sakuma et al., Scientific reports 3, 1-8, 2013.) (pTALECD,
As described above, 12 types of ptpTALECD expression vectors (expression vectors targeting the three regions by four CD half combinations (see
Each expression vector was introduced into Arabidopsis thaliana, and at 23 DAS, the target region of T1 was sequenced by the Sanger method. Only the constructs, in which T1 was obtained, are shown in
In order to examine the stability of mutations in the growth process of individual plants, the nucleotide sequences of total DNAs extracted from the newborn leaves of T1 plants at 11 DAS and 23 DAS (or from the cotyledons of slow-growing plants at 11 DAS) were examined. At 11 DAS and 23 DAS, among plants having a nucleotide mutation in the target region, several plants retained the mutant nucleotide in a heteroplasmic or chimeric (h/c) form at both time points (30.0% of all plants, 15/50,
Subsequently, the off-target effect of ptpTALECD (substitution of non-target nucleotides) in the maternally inherited plastid and mitochondrial genomes was examined (the above Table 2). The total genome sequences of 14 T1 plants were determined (Novaseq, Illumina). In the 13 plants, most of the target nucleotides C were homoplasmically substituted with T (16S rRNA 1397C-1397N (1397CN) line 2, line 7, line 8, line 12, line 16, 1397N-1397C (1397NC) line 1, line 2, line 3: psbA 1397C-1397N (1397CN) line 6, 1397N-1397C (1397NC) line 1, line 5: and rpoC1 1397C-1397N(1397CN) line 16), while one remaining target (rpoC1 1397C-1397N (1397CN) line 3: see
T1 plants, which were transformed with the 16S rRNA-targeted ptpTALECD vector and in which the first Cp*(G5) and/or C10 were homoplasmically substituted, were all fertile, except for one plant (16S rRNA 1397C-1397N line 1). In order to examine whether or not the C to T substitution mutation is inherited by progenies, the genotyping of T2 plants of these three strains (16S rRNA 1397C-1397N line 2, line 8 and 1397N-1397C line 3) was performed (
G5 of the 16S rRNA gene corresponds to G, which is predicted to cause biological effects on E. coli 16S rRNA, and the substitution mutation of G in this E. coli 16S rRNA is known to confer spectinomycin resistance (Spmr). T2 seeds collected from T1 plants (16S rRNA 1397C-1397N line 2) in which G5 was homoplasmically substituted with A were seeded on a spectinomycin-containing medium. Regardless of the presence or absence of GFP fluorescence from the seeds, many of the seedlings germinated from these seeds showed spectinomycin resistance (
The above-described results demonstrated that ptpTALECD can introduce a target region-specific and homoplasmic C to T mutation into the plastid genome of Arabidopsis thaliana, and that this mutation is stably inherited by the offspring seeds (probably, following a maternal mode of inheritance).
Arabidopsis thaliana Col-0, otp87 (a homozygous T-DNA insertion line, GK-073C06-011724), and transformants were cultivated at 22° C. under long day conditions (a light period of 16 hours, and a dark period of 8 hours). The Col-0 seeds were seeded on a ½ MS-Agar plate (Non Patent Literature 7). Seedlings with 2 to 3 weeks old were transferred to Jiffy-7 (Jiffy Products International), and were then infected with Agrobacterium. Mature plants of Col-0 and otp87 were transformed by the floral dip method (Clough et al., The Plant Journal 16, 735-743, 1998). The obtained T1 seeds were selected based on the seed-specific GFP fluorescence (Non Patent Literature 7: Shimada et al., Plant J. 61, 519-528, 2010). These T1 seeds were seeded on the above-described medium containing 125 mg/L Claforan. T1 plants were transplanted to Jiffy-7 at 23 DAS. OTP87 seeds (GABI_073C06) were obtained from ABRC Stock Center. The homozygosity of OTP87 T-DNA insertion in the plants was confirmed by PCR (Hammani et al., J. Biol. Chem. 286, 21361-21371, 2011).
TALE-binding sequences are shown in
II-1-3. Genotyping of T1 and T2 Plants
PCR for Sanger sequencing (
Total DNA for NGS was extracted from mature leaves using the DNeasy Plant Pro Kit (QIAGEN). A paired-end library of 11 samples using VAHTS Universal Pro DNA Library Prep Kit for Illumina (Vazyme, China) and the sequencing of 5G base/sample using Illumina NovaSeq 6000 platform were performed at GENEWIZ Japan. Whole genome sequence data for performing SNP calling were obtained for 3 samples of wild-type plants and 8 samples of T2 plants (2 samples from each of 4 strains). As a pre-treatment of the analysis, low-quality sequences and adapter sequences contained in the reads were trimmed using PEAT [v1.2.4 (Li et al., BMC Bioinformatics, (BioMed Central, 2015), pp. 1-11)]. The paired-end reads of each strain were mapped to reference sequences (mitochondrial genome BK010421.1 and chloroplast genome AP000423.1) in a single-end mode, using BWA (v 0.7.12) (Durbin, Bioinformatics 25, 1754-1760, 2009). Inappropriate map reads having a sequence identity of 97% or less or an alignment coverage percentage of 80% or less were eliminated using a filter. SNPs were called with the samtools mpileup command (-uf -d 50000 -L 2000) and the bcftools call command (-m -A -P 0.1 (Li et al., Bioinformatics 25, 207-2079, 2009)). Finally, SNPs with (AF of T1 sample)−(average AF of 3 wild-type plants)≥0.05 were detected as off-target SNP candidates by allele frequency (AF) calculated by the bcftools, and many artifact SNPs derived from chloroplast genome sequences similar to those in NUMT and mitochondrial genomes were eliminated (
In order to predict the binding site of OTP87 in atp1, a PPR code was used (Takanaka et al., PLos one 8 e65343 2013: Yan et al., Nucleic acids research 4, 3728-3738, 2019). In this code, the combination of two important amino acid residues at positions 5 and 35 of each PPR repeat was used to calculate which nucleotides each PPR repeat was likely to recognize. The binding probability of each motif was depicted in the weblog (http://weblogo.berkeley.edu/) shown in
The photographs of plants were taken with a digital camera (OLYMPUS OM-D E-M5) and were then processed with Adobe Photoshop 2021.
The base pair, atp1-1178C, which corresponded to the RNA editing site of mitochondrial ATPase subunit 1 (atp1), was selected as a target for nucleotide editing. In wild-type plants, this C is post-transcriptionally converted to U on the RNA and is then translated. Accordingly, when evaluating the efficiency of single nucleotide substitution and its heritability, the substitution of C:G to T:A is not considered to have adverse effects on the plants. For the substitution of this target nucleotide, 4 types of vectors containing a cytidine deaminase (CD) domain that is located at the C-terminus of a Burkholderia cenocepacia DddA protein (1,427 amino acids: Non Patent Literature 6) were produced. As in the previous reports (Non Patent Literature 6: Non Patent Literature 7: Nakazato et al., Nat. Plants 7, 906-913 2021: and Lee et al., Nat. Commun. 12, 1-6 2021), the coding sequence of the CD domain was divided at the nucleotide immediately after the codon of Gly 1333 or Gly 1397. The sequences (N- and C-terminal sides) of the divided CD halve were each fused with the 3′ side of the DNA-binding domain sequence (hereafter referred to as pTALE) of platinum TALEN (Sakuma et al., Sci. Rep. 3 1-8, 2013) that recognizes at maximum 21 nucleotides. In order to prevent the removal of uracil generated from cytosine, the sequence of pTALE-CD was fused with the 5′ side of the sequence of UGI (Non Patent Literature 6: and Mol et al., Cell 82, 701-708, 1995, pTALE-CD-UGI). The nucleotide sequences of CD and UGI are the same as those in the previous report (Nakazato et al., Nat. Plants 7, 906-913, 2021), and were optimized for the codon usage in Arabidopsis thaliana. The mitochondrial target signal sequence of the Arabidopsis thaliana ATPase delta prime subunit (Arimura et al., Plant J. 104, 1459-1471, 2020) was linked to the 5′ side of pTALE-CD-UGI (mtpTALECD:
In order to substitute the target C:G pair of the mitochondrial genome with a T:A pair, the nuclear genome of Arabidopsis thaliana was transformed with each vector by the floral dip method (Clough et al., Plant J. 16, 735-743, 1998). Total DNA from the leaves of T1 transformants was amplified by PCR, and the nucleotide sequences of the PCR products were determined by the Sanger method. Among the 78 T1-transformed plants examined (the number of transformants obtained with all of the four vectors), 36 plants had a substitution of C:G with T:A in the target window (
The T1 plants, in which a mutation had been detected by the first genotyping, were subjected to genotyping again using new primers.
In many transformants, the nucleotides in the target window appeared to be homoplasmically substituted (
In order to examine whether the type of the introduced mutation is changed during the developmental process of a plant, regarding each transformant, the sequences of PCR fragments obtained using total DNAs of different leaves at 11 DAS and 23 DAS as templates were determined by the Sanger method, and the types of mutations were then examined. A total of 76 mutant nucleotides were detected on at least one of these days (
In order to confirm whether or not the introduced mutations are inherited in the seed progenies, regarding each of the 4 T1 plants in which the C:G pair in the target window was homoplasmically substituted, T2 progenies of 13 plants were subjected to genotyping. All of the examined T2 plants inherited the parental homoplasmic mutation, regardless of whether they carried a mtpTALECD gene in the nucleus thereof (
In order to examine the off-target effects of mtpTALECD on the mitochondrial genome, T2 plants (
In these 8 plants, the coverage pattern of the entire mitochondrial genome was very similar to the coverage pattern of wild-type plants (
About 20% of the reads at the position of SNPs in the target window did not have any mutant nucleotides (
II-2-4. Complementation of Phenotypes of Ppr Mutants Using mtpTALECD
RNA editing is a feature of the mitochondrial and chloroplast genomes of land plants, in which the specific Cs of RNA molecules after transcription are converted to U. This is mediated by mitochondria-targeted PPR proteins encoded in the nucleus (Small et al., Plant J. 101, 1040-1056, 2020). In order to verify the usefulness of mtpTALECD in the molecular analysis of the mitochondrial genome, two experiments related to RNA editing were carried out. First, the otp87 mutant exhibiting growth retardation was examined. In wild-type plants, the PPR protein OTP87 converts 1178C of the atp1 transcript (C10 in the target window,
II-2-5. Recognition of atp1 by OTP87
In the second experiment, the atp1 sequence, to which OTP87 is predicted to bind, was examined (Takenaka et al., PloS One 8 e65343, 2013:
Arabidopsis thaliana Col-0 and transformants were cultivated at 22° C. under long day conditions (a light period of 16 hours, and a dark period of 8 hours). The Col-0 seeds were seeded on a ½ MS-Agar plate (Non Patent Literature 7). Seedlings with 2 to 3 weeks old were transferred to Jiffy-7 (Jiffy Products International), and were then infected with Agrobacterium. Mature plants of Col-0 were transformed by the floral dip method (Clough et al., The Plant Journal 16, 735-743, 1998.) The obtained T1 generation was analyzed.
Based on the construct of ptpTALECD (Nakazato et al., Nature Plants 7, 906-913, 2021), the chloroplast transition signal (PTP) was substituted with the SV40 nuclear localization signal (SV40NLS) to produce nTALECD. Target sequences were designed for the purpose of introducing stop codons or amino acid substitutions predicted to have a great influence on gene functions into two sites of each of three target loci, AtCYO1, AtPKT3, and AtMSH1, and a total of 6 constructs of nTALECD expression vectors corresponding to individual target sequences were produced, and were then transformed into Col-0 through infection with Agrobacterium by the floral dip method.
PCR for Sanger sequencing was performed employing KOD One PCR Master Mix (Toyobo Co., Ltd.), using DNA roughly extracted from true leaves or cotyledons, according to standard protocols. Nucleic acid templates used in the PCR for Sanger sequencing were extracted using the Maxwell RSC Plant RNA Kit (Promega), without using DNase I included therewith. DNA in the extracted nucleic acids was decomposed with Deoxyribonuclease (RT Grade) for Heat Stop (Nippon Gene) to prepare RNA templates for RT-PCR. The RT-PCR was performed using PrimeScript™ II High Fidelity One Step RT-PCR Kit (TaKaRa). A portion of the mtpTALECD reading frame was amplified with primers, and a transformant was identified. Sequences around the target window of mitochondrial DNA and cDNA and their homologous sequences in the nuclear DNA were amplified. The purified PCR products were read by Sanger sequencing, and the data were analyzed by Geneious Prime (v. 2021. 2.2).
The photographs of plants were taken with a digital camera (OLYMPUS OM-D E-M5) and were then processed with Adobe Photoshop 2021.
Representative examples of 11 DAS cyo1 mutant and wild type (
Since the cyo1 loss-of-function mutation is a recessive inheritance, it is suggested that the loss-of-function mutation has been introduced into many of T1 plants, entirely (
The nucleotide sequence in the target sequence of CYO1 was sequenced by the Sanger method. As a result, it was confirmed that the nucleotide substitution of specific C in the nucleotide sequence occurred at a high efficiency (>40%), and that biallelic/homozygous mutants can be easily obtained in the T1 generation (
Subsequently, PKT31 and MSH1 were selected as target sequences different from CYO1, and the nucleotide sequences in the target windows of both alleles were sequenced by the Sanger method.
As a result, it was confirmed that the nucleotides C10 and C11 or G4 to G6 were edited (
Studies were conducted regarding the degree of occurring the editing of nucleotides other than the target nucleotide, namely, the degree of off-target editing, when single nucleotide substitutions are carried out using the method of the present invention.
As a result, although off-target nucleotide substitutions occurred (TC→TT in all cases), the frequency thereof was low, and indels (insertion and/or deletion of the nucleotide sequence) were not observed around the target sequence (
By using the method of the present invention, single nucleotide editing of plant genomes (a nuclear genome, a plastid genome, and a mitochondrial genome) becomes possible. Therefore, plants modified by using the method of the present invention are expected to contribute to the enhancement of food production and the improvement of biofuel production. etc.
Number | Date | Country | Kind |
---|---|---|---|
2021-009001 | Jan 2021 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2022/002162 | 1/21/2022 | WO |
Number | Date | Country | |
---|---|---|---|
63285223 | Dec 2021 | US |