The invention relates to gene editing, preferably in a plant, to link a distal promoter to a gene of interest in order to modify the expression of said gene. The desired gene editing is preferably a deletion or inversion. More specifically, said invention relates to a method to enhance the expression of a fertility-restorer gene of interest in a plant.
The expression pattern and the strength of a promoter are key elements in the expression of genes. Genetic modification approaches highlighted the fact that the use of a strong constitutive promoter to drive the expression of a plant gene was necessary to boost the expression of a gene in order to improve a specific trait of interest. Other possibilities offered by genetic engineering involved the use of a promoter with a different pattern of expression in the plant or a promoter with a different pattern of expression throughout the plant development cycle.
The recent gene editing technologies are offering new possibilities for researchers to modify the expression of genes. It is now possible to activate or repress target genes by using inactivated endonucleases coupled with activator or repressor domains. Some technologies also target promoters to add specific domains within the regulatory regions of genes by homologous recombination. But some of these gene editing technologies are relatively similar to technologies previously used to obtain GMO as they require the insertion of heterologous DNA within the plant genomes.
The use of gene editing to create deletions or inversions in the genome of plants or mammals has already been disclosed. For example, Cai et al. disclose a CRISPR/Cas9-mediated deletion of large genomic fragments in soybean, Durr et al. disclose deletions of gene clusters and non-coding regulatory regions in Arabidopsis using CRISPR/Cas9 for functional genomic studies, and Ordon et al. describe chromosomal deletions in dicotyledonous plants by gene editing. Relative to mammals, Korablev et al. mention deletions, inversions and duplications involving the Contactin-6 gene, using a CRISPR-Cas9 technology, to develop new mouse lines. Li et al. also disclose inversions and duplications of mammalian regulatory DNA elements and gene clusters by CRISPR/Cas9. The use of gene editing to create chromosome inversions has also been described, as in WO2020/117553, but only to reduce the expression of a gene of interest.
Therefore, deletions or inversions into a genome by gene editing have already been disclosed in the prior art. However, none of these deletions and/or inversions aims to operably link a distal endogenous promoter to a gene of interest.
Fertility-restorer (Rf) genes are essential today for the development of hybrids, especially in autogamous species like wheat. To avoid self-pollination, the plant that is used as the female parent has to be male sterile. The male parent is generally a fertility restorer line. The hybrid progeny of the cross will have normal restored pollen fertility.
It is a goal of breeders to move towards hybrid wheat since hybrid varieties usually outperform inbreds. Since wheat is dioecious and largely autogamous, the production of hybrid seed requires systems to facilitate crossing and reduce the cost of hybrid seed production. Such a system is the use of male-sterile ‘female’ plant line crossed to a male fertile line such that all the seed harvested from the female, male-sterile plants will be F1 hybrid seed. Male-sterile plants can be produced using cytoplasmic male sterility (CMS) where the female plant carries ‘defective’ mitochondria that often express novel ORFs leading to the production of no or defective pollen. Use of CMS systems for hybrid seed production requires that the male line used in the hybrid seed production cross carries a nuclear gene or genes that repair the defective mitochondria in the F1. This leads to full male-fertility of the F1 plants that are grown by the farmer. These nuclear genes in the male line are referred to as CMS restorer genes. One potential CMS system for hybrid wheat production is that using T. timopheevii CMS (Bohra A, Jha U C, Adhimoolam P, et al (2016) Cytoplasmic male sterility (CMS) in hybrid breeding in field crops. Plant Cell Rep 35:967-993. doi: 10.1007/s00299-016-1949-3). A drawback of this system is that a combination of several restorer genes (Rf1, Rf3, Rf4 and Rf7) is required to give full male fertility to the F1. For the breeder this makes the system more complex to use since each male line has to be converted to contain 3 or 4 independently segregating restorer genes.
It is thus desirable to identify or create a single effective restorer locus.
In this context, the development of modified organisms which do not require the insertion of heterologous DNA within their genomes is of high interest, notably in plants. A method to obtain hybrid plant, more specifically hybrid wheat, which requires the use of only one fertility restorer gene is also of major importance.
It is therefore the object of this invention to provide a method to meet these needs. Preferably, it is an object of this invention to modify the expression of a gene of interest, more preferably a fertility restorer gene, to obtain hybrid plants useful for the seed industry.
The inventors have shown that the expression of a gene of interest can be modified, preferably enhanced, using gene editing to operably link a distal endogenous promoter, or part of an endogenous promoter, to said gene of interest.
Therefore, in a first aspect, it is disclosed herein a method to modify the expression of a gene of interest in an organism, said method comprising the steps of: a) introducing into said organism at least one gene editing system; b) allowing the said gene editing system to perform the desired editing at a target genomic site; and wherein the said gene editing system is designed so that, after the desired editing, the gene of interest is operably linked to an endogenous promoter, or part of an endogenous promoter, said endogenous promoter being distal from a naturally occurring promoter of the gene of interest in the genome of a non-modified organism.
In a specific embodiment, said organism is a eukaryotic organism, provided that the organism is not a human or an animal. In specific embodiments that may be combined with the previous embodiments, the organism is a plant, preferably a polyploid plant, such as wheat, oat, rapeseed, potato, sugar cane, . . . .
In a specific embodiment that may be combined with the previous embodiments, the performance of the desired editing as recited in step b) comprises two or more DNA breaks within said target genomic site.
In a specific embodiment that may be combined with the previous embodiments, the desired editing is a deletion or an inversion, the deletion preferably having a size from 10 kb to 10 Mb and the size of said inversion being from 10 kb to 100 Mb.
In a specific embodiment that may be combined with the previous embodiments, said deletion is a deletion of a genomic region comprising the full naturally occurring promoter or of a genomic region comprising only a part of the naturally occurring promoter. Alternatively, in a specific embodiment that may be combined with the previous embodiments, said inversion is an inversion of a genomic region resulting in the replacement of the naturally occurring promoter by said endogenous promoter or is an inversion of a genomic region resulting in the replacement of a part of the naturally occurring promoter by a part of said endogenous promoter.
In a specific embodiment that may be combined with the previous embodiments, said gene of interest is a fertility-restorer gene, preferably an RFL29 gene, more preferably represented by the coding sequence of SEQ ID NO:14. In another specific embodiment that may be combined with the previous embodiments, said fertility-restorer gene is an RFL79 gene, more preferably represented by SEQ ID NO:58.
In a specific embodiment that may be combined with the previous embodiments, said endogenous promoter drives a higher expression of the gene of interest, changes the pattern of expression of the gene of interest during the development cycle of a plant, changes the spatial pattern of expression of the gene of interest or is a promoter which is activated by biotic or abiotic stress.
In a specific embodiment that may be combined with the previous embodiments, the modification of the expression is an enhancement, said endogenous promoter being notably a strong promoter.
In a specific embodiment that may be combined with the previous embodiments, the gene editing system is designed to maintain the Kozak sequence of the gene of interest.
In a specific embodiment that may be combined with the previous embodiments, the break occurs in a promoter, in an untranslated region, in gene-gene junction region, in exon or in intron, preferably in a promoter or in an untranslated region.
In a specific embodiment that may be combined with the previous embodiments, said gene editing system is chosen among a zinc finger nuclease (ZFN) gene editing system, a transcription activator-like effector nucleases (TALEN) gene editing system, a clustered regularly interspaced short palindromic repeats (CRISPR) gene editing system, or a meganuclease gene editing system, preferably a CRISPR gene editing system. In a more specific embodiment, said gene editing system comprises at least one enzyme chosen among: meganuclease, zinc-finger nuclease, transcription-activator like effector nuclease, CRISPR-nickase or CRISPR-nuclease.
In a specific embodiment that may be combined with the previous embodiments, the CRISPR gene editing system comprises multiple guide sequences capable of hybridizing to multiple target sequences within the target genomic site.
In a specific embodiment that may be combined with the previous embodiments, the method comprises the steps of:
It is also disclosed herein, as a second aspect, a method to enhance the expression of a fertility-restorer gene of interest, in a plant, said method comprising the steps of: a) introducing into said plant, at least one gene-editing system, preferably a CRISPR gene-editing system; b) allowing the said gene-editing system to perform the desired editing at a target genomic site; and wherein the said gene editing system is designed so that, after the desired editing, the fertility-restorer gene of interest is operably linked to an endogenous promoter, or part of an endogenous promoter, said endogenous promoter being distal from a naturally occurring promoter of the fertility-restorer gene of interest in the genome of a non-modified plant.
This second aspect is a particular embodiment of the first aspect, wherein the organism is a plant, more specifically a wheat plant. Consequently, all the specific embodiments disclosed above in relation to the first aspect also apply and may also be combined with this second aspect.
In a specific embodiment that may be combined with the previous embodiments, after the desired editing, the fertility-restorer gene of interest is operably linked to an endogenous promoter, said endogenous promoter being a strong promoter.
In a specific embodiment that may be combined with the previous embodiments, the fertility-restorer gene is RFL29, and preferably said endogenous promoter identified to replace the naturally-occurring promoter of RFL29 is chosen among PK-like promoter (notably PK-like promoter in T. aestivum represented by SEQ ID NO:10), RAE1-like promoter (notably RAE1-like promoter in T. aestivum represented by SEQ ID NO:11) or At5g02240-like promoter (notably At5g02240-like promoter in T. aestivum represented by SEQ ID NO:59).
In another specific embodiment, that may be combined with the previous embodiments, the fertility-restorer gene is RFL79.
In a specific embodiment that may be combined with the previous embodiments, the gene editing system is a CRISPR gene-editing system which is a delivery system comprising and operably configured to deliver into a plant cell, either CRISPR-Cas complex components, or one or more polynucleotide sequences comprising or encoding said components, wherein said CRISPR-Cas complex components, comprises:
The present disclosure also relates to a plant useful in the above-mentioned methods. It is thus disclosed, in a third aspect, a plant, or a plant part, comprising a gene editing system designed to perform a desired editing at a target genomic site, so that after the desired editing occurred, a gene of interest in the plant is operably linked to an endogenous promoter, or part of an endogenous promoter, said endogenous promoter being distal from a naturally occurring promoter of the gene of interest in the genome of a non-modified plant.
The present disclosure, in a fourth aspect, further relates to the plant as obtained with the above-mentioned methods, as well a method to identify such a plant. Such identification method comprises a step of PCR after the desiring editing was performed, in order to detect bands corresponding to the desired editing, notably inversion or deletion. To evaluate the result of the gene editing, such an identification method can also be used to detect the expression level of the gene of interest, as well as the level of the protein encoded by said gene. A phenotypic analysis can also be performed in order to determine if the modified plant has a fertility restoration phenotype.
The present disclosure, in a fifth aspect, also relates to a method to restore fertility in wheat, comprising a step of crossing a sterile wheat plant with a plant wherein the expression of a fertility-restorer gene has been enhanced according to the above-mentioned methods.
The
The
The
The
The
The
The
The
The
The
The
The
The
The
The
The
The arrows represent the cleavage sites.
The
The
In specific embodiments of the disclosure, the organism is a eukaryotic organism, provided that the organism is not a human or other animal.
Preferably, the organism is a plant. As used herein, the term “plant” or “plants” refer to the entire plant but also plant parts (cells, tissues or organs, seed pods, seeds, severed parts such as roots, leaves, flowers, pollen, etc.), progeny of the plants which retain the distinguishing characteristics of the parents (i.e. the modification of the expression of the gene of interest), such as seed obtained by selfing or crossing, e.g. hybrid seeds (obtained by crossing two inbred parent plants), hybrid plants and plant parts derived there from are encompassed herein, unless otherwise indicated. The plant can be a monocotyledon or a dicotyledonous plant.
In specific embodiments, said plant is a polyploid plant such as wheat, oat, rapeseed, potato, sugar cane, . . . .
As used herein, the term “wheat” refers to a plant of Triticum gender, as for example T. aestivum, T. aethiopicum, T. araraticum, T. boeoticum, T. carthlicum, T. compactum, T. dicoccoides, T. dicoccon, T. durum, T. ispahanicum, T. karamyschevii, T. macha, T. militinae, T. monococcum, T. polonicum, T. spelta, T. sphaerococcum, T. timopheevii, T. turanicum, T. turgidum, T. urartu, T. vavilovii, T. zhukovskyi Faegi. Preferably, the wheat plant is T. aestivum. Wheat plant also refers to Aegilops gender and Triticale.
As used herein, the term “oat” refers to a plant of Avena gender, as for example A. sativa.
As used herein, the term “rapeseed” refers to a plant of Brassica gender, as for example B. napus, B. juncea and B. rapa; preferably B. napus.
As used herein, the term “potato” refers to a plant of Solanum gender, as for example Solanum tuberosum.
As used herein, the term “sugar cane” refers to a plant of Saccharum gender, as for example S. officinarum, S. sponteneum, S. robustum, S. sinense and S. barberi.
A first aspect of the present disclosure relates to a method to modify the expression of a gene of interest in an organism, said method comprising the steps of:
As used herein, the term “modify the expression of a gene” refers to an enhancement of the expression, or a modification of an expression pattern. In a preferred embodiment that may be combined with the previous embodiments, said modification is an enhancement. In such a case, the endogenous promoter can be considered as a strong promoter. In a specific embodiment, the gene expression modification can include modification at a precursor mRNA level, at a mature mRNA level or at translation level.
As used herein, the term “enhancement of the expression” means that the gene of interest is more expressed compared to a non-modified organism. Preferably the gene expression is increased by at least 2 fold, preferably between 2 and 100 fold, such as, at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95 or 100. The term “enhancement of the expression” also means that the gene of interest is expressed when operably linked to the endogenous promoter. Indeed, in this specific embodiment, the gene of interest is not expressed in a non-modified organism, but gene expression of said gene of interest can be detected in a modified organism, after the desired editing. Relative gene expression can be measured by a q-RT-PCR method. Gene expression can also be measured by RNA-Seq.
As used herein, the term “modification of an expression pattern” means that the gene is expressed under different conditions by comparison with a non-modified organism, for example at a different time during the development cycle of a plant, in a different tissue, in presence of biotic or abiotic stress, . . . .
As used herein, the term “gene of interest” refers to an endogenous gene of the organism whose modification of the expression confers a desired characteristic to the organism, such as improved performance in the fields, improved performance in an industrial process, improved nutritional value, or improved reproductive capability. The gene of interest can encode for a protein of interest, but can also encode for a functional RNA of interest, such as antisense RNA, rRNA, tRNA, . . . . The gene of interest can be a gene which is not expressed in a non-modified organism and which becomes expressed when operably linked to an endogenous promoter, or part of an endogenous promoter. The gene of interest can be dominant, recessive or semi-dominant. As used herein “expression of a gene” means that the coding sequence of the gene is transcribed, and optionally translated.
As used herein, the term “operably linked” means that the gene of interest is linked to said endogenous promoter in a manner that allows for expression of the gene of interest.
As used herein, the term “introducing” means that the gene editing system penetrates into the cell of the organism, so that the system can enter the nucleus for targeted gene editing.
As used herein, the term “allowing the said gene-editing system to perform the desired editing at a target genomic site” means that two or more DNA breaks are made within the target genomic site, and that the strand cuts are then repaired. In other words, it also means “making two or more DNA breaks and then selecting for an organism (more specifically a cell) wherein the target genomic site has been edited” (in such edited cell, the gene of interest is operably linked to an endogenous distal promoter, or part of an endogenous distal promoter, whereas in non-edited cell, the gene of interest is operably linked to its naturally occurring promoter).
As used herein, the term “target genomic site” refers to the genomic region wherein the gene editing occurs, i.e. the genomic region between the two cleavage sites, or the genomic region between the most remote cleavage sites if there are more than two cleavage sites. As used herein, the term “cleavage site” corresponds to genomic DNA region which comprises a sequence recognized by a specific enzyme used in gene editing system, such as nickase or nuclease, which cleave the genomic DNA region in one or both strands.
As used herein, the term “gene of interest is operably linked to an endogenous promoter, or part of an endogenous promoter” means that the coding sequence is under the transcriptional control of the endogenous promoter or part of endogenous promoter. In other words, the promoter, or part, controls the expression of the gene of interest. The said promoter sequence does not need to be contiguous with the sequence of the gene of interest, as long as the promoter still controls the expression of the gene (e.g. a transcribed sequence which is not translated, can be interposed between the promoter and the coding sequence of the gene of interest). In a preferred embodiment, after the gene editing occurred, the endogenous promoter is contiguous to the gene of interest.
As used herein, the term “endogenous promoter” means that the promoter is native to the organism.
As used herein, the term “the said promoter distal from a naturally occurring promoter of the gene of interest” means that the endogenous promoter is distant from the gene of interest in a non-modified organism. The distance can be kilobase-sized or megabase-sized between the naturally occurring promoter of the gene of interest and the distal promoter. Said distal promoter can be located in 5′ or 3′ from the naturally occurring promoter. In other words, the endogenous promoter is distal compared to the naturally occurring promoter which is a proximal promoter. As used herein, the term “proximal” refers more specifically to a nucleotide sequence, upstream (5′) to the coding sequence of the gene, generally from 1 base to about 500 base of the start site. The term “from 1 base to 500 base” means every value of this range, even if not explicitly recited, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500.
As used herein, the term “distal promoter” refers also to a promoter which is distant from the gene of interest in a non-modified organism, and which regulates the expression of a gene different from the gene of interest in a non-modified organism. It is only after the gene-editing that the said distal promoter regulates the expression of the gene of interest. In other words, in a non-modified organism, the gene of interest is operably linked to its naturally occurring promoter and not to the distal endogenous promoter or part of it, whereas after the gene-editing the gene of interest is operably linked to the distal endogenous promoter or part of it.
In a specific embodiment of the disclosure that may be combined with the previous embodiments, the desired editing is a deletion or an inversion.
As used herein “deletion” means that the genetic material which represents the target genomic site is lost on a chromosome. According to a specific embodiment of the disclosure that may be combined with the previous embodiments, the deletion is a deletion of a genomic region comprising the full naturally occurring promoter. In this embodiment, the full naturally occurring promoter is either replaced by the full distal endogenous promoter or by a functional part of the full distal endogenous promoter. Alternatively, said deletion is a deletion of a genomic region comprising only a part of the naturally occurring promoter. In this last embodiment, the gene editing creates thus a chimeric promoter containing two parts: one part comprising the naturally occurring promoter and one part comprising the endogenous promoter, with the proviso that such chimeric promoter is a functional promoter.
As used herein “inversion” means that the genomic region is reversed end to end within the target region site. In other words, it corresponds to a rearrangement within a single chromosome. According to a specific embodiment of the disclosure that may be combined with the previous embodiments, the inversion is an inversion of a genomic region resulting in the replacement of the naturally occurring promoter by said endogenous promoter. Alternatively, said inversion is an inversion of a genomic region resulting in the replacement of a part of the naturally occurring promoter by a part of said endogenous promoter. In this last embodiment, the gene editing creates thus two chimeric promoters containing two parts: one part comprising the naturally occurring promoter and one part comprising the endogenous promoter. In an embodiment, the two chimeric promoters can be functional and in another embodiment only one chimeric promoter is functional: the one which is operably linked to the gene of interest.
In a specific embodiment that may be combined with the previous embodiments, the size of said deletion is from 10 kb to 10 Mb. In a specific embodiment that may be combined with the previous embodiments, the size of said inversion is from 10 kb to 100 Mb. As used herein, “from 10 kb to 10 Mb” or “from 10 kb to 100 Mb” mean every value of this range, even if not explicitly recited, for example from 100 kb to 10 Mb, from 1000 kb to 10 Mb, from 100 kb to 1 Mb, from 1000 kb to 1 Mb, from 10 kb to 5 Mb, from 100 kb to 5 Mb, from 200 kb to 10 Mb, from 300 kb to 10 Mb, from 400 kb to 10 Mb, from 500 kb to 10 Mb, from 600 kb to 10 Mb, from 700 kb to 10 Mb, from 800 kb to 10 Mb, from 900 kb to 10 Mb, from 200 kb to 5 Mb, from 300 kb to 5 Mb, from 400 kb to 5 Mb, from 500 kb to 5 Mb, from 600 kb to 5 Mb, from 700 kb to 5 Mb, from 800 kb to 5 Mb, from 900 kb to 5 Mb, from 1 Mb to 10 Mb, from 2 Mb to 10 Mb, from 3 Mb to 10 Mb, from 4 Mb to 10 Mb, from 5 Mb to 10 Mb, from 6 Mb to 10 Mb, from 7 Mb to 10 Mb, from 8 Mb to 10 Mb, from 9 Mb to 10 Mb, from 10 kb to 9 Mb, from 10 kb to 8 Mb, from 10 kb to 7 Mb, from 10 kb to 6 Mb, from 10 kb to 5 Mb, from 10 kb to 4 Mb, from 10 kb to 3 Mb, from 10 kb to 2 Mb, from 10 kb to 1 Mb, or from 100 kb to 100 Mb, from 1000 kb to 100 Mb, from 10 kb to 1 Mb, from 10 kb to 10 Mb, from 10 kb to 20 Mb, from 10 kb to 30 Mb, from 10 kb to 40 Mb, from 10 kb to 50 Mb, from 10 kb to 60 Mb, from 10 kb to 70 Mb, from 10 kb to 80 Mb, from 10 kb to 90 Mb, from 100 kb to 20 Mb, from 100 kb to 30 Mb, from 100 kb to 40 Mb, from 100 kb to 50 Mb, from 100 kb to 60 Mb, from 100 kb to 70 Mb, from 100 kb to 80 Mb, from 100 kb to 90 Mb, from 1 Mb to 20 Mb, from 1 Mb to 30 Mb, from 1 Mb to 40 Mb, from 1 Mb to 50 Mb, from 1 Mb to 60 Mb, from 1 Mb to 70 Mb, from 1 Mb to 80 Mb, or from 1 Mb to 90 Mb.
In a specific embodiment of the disclosure that may be combined with the previous embodiments, the endogenous promoter drives a higher expression of the gene of interest, changes the pattern of expression of the gene of interest during the development cycle of a plant, changes the spatial pattern of expression of the gene of interest or is a promoter which is activated by biotic or abiotic stress. Abiotic stress is for example light or drought stress. Biotic stress is for example viral, bacterial or fungal invasion of the plant.
According to the disclosure, a gene editing is performed to create at least two breaks in the genomic targeted site. As used herein, the term “break” refers to a cleavage on both DNA strands. The double-stranded break can result of two separate single-stranded breaks. One double strand break on each side of the target genomic site is necessary to perform the desired editing at the target genomic site. Two single strand breaks (a single strand break on each DNA strand) on each side of the target genomic site are necessary to perform the desired editing at the target genomic site.
In a specific embodiment, the at least two breaks occur in a promoter, in an untranslated region, in gene-gene junction region, in exon or in intron. The term “untranslated region” preferably refers to a 5′UTR region. For example, one of the breaks can occur in front of the initiating Methionine ATG of the coding sequence of the gene of interest. In a specific embodiment, the gene-editing system is designed to maintain the Kozak sequence of the gene of interest.
The gene editing system which can be used are preferably chosen among a zinc finger nuclease (ZFN) gene editing system, a transcription activator-like effector nucleases (TALEN) gene editing system, a clustered regularly interspaced short palindromic repeats (CRISPR) gene editing system, or a meganuclease gene editing system. Such gene editing system comprises at least one enzyme, preferably chosen among: meganuclease, zinc-finger nuclease, transcription-activator like effector nuclease, CRISPR-nickase or CRISPR-nuclease. Such enzymes may be a wild-type protein, or a mutant with the proviso that the enzyme still possesses its nuclease or nickase activity. In a more preferred embodiment, the enzyme is chosen among Cas3, Cas9, Cas12a, or dCas9-Fokl, dCpf1-Fokl, chimeric FENI-Fokl, a nickase Cas9, chimeric dCas9 non-Fokl nuclease and dCpf1 non-Fokl nuclease.
According to the disclosure, the most preferred gene editing system is a CRISPR gene editing system. As used herein, the term “CRISPR gene editing system”, which can also be called “CRISPR-Cas system” in an embodiment refers to a system which relies on two components: (i) a guide sequence which is a specific RNA sequence which hybridize to a target DNA region and direct the enzyme (the ‘CRISPR-associated protein’) to the target DNA region to perform the gene editing, and (ii) the CRISPR-associated protein which is a non-specific nuclease or nickase.
As used herein, the term “guide sequence” refers to a “guide RNA”, also called “gRNA”. Preferably, said guide sequence comprises a crRNA and optionally a tracrRNA. The ‘crispr RNA’, also called crRNA is a 17-20 nucleotide sequence complementary to the target DNA, and a “tracrRNA”, is a binding scaffold for the Cas9 nuclease. For Cas12a, the guide RNA comprises only the crRNA.
The crRNA and the tracrRNA may be present on the same molecule or may be present on two physically distinct molecules. According to the disclosure, the guide sequence may refer to a single guide, so that the crRNA and the tracrRNA form a single molecule.
In specific embodiments of the disclosure, the CRISPR system may comprise multiple guide sequences capable of hybridizing to multiple target sequences within the target genomic site.
Preferably, the said “CRISPR-associated protein” binds to the target DNA region only in presence of a specific sequence, called protospacer adjacent motif (PAM), on the non-targeted DNA strand. The nuclease cuts 3-4 nucleotides upstream of the PAM sequence or around 18-19 nucleotides downstream of the PAM sequence. Therefore, in a preferred embodiment, each end of the target genomic site (i.e. each end of the target DNA region) is adjacent to a Protospacer Adjacent Motif (PAM) recognized by a CRISPR-associated protein, such as 5′NGG3′ (with N representing either A, T, C or G) for Cas9 of Streptococcus pyogenes or T-rich PAM for Cas12a such 5′TTN or 5′TTTN (with N representing either A, T, C or G). In this embodiment, the locations in the genome that can be targeted by the CRISPR-associated proteins are limited by the locations of these PAM sequences.
In specific embodiments of the disclosure that may be combined with the previous embodiments, the CRISPR system may comprise at least one nuclear localization system (NLS). Preferably, said CRISPR-associated protein, more preferably the nuclease, is operably linked to the NLS.
According to the disclosure, several strategies are available to deliver the CRISPR gene editing system: a DNA delivery format, a RNA delivery format or a ribonucleoprotein delivery format.
In the DNA delivery format, the system is provided as one or several DNA molecules which enters the cell and translocates to the nucleus where the sequences encoding for CRISPR-associated protein and the guide sequence are transcribed. After translation of the CRISPR-associated protein said CRISPR-associated protein, preferably a nuclease, and the guide sequence assembles into the cell cytoplasm to form a ribonucleoprotein (RNP) complex. Next, said RNP complex enters the nucleus to perform the desired gene editing.
In the RNA delivery format, the system is provided as one or several RNA molecules. The sequences comprising the guide RNA and the mRNA corresponding to the CRISPR-associated protein are co-transfected into the cell cytoplasm. The mRNA is then translated to produce a CRISPR-associated protein, preferably a nuclease. Then, a RNP complex is formed and it enters the nucleus to perform the desired gene editing.
In the ribonucleoprotein delivery format, a RNP complex comprising a guide RNA and a CRISPR-associated protein are provided. Next, said RNP complex enters the nucleus to perform the desired gene editing.
In a specific embodiment of the disclosure, the CRISPR gene editing system comprises then a delivery system comprising and operably configured to deliver into an eukaryotic cell, either CRISPR-Cas complex components, or one or more polynucleotide sequences comprising or encoding said components, wherein said CRISPR-Cas complex components, comprises:
As used herein, the term “CRISPR-Cas complex components” refers to the components required to form a RNP complex (i.e. at least one guide sequence and a functional nuclease or a nickase, preferably a nuclease); Preferably, the CRISPR gene editing system is a CRISPR-Cas vector system encoding a CRISPR-Cas complex, said vector system comprising:
As used herein, the term “vector” refers to DNA sequence in which it is possible to insert fragments of foreign nucleic acid, the vectors making it possible to introduce foreign DNA into the eukaryotic cell. Examples of vectors are plasmids, cosmids, yeasts, yeast artificial chromosomes (YACs), bacteria artificial chromosomes (BACs), artificial chromosomes derived from the P1 bacteriophage (PACs), and viral vectors such as lentiviral, baculoviral or adeno-viral, adeno-associated viral vectors or plant viral vectors. In a specific embodiment, the CRISPR-Cas complex is delivered via liposomes, nanoparticles, exosomes, microvesicles, or a gene-gun. In a specific embodiment, the CRISPR-Cas complex is delivered in plants via electroporation, Agrobacterium transformation, direct precipitation by means of PEG.
As used herein, the term “regulatory element linked to a polynucleotide sequence” means that the polynucleotide sequence is linked to the regulatory element in a manner that allows for expression of said polynucleotide. Said regulatory element is preferably a promoter. In a specific embodiment, the said vector system can comprise one or several regulatory elements, such as promoter, enhancer, internal ribosomal entry sites, polyadenylation signals, poly U sequences, . . . . Said regulatory elements may direct constitutive expression of the polynucleotide sequence in many types of host cells or only in certain types (i.e. tissue specific regulatory sequences). A tissue-specific promoter may direct expression primarily in a desired tissue of interest or in a particular type of cells. Regulatory elements may also direct expression in a temporal-dependent manner, such as in a cell-cycle dependent or developmental stage-dependent manner, which may or may not also be tissue or cell-type specific.
In other embodiment, the CRISPR gene editing system is a vector system encoding a CRISPR-Cas complex, said vector system comprising:
In a specific embodiment, the said vector system comprises one or more expression cassettes for driving the expression of one or more guide RNA and CRISPR-associated protein. Each of the expression cassettes comprise a promoter sequence, a polynucleotide sequence encoding for at least one guide sequence, and a terminator.
In a specific embodiment, the disclosure relates to of the disclosure a method to modify the expression of a gene of interest in an organism, as mentioned above, which the steps of:
In a specific embodiment, some steps can be performed before the step a) mentioned above. Such steps can be:
Said “step of analyzing the region surrounding the gene of interest” is any method known in the art.
In a specific embodiment, the disclosure relates to a method to modify the expression of a gene of interest, as mentioned above, wherein said gene of interest is a fertility-restorer gene, preferably RFL29 gene, more preferably represented by SEQ ID NO:14 (RFL29a). In another embodiment, the fertility-restorer gene is RFL79, more preferably represented by SEQ ID NO:58. In another embodiment, the fertility-restorer gene is RFL29b, more preferably represented by SEQ ID NO: 2. Such embodiment corresponds to a second aspect of the disclosure.
In a second aspect, the disclosure then relates to a method to enhance the expression of a fertility-restorer gene of interest in a plant, said method comprising the steps of:
In a specific embodiment, said method aims to enhance the expression of a fertility-restorer gene of interest in a wheat plant.
In a specific embodiment, the desired editing to enhance the expression of a fertility-restorer gene of interest is preferably a deletion or an inversion. SEQ ID NO:14 notably refers to a RFL29a gene in Triticum aestivum (var. Spelt) P1190962. In a specific embodiment, the distal promoter to drive RFL29 is expressed (active) in Z32 and Z39 spikes. In a specific embodiment, the fertility-restorer gene is RFL29 gene and said endogenous promoter identified to replace the naturally-occurring promoter of RFL29 is chosen among PK-like promoter, RAE1-like promoter or At5g02240-like promoter. PK-like promoter is notably represented by SEQ ID NO:10 (TaPK-like promoter), RAE1-like promoter by SEQ ID NO:11 (TaRAE1-like promoter) and At5g02240-like promoter by SEQ ID NO:59 (TaAt5g02240-like promoter).
In another specific embodiment, the fertility-restorer gene is an RFL79 gene, more preferably represented by SEQ ID NO:58 in Triticum aestivum.
In a specific embodiment, said CRISPR gene-editing system is a delivery system comprising and operably configured to deliver into a plant cell, either CRISPR-Cas complex components, or one or more polynucleotide sequences comprising or encoding said components, wherein said CRISPR-Cas complex components, comprises:
In a specific embodiment of this method, the fertility-restorer gene is a RFL29 gene and the two guide sequences hybridize to the following targets:
In a specific embodiment of this method, the fertility-restorer gene is RFL29 gene and the said CRISPR gene-editing system comprises the following components:
The present disclosure also relates to a plant, or a plant part, which comprises a gene-editing system designed to perform a desired editing at a target genomic site, so that after the desired editing, a gene of interest in the plant is operably linked to an endogenous promoter, said endogenous promoter being distal from a naturally occurring promoter of the gene of interest in the genome of a non-modified plant.
The present disclosure further relates to a plant obtained by the method to modify the expression of a gene of interest in an organism as mentioned above, notably the method to enhance the expression of a fertility-restorer gene of interest in a plant. In a specific embodiment, the disclosure relates thus to a wheat plant wherein the RFL29 gene is preferably linked to an endogenous promoter being distal from the naturally occurring promoter of RFL29 gene in the genome of a non-modified plant, such as PK-like promoter, RAE1-like promoter or At5g02240-like promoter.
Several techniques can be used to identify whether a plant was obtained by a method according to the present disclosure, and/or to determine of the gene editing was correctly performed. Any method known in the art may be used. Some suitable methods include, but are not limited to, sequencing, hybridization assays, polymerase chain reaction (PCR), ligase chain reaction (LCR), or combinations thereof. For example, such a method of identification can comprise a step of PCR after the desiring editing was performed, in order to detect bands corresponding to the desired editing, notably inversion or deletion. The step of PCR amplification is generally followed by a sequencing step. Alternatively, the method of identification can comprise a step to determine the expression level of the gene of interest, in order to compare the expression of a gene of interest by comparison with the expression of a gene of interest in a non-modified plant, thereby verifying the modification of the expression of the gene of interest, preferably a higher expression of the gene of interest. In another alternative embodiment, said identification method can comprise a step to determine the level of the expressed protein encoded by the gene of interest, in order to compare said expression level by comparison with the level of the expressed protein encoded by the gene of interest in a non-modified plant, thereby verifying the modification of the expression level of the protein encoded by the gene of interest, preferably a higher level of the protein.
When said plant was obtained by a method to enhance the expression of a fertility-restorer gene of interest, as disclosed above, a further step of phenotypic analysis can be performed in order to determine if said plant has a fertility restoration phenotype.
As mentioned above, plant having a fertility restoration phenotype can be obtained according to the methods of the disclosure. It is then desirable to cross such plant with another plant, preferably when the plant is wheat.
Therefore, the present disclosure also relates to a method to restore fertility in wheat, comprising a step of crossing a sterile wheat plant with a plant wherein the expression of a fertility-restorer gene has been enhanced, as mentioned above. Preferably the sterile wheat plant is female wheat and the plant with an enhanced fertility restorer gene is male wheat.
The Examples below are given for illustration purposes only.
The T. timopheevii CMS restorer gene RF3 has been identified as a PPR protein on Chr1B referred to as RFL29 (TraesCS1B01G038500) (WO2019086510). This gene is present in most wheat lines such as Chinese Spring though its level of expression is low as measured by RNAseq data (
There are at least 3 RFL29 variants in wheat; RFL29a, RFL29b, RFL29c. RFL29b, present in Chinese Spring is a less effective restorer than the RFL29a allele found in lines such as Spelt (WO2019086510, Walkowiak et al. 2020).
To determine if RFL29-mediated fertility restoration can be improved, RFL29a and RFL29b were placed under the control of the strong ZmUbiquitin promoter and transformed into a wheat line containing T. timopheevii CMS. Full male fertility was observed in single copy T-DNA transformants (Melonek et al. 2021). This indicates that an increase in RFL29 expression would be sufficient to create a single locus restorer.
In this example, inventors search to identify a strong promoter in a gene upstream of RFL29 and perform a genomic deletion such that the strong promoter is brought in front of RFL29 (
RNAseq data from the wheat Chinese Spring was used to identify candidate promoters that could be brought in front of the RFL29 promoter. RFL29 should be expressed in the anther to give restoration, thus candidate promoters should therefore have good expression levels in RNAseq samples from the floral spike. In particular, overexpression of RFL29b from the tapetum-specific ZmMac2 promoter was sufficient to restore fertility to CMS wheat (Melonek et al. 2021). The expression of the wheat tapetum-specific gene TaCHSL1 (Wu et al., 2008) was detected in RNAseq floral spike samples Z32 and most strongly in Z39 with no expression at stage Z65 (
The predicted protein encoded by TraesCS1B01 G041300 shows 99% identity to a protein from Aegilops tauschii subsp. Tauschii annotated as RAE1-like (accession XP_020173482.1). RAE1 proteins are required for export of mRNA from the nucleus. TraesCS1B01G041300 was thus named RAE1-like. This protein is represented by SEQ ID NO:8.
TraesCS1B01 G038800 shows 98% identity to a protein annotated as probable serine/threonine-protein kinase PBL11 from Aegilops tauschii subsp. Tauschii (accession XP_020157135.1) and was named TaPK-like. This protein is represented by SEQ ID NO:5.
The TaPK-like gene is about 7-fold more expressed than RFL29 in Z39 spikes and RAE1-like about 44 fold more expressed (
To confirm that the promoter regions of TaPK-like and TaRAE1-like are sufficient to restore fertility when used to drive RFL29 expression, the putative promoter regions of TaPK-like and TaRAE1-like were amplified from the wheat variety Fielder and linked to the coding region of RFL29a (represented by SEQ ID NO:14).
The chimeric pTaPK-like::RFL29a (SEQ ID NO:12) and pTaRAE1-like::RFL29a (SEQ ID NO:13) gene cassettes were cloned into a plant binary vector containing a plant transformation selectable marker forming plasmids pBIOS12569 and pBIOS12526 respectively.
These binary vectors were transformed into agrobacteria, forming strains T11861 and T11844, which were transformed into a Fielder CMS line, a wheat line bearing a CMS sterility trait, by agrobacterial-mediated transformation and the fertility of the transformants scored.
The scoring method is outlined in
The transformed TO plants that had the level of fertility of Fielder were classified as fully fertile. Both the pTaPK-like::RFL29a and pTaRAE1-like::RFL29a constructs gave fully fertile plants however the level of restoration was greater for pTaRAE1-like::RFL29a with 92% of TO plants being fully fertile as opposed to 61% for pTaPK-like::RFL29a (Table 1).
Fertility restoration of pTaPK-like::RFL29a and pTaRAE1-like::RFL29a transformants. TO plants were phenotyped for fertility as described in
Plants with the fertility of Wild-type Fielder were classified as fully fertile. All plants are single-copy for the transgene apart for 4 plants from construct pTaPK-like::RFL29a (indicated as *).
As described in example 1, the promoters of TaPK-like and TaRAE1-like are candidate promoters to boost or replace the natural RFL29 promoter. The wheat line ‘Spelt’ has the favourable RFL29 allele (RFL29a) therefore it is desirable to engineer a strong RFL29 restorer in this or other lines that possess RFL29a. This strong restorer can then be introgressed into any wheat line to create a male line for hybrid seed production.
The deletion could replace a part of the RFL29a promoter by bringing enhancer elements of TaPK-like or TaRAE1-like (
The Spelt RFL29a (SEQ ID NO: 16), TaPK-like (SEQ ID NO: 19) and TaRAE1-like (SEQ ID NO: 22) genomic sequences were identified by BLASTN analysis of the Spelt genome sequence. The regions in the coding sequences were analysed for sites that could be cleaved using CRISPR Cas9 or Cas12a guide RNAs (gRNAs).
TTTGAAGGTTCACAGAAGGAGAGATGG
In Table 2, the combinations of gRNAs are read vertically.
Such combinations are designed to maintain the Kozak translational initiation sequences of either TaRFL29a (such as combinations with targets 972f or 983r) or that of TaPK-like (478f or 488r) or of TaRAE1-like (1155f or 1151f).
The selected pairs of gRNAs are constructed such that they are expressed from the wheat U6 promoter (SEQ ID NO: 33) and the gRNA cassettes cloned into a plant binary transformation vector containing a ZmUbiquitin promoter::Cas9 gene cassette. In the case of use of target 1151f, the plant binary vector also contains a Cas12a gene cassette.
These binary vectors are then transformed into the wheat variety Spelt using agrobacterium-mediated transformation. Primary transformants are then screened for the desired deletion using a PCR screen with primers in the coding region of RFL29a combined with primers in the TaPK-like or TaRAE1-like promoters. PCR bands obtained of approximately the expect size for the deletion are sequenced to confirm the junction sequence. The junction sequence should maintain the functions of the TaPK-like or TaRAE1-like promoter and allow the translation of RFL29a. Junctions that lack a Kozak sequence or that remove the ATG of RFL29 or introduce a novel ATG sequence or alter an intron splice junction are less desirable.
Plants with a desirable junction sequence between the TaPK-like promoter and the RFL29a CDS or with the TaRAE1-like promoter and the RFL29a CDS are used as males in crosses to the Fielder CMS line. F1 progeny of these crosses are sown and scored for male fertility.
RNAseq data from wheat Chinese Spring spikes were used to identify candidate promoters that could be brought in front of the RFL29 coding region via a genomic inversion. A gene with an opposing orientation to RFL29 was identified with good spike expression in Chinese Spring (
The predicted protein of this gene (TraesCS1B02G037100) shows 97% identity to a protein from Aegilops tauschii subsp. Tauschii annotated as At5g02240-like (accession XP_020198298.1). TraesCS1B02G037100 was thus named TaAt5g02240-like. This protein is represented by SEQ ID NO:49.
The TaAt5g02240-like gene is about 80-fold more expressed than RFL29 in Z39 spikes.
In this example double strand DNA breaks (DSBs) are created just in front of the initiating Methionine ATG of the coding sequences of RFL29a and TaAt5g02240-like. The Spelt RFL29a (SEQ ID NO:16) and TaAt5g02240-like (SEQ ID NO:50) genomic sequences were identified by BLASTN analysis of the Spelt genome sequence. The regions in from of the coding sequences were analyzed for sites that could be cleaved using CRISPR Cas9 guide RNAs (gRNAs).
The combinations of gRNAs are read vertically.
Such combinations are designed to maintain the Kozak translational initiation sequences of either RFL29a (combinations with targets 972f or 983r) or that of TaAt5g02240-like (81f)).
The selected pairs of gRNAs are constructed such that they are expressed from the wheat U6 promoter (SEQ ID NO: 33) and the gRNA cassettes cloned into a plant binary transformation vector containing a ZmUbiquitin promoter::Cas9 gene cassette.
These binary vectors are then transformed into the wheat variety Spelt using agrobacterium-mediated transformation. Primary transformants are then screened for the desired inversion using a PCR screen with primers in the coding region of RFL29a combined with primers in the TaAt5g02240-like promoter or 5′UTR region. PCR bands obtained of approximately the expect size for the deletion are sequenced to confirm the junction sequence. The junction sequence should maintain the functions of the TaAt5g02240-like promoter and allow the translation of RFL29a. Junctions that lack a Kozak sequence or that remove the ATG of RFL29 or introduce a novel ATG sequence or alter an intron splice junction are less desirable.
Plants with a desirable junction sequence between the TaAt5g02240-like promoter and the RFL29a CDS are used as males in crosses to the Fielder CMS line. F1 progeny of these crosses are sown and scored for male fertility.
Number | Date | Country | Kind |
---|---|---|---|
21305141.0 | Feb 2021 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2022/052357 | 2/1/2022 | WO |